Classical Optics and its Applications

CLASSICAL OPTICS AND ITS APPLICATIONS Second Edition Covering a broad range of fundamental topics in classical optics a...

Author: Masud Mansuripur

196 downloads 2451 Views 15MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

CLASSICAL OPTICS AND ITS APPLICATIONS Second Edition

Covering a broad range of fundamental topics in classical optics and electromagnetism, this book is ideal for graduate-level courses in optics, providing supplementary reading materials for teachers and students alike. Industrial scientists and engineers developing modern optical systems will also find it an invaluable resource. Now in color, this second edition contains 13 new chapters, covering optical pulse compression, the Hanbury Brown–Twiss experiment, the Sagnac effect, Doppler shift and stellar aberration, and optics of semiconductor diode lasers. The first half of the book deals primarily with the basic concepts of optics, while the second half describes how these concepts can be used in a variety of technological applications. Each chapter is concerned with a single topic, developing an understanding through the use of diagrams, examples, numerical simulations, and logical arguments. The mathematical content is kept to a minimum to provide the reader with insightful discussions of optical phenomena. M asud M ansuripur received a Bachelor of Science degree in Electrical Engineering from Arya-Mehr University of Technology in Tehran, Iran (1977), a Master of Science in Electrical Engineering from Stanford University (1978), a Master of Science in Mathematics from Stanford University (1980), and a Ph.D. in Electrical Engineering from Stanford University (1981). He has been Professor of Optical Sciences at the University of Arizona since 1988. His areas of research include optical data storage, optical signal processing, magneto-optical properties of thin magnetic films, radiation pressure, interaction of light with sub-wavelength structures, and the optical and thermal characterization of thin films and stacks. A Fellow of the Optical Society of America, he has published more than 250 papers in various technical journals, holds eight patents, has given numerous invited talks at international scientific conferences, and has been a contributing editor of Optics & Photonics News, the magazine of the Optical Society of America. Professor Mansuripur’s published books include Introduction to Information Theory (1987) and The Physical Principles of Magneto-optical Recording (1995).

http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511803796

http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511803796

CLASSICAL OPTICS AND ITS APPLICATIONS Second Edition

MASUD MANSURIPUR College of Optical Sciences University of Arizona, Tucson

http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511803796

cambridge university press Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, Sa˜o Paulo, Delhi Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521881692 ª M. Mansuripur 2009 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First edition published 2002 Second edition published 2009 Printed in the United Kingdom at the University Press, Cambridge A catalogue record for this publication is available from the British Library ISBN 978-0-521-88169-2 hardback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

To Annegret, Kaveh, and Tobias

http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511803796

http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511803796

Contents

Preface to the second English edition page ix Preface to the first edition xi Introduction 1 1 Abbe’s sine condition 9 2 Fourier optics 23 3 Effect of polarization on diffraction in systems of high numerical aperture 45 4 Gaussian beam optics 52 5 Coherent and incoherent imaging 62 6 First-order temporal coherence in classical optics 74 7 The van Cittert–Zernike theorem 88 8 Partial polarization, Stokes parameters, and the Poincare´ sphere 100 9 Second-order coherence and the Hanbury Brown–Twiss experiment 113 10 What in the world are surface plasmons? 128 11 Surface plasmon polaritons on metallic surfaces 139 12 The Faraday effect 152 13 The magneto-optical Kerr effect 166 14 The Sagnac interferometer 182 15 Fabry–Pe´rot etalons in polarized light 197 16 The Ewald–Oseen extinction theorem 209 17 Reciprocity in classical linear optics 224 18 Optical pulse compression 240 19 The uncertainty principle in classical optics 258 20 Omni-directional dielectric mirrors 274 21 Linear optical vortices 289 22 Geometric-optical rays, Poynting’s vector, and the field momenta 301 23 Doppler shift, stellar aberration, and convection of light by moving media 310 24 Diffraction gratings 323 vii

Contents

25 Diffractive optical elements 26 The Talbot effect 27 Some quirks of total internal reflection 28 Evanescent coupling 29 Internal and external conical refraction 30 Transmission of light through small elliptical apertures 31 The method of Fox and Li 32 The beam propagation method 33 Launching light into a fiber 34 The optics of semiconductor diode lasers 35 Michelson’s stellar interferometer 36 Bracewell’s interferometric telescope 37 Scanning optical microscopy 38 Zernike’s method of phase contrast 39 Polarization microscopy 40 Nomarski’s differential interference contrast microscope 41 The van Leeuwenhoek microscope 42 Projection photolithography 43 Interaction of light with subwavelength structures 44 The Ronchi test 45 The Shack–Hartmann wavefront sensor 46 Ellipsometry 47 Holography and holographic interferometry 48 Self-focusing in nonlinear optical media 49 Spatial optical solitons 50 Laser heating of multilayer stacks Index

viii

351 367 379 387 404 418 447 459 476 489 505 515 525 545 554 566 576 586 599 614 624 632 642 654 664 678 691

Preface to the second English edition

Following the publication of the first edition of this book, I wrote (or co-wrote) nine additional columns for Optics & Photonics News (OPN), which appeared between April 2001 and April 2007. Some of these columns were included in the Japanese enlarged edition of the book, published in 2006; all nine columns are now included in this second English edition. Throughout the years, I also wrote four columns which were not submitted to OPN, because they ended up being somewhat lengthy and perhaps too mathematical for the general readership of the OPN; these appear here for the first time as Chapters 9, 14, 18, and 25. The selection of topics and the exposition style of the thirteen new chapters of the present edition follow the same principles and guidelines as did the original thirtyseven chapters of the first edition. In each case a topic is chosen either for its intrinsic value as a foundational contribution to the science of optics (e.g., the Sagnac effect, second-order coherence, the Doppler shift), or because of its technological significance (e.g., optical pulse compression, semiconductor diode lasers, diffractive optical elements). To a large extent, the fifty chapters of the present book are independent of each other and can be read in any desired sequence. Occasionally, when the information in one chapter could benefit the understanding of the material in another, cross references are provided. The presentation style is pedagogic and informal, with mathematics used sparingly unless it is deemed essential and unavoidable. Computer simulations are used extensively throughout the book as an aid to explaining the concepts and to provide concrete examples of the physical phenomena under consideration. As was the case in the first edition, the software packages DIFFRACT and MULTILAYER, both products of MM Research, Inc., Tucson, Arizona, were used for the numerical simulations reported in the new Chapters 19, 20, 23, 25, 33, 34, 43, and 49. The computer simulations of Chapters 11, 30, and 43 were carried out by Armis Zakharian using his software package Sim3D_Max, a product of Nonlinear Control Strategies, Inc., Tucson, Arizona. The basis of Sim3D_Max is the Finite Difference Time Domain (FDTD) technique for solving Maxwell’s equations as described, for example, in Computational Electrodynamics by A. Taflove and S. C. Hagness, Artech House (2000). ix

Preface to the second English edition

x

Professor Emeritus Jumpei Tsujiuchi of the Tokyo Institute of Technology, Japan, has painstakingly translated all of my OPN columns for the Japanese magazine O Plus E; these articles appeared in print between 2001 and 2005. Subsequently, Professor Tsujiuchi arranged for the collection of the translated articles to be published in book form by the New Technology Communications Co., Ltd. The Japanese enlarged edition of Classical Optics and its Applications has been in print since 2006. I am grateful to Professor Tsujiuchi for his dedication and his untiring efforts to bring these articles to the attention of the technical community in Japan. For guidance and illuminating discussions, I am indebted to Armis Zakharian of the Corning Corp., Jeffrey Wilde of the Capella Photonics, Inc., Seiji Yonezawa of the Comets, Inc. (Japan), Sjoerd Stallinga of the Philips Research Laboratories (Netherlands), and Kenji Konno of Minolta Corp. (Japan). I am also grateful to the following colleagues from the College of Optical Sciences, University of Arizona, for sharing their insights with me: Ewan Wright, Jerome Moloney, Brian Anderson, Mahmoud Fallahi, Jason Jones, Jose Sasian, Nasser Peyghambarian, James Wyant, Dennis Howe, Pavel Polynkin, and Pierre Meystre. Special thanks are due to Ewan Wright, Armis Zakharian, and Jerome Moloney for granting permission to publish our joint articles in this volume. (The co-authors of the corresponding chapters are acknowledged in footnotes to each chapter.) While working on several chapters of this book, my research has been supported by the United States Air Force Office of Scientific Research (AFOSR) through contracts F49620–03–1–0194, FA9550–04–1–0213, FA9550–04–1–0355 awarded by the Joint Technology Office; I thank Dr Arje Nachman of the AFOSR for his support of our research program over the past several years. I also would like to thank the editor at the Cambridge University Press, Dr Simon Capelin, and his professional staff for their superb handling of the publication of the English editions of this book. Last but not least, I must mention with deep gratitude the loving care and support of my wife, Annegret, during the years that this book has been in preparation. As with previous editions, it is to her and to our children, Kaveh and Tobias, that the second English edition of the book is dedicated.

Preface to the first edition

I started writing the Engineering column of Optics & Photonics News (OPN) in early 1997. Since then nearly forty articles have appeared, covering a broad range of topics in classical optical physics and engineering. My original goal was to introduce students and practising engineers to some of the most fascinating topics in classical optics. This I planned to achieve with minimal usage of the mathematical language that pervades the literature of the field. I had met many bright students and practitioners who either did not know or did not fully appreciate some of the major concepts of classical optics such as the Talbot effect, Abbe’s sine condition, the Goos–Ha¨nchen effect, Hamilton’s internal and external conical refraction, Zernike’s method of phase contrast, Michelson’s stellar interferometer, and so on. My columns were going to have little mathematics but an abundance of pictures and pedagogical arguments, to bring forth the essence of the physics involved in each phenomenon. In the process, I hoped, the readers would appreciate the beauty of the subject and, if they found it interesting, would dig deeper by searching the cited literature. A unique tool available to me for this purpose was the computer programs DIFFRACTTM, MULTILAYERTM, and TEMPROFILETM, which I have developed in the course of my research over the past 20 years. The first of these programs simulates the propagation of light through optical systems consisting of discrete elements such as lasers, lenses, mirrors, prisms, phase/amplitude masks, gratings, polarizers, wave-plates, multilayer stacks, birefringent crystals, diffraction gratings, and optically active materials. The other two programs simulate the optical and thermal behavior of multilayer stacks. I have used these programs to generate graphs and pictures to explain the various phenomena in ways that would promote a better understanding. The articles have been successful beyond my wildest dreams. While I had hoped that a few readers would find something useful in this series, I have received notes, e-mails, and verbal comments from distinguished scholars around the world who have found the columns stimulating and helpful. Some teachers informed me that they use the articles for their classroom teaching, and I have heard of several xi

Preface to the first edition

xii

readers who collect the articles for future reference. All in all, I have been pleasantly surprised by the positive reaction of the OPN readers to these columns. Optics & Photonics News is not an archival journal and, therefore, will not be widely available to future students. Thus I believe that collecting the articles here in one book, which provides for ease of cross-referencing, will be useful. Moreover, the book contains additional explanations of topics that were originally curtailed for lack of space in OPN; it includes corrections to errors discovered afterwards and incorporates some comments and criticisms made by OPN readers as well as my answers to these criticisms. This book covers a broad range of topics: classical diffraction theory, the optics of crystals, the peculiarities of polarized light, thin-film multilayer stacks and coatings, geometrical optics and ray-tracing, various forms of optical microscopy, interferometry, coherence, holography, nonlinear optics, etc. It could serve as a companion to the principal text used in a number of academic courses in physics, engineering, and optics; it should be useful for university teachers as a guide to selecting topics for a graduate-level course; it should be useful also for self-study by graduate students. It could be used fruitfully by engineers who develop optical systems such as laser printers, scanners, cameras, displays, image-processing equipment, lasers and laserbased systems, telescopes, optical storage and communication systems, spectrometers, etc. I believe anyone working in the field of optics could benefit from this book, by being exposed to some of the major concepts and ideas (developed over the last three centuries) that shape our modern understanding of optics. Some of the original OPN columns were written jointly with colleagues and students; these are identified in the footnotes and the corresponding co-authors acknowledged. I thank Ewan Wright and Rongguang Liang of the Optical Sciences Center, Lifeng Li of Tsinghua University, Mahmoud Fallahi of Nortel Co., and Wei-Hung Yeh of Maxoptix Co. for their collaboration as well as for giving permission to publish our joint papers in this collection. I also would like to acknowledge the late Peter Franken, Pierre Meystre, Yung-Chieh Hsieh, Dennis Howe, Glenn Sincerbox, Harrison Barrett, Roland Shack, Jose´ Sasian, Michael Descour, Arvind Marathay, Ray Chiao, James Wyant, Marc Levenson, Ronald Gerber, James Burge, Ferry Zijp and Dror Sarid, who shared their valuable insights with me and/or criticized the drafts of several articles prior to publication. Needless to say, I am solely responsible for any remaining errors and inaccuracies. For their help with graphics and word processing, I am grateful to our administrative assistants Patricia Gransie, Nonie Veccia, Marylou Myers, and Amanda Palma. Last but not least, I am grateful to my wife, Annegret, who has tolerated me with love and patience over the past four years while this book was being written. It is to her and to our children, Kaveh and Tobias, that this book is dedicated.

Introduction

The common threads that run through this book are the classical phenomena of diffraction, interference, and polarization. Although the reader is expected to be generally familiar with these electromagnetic phenomena, the book does cover some of the principles of classical optics in the early chapters. The basic ideas of diffraction and Fourier optics are introduced in chapters 1 through 4; this introduction is followed by a detailed discussion of spatial and temporal coherence and of partial polarization in chapters 5 through 8. These concepts are then used throughout the book to explain phenomena that are either of technological import or significant in their own right as natural occurrences that deserve attention. Each chapter is concerned with a single topic (e.g., surface plasmons, diffraction gratings, evanescent coupling, photolithography) and attempts to develop an understanding of this subject through the use of pictures, examples, numerical simulations, and logical argument. The reader already familiar with a particular topic is likely to learn more about its applications, to appreciate better the physics behind some of the formulas he or she may have previously encountered, and perhaps even learn a thing or two about the nuances of the subject. For the reader who is new to the field, our presentation is aimed to provide an introduction, an intuitive feel for the physical and/or technological issues involved, and, hopefully, motivation for digging deeper by consulting the cited references. For the most part, this book avoids repeating what is already in the open literature, aiming instead to expose concepts and ideas, ask critical questions, and provide answers by appealing to the reader’s intuition rather than to his or her mathematical skills. Some of the chapters address fundamental problems that historically have been crucial to our modern understanding of optics; conical refraction, the Talbot effect, the principle of holography, and the Ewald–Oseen extinction theorem are representatives of this class. Other chapters introduce devices and phenomena of great scientific and technological importance; Fabry–Pe´rot 1

2

Classical Optics and its Applications

etalons, the magneto-optical Faraday and Kerr effects, and the phenomenon of total internal reflection fall into this second category. Many of the remaining chapters single out a tool or an instrument that not only is of immense technological value but also has its unique principles of operation, worthy of detailed understanding; examples include various microscopes and telescopes, lithographic systems, ellipsometers, and so on. Occasionally a theoretical concept or a numerical method is found that has a wide range of applications; we have devoted a few chapters to these topics, such as the method of Fox and Li, the beam propagation method, and the concept of reciprocity in classical optics. The majority of the computer simulations reported in this book were performed with the software packages DIFFRACT, MULTILAYER, and TEMPROFILE, which I have written in the course of the past twenty years and which are now commercially available. These programs in turn are based on theoretical methods and numerical algorithms that are fully documented in several of my publications.1,2,3,4,5,6 In a few chapters, I have collaborated with Professor Lifeng Li (now at the Tsinghua University in China). Here, we have used Professor Li’s program DELTA, also commercially available, for calculations pertaining to diffraction gratings. The theoretical foundations of DELTA are described in Professor Li’s publications.7 Throughout the book, black-and-white pictures will be used to display the various properties of an optical beam; these include cross-sectional distributions of intensity, phase, polarization, and the Poynting vector. A unified scheme for the gray-scale encoding of real-valued functions of two variables is used in all the chapters, and it is helpful to review these methods at the outset. In the convention adopted the beam always propagates along the Z-axis, and its cross-sectional plane is XY. The Cartesian XYZ coordinate system is right-handed, the polar angles are measured from the positive Z-axis, and the azimuthal angles are measured from the positive X-axis towards the positive Y-axis. In general, the beam has three components of polarization along the X-, Y-, and Z-axes of the coordinate system, that is, its electric field E has components Ex(x, y), Ey(x, y), and Ez(x, y) at any given cross-sectional plane, say, at z ¼ z0. Since the E-field components are complex-valued, their complete specification requires two distributions for each component, namely, amplitude and phase. The following paragraphs describe in some detail the encoding scheme used for displaying different cross-sectional properties of the beam and also provide a few examples. Plots of intensity distribution The electric-field intensity is the square of the field amplitude at any given location in space. Thus, for example, the intensity distribution in the

Introduction

3

cross-sectional XY-plane for the E-field component along the X-axis is denoted by Ix (x, y) ¼ jEx(x, y)j2. Figure 0.1 shows plots of intensity distribution for the three components of polarization of a Laguerre–Gaussian beam propagating along the Z-axis. The black pixels represent locations where the intensity is at its minimum (zero in the present case), the white pixels correspond to the locations of maximum intensity within the corresponding frame, and the gray pixels linearly interpolate between these minimum and maximum values. In the case of Figure 0.1, the beam was taken to be linearly polarized at 45 to the X-axis, leading to identical distributions for the X- and Y-components of polarization. The much weaker Z-component is computed to ensure that the Maxwell equations will be satisfied for the assumed distributions of the X- and Ypolarization components. In general, one may assume arbitrary distributions for Ex and Ey within a given cross-sectional XY-plane. To determine Ez in a selfconsistent manner, one must break up the Ex and Ey distributions into their plane-wave constituents and proceed to determine Ez for each plane wave that propagates along the unit vector r ¼ (rx, ry, rz) by requiring the inner product of E and r to vanish (i.e., Exrx þ Eyry þ Ezrz ¼ 0). One must then superimpose the Z-components of all the plane waves thus obtained to arrive at the total distribution of Ez. In Figure 0.1 the peak intensities in the three frames are in the ratios Ix : Iy : Iz ¼ 1.0 : 1.0 : 1.47 · 107. Logarithmic plots of intensity distribution In order to emphasize the weaker regions of an intensity distribution, we will show on numerous occasions the distribution of the logarithm of the intensity. First, the intensity distribution is normalized by its peak value, then the base-10 logarithm of the normalized intensity is computed and all values below some cutoff point are truncated. For instance, if the cutoff is set to a then all values of the normalized intensity below 10a are reset to 10a; the range of the logarithm of normalized intensity thus becomes (a, 0). The continuum of gray levels from black to bright-white is then mapped linearly onto this interval and used to display plots of normalized intensity on the logarithmic scale. When it is deemed useful or necessary, the corresponding value of a will be indicated in a figure’s caption. Figure 0.2 shows two plots of the same intensity distribution at the focal plane of a comatic lens. In (a) the distribution is linearly mapped onto the gray-scale, whereas in (b) the logarithm of intensity with a cutoff at a ¼ 4 is displayed. The latter is similar to what would be obtained by over-exposing a photographic plate placed at the focal plane of the lens.

4

Classical Optics and its Applications 104

a

y/

–10 4 104

b

y/

–104 104

c

y/

–104 –10000

x/

10000

Figure 0.1 Plots of intensity distribution in the cross-sectional plane of a Laguerre–Gaussian beam for the three components of the E-field. In each frame the black pixels represent locations of zero intensity, while the white pixels represent locations of maximum intensity in the corresponding frame. The beam is assumed to propagate along the Z-axis, linearly polarized at 45 to the X-axis. (a) Intensity of the component of polarization along the X-axis, Ix (x, y) ¼ j Ex (x, y)j2, (b) Iy (x, y) ¼ j Ey (x, y)j2, (c) Iz (x, y) ¼ j Ez (x, y)j2. The peak intensities in (a), (b), (c) are in the ratios 1.0 :1.0 :1.47 · 107, respectively.

Plots of phase distribution In several chapters we will show plots of phase distribution in a beam’s crosssectional plane. The phase, a modulo-2p entity, will always be limited to a range less than or equal to 360 . We typically divide the range of phase values for a given distribution into equal sub-intervals, assigning black to the minimum value, bright-white to the maximum value, and various gray levels to the values in between. A sharp discontinuity (from black to white or vice versa) appearing in

5

Introduction 5 a

b

y/

–5 –5

x/

+5 –5

x/

+5

Figure 0.2 (a) Intensity distribution in the focal plane of a 0.5NA lens having 1.5k of third-order coma (Seidel aberration). The uniformly distributed incident beam is assumed to be circularly polarized. In the focal plane, the X-, Y-, and Z- components of the electric field vector are added together to yield the total E-field intensity. (b) Same as (a) but on a logarithmic scale with a ¼ 4 (see text).

these phase plots would be of no physical significance, since it merely indicates a 360 phase jump. Figure 0.3 is a cross-sectional plot of the phase distribution for the Laguerre– Gaussian beam whose intensity distribution was given in Figure 0.1. The three frames of Figure 0.3 correspond to the components of polarization along the X-, Y-, and Z- axes. The black pixels represent the minimum phase, 180 , and the white pixels correspond to the maximum phase, þ180 ; the gray pixels cover the continuous range of values in between. Ellipse of polarization Consider a collimated beam of light propagating along the Z-axis. In general, the state of polarization of the beam at any given point is elliptical, as shown in Figure 0.4. So long as the electric-field vector E may be assumed to be confined to the XY-plane, it may be resolved into two orthogonal components, along the X- and Y- axes say. If Ex and Ey happen to be in phase, the polarization will be linear along some direction specified by the angle q. If, on the other hand, the phase difference between Ex and Ey is 90 then the polarization will be elliptical, the major and minor axes of the ellipse lying along the X- and Y-axes. In general, the phase difference between Ex and Ey is somewhere between 0 and 360 , giving rise to an ellipse whose major axis has an angle q with the X-axis and whose ellipticity is given by the angle g. When the polarization is linear, g ¼ 0 ; for light that is right circularly polarized (RCP), g ¼ þ45 , whereas for light that is left circularly polarized (LCP), g ¼ 45 . In general, 90 < q 90 and 45 g 45 . Figure 0.5 shows cross-sectional plots of intensity and polarization state for a beam with a highly non-uniform state of polarization. Frame (a) is the logarithmic

6

Classical Optics and its Applications 104

a

y/

–10 4 104

b

y/

–10

4

104

c

y/

–10 4 –10000

x/

10000

Figure 0.3 Plots of phase distribution in the cross-sectional plane of the Laguerre–Gaussian beam depicted in Figure 0.1. Frames (a), (b), and (c) correspond, respectively, to the components of the E-field along the X-, Y-, and Z- coordinate axes. In each frame the black pixels represent a phase of180 and the white pixels correspond to a phase of þ180 ; the gray pixels linearly interpolate between these two extreme values.

intensity pattern in the XY-plane. The polarization rotation angle q(x, y) is depicted in (b), while the ellipticity g(x, y) is shown in (c). The gray-scale in Figure 0.5(b) is a linear map of the values of q from 90 (black) to þ90 (white). Similarly, the plot of g in Figure 0.5(c) is linearly encoded in gray-scale, with black representing 45 and white representing þ45 . In the plot of q depicted in Figure 0.5(b), there are random-looking jumps between black and bright-white pixels. This is due to the ambiguity of the polarization rotation angle when either the E-field intensity is zero or the ellipticity g is 45 . In these regions, a small numerical error could readily cause a discrete jump between qmin ¼ 90 and qmax ¼ þ90 .

7

Introduction

Figure 0.4 The ellipse of polarization is uniquely specified by Ex and Ey, the complex-valued electric field components along the X- and Y- axes. The major axis of the ellipse makes an angle q with the X-direction, and the angle g facing the minor axis represents the polarization ellipticity.

150

a

y/

–150 150

b

y/

–150 150

c

y/

–150 –150

x/

150

Figure 0.5 Distributions of intensity and polarization in the cross-section of a beam having a non-uniform polarization state. (a) Logarithmic plot of intensity distribution having cutoff at a ¼ 4. (b) Polarization rotation angle q; the grayscale is linearly mapped onto q, from black at qmin ¼ 90 to bright-white at qmax ¼ þ90 . (c) Polarization ellipticity g; the gray-scale is linearly mapped onto g, from black at gmin ¼ 45 to bright-white at gmax ¼ þ45 .

8

Classical Optics and its Applications

References for the Introduction 1 M. Mansuripur, The Physical Principles of Magneto-optical Recording, Cambridge University Press, UK, 1995. 2 M. Mansuripur, Distribution of light at and near the focus of high numerical aperture objectives, J. Opt. Soc. Am. 3, 2086 (1986). 3 M. Mansuripur, Certain computational aspects of vector diffraction problems, J. Opt. Soc. Am. A 6, 786 (1989). See also the erratum in J. Opt. Soc. Am. A 10, 382–383 (1993). 4 M. Mansuripur, Analysis of multilayer thin film structures containing magnetooptic and anisotropic media at oblique incidence using 2 · 2 matrices, J. Appl. Phys. 67, 6466–6475 (1990). 5 M. Mansuripur, G. A. N. Connell, and J. W. Goodman, Laser-induced local heating of multilayers, Appl. Opt. 21, 1106 (1982). 6 M. Mansuripur and G. A. N. Connell, Laser induced local heating of moving multilayer media, Appl. Opt. 22, 666 (1983). 7 Lifeng Li, Multilayer-coated diffraction gratings: differential method of Chandezon et al. revisited, J. Opt. Soc. Am. A 11, 2816–2828 (1994).

1 Abbe’s sine condition

Ernst Abbe (1840–1905), professor of physics and mathematics and director of the astronomical observatory at Jena, was also the research director of the Zeiss optical works. In 1868 he invented the apochromatic lens, thus eliminating the primary and secondary color distortion in microscopes. Abbe developed a clear theoretical understanding of limits to resolution and magnification in optical image-forming systems and discovered the sine condition for a lens to form a sharp image without the defects of coma and spherical aberration. (Jena Review, 1965, Zeiss Archive, Courtesy AIP Emilio Segre´ Visual Archives.)

Ernst Abbe (1840–1905), professor of physics and mathematics at the University of Jena, Germany, and major partner in the Carl Zeiss company, made important contributions to the theory and practice of optical microscopy.1 His compound microscope was a superb optical design based on a theoretical understanding of diffraction and minimization of the effects of aberrations.2 Abbe enunciated his famous sine condition regarding the axial point in the object plane of a centered image-forming system such as a microscope or a telescope. When this 9

10

Classical Optics and its Applications

condition is satisfied, “aberration-free” imaging of the object points located in the vicinity of the optical axis is assured.1,2,3,4,5,6 This chapter provides an heuristic description of the sine condition, which, in the words of Conrady, is “one of the most remarkable and labor-saving theorems in the whole realm of applied optics”.7 As the chapter follows a rather unconventional approach towards explaining the sine condition, it is worthwhile to highlight its main features at the outset. An introduction of the necessary geometric-optical concepts provides the basis for defining the sine condition. This is followed by establishing, for an axial object point, a one-to-one mapping between the principal planes of the imaging system. The wavefront entering the system at the first principal plane (p.p.) is thus related to that emerging from the second p.p. To describe the imaging of near-axis regions, we switch to a wave-optical viewpoint. Assuming that the axial object point is shifted to a nearby offaxis location, we derive the spatial phase modulation imparted to the emergent wavefront in consequence of this small shift. By then it should be apparent that aberration-free imaging of the off-axis point requires this spatial phase modulation to be linear in a certain coordinate system and that Abbe’s sine condition is both necessary and sufficient to guarantee this linearity. A lens that violates the sine condition To appreciate the significance of Abbe’s sine condition consider the plano-convex lens shown in Figure 1.1. A collimated beam of light propagating along the optical axis Z enters the flat facet of this lens and, upon exiting the second, hyperboloidal, surface, converges toward the focal point. The conic constant of the second surface is chosen to bring the beam to a perfect (i.e., diffraction-limited) focus at the rear focal plane of the lens. The logarithmic plot of intensity distribution at the focal plane (see Figure 1.2(a)) reveals the focused spot to be the well-known Airy pattern for this 0.75NA lens. If the incident beam is tilted by a small amount, the focus shifts to an off-axis location but, more importantly, it acquires a significant amount of coma (see Figure 1.2(b)). Thus it is clear that a lens that works well for an axial object point is not necessarily suitable for the imaging of near-axis regions. The sine condition is intended to alleviate this problem. For comparison with a case to be described later, Figure 1.2(c) shows the phase distribution of the oblique beam at the front facet of the plano-convex lens; similarly Figure 1.2(d) shows the phase distribution of the emergent beam (minus the curvature) at the second p.p. Note that the clear aperture at the second p.p. is reduced in size and that the emergent phase pattern is “compressed” toward the optical axis in a nonlinear fashion. As

11

1 Abbe’s sine condition f = 1.1133 mm NA = 0.75

Z

Second principal plane

Figure 1.1 A plano-convex lens brings a collimated beam to perfect focus on an axial point. The lens is designed for k ¼ 633 nm; it has a 4 mm diameter clear aperture, a focal length of 1.1133 mm, and a numerical aperture of NA ¼ 0.75. The refractive index of the lens glass n ¼ 2.5, its thickness at the center is 1 mm, and its hyperboloidal surface has radius of curvature Rc¼1.67 mm and conic constant k ¼ n2 ¼ 6.25. The second principal plane of this lens is tangent to its curved surface at the apex. Both surfaces of the lens are assumed to be antireflection coated.

we shall see below, the emergent phase pattern is quite different for a lens that does satisfy the sine condition. Geometric-optical concepts The sine condition applies to a centered optical system designed for “aberrationfree” imaging of a small patch within the object plane to a corresponding patch within the image plane (see Figure 1.3). The imaging system is intended for a given pair of conjugate planes, so that the distance z0 between the object and the first p.p. of the system is fixed, as is the distance z1 between the image and the second p.p. The lens formula 1/z0þ1/z1¼1/f, where f is the focal length of the system, applies here.5 Throughout this chapter, attention is confined to systems where both the object and image are in air; extension of the results to situations where the object space and image space have differing refractive indices (e.g., immersion-oil microscopy) is straightforward but is not discussed.4,5 In the present context, “aberration-free” imaging means that a cone of light emanating from any point (x0, y0) in the small patch within the object plane, when captured by the optical system is turned into a convergent cone that – to a first approximation in the relevant parameters – comes to focus at (x1, y1) in the image

12

Classical Optics and its Applications 7.5 a

b

y/

–7.5 3250

c

d

y/

–3250

Figure 1.2 (a) Logarithmic plot of intensity distribution at the focal plane of the plano-convex lens of Figure 1.1 for a circularly polarized, collimated beam traveling along the optical axis. (b) Same as (a) but for an obliquely incident beam traveling at 0.076 relative to the optical axis. (c) Distribution of phase for the oblique beam entering the lens at its flat surface. The gray-scale covers the interval from 180 (black) to þ180 (white). (d) Distribution of phase for the oblique beam emerging from the lens at its second p.p.

plane (see Figure 1.4).4,5 The point (x1, y1) is conjugate to (i.e., the Gaussian image of) the point (x0, y0). Since the system is circularly symmetric around the optical axis, the axial point at the center of the object plane is imaged to the axial point at the center of the image plane. Denoting the distance between (x0, y0) and the origin of the object plane by d0 and, similarly, the distance between (x1, y1) and the origin of the image plane by d1, the transverse magnification m of the system is d1/d0. It is not difficult to show that m is also equal to z1/z0 (see Figure 1.3). Principal planes The concept of the principal planes is rooted in paraxial ray-tracing (i.e., Gaussian optics), where the angles between the rays and the optical axis are so small that the sine and the tangent of each angle can be approximated by the

13

1 Abbe’s sine condition X0

X First principal plane

X1

X Second principal plane

Object

Image Z

Imaging system Y

Y0

Y1

Y

Figure 1.3 A small planar object in the vicinity of the optical axis in the X0Y0-plane is imaged onto a small region of the X1Y1-plane. The principal planes of the imaging system are also shown. The object and image planes are assumed to be in air, so that the refractive indices of both the object space and the image space may be set to unity.

Object plane

(x1, y1) d1 Z

d0 Image plane

(x0, y0)

First p.p.

Second p.p.

Figure 1.4 The cone of light emanating from an off-axis object point (x0, y0) is captured by the imaging system and brought to focus at the corresponding image point (x1, y1). Note that beyond the paraxial regime the rays entering the first p.p. at a given height do not necessarily emerge from the second p.p. at the same height.

value of the angle itself, sin h tan h h. In the neighborhood of the optical axis, therefore, the entire system may be represented by a 2 · 2 matrix, and the principal planes are uniquely determined from this so-called ABCD matrix of the system.5

14

Classical Optics and its Applications

The principal planes are conjugate planes with unit transverse magnification. A ray entering the first p.p. at a certain height h will emerge from the second p.p. at the same height, as shown in Figure 1.5(a). Thus h z0h0 z1h1, where h0 and h1 are the angles of the incident and the emergent rays with the optical axis. Note that, within the framework of the paraxial approximation, the system’s entrance aperture at the first p.p. is identical in size and shape to the exit aperture located at the second p.p. (The term aperture as used here should not be confused with pupil, which has a more specific meaning in geometrical optics. The entrance and exit pupils also define the boundaries of the cones of light that enter and exit the system, but the pupils are not necessarily located at the principal planes.) Beyond the paraxial regime, the principal planes cease to be conjugate planes. Depending on its direction, a ray entering the first p.p. at a given height h might emerge from different locations on the second p.p. One might confine attention to a specific set of rays, such as those emanating from the axial point in the object plane, in order to fix the directions of rays that enter the system. Yet there is no guarantee that the height h of a ray on entering the first p.p. will remain the same when it emerges from the second p.p. Of course one can impose this as a requirement on the system, but many other possibilities exist that are equally plausible, as long as they conform to the constraints of the paraxial regime. Abbe’s sine condition is one such requirement placed on the heights of the entering and emerging rays. The sine condition Let us define two spherical surfaces, one in the object space, centered on the axial object point and tangent to the first p.p., and the other in the image space, centered on the axial image point and tangent to the second p.p. (see Figure 1.5(b)). Instead of assigning heights to the rays in the principal planes, the heights are assigned at the points where the rays cross these spherical surfaces. Thus, upon entering the system, h ¼ z0 sin h0. (If the height were assigned at the principal plane, the above expression would be written with tangent instead of sine.) Abbe’s sine condition requires that all rays emanating from the axial object point within the incident cone must emerge in the image space, where they form a converging cone toward the axial image point, at the same height at which they entered the system.4 As long as the rays are close to the optical axis (where the spheres are tangent to the principal planes), the tangent and sine of a given angle are nearly the same. Thus Abbe’s sine condition is consistent with the fact that, in the paraxial regime, the principal planes are unit-magnification conjugate planes. For the rays beyond the paraxial region, sin h deviates from tan h and the height of a given ray at the entrance sphere is no longer the same as its height at the first p.p. (Similarly,

15

1 Abbe’s sine condition (a)

u0

u1

First p.p

Z

Second p.p

(b)

u0

u1

Axial Object Point

Z Axial Image Point

First p.p.

Second p.p.

Figure 1.5 (a) In the paraxial regime the height h of a ray is measured from the optical axis in the principal planes. (b) In systems that operate beyond the paraxial regime one may define the ray height at the point where the ray crosses a reference sphere. When a system satisfies Abbe’s sine condition the height of a ray thus defined remains the same upon entering and exiting the system.

the height of an emergent ray at the exit sphere differs from its height at the second p.p.) In a sense, therefore, the sine condition requires the bending of the principal planes into spheres to preserve the paraxial property that a ray entering the system at a given height emerges from the system at the same height.

16

Classical Optics and its Applications

Whereas in the paraxial regime the angular magnification h1/h0 ¼ 1/m, where m is the transverse magnification of the system, it is the ratio (sin h1)/(sin h0) that equals 1/m in a system satisfying the sine condition. This turns out to be of crucial significance for the image-forming system, as will be shown below. To emphasize the point, note that in the system of Figure 1.5(a), where the entering and emerging ray heights are equal at the principal planes, the ratio (tan h1)/(tan h0) equals 1/m, whereas in the system of Figure 1.5(b), which satisfies Abbe’s sine condition, the relevant ratio is (sin h1)/(sin h0). Aplanatic system A system that yields an aberration-free image of the axial object point and satisfies Abbe’s sine condition is said to be “aplanatic”.4,5 Many imaging systems in use today satisfy these conditions to a good approximation, if not exactly. Note that the clear-aperture diameter of an aplanatic system as seen on the first p.p. is no longer equal to that on the second p.p. If NA0 is the numerical aperture of the largest cone of light emanating from the axial object point and captured by the system, the aperture radius on the entrance sphere is z0NA0 whereas that on the first p.p. is z0 tan[sin1(NA0)]. Similarly, in the image space the aperture radius on the exit sphere is z1NA1 while that on the second p.p. is z1 tan[sin1(NA1)]. Abbe’s sine condition guarantees that z0NA0 ¼ z1NA1 but, unless the imaging system has unit magnification, the aperture radii at the two principal planes are not equal. What is surprising about the sine condition is that a requirement imposed solely on the cones of light corresponding to the on-axis points affects the quality of imaging for nearby off-axis points: once the sine condition has been satisfied, all near-axis points within the object plane will be imaged, essentially free of aberration, to their conjugates in the image plane. Without the sine condition, however, images of the near-axis points would be degraded by aberrations, most prominently by coma. It is this surprising property of the sine condition that we shall elucidate further. The wave-optical viewpoint Having secured a one-to-one mapping between the distribution of light entering the first p.p. and that exiting the second p.p, for an axial object point, we now switch to the viewpoint of wave optics and consider the perturbation of the wavefront in response to a slight off-axis shift of the axial object point. In the diffraction analysis of lenses conducted within the paraxial approximation, it is customary to assign to the second p.p. the same complex-amplitude distribution that exists on the first p.p. This distribution is then augmented by

1 Abbe’s sine condition

17

aberrations of the lens, if any, to account for deviations of the emergent wavefront from perfect sphericity.8 Thus if A1(x,y) represents the complex-amplitude distribution at the first p.p., the distribution at the second p.p. will be written pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ A2 ðx; yÞ ¼ A1 ðx; yÞ exp½ið2p=kÞWðx; yÞ exp½ið2p=kÞ x2 þ y2 þ f 2 : ð1:1Þ Here k is the wavelength of the light, W(x, y) represents wavefront aberrations, and the second exponential factor corresponds to a perfect spherical wavefront converging toward the focal point in the image space. For wide-aperture systems, Eq. (1.1) must be modified to account for deviations from the paraxial regime. For example, if the ray emerging from (x,y) in the second p.p. enters the first p.p. at (x0 , y0 ), then A1(x, y) in Eq. (1.1) must be replaced by A1(x0 , y0 ), and the Jacobian of the transformation between the two principal planes must be properly taken into account, to preserve the optical energy throughput of the system. Strictly speaking, since in non-paraxial regions the principal planes are no longer conjugate planes, it follows that a one-to-one mapping between these planes is meaningless. In practice, however, the field of view of the lens is so small that a cone of light emanating from any point within the field of view is essentially the same as the axial cone in Figure 1.5(b), but endowed with some form of phase/ amplitude modulation. Thus the correspondence between a pair of points such as (x0 , y0 ) on the first p.p. and (x, y) on the second p.p., established for the axial cone, remains approximately valid for all object points. Any phase/amplitude perturbation affecting the beam at (x0 , y0 ) can then be transferred directly to the beam at (x, y), and the resulting distribution within the second p.p. may be used as the initial distribution for further propagation through the image space. (Many authors prefer to use the amplitude distribution over the spherical exit surface in Figure 1.5(b) as the initial distribution, without ever referring to the principal planes. If one is interested in diffraction analysis using a plane wave spectrum, however, one should start with initial conditions that are defined on a flat surface, in which case the second p.p. provides a natural frame of reference.) Wavefront perturbation due to off-axis shift of the object point The distribution of the complex amplitude at the first p.p. due to a cone of light emanating from the off-axis point (x0, y0) may be determined by reference to Figure 1.6. The distance from (x0 , y0 ) to the off-axis point differs from that to the on-axis point by Dl x0 Sx0 þ y0 Sy0 :

ð1:2Þ

18

Classical Optics and its Applications X

X

(x,y) (x,y) S

S

(0,0)

(x1,y1) (0,0)

Z

(x0,y0)

Figure 1.6 The ray leaving the off-axis point (x0, y0) and arriving at (x0, y0) will travel a slightly different distance than the ray from the axial point (0, 0) that travels along S0 toward the same location. When (x0, y0) is sufficiently close to the optical axis, the path-length difference between these two rays can be approximated by the projection on S0 of the line joining (x0, y0) to the point at the origin. The same argument applies to the conjugate rays in the image space.

To a first approximation, therefore, upon arrival at the first p.p. the cone of light that originates at (x0, y0) will be the same as that which originated from the axial point, albeit with a modulation by the following phase factor: exp½ið2p=kÞDl exp½ið2p=kÞðx0 Sx0 þ y0 Sy0 Þ:

ð1:3Þ

Note that the phase in Eq. (1.3) is linear in (S0x, S0y) but not in (x0 , y0 ). The same phase factor will appear on the beam at (x, y) on the second p.p. Now, and this is the crux of the matter, if the sine condition is satisfied then this phase factor can be replaced by exp[i(2p/k)(x1Sx þ y1Sy)], because the angular magnification between (S0x, S0y) and (Sx, Sy) is exactly the reverse of the transverse magnification m between (x0, y0) and (x1, y1). The distribution at the second p.p. now corresponds to a spherical wavefront, converging toward (x1, y1) and having no aberrations whatsoever. This is the essence of the sine condition, which cannot be over-emphasized; it is the reason why there is “aberration-free” imaging of near-axial points. A wide-aperture aplanat As an example, consider an ideal infinite-conjugate aplanatic lens having z0 ¼ 1, NA0 ¼ 0, z1 ¼ f ¼ 4000k and NA1 ¼ 0.75. The phase pattern of an obliquely

1 Abbe’s sine condition

19

incident plane wave at the first p.p. of this lens is shown in Figure 1.7(a). The beam has a linear phase over the entire entrance aperture, as expected of a plane wave at oblique incidence. Upon emerging from the second p.p. the phase pattern of the beam is that of Figure 1.7(b). In compliance with the sine condition the exit aperture is seen to be larger than the entrance aperture, and the phase pattern has undergone some sort of nonlinear “stretching”. (The emergent phase pattern in Figure 1.7(b), however, is nonlinear because it is displayed in the x, y coordinates; in the coordinates Sx, Sy it would be perfectly linear.) The emergent beam comes to focus at the focal plane of the lens, creating the off-axis Airy pattern shown in Figure 1.7(c). For comparison, the on-axis focused spot of the same lens is also shown in the figure. As expected, the off-axis spot is free from aberrations, and the two spots are essentially identical. It is not difficult to design an aplanat with the characteristics of the lens in the above example; a specific design is shown in Figure 1.8. The various parameters of this meniscus, which consists of two conic surfaces, are listed in the figure caption. Offense against the sine condition Let us now examine the special case of a lens in which the ray heights have been made equal at the principal planes. Here (x, y) ¼ (x0 , y0 ), and the difference between the actual and the ideal (i.e., aberration-free) emergent wavefronts will be W ðx; yÞ ¼ ðx0 Sx0 þ y0 Sy0 Þ x1 Sx þ y1 Sy : ð1:4Þ Note that Sx and Sy are proportional to sin h, but in the present case it is tan h that is magnified by 1/m. A Taylor series expansion yields pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ tan h ¼ sin h= 1 sin2 h ¼ sin h þ 12 sin3 h þ 38 sin5 h þ : ð1:5Þ To a first approximation, therefore, the difference between sin h and tan h is proportional to sin3 h. This difference, when inserted in Eq. (1.4), produces primary coma. Thus when the rays that enter at a given height on the first p.p. emerge at the same height on the second p.p., perfect imaging of the axial point results in comatic imaging of the near-axis points. Similar arguments may be advanced for systems that violate the sine condition in ways other than described above. In general, offense against the sine condition results in primary and higher-order coma in near-axis regions of the image plane.

20

Classical Optics and its Applications 4700

a

y/

–4700 4700

b

y/

–4700 7.5

c

y/

–7.5 –7.5

x/

7.5

Figure 1.7 (a) Distribution of phase at the first p.p. of an infinite-conjugate lens having NA1 ¼ 0.75 and f ¼ 4000k. The entrance aperture radius is 3000k, and the incident beam propagates at h ¼ 0.076 relative to the optical axis. The gray-scale covers the interval from 180 (black) to þ180 (white). (b) Distribution of phase at the second p.p. Since the lens satisfies Abbe’s sine condition the exit aperture radius is 4536k. (c) Logarithmic plot of intensity distribution at the focal plane showing the axial focused spot (center) and the off-axis spot corresponding to an oblique incidence angle of h ¼ 0.076 . The spots are nearly identical; both are substantially free from aberrations.

21

1 Abbe’s sine condition

Z

Second principal plane

Figure 1.8 An aplanatic meniscus lens brings collimated beams to diffractionlimited focus within its focal plane in the vicinity of the optical axis. This 4 mm diameter lens has f ¼ 2.6733mm and NA ¼ 0.75. The refractive index of the lens glass is n ¼ 2.49486, its thickness at the center is 1 mm, and its conic surfaces have the following radii of curvature and conic constants: first surface, Rc ¼ 2.26875 mm, k ¼ 0.20945; second surface, Rc ¼ 3.87493 mm, k ¼ 0.08173. The second principal plane is 0.2894 mm to the right of the first surface’s vertex.

The image of a diffraction grating An appealing argument in favor of the sine condition involves the image of a diffraction grating.9 Consider a small grating of period P placed perpendicular to the optical axis in the object plane of the system of Figure 1.3; the illumination is coherent, collimated, and monochromatic with wavelength k. The nth diffraction order leaves the grating at the Bragg angle hn relative to the optical axis, where sin hn¼nk/P. In the image plane the grating period is mP, where m is the transverse magnification of the system. Therefore, to obtain a distortion-free image it is necessary that all the sin hn be magnified by 1/m; in other words, the sine condition must be satisfied. References for Chapter 1 1 2 3 4 5 6 7 8 9

E. Abbe, Jenaisch. Ges. Med. Naturw. (1879); also Carl. Repert. Phys. 16, 303 (1880). C. Hockin, J. Roy. Micro. Soc. (2) 4, 337 (1884). A. B. Porter, Phil. Mag. (6) 11, 154 (1906). M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, Oxford, 1980. M. V. Klein, Optics, Wiley, New York, 1970. J. M. Stone, Radiation and Optics, McGraw-Hill, New York, 1963. A. E. Conrady, Applied Optics and Optical Design, Dover, New York, 1957. J. W. Goodman, Introduction to Fourier Optics, McGraw-Hill, New York, 1968. Douglas Goodman, private communication.

22

Classical Optics and its Applications

Siméon Denis Poisson

2 Fourier optics

The classical theory of diffraction originated in the work of the French physicist Augustin Jean Fresnel, in the first quarter of the nineteenth century. Fresnel’s ideas were subsequently expanded and elaborated by, among others, William Rowan Hamilton, Gustav Kirchhoff, George Biddell Airy, John William Strutt (Lord Rayleigh), Ernst Abbe, and Arnold Sommerfeld, leading to a complete understanding of light in its wave aspects.1 The Fourier-transform operation occurs naturally in any formulation of the theory of diffraction, giving rise to a body of literature that has come to be known as Fourier optics.2 The prominence of Fourier transforms in physical optics is rooted in the fact that any spatial distribution of the complex amplitude of light can be considered a superposition of plane waves.3 (Plane waves, of course, are eigenfunctions of Maxwell’s equations for the propagation of electromagnetic fields through homogeneous media.1,4) Many students of Fourier optics are intimidated by the approximations involved in deriving its basic formulas, but it turns out that the majority of these approximations are in fact unnecessary: by starting from a plane-wave expansion of the light amplitude distribution, rather than the traditional Huygens’ principle,1,2,4 one can readily arrive at the fundamental results of the classical theory either directly or after applying the stationary-phase approximation.1,3 (For a detailed discussion of the stationary-phase method see the appendix to this chapter.) The goal of the present chapter is to show how decomposition into, and subsequent superposition of, plane waves can lead straightforwardly to the nearfield (Fresnel) and far-field (Fraunhofer) formulas, to elucidation of the Fourier transforming properties of a lens, and to the essence of Abbe’s theory of image formation. Along the way, several numerical examples will demonstrate the utility of the derived formulas.

23

24

Classical Optics and its Applications

George Biddell Airy

Gustav Robert Kirchhoff

Augustin Jean Fresnel (1788–1827). His work in optics received scant public recognition during his lifetime, but Fresnel maintained that not even acclaim from distinguished colleagues could compare with the pleasure of discovering a theoretical truth or confirming a calculation experimentally. (Photo: Smithsonian Institution, courtesy of AIP Emilio Segre´ Visual Archives.) Jean Baptiste Joseph Fourier (1768–1830), began to work on the theory of heat around 1804 and by 1807 had completed a memoir, On the Propagation of Heat in Solid Bodies, in which periodic functions were expressed as the sum of an infinite series of sines and cosines. Lagrange and Laplace objected to Fourier’s expansion on the grounds that it lacked generality and rigor. Fourier’s treatise, The Analytical Theory of Heat, was not published until 1822. (Photo: Deutsches Museum, courtesy of AIP Emilio Segre´ Visual Archives.) Sime´on Denis Poisson (1781–1840). In 1818, during the judging of Fresnel’s paper on diffraction at the Paris Academy, Poisson argued that the consequence of Fresnel’s theory was the absurdity that the center of the shadow of an opaque disk should be illuminated. This unexpected effect was subsequently observed. (Photo: courtesy of AIP Emilio Segre´ Visual Archives.) Joseph von Fraunhofer (1787–1826) German physicist who first studied the dark lines in the spectrum of the Sun. The first to use diffraction gratings, his work set the stage for the further development of spectroscopy. (Photo: Bavarian Academy of Sciences, courtesy of AIP Emilio Segre´ Visual Archives.) Sir George Biddell Airy (1801–1892), became Lucasian Professor of Mathematics at Cambridge only three years after graduating from Trinity College in 1823. He was Astronomer Royal from 1835 to 1881. Airy contributed to the understanding of the rainbow by studying the effects of diffraction from raindrops. (Photo: courtesy of AIP Emilio Segre´ Visual Archives, E. Scott Barr Collection.)

25

2 Fourier optics

Electromagnetic plane waves A plane-wave solution of Maxwell’s equations in a homogeneous environment can be expressed as aðx; y; zÞ ¼ A0 exp ið2p=kÞ xrx þ yry þ zrz : ð2:1aÞ Here k is the wavelength of the light, A0 is a complex vector representing the magnitude and state of polarization of the E-field at the origin of the coordinate system, and r ¼ (rx, ry, rz) is a unit vector specifying the direction of propagation. In general, rz is related to rx and ry by rz ¼ ð1 r2x r2y Þ1=2 :

ð2:1bÞ

On the one hand, rz will be real-valued if r2x þ r2y 1, in which case the plane wave is said to be homogeneous or propagating. On the other hand, if r2x þ r2y > 1 then rz becomes imaginary and the plane wave is called inhomogeneous or evanescent. In scalar diffraction theory, the state of polarization of the light is ignored and A0 is treated as a complex constant. Furthermore, if the x, y, z coordinates are normalized by the wavelength k, then this parameter disappears from all subsequent equations. Throughout this chapter, therefore, all lengths will be assumed to be normalized by k; a propagation distance of 1000, for example, should be understood as a distance of 1000k. Expansion into plane waves Consider the complex-amplitude distribution a(x, y, z ¼ 0) in the XY-plane at z ¼ 0. The Fourier transform of a(x, y, z ¼ 0) is defined as ZZ1 Aðrx ; ry Þ ¼

aðx; y; z ¼ 0Þ exp½i2pðxrx þ yry Þ dx dy:

ð2:2aÞ

1

Gustav Robert Kirchhoff (1824–1887), Professor of physics at Heidelberg, Breslau and Berlin. His discovery that a gas absorbs the same wavelengths that it emits when heated explained the numerous dark lines (Fraunhofer lines) in the Sun’s spectrum, marking the beginning of a new era in astronomy. Kirchhoff placed Fresnel’s ideas on a firm theoretical basis, formulating what is now referred to as the Fresnel–Kirchhoff diffraction theory. (Photo: courtesy of AIP Emilio Segre´ Visual Archives, W. F. Meggers Collection.)

26

Classical Optics and its Applications

The inverse Fourier transform may therefore be written ZZ1 Aðrx ; ry Þ exp½i2pðxrx þ yry Þ drx dry :

aðx; y; z ¼ 0Þ ¼

ð2:2bÞ

1

Because Maxwell’s equations are linear, any superposition of plane waves within homogeneous linear media is also a solution of Maxwell’s equations. In general, the superposition of plane waves in Eq. (2.2b) contains both propagating and evanescent waves. At a distance z ¼ z0 from the origin, the complex-amplitude distribution of the light is thus given by ZZ1 aðx; y; z ¼ z0 Þ ¼

Aðrx ; ry Þ exp½i2pðxrx þ yry þ z0 rz Þdrx dry :

ð2:3Þ

1

Equation (2.3) is the fundamental formula of the classical theory of diffraction. It provides the following simple recipe for computing the distribution of the field at the plane z ¼ z0 given the initial distribution at z ¼ 0: (i) compute the Fourier transform A(rx,ry) of the initial distribution; (ii) multiply A(rx, ry) by the phase factor, which may be written as exp (i2pz0rz) ¼ exp[i2pz0(1rx2ry2)1/2]; (iii) compute the inverse Fourier transform of the resulting function.

The above recipe is applicable to many practical problems, without the need to introduce any approximations or simplifications. Some consequences of Eq. (2.3) are explored in the following examples. Diffraction from a circular aperture Figure 2.1 shows the computed intensity patterns at various distances from a circular aperture of radius r0 ¼ 3000 illuminated by a uniform plane wave. From (a) to (f) the assumed distances from the aperture are z0 ¼ 0, 0.5 · 106, 0.75 · 106, 1.0 · 106, 2.25 · 106, and 9.0 · 106. (These distances correspond to the Fresnel numbers1 N ¼ r20/z0 ¼ 1, 18, 12, 9, 4, and 1, respectively.) The computations were carried out by discretizing the initial distribution on a 512 · 512 mesh and then applying the fast Fourier transform (FFT) algorithm. On a modern personal computer the time needed for these calculations is less than a second. Diffraction-free beams If the propagation phase factor in Eq. (2.3) happens to be a constant then it can be taken out of the integral, in which case, aside from a multiplicative phase

27

2 Fourier optics a

b

c

d

e

f

–4000

x

4000 –4000

x

4000 –4000

x

4000

Figure 2.1 Computed intensity patterns at various distances from a circular aperture of radius r0 ¼ 3000, illuminated by a uniform plane wave. The assumed distances from the aperture are (a) z0 ¼ 0, (b) z0 ¼ 0.5 · 106, (c) z0 ¼ 0.75 · 106, (d) z0 ¼ 106, (e) z0 ¼ 2.25 · 106, (f) z0 ¼ 9.0 · 106. Note that the center of the diffracted beam is dark in (b), (c) and (e), while it is bright in (d) and (f).

factor, the distribution at z ¼ z0 becomes equal to that at z ¼ 0. This occurs if the Fourier transform A(rx, ry) of the initial distribution happens to be non-zero only over a circle of fixed radius in the Fourier plane, that is, if A(rx, ry) ¼ 0 everywhere except where r2x þ r2y ¼ q20. Under these circumstances, Eq. (2.3) yields h 1=2 i aðx; y; z ¼ z0 Þ ¼ exp i2pz0 1 q20 aðx; y; z ¼ 0Þ: ð2:4Þ According to Eq. (2.4), any initial distribution that is confined to a circle of radius q0 in the Fourier domain will not diffract while propagating along the Z-axis.5 A particularly simple case occurs when A(rx, ry) ¼ d(qq0), where d(·) is Dirac’s delta function and q ¼ (r2x þ r2y)1/2. The inverse transform of this delta function is a zeroth-order Bessel function of the first kind, namely, a(x, y, z ¼ 0) ¼ J0(2pq0r), where r ¼ (x2 þ y2)1/2. Needless to say, any azimuthal variation of the amplitude and/or phase of the above delta function around the circle of radius q0 in the Fourier domain yields another non-diffracting beam. Moreover, if the radius q0 is less than unity then the non-diffracting beam will be a propagating beam, whereas q0 > 1 corresponds to an exponentially attenuating, non-diffracting, evanescent beam.

28

Classical Optics and its Applications

Poisson’s bright spot A bright spot appearing at the center of the geometrical shadow of an opaque disk was first predicted by S. D. Poisson in an attempt to refute Fresnel’s theory of diffraction. Fresnel’s theory was vindicated, however, when Franc¸ois Arago confirmed the existence of the bright spot in an experiment.1,4,6 The diagram in Figure 2.2 shows a collimated beam blocked at the center by a disk of radius r0. The complex-amplitude distribution immediately after the disk is denoted by a(x, y, z ¼ 0). The volume under the Fourier transform A(rx, ry) of this distribution over the rxry-plane is zero, because the central value a(0, 0, 0) of the initial distribution is zero. However, the volume under the Fourier transform of the beam’s cross-section at z ¼ z0 is not zero, because A(rx, ry) is multiplied by the phase-factor exp[i2pz0(1q2)1/2], which changes the phase of the Fourier transform as a function of the radius q in the rxry-plane. A non-zero volume in the Fourier domain implies that the central value of the distribution a(x ¼ 0, y ¼ 0, z ¼ z0) is also non-zero, that is, the center of the distribution at z ¼ z0 is no longer dark. For a disk of radius r0 ¼ 2500, Figure 2.3 shows the computed intensity distributions (a) immediately after the disk, (b) at z0 ¼ 2.0 · 106, and (c) at z0 ¼ 4.0 · 106. Poisson’s bright spot may be considered as the focus of a collimated beam produced by an opaque disk. The disk, therefore, behaves as a lens, albeit a dark one;6 an illuminated object placed before the disk forms an image through the dark lens, as shown in Figure 2.4. In this particular example, the object, shown in Figure 2.4(a), is a circular aperture partially covered by

X

X

r0 Y

Z

z0

Figure 2.2 A collimated beam illuminates an opaque circular disk of radius r0. At a distance z0 from the disk the intensity distribution in the XY-plane contains a bright spot at the center of the geometrical shadow of the disk.

29

2 Fourier optics a

b

c

–5000

x

5000

Figure 2.3 Computed intensity patterns at various distances from an opaque circular disk of radius r0 ¼ 2500, illuminated by a collimated Gaussian beam having a 1/e (amplitude) radius of 5000. The distances from the disk are (a) z0 ¼ 0, (b) z0 ¼ 2.0 · 106, (c) z0 ¼ 4.0 · 106.

four small obstacles. The object is back-illuminated incoherently, by an extended quasi-monochromatic source, through a 0.005NA condenser lens. A dark lens of radius r0 ¼ 2500 at a distance of 106 from the object produces the real image shown in Figure 2.4(b) at a distance of 2.0 · 106 behind the dark lens. We mention in passing that the incoherence of the illumination is essential for the success of this imaging process; interference effects totally obscure the image when the object is coherently illuminated.

30

Classical Optics and its Applications a

–1200

b

x

1200 –2400

x

2400

Figure 2.4 Incoherent imaging by means of a dark lens. The object in (a) is illuminated by an extended quasi-monochromatic source through a 0.005NA condenser of focal length f ¼ 6.0 · 105. The source consists of 529 mutually incoherent point sources, imaged by the condenser at a distance of Dz ¼ 105 before the object. The dark lens is an opaque circular disk of radius r0 ¼ 2500, placed a distance of Dz ¼ 106 from the object. The image in (b) is computed at a distance of z0 ¼ 2.0 · 106 behind the dark lens.

Distribution of light in the far field As the value of z0 increases, Eq. (2.3) becomes exceedingly difficult to compute, because the rapid oscillations of the exponential phase factor require dense sampling of the functions in the rx ry-plane. In this regime, however, the stationaryphase approximation1 becomes applicable. For a fixed value of (x, y, z0), the exponent under the integral in Eq. (2.3) may be considered to be a function of (rx, ry). This function has a single stationary point at (rx0, ry0) ¼ (x, y)/(x2þy2þz20)1/2. At all other points in the rx ry-plane the complex exponential oscillates so rapidly that the local integral effectively vanishes; only at the stationary point does the integral yield a non-zero value. At this point the exponent can be replaced by the first few terms in its Taylor-series expansion around the stationary point, namely, xrx þ yry þ z0 rz ðx2 þ y2 þ z20 Þ1=2 1 12 ð1 þ x2 =z20 Þðrx rx0 Þ2 ðxy=z20 Þðrx rx0 Þðry ry0 Þ 12 ð1 þ y2 =z20 Þðry ry0 Þ2 : ð2:5Þ The integral in Eq. (2.3) is then readily computed, without further approximations, yielding aðx; y; z ¼ z0 Þ i=ðx2 þ y2 þ z20 Þ1=2 exp i2pðx2 þ y2 þ z20 Þ1=2 1=2 ð2:6Þ · Aðrx0 ; ry0 Þ= 1 þ ðx=z0 Þ2 þ ðy=z0 Þ2 :

31

2 Fourier optics X

X

Incident plane wave

s x

s0

x

u

Z

Object

Z0

Figure 2.5 A phase/amplitude object is illuminated by a plane wave propagating along the Z-axis. The diffracted beam is a superposition of plane waves of differing amplitudes, propagating along directions indicated by the unit vectors r. The far-field pattern appears at a sufficiently large distance z0 from the object. Whether the field is observed on the XY-plane at z ¼ z0 or on the spherical surface of radius z0 centered on the object, the x, y coordinates of a given point are the usual coordinates measured along the X- and Y-axes. In either case, the far-field amplitude is proportional to the complex amplitude of the plane wave whose propagation direction r0 is directly aimed at the observation point.

This is the so-called Fraunhofer (or far-field) distribution arising from the initial distribution a(x, y, z ¼ 0). The far field is expressed in terms of the Fourier transform A(rx, ry) of the initial distribution evaluated at (rx0, ry0) ¼ (x, y)/(x2þy2þz20)1/2. Note how the obliquity factor cos h ¼ 1/[1þ(x/z0)2þ(y/z0)2]1/2 enters the above equation (see Figure 2.5). If the far field is observed on a spherical surface of radius z0 centered on the object (see Figure 2.5) then the curvature phase factor becomes a constant and (rx0, ry0) reduces to (x/z0, y/z0), yielding the following simple formula for the farfield pattern on a spherical surface of radius z0: aðx; y; zÞ ði=z0 Þ expði2pz0 Þ Aðx=z0 ; y=z0 Þ cos h:

ð2:7Þ

The conservation of optical power passing through any cross-section of the beam may be verified by integrating the squared modulus of the functions appearing in Eqs. (2.6) and (2.7) over their respective domains.

32

Classical Optics and its Applications

Far field of an annular aperture To demonstrate the utility of Eq. (2.6), we use as the initial distribution the narrow ring of light transmitted through an annular aperture (of width 100 and average radius 1000), shown in Figure 2.6(a). After propagating a distance of 106, the far-field pattern of Figure 2.6(b) is obtained. (To enhance the weak rings of this distribution, a gray-scale plot of the logarithm of intensity is displayed.) The far field is essentially a Bessel beam with a curvature phase factor. To eliminate the curvature, we use a 0.0075NA lens of focal length f ¼ 106 to collimate the beam in the far field of the annular aperture. The emerging truncated and collimated Bessel beam at the exit pupil of the lens is shown in Figure 2.6(c). This beam is not completely diffraction-free because it has a finite diameter. For instance, after it has propagated a distance of 106 from the exit pupil of the collimating lens one observes the intensity pattern of Figure 2.6(d). The intensity distribution after propagating another distance of 2 · 106 is shown in Figure 2.6(e). Finally, Figure 2.6(f) shows the intensity distribution observed at a distance of 5 · 106. Note how the decay of this truncated Bessel beam starts from the outer rings and moves toward the center as the beam propagates. a

–1500

x

1500 –8000

x

e

d

–8000

c

b

x

8000 –8000

8000 –8000

x

8000

x

8000

f

x

8000 –8000

Figure 2.6 Logarithmic plots of intensity distribution at various cross-sections of a beam. (a) A transparent ring (radius 1000, width 100), illuminated with a collimated uniform beam propagating along the Z-axis. (b) Far-field pattern of the ring in the XY-plane at z0 ¼ 106. (c) The beam in (b) after collimation by a 0.0075NA lens of focal length f ¼ 106. (d) The collimated beam in (c) after propagating in free space a distance of 106. (e) The beam in (d) after propagating a distance of 2.0 · 106. (f) The beam in (e) after propagating a distance of 5.0 · 106.

33

2 Fourier optics

The Airy pattern at the focal plane of a lens Consider the infinite-conjugate aplanatic lens of focal length f shown in Figure 2.7. (For a discussion of aplanatism see Chapter 1, Abbe’s sine condition.) To determine the light amplitude distribution around the focal point F, we need the distribution in the second principal plane, which is given by aðx; y; z ¼ 0Þ ¼ a0 ðx1 ; y1 Þ cos3=2 h exp i2pðx2 þ y2 þ f 2 Þ1=2 : ð2:8Þ The amplitude distribution at the entrance pupil (assumed to coincide with the 1st principal plane) is denoted by a0(x1, y1). The coordinates at the 1st and 2nd principal planes are related as follows: (x1, y1) = (fx, fy)/(x2 þ y2 þ f 2)1/2. The corresponding infinitesimal areas in the two principal planes are in the ratio cos3 h, where cos h ¼ f/(x2 þ y2 þ f 2)1/2; the amplitude in Eq. (2.8) is therefore scaled by cos3/2 h to conserve optical power between the entrance and exit pupils. The exponential phase factor in Eq. (2.8) is the curvature imparted by a perfect lens to the emergent beam. To determine, in accordance with Eq. (2.2a), the Fourier transform of the initial distribution given by Eq. (2.8), we invoke the stationary-phase approximation.1 The X1

X

X2

u

F Z

ra = f NA Incident beam

f z0

First p.p.

Second p.p.

Figure 2.7 A collimated beam of light enters an infinity-corrected, aplanatic lens of focal length f and numerical aperture NA. The entrance and exit pupils are at the first and second principal planes. A ray entering at a height (x1, y1) on the first principal plane appears at the same height on the spherical surface centered at the rear focal point F and tangent to the second principal plane. In the absence of aberrations, all emergent rays converge to the focal point F. The distribution in the XY-plane at z ¼ z0 is given by Eq. (2.11).

34

Classical Optics and its Applications

exponent of the integrand under the Fourier integral may be expanded in a Taylor series around its stationary point, ðx0 ; y0 Þ ¼ ð f rx ; f ry Þ=ð1 r2x r2y Þ1=2 ; yielding n 1 xrx þ yry þ ðx2 þ y2 þ f 2 Þ1=2 ð1 r2x r2y Þ1=2 f þ 12 ð1 r2x Þðx x0 Þ2 f rx ry ðx x0 Þð y y0 Þ o 2 1 2 þ 2 ð1 ry Þð y y0 Þ : ð2:9Þ Without any other approximations, the Fourier transform of the initial distribution is found to be Aðrx ; ry Þ i fa0 ðf rx ; f ry Þ exp i2pf ð1 r2x r2y Þ1=2 =ð1 r2x r2y Þ1=4 : ð2:10Þ When the above function is substituted in Eq. (2.3) we obtain ZZ aðx2 ; y2 ; z ¼ z0 Þ i f a0 ðf rx ; f ry Þ=ð1 r2x r2y Þ1=4 · exp i2pðz0 f Þð1 r2x r2y Þ1=2 · exp½i2pðx2 rx þ y2 ry Þ drx dry :

ð2:11Þ

For a given distribution a0(x1, y1) at the entrance pupil, Eq. (2.11) gives the distribution at and near the focal plane of the aplanatic lens of Figure 2.7. If the final distribution is sought in the focal plane (i.e., z0 ¼ f ) and if the factor cos1/2h ¼ (1r2xr2y)1/4 is ignored (i.e., the paraxial approximation), then the focal-plane distribution becomes simply the Fourier transform of the entrancepupil distribution. For an aberration-free lens having a circular aperture of radius ra ¼ fNA, and for a uniform incident beam, the focal- plane distribution is thus proportional to J1(2pNAr)/r, where J1(·) is the first-order Bessel function of the first kind and r ¼ (x22þy22)1/2. This is known as the Airy pattern, a plot of which appears in Figure 2.8.

Fourier-transforming property of a lens An infinity-corrected lens produces in its focal plane the Fourier transform A0(rx, ry) of a complex-amplitude distribution a0(x1, y1) placed before the lens. This behavior

35

2 Fourier optics 5

1.0

0.8

J1(2pr)/(pr)

y 0.6

0.4 –5 x

–5

5

0.2

0.0

0

1

2

3

4

5

r

Figure 2.8 Plot of the Airy function J1(2pr)/pr versus the radial distance r from the focal point. The first zero of the Airy function is at r 0.61. The inset shows a logarithmic plot of the intensity distribution at the focal plane of a 0.5NA diffraction-limited lens. This Airy pattern, being the result of a scalar calculation, shows circular symmetry. In practice, both unpolarized and circularly polarized incident beams produce circularly symmetric Airy patterns. However, for linearly polarized light the Airy pattern tends to be slightly elongated along the direction of the incident E-field.

is readily understood if one recognizes that the input distribution is a superposition of plane waves, each propagating in a different direction. The lens captures these plane waves and brings them to focus within its focal plane. The amplitude of each focused spot is thus proportional to the corresponding planewave amplitude. The finite aperture of the lens spreads each focused spot into an Airy function, giving rise to a focal plane distribution that is the convolution between the object’s Fourier transform and the lens’s Airy pattern. To study in some detail the properties of an aplanatic, infinite-conjugate lens, consider Figure 2.9. Here a plane wave propagating at angle h relative to the Z-axis enters the lens at its first principal plane. At the entrance pupil (which is assumed to coincide with the 1st principal plane) the ray heights are the same as those at the exit pupil, which is a spherical cap of radius f centered at the focal point F. A ray entering at height x1 has phase 2px1rx, which it retains as it emerges from the exit pupil. The ray then acquires an additional phase in

36

Classical Optics and its Applications

Figure 2.9 Fourier transform lens having focal length f and aperture radius ra ¼ fNA. The incident plane wave makes an angle h with the Z-axis in the XZplane, that is, (rx, ry) ¼ (sin h, 0). The beam emerging from the lens converges to the point (x2, y2) ¼ ( f sin h, 0) within the focal plane. The height of a ray entering the lens at the first principal plane is the same as that of the emergent ray measured on a spherical surface of radius f centered at the rear focal point F.

propagating from the exit pupil to the focus at x2 ¼ frx. The total phase at this focus (relative to that at F) is thus given by 2 2 2 1=2 ðx1 ; rx Þ ¼ 2p x1 rx þ ðx1 f rx Þ þ ð f x1 Þ f 1=2 2 ð2:12Þ ¼ 2pf ðx1 =f Þrx þ 1 þ rx 2ðx1 =f Þrx 1 : For small values of both x1/f and rx, the above expression may be approximated as (x1,rx) pfrx2, which is independent of x1. The various rays of the plane wave, having thus acquired the common phase factor exp(ip f rx2), converge to a common focus in the vicinity of the optical axis. Further away from the axis, of course, higher-order terms will cause aberration. Unless the lens is properly designed to correct these aberrations, the acceptable values of NA and rx will indeed be very small. For example, Figure 2.10 shows plots of (x1, rx)pfr2x versus x1/f for several values of rx, for a lens having NA ¼ 0.05 and f ¼ 25 000. Note that to keep the maximum phase deviation at the edge of the pupil below 90 one must restrict the aperture radius to ra 0.05f and the values of rx to the range within 0.055. We conclude that, under appropriate conditions, a plane wave entering the lens at rx ¼ sin h comes to diffraction-limited focus at x2 ¼ frx, with a phase

37

2 Fourier optics 0

sx = 0.01 0.02

f(x1, sx ) – pfs2x (degrees)

–20

0.03 0.04

–40 0.05 –60

0.06

–80

f = 25000, NA = 0.05

–100 –0.050

–0.025

–0.000

0.025

0.050

x1/f

Figure 2.10 Plots of (x1,rx)pfrx2 versus x1/f for several values of rx ¼ sin h from 0.01 to 0.06 in the system of Figure 2.9. The function is given by Eq. (2.12), and the specific values of the lens parameters used in the calculations are NA ¼ 0.05, f ¼ 25 000.

pfr2x ¼ px22/f. Because of the finite aperture of the lens, the focused spot will be not a geometric point but an Airy pattern of diameter 1/NA. Therefore, for an object a0(x1, y1) at the entrance pupil the focal-plane distribution is related to the Fourier transform A0(rx, ry) of the object as follows: 2 2 ð2:13Þ aðx2 ; y2 Þ exp ipðx2 þ y2 Þ=f A0 ðx2 =f ; y2 =f Þ Airyðx2 ; y2 Þ: Needless to say, the range of (x2, y2) in Eq. (2.13) is limited to the region for which the lens is properly designed to focus the incident plane waves into diffraction-limited spots. In the absence of aberrations, the angular resolution of such a lens is solely dependent on the lens-aperture radius ra and is given by Drx ¼ Dry 0.61/ra. (Like all other spatial dimensions in this chapter, ra is assumed to be normalized by the wavelength k of the light.) Similar considerations apply when the object is placed a distance z1 before the first principal plane. In this case each plane wave leaving the object must travel a different distance to reach the entrance pupil. By the time it reaches the entrance pupil, a plane wave traveling along the direction (rx, ry, rz) will have acquired a phase 2pz1rz, which may be approximated as pz1(r2x þ r2y). Under these

38

Classical Optics and its Applications

Figure 2.11 (a), (b) Intensity and phase distributions in the XY-plane for an object and (c), (d) for its Fourier transform. The object is in the front focal plane of a 0.05NA lens having f ¼ 105, illuminated with a plane wave propagating along the Z-axis; the Fourier transform is observed in the rear focal plane. The intensity distribution in the Fourier-transform plane, (c), is displayed on a logarithmic scale to enhance its weak regions. The phase plots in (b) and (d) are encoded in gray-scale (black represents 180 , white represents þ180 ).

circumstances, Eq. (2.13) remains valid provided the exponent of the first term on the right-hand side is multiplied by (1z1/f). In the special case where z1 ¼ f, the quadratic phase factor in Eq. (2.13) disappears altogether, leaving a simple Fourier-transform relation between the distributions in the front and rear focal planes. As an example, Figure 2.11 shows a phase/amplitude object placed in the front focal plane of a 0.05NA lens (see frames (a) and (b)), and the corresponding Fourier transform as observed in the rear focal plane (frames (c) and (d)).5 Abbe’s theory of image-formation Figure 2.12 is a diagram of the basic image-forming system. Both the entrance and exit pupils are assumed to be at the principal planes of the lens; in compliance with

2 Fourier optics

39

Figure 2.12 Diagram of a simple imaging system. The object and image distances from the respective principal planes are z1 and z2. The height of a ray entering the lens is measured on a spherical surface of radius z1 centered at the axial object point. Similarly, the height of a ray exiting the system is measured on the spherical surface of radius z2 centered on the axial image point. For any given ray, the entering and exiting heights are equal. Only one plane wave (leaving the object at an angle h) is shown. The various rays of this plane wave converge to a focus in the image space, then continue to propagate to the image plane.

Abbe’s sine condition, the pupils are spherical caps centered at the axial object and image points. The distance between the object and the first principal plane is z1 and that between the second principal plane and the image is z2. The lateral magnification of the system, therefore, is M ¼ z2/z1. A plane wave leaving the object at an angle h relative to the Z-axis emerges from the exit pupil, each one of its rays having the same height and the same optical phase as at the entrance pupil. Confining attention to the two-dimensional XZ-plane, and denoting the direction cosine of a ray in the object space by rx1 ¼ sin h, the ray height x at the entrance pupil is found from simple geometry to be 1=2 x ¼ x1 ð1 r2x1 Þ þ rx1 z21 x12 ð1 r2x1 Þ : ð2:14Þ A ray leaving the object at x1 intersects the image at x2 ¼ Mx1. Obviously, the ray fan reaching the image plane in Figure 2.12 is not a plane wave. However, it will be seen that this bundle of rays has a phase distribution that can be expressed

40

Classical Optics and its Applications

as the sum of a linear term 1 and a nearly quadratic term 2. The linear term is identical with that of the plane wave leaving the object, namely, 1 ðx2 ; rx2 Þ ¼ 2px2 rx2 ¼ 2px1 rx1 :

ð2:15Þ

Since x2 is a version of x1 magnified by a factor of M, rx2 must be a version of rx1 demagnified by a factor of 1/M, so the above equality is exactly satisfied. Note in Figure 2.12 that although h is the same for all the rays that leave the object within a given plane wave, the corresponding angle h0 in the image plane varies from ray to ray. Therefore rx2, which is defined here as rx1/M, equals sin h0 only for the ray that goes through the center of the image at x2 ¼ 0. The quadratic phase 2 is acquired while covering the path from x1 at the object plane to x2 in the image plane. A ray leaving the object at x1 enters the lens at a height x given by Eq. (2.14), emerges from the exit pupil at the same height and with the same optical phase as at the entrance pupil, and then proceeds to x2 in the image plane. The phase acquired in going from x1 to x2 relative to that at the image center may thus be written 1=2

2 ðx1 ; rx1 Þ ¼ 2p ðx x1 Þ=rx1 þ ðx x2 Þ2 þ ðz22 x2 Þ ðz1 þ z2 Þ : ð2:16Þ Noting that x2 ¼ Mx1, 2 may also be considered a function of x2 and rx1. Equation (2.16) yields a nearly quadratic phase factor in x2, which may be plotted for different values of rx1 ¼ sin h. Figure 2.13 shows a set of such plots within a field of view jx2j < 250 for a system in which z1 ¼ 104 and z2 ¼ 105. Different curves correspond to different values of h. Note that if the slight differences between these curves are ignored (and the maximum difference in 2 is only about 10 in the present example), then the quadratic phase factor 2 is essentially independent of h. When 2 as a function of x2 is expanded in a Taylor series, the lowest-order term is found to be 2 ðp=z2 Þ 1 þ ðz1 =z2 Þ ðz1 =z2 Þr2x1 x22 : ð2:17Þ If z21, this quadratic phase can be ignored, yielding a plane-wave output for a plane-wave input. However, when 2 is too large to be ignored its dependence on rx1 may be insignificant. This happens when the magnification z2/z1 is either very large or very small. The case z2/z11 is obvious when one considers the coefficient 2 of rx1 in Eq. (2.17). In the case of large demagnification, z2/z1 1, the range of rx1 is limited to jrx1j < z2/z1, rendering 2 essentially independent of rx1 once again. The quadratic phase factor exp(i2), being more or less independent of h, can thus be factored out. This means that those plane waves that leave the object and manage to get through the lens to the image plane have the requisite uniform amplitude and linear phase expected of a plane wave. These plane waves, when

41

2 Fourier optics 120 z1 = 104, z2 = 105 100

f2(x2, sx1) (degrees)

x1 = 0.00 0.25 0.50 0.75 1.00

80

60

40

20

0 –250

–125

0 x2

125

250

Figure 2.13 Plots of the function 2(x2, rx1) versus x2 for several values of rx1 ¼ sin h equal to (top to bottom) 0.00, 0.25, 0.50, 0.75, 1.00. (See Figure 2.12 and Eq. (2.16); x2 is related to x1 through x2 ¼ Mx1.) The assumed system parameters are z1 ¼ 104, z2 ¼ 105. The field of view in the image plane is confined to the region jx2j < 250.

superimposed upon each other, produce in the image plane a magnified (or demagnified) image of the object. Thus the differences between object and image are: (i) the image is multiplied by a nearly quadratic phase factor, exp(i2); (ii) the plane waves having a large angle h miss the lens and, therefore, do not contribute to the image. This truncation by a circular aperture in the Fourier domain is equivalent to convolution with an Airy function in the image plane. The amplitude distribution in the image plane is thus given by ð2:18Þ aimage ðx2 ; y2 Þ ¼ expði2 Þ aobject ðx2 =M; y2 =MÞ Airyðx2 ; y2 Þ : Figure 2.14 shows two examples of coherent imaging through a diffractionlimited lens. The object’s intensity and phase are shown in Figures 2.14(a), (b). This object has several fine features which, being smaller than a wavelength, are below the resolution of any optical imaging system. A coherent and uniform beam propagating along the Z-axis illuminates the object. The entrance pupil of the imaging lens, located at z1 ¼ 104, is in the far field of the object. Figures 2.14 (c), (d) show the intensity and phase patterns at the image plane of a 10 ·, 0.6NA lens. Similarly, Figures 2.14(e), (f) show the intensity and phase distributions in

42

Classical Optics and its Applications

Figure 2.14 Distributions of intensity (left column) and phase (right column) at the object and image planes of a coherent imaging system. The phase plots are encoded in gray scale: black represents 180 , white represents þ180 . (a), (b) Distributions in the plane of the object. (c), (d) Image obtained with a 10 ·, 0.6NA objective lens. (e), (f) Image obtained with a 10 ·, 0.95NA objective lens.

the image plane of a 10 ·, 0.95NA lens. The higher-NA lens, capturing more of the high-frequency Fourier components of the object, yields a superior image. Both lenses, however, fail to reproduce the very fine features of the object. Appendix to Chapter 2: The stationary-phase approximation Consider the two-dimensional integral ZZ I¼ f ðx; yÞ exp½iggðx; yÞdx dy;

ðA2:1Þ

43

2 Fourier optics

where, in general, f(x, y) is a complex function, g(x, y) is a real function, g is a large real number, and the domain of integration is a subset of the XY-plane. In the neighborhood of an arbitrary point (x0, y0), within the domain of integration, small variations in g(x, y) will be amplified by g; this will result in rapid oscillations of the phase factor exp[igg(x, y)]. Assuming that f(x, y) in the neighborhood of (x0, y0) is a slowly varying function, the oscillations result in a negligible contribution from this neighborhood to the integral. The main contributions to the integral then come from the regions in which g(x, y) is nearly constant. These regions are in the vicinity of stationary points (x0, y0), which are defined by the following relation: @gðx; yÞ=@x ¼ @gðx; yÞ=@y ¼ 0:

ðA2:2Þ

Around each stationary point one may expand g(x, y) in a Taylor series up to the second-order term to obtain gðx; yÞ gðx0 ; y0 Þ þ 12 gxx ðx0 ; y0 Þðx x0 Þ2 þ gxy ðx0 ; y0 Þðx x0 Þð y y0 Þ þ 12 gyy ðx0 ; y0 Þðy y0 Þ2 :

ðA2:3Þ

Replacing the expression for g(x, y) in Eq. (A2.1) with that in Eq. (A2.3), and taking f(x, y) outside the integral, yields I

X

Z Z1 f ðx0 ; y0 Þ exp½iggðx0 ; y0 Þ ·

exp iðg=2Þ gxx ðx x0 Þ2

1 2

þ 2gxy ðx x0 Þðy y0 Þ þ gyy ð y y0 Þ

dx dy;

ðA2:4Þ

where the summation is over all stationary points (x0, y0). Notice that the domain of integration is now extended to the entire plane, since the contribution to the integral from regions outside the immediate neighborhood of the stationary points is, in any event, negligible. The double integral in Eq. (A2.4) can be readily carried out, yielding X ðA2:5Þ I ð2pi=gÞ mj gxx gyy g2xy j1=2 exp½iggðx0 ; y0 Þ f ðx0 ; y0 Þ; where the summation is again over all stationary points (x0, y0) and the coefficient m is given by i if gxx gyy g2xy and gxx > < 0: Equation (A2.5) is the final result of this appendix. If the numerical value of

44

Classical Optics and its Applications

gxx gyyg2xy happens to be exactly zero at a particular stationary point or if a stationary point occurs on the boundary of the domain of integration in Eq. (A2.1) then Eq. (A2.5) no longer applies. In our analysis of diffraction problems, however, these special cases will not be encountered. References for Chapter 2 1 M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, Oxford, 1980. 2 J. W. Goodman, Introduction to Fourier Optics, second edition, McGraw-Hill, New York, 1996. 3 L. Mandel and E. Wolf, Optical Coherence and Quantum Optics, Cambridge University Press, UK, 1995. 4 M. V. Klein, Optics, Wiley, New York, 1970. 5 J. Durnin, J. J. Miceli, and J. H. Eberly, Diffraction-free beams, Phys. Rev. Lett. 58, 1499–1501 (1987). 6 F. A. Jenkins and H. E. White, Fundamentals of Optics, fourth edition, McGraw-Hill, New York, 1976.

3 Effect of polarization on diffraction in systems of high numerical aperture

The classical theory of diffraction, according to which the distribution of light at the focal plane of a lens is the Fourier transform of the distribution at its entrance pupil, is applicable to lenses of moderate numerical aperture (NA). The incident beam, of course, must be monochromatic and coherent, but its polarization state is irrelevant since the classical theory is a scalar theory (see Chapter 2, “Fourier optics”). If the incident beam happens to be a plane wave and the lens is free from aberrations then the focused spot will have the wellknown Airy pattern. When the incident beam is Gaussian the focused spot will also be Gaussian, since this particular profile is preserved under Fourier transformation. In general, arbitrary distributions of the incident beam, with or without aberrations and defocus, can be transformed numerically, using the fast Fourier transform (FFT) algorithm, to yield the distribution in the vicinity of the focus. There are two basic reasons for the applicability of the classical scalar theory to systems of moderate NA. The first is that bending of the rays by the focusing element(s) is fairly small, causing the electromagnetic field vectors (E and B) before and after the lens to have more or less the same orientations. A scalar amplitude assigned to each point on the emergent wavefront from a system having low to moderate values of NA is sufficient to describe its electromagnetic state, whereas in the high-NA regime one can no longer ignore the vectorial nature of light. The second reason for the success of the classical scalar theory (within its proper limits) is that a certain integral – that which represents the decomposition of a convergent wavefront into its planewave constituents – submits to evaluation by the method of stationary-phase approximation. The remaining integral – that which represents the superposition of plane waves arriving at the focal plane – is then calculated with the aid of Fourier transformation. When the stationary-phase technique fails, so does the classical scalar theory, as is evidenced, for instance, in systems of very 45

46

Classical Optics and its Applications

low numerical aperture: The well-known focal-shift phenomenon is but one manifestation of the failure of the stationary-phase approximation in very-low-NA systems.1 In the stationary-phase approximation the plane-wave spectrum of the convergent beam at the exit pupil coincides with the light amplitude distribution at that pupil, thus enabling each geometric-optical ray to represent one plane wave of the spectrum, namely, that which propagates in the direction of the ray.2 This correspondence between rays and plane waves, which is an important feature of many diffraction problems, is therefore understood to be a direct consequence of the stationary-phase approximation. Now, let h be the angle between a converging ray in the image space and the optical axis at the focal point. Since the projection of the wave vector k onto the exit pupil has length k sin h, whereas the intersection of the ray with the pupil occurs at a radius r ¼ f tan h , then in order to convert from light amplitude distribution to the corresponding plane-wave spectrum one must compress the distribution function at the exit pupil. Aside from a trivial scaling of the aperture’s radius by the focal length f, the radial compression must assign to r ¼ sin h the value of the function at r ¼ tan h; this must be followed by proper normalization to preserve the integrated intensity. The compressed distribution is therefore confined to a disk of radius NA ¼ sin hmax, where hmax is the angle subtended by the rim of the exit pupil at the focal point. This scaling, compression, and normalization procedure is not merely justified on heuristic grounds but, as discussed in the preceding chapter, is a rigorous consequence of the stationary-phase approximation itself. For lenses of low to moderate numerical aperture (say, NA < 0.2) the difference between sin h and tan h is negligible, and the effects of compression can be ignored. At the exit pupil, the plane-wave spectrum of these lenses is usually the same as the incident distribution at the entrance pupil, modified only by the presence of aberrations. For lenses of high numerical aperture, however, it is necessary to obtain the exit-pupil distribution (from the knowledge of lens characteristics and the entrancepupil distribution) before proceeding to the compression operation. Noteworthy in this respect is the aplanatic lens, which, by virtue of satisfying Abbe’s sine condition, guarantees that the compressed exit-pupil distribution is identical with the entrancepupil distribution. Bending of polarization vector To account for polarization effects at high numerical aperture, one usually ignores transmission losses at the various surfaces of a lens, assuming that a ray goes through the system unattenuated but with its polarization vector bent in accordance with the known laws of refraction.2,3,4,5 (The assumption of losslessness

3 Effect of polarization on diffraction in systems of high numerical aperture

47

X E E

E

F

Z

E

Y

Figure 3.1 Focusing of linearly polarized light by a high-NA lens, shown in perspective, causes bending of the polarization vectors. The amount and direction of bending depend on the coordinates of the ray.

is not necessary here, but it simplifies the problem by enabling the polarization state of individual rays at the exit pupil to be determined solely on the basis of their coordinates, without requiring detailed knowledge of the lens structure.) For a linearly polarized incident beam, Figure 3.1 shows the bending of the E-vector at two azimuthal positions. The ray at the top of the lens contributes both an X- and a Z-component to the distribution in the image space, whereas the ray in the YZ-plane contributes only an X-component. By the same token, rays intermediate between those shown here will contribute to the polarization along all three axes. We present a simple treatment of polarization-related phenomena within the framework of the classical theory of diffraction. This will not be a rigorous treatment based on Maxwell’s equations; rather, it will be rooted in reasonable physical arguments based on the bending of rays (or plane waves) by prisms. Our approach to vector diffraction is in keeping with the spirit of diffraction theory; it is not exact as far as Maxwell’s equations are concerned but incorporates intuitive ideas about the propagation of electromagnetic waves. With reference to Figure 3.2, consider a plane wave propagating along the unit vector r0 ¼ (0, 0, 1), i.e., along the Z-axis, having linear polarization in the X-direction. Let a prism be placed in the path of this beam, with orientation such that the emerging beam would propagate in a direction specified by the unit vector r1 ¼ (rx, ry, rz). Now, the incident polarization vector E0 ¼ (1, 0, 0) may be decomposed into two components: one, the so-called p-polarization, is in the plane of r0 and r1; the other, known as the s-polarization, is perpendicular to this plane. As the latter component (perpendicular to the r0r1-plane) emerges from the prism, it will have suffered no deviation in direction. The p-component,

48

Classical Optics and its Applications

Incident Beam p 0 = (0,0,1) s

Prism (Diffraction Grating)

Diffracted Beam

1 = (x, y, z)

Figure 3.2 Lossless refraction of a polarized plane wave by a prism. The original direction of propagation is r0 ¼ (0, 0, 1) and the corresponding polarization vector is E0. After refraction, the beam assumes a new direction r1 ¼ (rx, ry, rz), and its new polarization state becomes E1. The same geometry would apply for diffraction of the beam by a grating.

however, will have been reoriented such that it remains perpendicular to the emergent direction. If it is further assumed that no losses, due to surface reflections or otherwise, occur in this refraction process, one can use simple geometry to determine the emerging polarization direction. A similar calculation can be performed for an incident plane wave linearly polarized along the Y-axis. Details of these calculations are left to the reader, but the final results are listed in Table 3.1. Notice that the reorientation of the polarization vector described in Table 3.1, while a consequence of the refraction of the direction of propagation, is independent of the particular mechanism responsible for refraction. Given an initial direction r0 and a direction for the emerging beam r1, one can use Table 3.1 to identify the emergent components of polarization for an arbitrary state of incident polarization. In the stationary-phase approximation each ray is associated with a single plane wave, the three polarization components of which may be treated independently of each other. Therefore, for each of the components Ex, Ey, Ez of the emergent beam, a single superposition integral (i.e., Fourier transform) yields the sought-after distribution in the focal plane. Example The technique described in the preceding section is quite general and can be applied to arbitrary incident distributions having arbitrary polarization states, while taking into account various lens aberrations (including substantial amounts of defocus). Computed results for an aberration-free, aplanatic lens having NA ¼ sin 75 ¼ 0.966 and f ¼ 3000k are shown in Figure 3.3. The assumed geometry in these calculations is that depicted in Figure 3.1, where the incident beam is a

49

3 Effect of polarization on diffraction in systems of high numerical aperture

Table 3.1. Polarization E1 of a refracted beam when the incident polarization

E0 is along the X- or Y- axes. The refraction (from r0 to r1) is lossless Emergent polarization with r1 ¼ (rx, ry, rz)

Incident polarization with r0 ¼ (0, 0, 1)

E1 ¼ 1 ½r2x =ð1 þ rz Þ; rx ry =ð1 þ rz Þ; rx E1 ¼ rx ry =ð1 þ rz Þ; 1 ½r2y =ð1 þ rz Þ; ry

E0 ¼ (1, 0, 0) E0 ¼ (0, 1, 0)

(a)

| Ex | 2

| Ey | 2

(b)

+3

+3 y/

–3

x/

+3

y/

+3

–3

–3

x/ –3

| Ez | 2

(c)

+3 +3

y/

–3

x/ –3

Figure 3.3 Intensity profiles of the three components of polarization at the focal plane of an aplanatic lens (NA ¼ 0.966, f ¼ 3000k), illuminated with a linearly polarized plane wave. For best viewing, the vertical scale is chosen differently in the three cases: the peak intensities in (a), (b), (c), corresponding to the X-, Y-, Z- components of polarization, are in the ratios 1.00 : 0.0081 : 0.192.

uniform plane wave with linear polarization along the X-axis. Frames (a)–(c) in Figure 3.3 are intensity plots for the X-, Y-, and Z- components of polarization in the focal plane; their peak intensities are in the ratio 1.00 : 0.0081 : 0.192. The corresponding gray-scale plots appear in Figure 3.4; frames (a)–(c) show the intensity distributions and frames (d)–(f) display their logarithmic counterparts. The observed four-fold symmetry of the Y-component and the two-fold symmetry

50

Classical Optics and its Applications a

d

b

e

c

f

–2

x/

+2 –3

x/

+3

Figure 3.4 Gray-scale plots of intensity distribution at the focal plane of an aplanatic lens (NA ¼ 0.966, f ¼ 3000k), illuminated with a linearly polarized plane wave. Frames (a)–(c) show the intensity plots, while frames (d)–(f) display the logarithm of intensity. In each column the top frame represents the Xcomponent of polarization, the middle frame corresponds to the Y-component, and the bottom frame to the Z-component.

of the Z-component are consistent with one’s expectations based on ray-bending arguments. The contour plot in Figure 3.5 of the total E-field energy density, jExj2 þ jEyj2 þ jEzj2, shows an elliptical profile, the ellipse having its major axis in the direction of the incident polarization. (Richards and Wolf 5 obtained the same result using a somewhat different formulation of the diffraction problem.) This result indicates a slight improvement in the resolution of a microscope or telescope that uses linearly polarized light, as long as the feature that needs to be resolved is oriented along the minor axis of the ellipse, namely, in the direction perpendicular to that of the incident polarization.

3 Effect of polarization on diffraction in systems of high numerical aperture

51

3

2

y/

1

0

–1

–2

–3 –3

–2

–1

0 x/

1

2

3

Figure 3.5 Contour plot representing the sum of the three intensity profiles shown in Figure 3.3, i.e., the total E-field energy density distribution in the focal plane of the aplanatic lens.

The computations reported here required no more than three seconds on a modern pentium-based personal computer using a 512 · 512 square mesh. References for Chapter 3 1 2 3 4 5

V. N. Mahajan, Axial irradiance and optimum focusing of laser beams, Appl. Opt. 22, 3042–3053 (1983). J. J. Stamnes, Waves in Focal Regions, Adam Hilger, Bristol, 1986. M. Mansuripur, Certain computational aspects of vector diffraction problems, J. Opt. Soc. Am. A 6, 786–805 (1989). H. H. Hopkins, The Airy disk formula for systems of high relative aperture, Proc. Phys. Soc. London 55, 116–128 (1943). B. Richards and E. Wolf, Electromagnetic diffraction in optical systems: structure of the image field in an aplanatic system, Proc. Roy. Soc. Ser. A 253, 358–379 (1959).

4 Gaussian beam optics

A Gaussian beam is perhaps the simplest possible waveform that shows many of the effects of diffraction. Using Gaussian beams one can study diffraction in the near field and the far field, examine beam divergence upon propagation, investigate diffraction-limited focusing through a lens, observe the Gouy phase shift, and analyze many other interesting properties of electromagnetic waves. Although Gaussian beams have been thoroughly analyzed in the literature,1,2 it is worthwhile to examine them in the Fourier domain from a less well-known perspective. The need for the paraxial approximation (inherent in all treatments of Gaussian beams) becomes particularly clear when employing the Fourier method of analysis. There is also the issue of separability of the x- and y- dependences of the Gaussian beam profile (assuming propagation along the Z-axis), which is often assumed but not properly explained in the literature. It turns out that separability is neither necessary nor desirable and that the two-dimensional analysis of a nonseparable beam is quite straightforward. It must be emphasized that separability is not always achievable by rotating the coordinate axes. When the real and imaginary parts of the Gaussian exponent require different rotations to become separable, the x- and y-dependences remain entangled, thus necessitating a two-dimensional analysis.

Cross-sectional amplitude profile For a generalized Gaussian beam propagating along the Z-axis, the complex amplitude distribution in the cross-sectional XY-plane is given by ^ aðx; y; z ¼ 0Þ ¼ ^ a0 exp pðax2 þ 2bxy þ cy2 : ð4:1aÞ Here the complex constant aˆ0 is the amplitude at the origin of the coordinate system and the coefficients a ¼ (a1 þ ia2), b ¼ (b1 þ ib2), c ¼ (c1 þ ic2) are fixed 52

4 Gaussian beam optics

53

complex numbers. The only constraints on these parameters are a1 0, c1 0, and a1c1 b21, lest the amplitude diverges to infinity. The power content of the beam (i.e., the integrated intensity over the XY-plane) is readily found to be pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ2 P ¼ 12 j^ ð4:1bÞ a1 c1 b1 : a0 j2 The real parts of the a, b, c parameters determine the profile of the beam’s magnitude in the XY-plane at z ¼ 0, while their imaginary parts determine the beam’s phase profile. The contours of constant magnitude are ellipses oriented at h1 relative to X, where tan 2h1 ¼ 2b1/(a1–c1); the major and minor diameters of qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1=2: The these ellipses are proportional to ða1 þ c1 Þ ða1 c1 Þ2 þ 4b21 phase contours are ellipses or hyperbolas whose axes are oriented at h2 relative to X, where tan 2h2 ¼ 2b2 / (a2c2). In general h1 6¼ h2 and therefore coordinate rotations cannot separate the x- and y- dependences of the Gaussian beam profile. When a2c2 > b22 the contours of constant phase are ellipses; otherwise, they are hyperbolas. Figure 4.1 shows two examples of amplitude and phase distributions for Gaussian beams having different sets of the a, b, c parameters.

Propagation in free space The Fourier transform of the Gaussian profile in (Eq. 4.1a) is given by ^ x ; ry Þ ¼ ff^ Aðr aðx; y; z ¼ 0Þg pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ¼ ^ a0 ac b2 exp pðar2x þ 2brx ry þ cr2y Þ :

ð4:2aÞ

Here a ¼ c/(acb2), b ¼ b/(acb2), and c ¼ a / (acb2). In matrix notation, 1 a b a b : ð4:2bÞ ¼ b c b c Fourier transform When a beam travels a distance z0 in free space, itsq ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ﬃ is multiplied 2 2 by the transfer function of propagation, exp i2pz0 1 rx ry (see chapter 2, “Fourier optics”). This is true irrespective of whether z0 is positive or negative; in other words, both forward and backward propagation can be treated by the same formalism. The reason that the wavelength k of the light does not appear in these equations is that all spatial coordinates are assumed to be normalized by k, that is, x, y, z0 are dimensionless quantities. Invoking the standard paraxial approximation, the above transfer function is replaced by exp(i2pz0) exp[ipz0(r2x þ r2y)]. Multiplying this transfer function

54

Classical Optics and its Applications a

b

c

d

–20

x/

20

–20

x/

20

Figure 4.1 Distributions of intensity (left) and phase (right) in the crosssections of two Gaussian beams having different a, b, c parameters. The phase plots are encoded in gray-scale, black representing 180 and white representing þ180 . (a), (b) a ¼ 0.009 0.023i, b ¼ 0.006 0.002i, c ¼ 0.012 0.016i, (c), (d) a ¼ 0.011 0.023i, b ¼ 0.01 0.003i, c ¼ 0.016 þ 0.012i.

into Aˆ(rx,ry) of Eq. (4.2a) converts aˆ0 to aˆ0 exp(i2pz0), a to a þ iz0, and c to c þ iz0, while keeping b unchanged. The beam’s Fourier transform thus retains its Gaussian form and, consequently, the profile of the beam at z ¼ z0 remains Gaussian, albeit with different a, b, c parameters and with a different value for aˆ0. It is readily verified that the new parameters of the beam at z ¼ z0 are given by 0 1 b a þ iz0 a b0 ; ð4:3aÞ ¼ b0 c0 b c þ iz0

^ a00

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ¼^ a0 expði2pz0 Þ ða0 c0 b02 Þ=ðac b2 Þ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ¼^ a0 expði2pz0 Þ= 1 ðac b2 Þz20 þ i ða þ cÞz0 :

ð4:3bÞ

Thus the beam remains Gaussian as it propagates along Z, but its magnitude and phase profiles change continuously. Figure 4.2 shows computed cross-sectional profiles for a beam at several locations along the Z-axis. Note how the elliptical

4 Gaussian beam optics

55

8

y/

–8 10

y/

–10 20

y/

–20 1000

y/

–1000

Figure 4.2 Distributions of intensity (left) and phase (right) in the crosssectional planes of a Gaussian beam propagating along the Z-axis. The parameters of the beam at z ¼ 0 are a ¼ 0.01 þ 0.05i, b ¼ 0.005 0.04i, c ¼ 0.02 0.12i. The phase plots are encoded in gray-scale, black representing 180 and white representing þ 180 . From top to bottom, the propagation distances along Z are 0, 5k, 25k, and 1000k. In the bottom right-hand frame the far-field curvature phase factor (corresponding to a2 ¼ c2 ¼ 0.001) has been subtracted.

cross-section of the intensity profile rotates with increasing z0 and also how the phase contours change from hyperbolas to ellipses and vice versa.

The beam waist The waist is the cross-section of the beam at which the phase is uniform, i.e., it is independent of x and y. In general, a Gaussian beam does not have to have a

56

Classical Optics and its Applications

waist but if a waist exists then the a, b, c parameters in that cross-section will be real. A question arises as to when an arbitrary Gaussian beam (for which the a, b, c parameters at a given cross-section are complex) can be said to have a waist. In other words, does a value of z0 (positive or negative) exist at which a0 , b0 , c0 are real? According to Eq. (4.3a), this requirement is met if b is real and the imaginary parts of a and c are identical, so that iz0 will end up canceling their imaginary parts. This is equivalent to requiring both b and ac to be real-valued. Considering the relationship between a, b, c and a, b, c in Eq. (4.2b), it is not difficult to show that the necessary and sufficient condition for an arbitrary Gaussian beam to have a waist is that, in the complex plane, the three vectors b, a c, and ac b2 must be parallel (or antiparallel) to each other. In other words, these three complex numbers must lie along a straight line that goes through the origin of the complex plane. This requirement, of course, is in addition to the other Gaussian beam requirements, namely, a1 0, c1 0, a1c1 b21. When the a, b, c parameters satisfy all the above constraints, the beam will have a waist at a specific location along the Z-axis. The waist is unique, because there is only one value of z0 that can cancel the imaginary parts of both a and c in Eq. (4.3a). When a waist exists, there is symmetry between the locations before and after the waist. Let the waist be at z ¼ 0. Then the a, b, c parameters at this location will be real, which means that the corresponding a, b, c are real as well. Now, any value of z0 will make a and c complex, while z0 will yield the conjugates of the same a and c. Therefore, the a, b, c parameters on opposite sides of the waist will be complex conjugates of each other. This means that the intensity profiles on opposite sides are identical, while the phase profiles differ by a minus sign. The beam is always convergent before, and divergent after, the waist. The Gouy phase shift Aside from the usual linear phase factor exp(i2pz0), Eq. 4.3(b) contains an additional phase whose value depends non-linearly on z0. This is the phase associated with the square-root factor on the right-hand side of the equation. Consider the special case when a, b, c are all real-valued, i.e., when the beam waist is at z ¼ 0. As z0 increases from zero and acquires positive values, the real part under the square root, 1(acb2)z02, decreases while the imaginary part, (a þ c)z0, increases. Thus, the phase of aˆ00 associated with the square root, namely,

ð4:4Þ w ¼ 12 tan1 ða þ cÞz0 =½1 ðac b2 Þz20 ; approaches 90 for sufficiently large z0. Similarly, when z0 goes from 0 to negative values, w moves toward þ90 . It is thus seen that, in crossing the waist,

57

4 Gaussian beam optics

the beam undergoes a 180 phase shift. This phase shift, which is particularly rapid near the focus of a lens, was first observed experimentally by the French physicist L. Georges Gouy in 1890.2,3,4 To demonstrate an observable effect of the Gouy phase, consider the experiment depicted in Figure 4.3. Here an aberration-free lens is split into two identical halves, and the upper half-lens is translated forward by Dz ¼ 300k. A collimated uniform beam of light is directed at the split lens, and the distribution of intensity in the region between the two foci, F1 and F2, is monitored. Figure 4.4 shows computed intensity

X ⌬z

Z F1

F2

Figure 4.3 A split lens brings a collimated uniform beam of wavelength k to two different foci, F1 and F2, along the Z-axis. The region of interest is between the two focal planes and above the Z-axis. At first glance the rays going through each half-lens are expected to have the same phase when they arrive in the vicinity of the Z-axis. However, because of the Gouy effect, the beam going through the upper lens and arriving at an observation point before focus will be phase-shifted by about 180 relative to the beam going through the lower lens and arriving at the same observation point after focus. In our numerical example, the lens (before splitting) has NA ¼ 0.1 and focal length ¼ 30000k, and the separation between the half-lenses is Dz ¼ 300k. 20 a

b

c

y/

–20 –20

x/

20 –20

x/

20 –20

x/

20

Figure 4.4 Computed intensity patterns in the XY-plane at the mid-point between the two foci in the system of Figure 4.3. (a) Upper half-lens blocked; (b) lower half-lens blocked; (c) both half-lenses transmitting the incident beam, and the two emergent beams interfering at the observation plane. Note that the central region of the distribution in (c), corresponding to points near the Z-axis, is dark.

58

Classical Optics and its Applications 20 a

b

c

e

f

h

i

y/

–20 20 d

y/

–20 20 g

y/

–20 –20

x/

20 –20

x/

20 –20

x/

20

Figure 4.5 Plots of intensity distribution in the XY-plane at various locations along the Z-axis in the system of Figure 4.3. From (a) to (i) the observation plane moves in steps of 37.5k from the first focus, at F1, to the second focus, at F2.

patterns in a vertical plane half-way between F1 and F2 when (a) the upper half-lens is blocked, (b) the lower half-lens is blocked, and (c) the light is allowed to go through both half-lenses. For locations near the Z-axis, where the optical path lengths are nearly identical, the light amplitudes contributed by the two half-lenses are expected to be in phase, resulting in constructive interference. However, as Figure 4.4(c) clearly demonstrates, the vicinity of the optical axis is dark. This destructive interference is caused by the nearly 180 Gouy phase shift between the “before-focus” and “after-focus” beams arriving from the two half-lenses. Figure 4.5 shows several cross-sectional plots of intensity distribution in the region between F1 and F2, starting at F1 and moving in steps of 37.5k to F2. The intensity at and near the Z-axis is seen to diminish as the mid-plane between the foci is approached from either side. The Rayleigh range For the generalized Gaussian beam of Eq. (4.1a), there is only one way to define the Rayleigh range,2 and that is in terms of the Gouy phase shift. To admit a

4 Gaussian beam optics

59

Rayleigh range the beam must have a waist, which we assume to be at z ¼ 0, so the a, b, c parameters at this location are real-valued. With reference to Eq. (4.4), the Rayleigh range is the distance z0 at which the Gouy phase w is 45 , i.e., pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ z0 ¼ 1/ ac b2 . In the special case when a ¼ c and b ¼ 0 (i.e., when the beam is circularly symmetric) the Rayleigh range z0 is 1/a. In this case the beam pﬃﬃﬃ diameter at the Rayleigh range is a factor of 2 larger than that at the waist; also the beam curvature can be shown to attain its maximum value at the Rayleigh range. Effect of lens on Gaussian beam In the paraxial approximation a lens imparts a quadratic phase factor to the incident beam. If the lens happens to be astigmatic, the phase factors along the X- and Y- axes will have different curvatures, and if the astigmatic lens happens to have rotated within the XY-plane then the quadratic phase factor will have an xy term as well. We assume that the lens aperture is large enough to transmit the beam without significant truncation and, therefore, to affect negligibly its amplitude profile. All in all, the effect of a lens on a Gaussian beam is to multiply its complex amplitude by the following transmission function: tðx; yÞ ¼ exp½ipðpx2 þ 2qxy þ ry2 Þ:

ð4:5Þ

Here p, q, r are real-valued constants related to the principal radii of curvature of the lens. Thus when the Gaussian beam of Eq. (4.1a) passes through the lens described by Eq. (4.5), a2 will be augmented by p, b2 by q, and c2 by r. The beam can then be propagated in the free-space region beyond the lens using the aforementioned analytical tools.

Higher-order Gaussian beams We confine the discussion of higher-order beams to the one-dimensional case only, as the extension to two dimensions is straightforward. Consider the Gaussian function exp(pax2), where a is a complex constant. The nth derivative of this function with respect to x may be used to define an initial amplitude distribution as follows: pﬃﬃﬃﬃﬃﬃ dn ^ aðx; z ¼ 0Þ ¼ ^ a0 Hn ð pa xÞ expðpax2 Þ ¼ ^a0 ð1Þn ðpaÞn=2 n ½expðpax2 Þ: dx ð4:6aÞ

60

Classical Optics and its Applications

Here the nth-order Hermite polynomial Hn(x) is defined as Hn ðxÞ expðx2 Þ ¼ ð1Þn

dn ½expðx2 Þ: dxn

ð4:6bÞ

The Fourier transform of the distribution in Eq. (4.6a) is readily evaluated using the differentiation theorem,5 n d 2 f ½expðpax Þ ¼ ði2prÞn f½expðpax2 Þ dxn ð4:7Þ ¼ ði2prÞn a1=2 expðpr2 =aÞ: To account for propagation by a distance z0 along the Z-axis, the Fourier transform of the initial distribution in Eq. (4.6a) is multiplied by the transfer function of freespace propagation, which, in the paraxial approximation, is exp(i2pz0) exp(ipz0 r2). This means that the coefficient 1/a in the exponent of the Gaussian function on the right-hand side of Eq. (4.7) is augmented by iz0, yielding 1 1 ¼ þ iz0 : a0 a

ð4:8Þ

The light amplitude distribution at z ¼ z0 is then obtained by an inverse Fourier transform, yielding pﬃﬃﬃﬃﬃﬃﬃ ^ a0 expði2pz0 Þða0 =aÞðnþ1Þ=2 Hn pa0 x expðpa0 x2 Þ: ð4:9Þ aðx; z ¼ z0 Þ ¼ ^ Since in general a0 is complex, the above eigenfunctions of propagation in free space contain Hermite polynomials with a complex argument. Siegman2,6 refers to these as the “elegant” solutions of the wave equation in free space. The elegant solutions are substantially different from the so-called “standard” solutions, whose argument of the Hermite polynomial is real. Assuming that the Hermite–Gaussian beam of Eq. (4.9) has its waist at z ¼ 0, the parameter a of the beam will be real-valued. The normalization factor (a0 /a)(n þ 1)/2 ¼ (1 þ i az0)(n þ 1)/2 thus contributes its phase angle, 12(n þ 1) tan1(az0), to the Gouy phase. Note that the complex-argument Hermite polypﬃﬃﬃﬃﬃﬃﬃ nomial Hn( pa0 xÞ also has a z-dependent phase, which contributes to the overall phase pattern in the beam’s cross-section. References for Chapter 4 1 H. Kogelnik and T. Li, Laser beams and resonators, Appl. Opt. 5, 1150–1167 (1966). 2 A. E. Siegman, Lasers, University Science Books, California (1986). 3 L. G. Gouy, Compt. Rend. Acad. Sci. Paris 110, 1251 (1890).

4 Gaussian beam optics 4 5 6

61

M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, New York, 1980. R. N. Bracewell, The Fourier Transform and its Applications, McGraw-Hill, New York, 1978. A. E. Siegman, Hermite-gaussian functions of complex argument as optical-beam eigenfunctions, J. Opt. Soc. Am. 63, 1093–1094 (1973).

5 Coherent and incoherent imaging

The basic elements of an imaging system are shown in Figure 5.1. The light from a source, either coherent (e.g., a laser) or incoherent (e.g., an incandescent lamp or an arc lamp), is collected by the illumination optics (e.g., a condenser lens) and projected onto the object. An image is then formed by an objective lens upon a screen, a photographic plate, a CCD camera, the retina of an eye, etc. Assuming that the objective lens is free from aberrations, the resolution and the contrast of the image are determined not only by the numerical aperture of the objective lens but also by the properties of the light source and the illumination optics. The source and the illumination optics Three types of illumination will be considered. For collimated and coherent illumination we assume a monochromatic laser beam brought to focus at the plane of the object with a condenser lens having a very small numerical aperture (NA). Figure 5.2(a) is the logarithmic intensity distribution at the object plane, produced by a 0.03NA condenser. This distribution has the shape of an Airy pattern, with a central lobe diameter of 1.22k/NA 41k, where k is the wavelength of the light source. Since the objects of interest will be small compared to the Airy disk diameter, and since they will be placed near the center of the Airy disk, this illumination qualifies as coherent, fairly uniform, and nearly collimated. The second type of illumination is also produced by a coherent monochromatic laser beam, but with a high-NA condenser. This time we place the focal point of the condenser somewhat before the object in order to produce within the object plane a bright spot large enough to cover the field of view of the objective lens. Figure 5.2(b) is the logarithmic intensity distribution at the object plane, produced by a coherent beam brought to focus by a 0.25NA condenser at a distance 62

63

5 Coherent and incoherent imaging

Source Illumination optics Object Objective lens (Condenser lens)

Image

Figure 5.1 Schematic diagram of a simple imaging system. The light source is projected by the illumination optics onto an object, allowing the objective lens to form an image of this object at the image plane.

a

–45

b

x/

c

–12.5

45 –45

x/

45

x/

25

d

x/

12.5 –25

Figure 5.2 Computed intensity patterns at the plane of the object corresponding to various types of illumination. (a) Logarithmic plot (a ¼ 4) of the intensity distribution obtained from a coherent source with a 0.03NA condenser lens. (b) Logarithmic plot (a ¼ 4) of the intensity distribution obtained from a coherent source with a 0.25NA condenser lens. The beam is focused to a plane located just 50k before the plane of the object. (c) Same as (b) but showing the intensity distribution rather than its logarithm. (d) Intensity distribution corresponding to an incoherent light source consisting of 37 independent point sources obtained with a 0.25NA condenser lens. Again the source is imaged to a plane located 50k before the plane of the object.

64

Classical Optics and its Applications

of 50k before the object. The beam incident on the object is, therefore, divergent and, although it covers the area of interest, its intensity distribution is not very uniform. This nonuniformity may be better appreciated by considering the corresponding plot of intensity distribution in Figure 5.2(c). (Note the different scales of Figures 5.2(b), (c).) The third type of illumination to be examined is incoherent illumination. We emphasize at the outset that our concern here is solely with spatial incoherence and, as such, we will assume that the source is quasi-monochromatic. (Departure from monochromaticity is a requirement for any source that is to exhibit spatial incoherence; the bandwidth of the source can nonetheless be narrow enough to give its light a long coherence time, making it in effect a temporally coherent source.) To simulate an incoherent source we assumed that the quasimonochromatic light emerging from a fiber bundle consisting of 37 fibers is imaged with a 0.25NA condenser lens to a plane located a distance of 50k before the plane of the object in Figure 5.1. Each fiber within the bundle acts as a coherent point source whose projected intensity distribution at the object plane will be the same as that shown in Figure 5.2(c). When these fibers are properly arranged in space and their intensity distributions added together, we obtain the intensity pattern displayed in Figure 5.2(d). This is a fairly uniform distribution over its central region, which is where the objects of interest will be placed. Although the source could have been imaged directly onto the object plane in this case, the 50k defocus helps to create a more uniform illumination. With this type of illumination, in order to compute the intensity distribution at the image plane, we treat the 37 fibers as independent point sources – each a coherent point source in its own right. We then compute the image obtained with each source independently, and add the intensities of the resulting 37 images together to obtain the final image. The imaging optics The objective lens used in the simulations described below is free from aberrations and, therefore, its performance is diffraction-limited. The objective is a finite-conjugate lens with a numerical aperture of 0.25 (on the side of the object), a focal length of 5000k, and a magnification of 10. Two types of object will be used in these simulations. The first is an amplitude grating with a period of 3k and a 50% duty cycle, shown in Figure 5.3(a). According to the classical optics textbooks,1,2,3 the spatial frequency of this grating is higher than the cutoff frequency of the modulation transfer function (MTF) of the objective lens for coherent illumination, fc ¼ NA/k ¼ (4k)1, but less than that for incoherent illumination, fc ¼ 2NA/k ¼ (2k)1. We will

65

5 Coherent and incoherent imaging 12.5 a

b

y/

–12.5

–12.5

x/

12.5 –12.5

x/

12.5

Figure 5.3 (a) Amplitude grating with a period of 3k and a 50% duty cycle, used as the object in some of the simulations. (b) Pattern of marks with different sizes and separations on a uniform background. In some cases these marks will be black on a transparent background, in other cases they will be transparent marks on a black background, in yet other cases they will be phase objects with 100% transmissivity, imparting a 180 phase shift to the incident beam.

examine the images of this grating under both coherent and incoherent illumination and draw certain conclusions about the classical treatment of this problem. The second type of object with which we will be concerned is a mask imprinted with seven marks of various sizes and shapes, shown in Figure 5.3(b). The largest mark is 10k long, and the smallest mark is 3k wide. These marks are large enough to yield a reasonably clear image with both coherent and incoherent illumination. In one case the marks will be assumed to be bright objects on a dark background, in another case they will be dark objects on a bright background, in yet a third case they will be 180 phase objects having the same amplitude transmissivity as the background. Resolution of the imaging system Let the grating of Figure 5.3(a) be the object in the system of Figure 5.1. If the collimated beam of Figure 5.2(a) is used to illuminate this object then no image will be formed, because all diffracted orders (except the zeroth order) will miss the entrance pupil of the objective lens; the situation is depicted schematically in Figure 5.4. Denoting the period of the grating by P, the deviation angle h of the first diffracted order will be given by sin h ¼ k/P. This is the origin of the wellknown assertion that the MTF cutoff frequency of a coherent imaging system is fc ¼ NA/k. If, however, the coherent illuminating beam is not collimated but is in the form of a cone of light, as in the case of the distribution shown in Figure 5.2(c), then an

66

Classical Optics and its Applications +1st order Grating

0th order Incident beam –1st order

Figure 5.4 A collimated coherent beam (wavelength k) illuminates a grating of period P at normal incidence. The first diffracted orders will miss the objective lens if the lens’s numerical aperture NA is less than k/P. (Some provision must be made for the expansion of the beam diameter at long propagation distances.)

a

–1500

b

x/

1500 –125

x/

125

Figure 5.5 Computed intensity distribution (a) at the exit pupil of the objective lens and (b) at the image plane, corresponding to coherent illumination with the divergent beam of Figure 5.2(c). The object is the grating of Figure 5.3(a).

“image” of the grating will be formed. Figure 5.5 shows computed plots of intensity distribution (a) at the exit pupil of the objective lens, where the overlap between the zeroth-order and the first-order beams is clearly visible, and (b) at the image plane, where an “image” of the grating is seen superimposed on a nonuniform pattern of illumination. The reason that a coherent cone of light produces an image of the grating whereas a collimated beam fails to do so may be understood by studying Figure 5.6: the diffracted first-order cones are captured by the objective lens as long as the lens’s NA is greater than k/(2P). The MTF cutoff frequency for this type of illumination, therefore, is fc ¼ 2NA/k.

67

5 Coherent and incoherent imaging +1st order cone

Incident beam

Condenser lens

Grating Objective lens –1st order cone

Figure 5.6 A cone of coherent light (wavelength k), coming from a condenser lens and illuminating a grating of period P, creates several diffracted cones. If the apex angle of the incident cone is sufficiently large, the first-order beams will be captured by the objective lens as long as NA > k/(2P).

a

–1500

b

x/

1500 –125

x/

125

Figure 5.7 Computed intensity distribution (a) at the exit pupil of the objective lens and (b) at the image plane, corresponding to the incoherent illumination depicted in Figure 5.2(d). The object is the grating of Figure 5.3(a).

The case of incoherent illumination is now easy to understand. Since the beam in Figure 5.2(d) is a superposition of 37 divergent cones similar to that of Figure 5.2(c), the grating’s image will have the same resolution as that obtained with a single cone of light, but it will have a more uniform contrast because it is an average over a large number of point sources. Figure 5.7 shows the computed patterns of intensity (a) at the exit pupil of the objective lens and (b) at the image plane obtained with incoherent illumination. The MTF cutoff for this type of illumination, fc ¼ 2NA/k, is the same as that for coherent illumination with a cone of light, which is twice as large as the cutoff frequency for collimated coherent illumination.

68

Classical Optics and its Applications

Images of non-periodic objects The mask containing marks of different sizes shown in Figure 5.3(b) provides a good test object for comparing images obtained under coherent and incoherent illumination. Consider the case of transparent marks on a dark background (an amplitude object), imaged with a collimated coherent illumination (see Figure 5.8), and also with incoherent illumination (see Figure 5.9). The resolution of the former is obviously inferior to that of the latter, and the spurious fringes appearing in the coherent image are responsible for at least some of the image-quality degradation. (As an aside, note that the exit-pupil distribution in the case of coherent illumination displays much more structure than that obtained with incoherent light.) Using the same object as in Figure 5.3(b) but assuming that the marks are black features on a transparent background (i.e., reversing the contrast) we

a

–1500

b

x/

c

–125

1500 –450

x/

450

x/

125

d

x/

125 –125

Figure 5.8 Coherent imaging of the seven transparent marks on a black background shown in Figure 5.3(b). The incident distribution is the collimated coherent beam of Figure 5.2(a). (a) Logarithmic plot (a ¼ 4) of intensity distribution at the exit pupil of the objective lens. (b) Logarithmic plot (a ¼ 4) of intensity distribution at the image plane. (c) Magnified view of the central region of the image shown in (b); in this case a ¼ 3. (d) Same as (c) but showing the distribution of intensity rather than its logarithm.

5 Coherent and incoherent imaging

69

1500 a

y/

–1500 125 b

y/

–125 125

c

y/

–125

Figure 5.9 Incoherent imaging of the seven transparent marks on a black background; the incident distribution is that of Figure 5.2(d). (a) Intensity distribution at the exit pupil of the objective lens. (b) Intensity distribution at the image plane. (c) Same as (b) but on a logarithmic scale (a ¼ 3).

obtain the distributions of Figure 5.10 in the case of collimated coherent illumination and those of Figure 5.11 in the case of incoherent illumination. Note the similarity between the exit-pupil distributions in Figures 5.8(a) and 5.10(a), indicating that Babinet’s principle is at work here.2,3 Also note in Figure 5.10(b) that, in addition to the marks, the rings of the Airy pattern of the illuminating beam are also captured in the image. The logarithmic intensity distributions in Figures 5.10(b), (c) show gray spots in the middle of dark marks, a feature that is less prominent in the incoherent image of Figure 5.11(c).

70

Classical Optics and its Applications b

a

–1500

x/

c

–125

1500 –450

x/

450

x/

125

d

x/

125 –125

Figure 5.10 Coherent imaging of the seven black marks on a transparent background shown in Figure 5.3(b). The incident distribution is the collimated coherent beam of Figure 5.2(a). (a) Logarithmic plot (a ¼ 4) of intensity distribution at the exit pupil of the objective lens. (b) Logarithmic plot (a ¼ 4) of intensity distribution at the image plane, showing the images of the marks as well as the rings of the Airy pattern. (c) Magnified view of the central region of the image shown in (b); in this case a ¼ 3. (d) Same as (c) but showing the distribution of intensity rather than its logarithm.

Finally we assume that the marks on the mask of Figure 5.3(b) represent transparent phase objects that impart a phase shift of 180 (relative to the background) to the incident beam. Figure 5.12 shows the computed intensity distributions at the objective’s exit pupil and at the image plane, for the case of illumination by the collimated coherent beam of Figure 5.2(a). Figure 5.13 shows the corresponding distributions for incoherent illumination. Note how diffraction from mark boundaries can create an “image” of the marks in a case where no explicit phase-contrast mechanism is present.3 In the two simulations depicted in Figures 5.10 and 5.12, the amplitude transmission functions of the respective objects differ only by an additive constant term. Therefore, the image in Figure 5.10(b), for instance, may be derived from that in Figure 5.12(b) by the addition of the image of the incident beam, it being understood that the quantities being added are the complex amplitudes, not the intensities.

5 Coherent and incoherent imaging 1500

a

y/

–1500 125 b

y/

–125 125 c

y/

–125

Figure 5.11 Incoherent imaging of the seven black marks on a transparent background; the incident distribution is that of Figure 5.2(d). (a) Intensity distribution at the exit pupil of the objective lens. (b) Intensity distribution at the image plane. (c) Same as (b) but on a logarithmic scale (a ¼ 1.7).

71

72

Classical Optics and its Applications b

a

–1500

x/

c

–125

1500 –450

x/

450

d

x/

125 –125

x/

125

Figure 5.12 Same as Figure 5.10 but for a phase object. The assumed object in this case is the mask of Figure 5.3(b), which has uniform transmissivity everywhere; its marks impart a relative phase shift of 180 to the incident beam.

5 Coherent and incoherent imaging

73

1500 a

y/

–1500 125 b

y/

–125 125 c

y/

–125

Figure 5.13 Same as Figure 5.11 but for a phase object. The assumed object in this case is the mask of Figure 5.3(b), which has uniform transmissivity everywhere; its marks impart a relative phase-shift of 180 to the incident beam. (For the logarithmic plot in (c) a ¼ 1.4.)

References for Chapter 5 1 2 3

J. W. Goodman, Introduction to Fourier Optics, McGraw-Hill, New York, 1968. M. Born and E. Wolf, Principles of Optics, 6th edition, Pergamon Press, Oxford, 1980. M. V. Klein, Optics, Wiley, New York, 1970.

6 First-order temporal coherence in classical optics†

A truly monochromatic beam of light, if it ever existed, would be perfectly coherent. Suppose that such a beam is split into two parts and each part propagated over an arbitrary distance. When the parts are finally brought together and mixed, no matter how different the two path lengths may have been, the resulting waveform will exhibit constructive and destructive interference in the form of bright and dark fringes. The coherence length of a monochromatic beam is therefore infinite, in the sense that the path-length difference can be as large as desired without hampering one’s ability to create interference patterns. Real sources of light, of course, are never monochromatic. White light restricted to the visible range of wavelengths from 400 nm to 700 nm, for example, has a coherence length of only a couple of micrometers. A green filter passing sunlight at k0 ¼ 550 nm with a 10 nm bandwidth produces a beam with a coherence length of about 50 lm. The red line of cadmium (k0 ¼ 643.8 nm) has a nearly Gaussian spectrum with a 0.0013 nm width at half peak intensity, leading to a coherence length of nearly 30 cm.1 This is similar to the coherence length of a short, inexpensive HeNe laser (k0 ¼ 632.8 nm) with a few longitudinal modes and a typical bandwidth of Df 1 GHz. A stabilized HeNe laser operating in a single longitudinal mode (Df 100 kHz) has a coherence length of several kilometers. It is important therefore to understand the role of spectral bandwidth in enhancing or diminishing the performance of an optical system that, by design or by coincidence, involves interference. The subject of temporal coherence has been covered extensively in modern and classical textbooks,1,2,3,4,5 and it is not our intention here to repeat what is already well known. Instead, we present an alternative viewpoint that draws on the similarities between a waveform extended over a long span of time and a compact wave packet that exists for a relatively short period. We will show that, †

This chapter is coauthored with Ewan M. Wright, Professor of Optical Sciences at the University of Arizona.

74

6 First-order temporal coherence in classical optics

75

as far as first-order temporal coherence is concerned, the wave packet can be substituted for the extended waveform in analyzing the results of interference experiments. While describing the properties of wave packets, we also mention some interesting observations concerning their reflection from, and transmission through, multilayer stacks. Time dependence, frequency spectrum, and phase Consider a superposition of plane waves, propagating in free space along the Z-axis and covering a range of (temporal) frequencies at and around f ¼ f0. The discrete frequencies fn comprising the spectrum of this waveform are assumed to have a fixed spacing Df as follows: fn ¼ f0 þ nD f ¼ ðN0 þ nÞD f : The amplitude of the waveform is X An ðD f Þ1=2 cos½2p fn ðz=c tÞ þ n ; aðz; tÞ ¼

ð6:1Þ

ð6:2Þ

n

where An and n are the amplitude and phase of the spectral component whose frequency is fn, and c is the speed of light in vacuum. The constant multiplier (Df)1/2 is for normalization purposes only, its significance becoming clear as the discussion proceeds. We set the central frequency f0 ¼ 4.74 · 1014 Hz (corresponding to k0 ¼ 632.8 nm) and D f ¼ 4.74 · 1012 Hz, which leads to N0 ¼ 100. We adopt a Gaussian shape for the distribution of the amplitudes An, as shown in Figure 6.1(a), and let the values of n in Eq. (6.2) range from 15 to þ14, for a total of 30 discrete wavelengths in the spectrum. To a large extent these choices are arbitrary, but the points that we seek to clarify by way of examples based on these choices are quite general in nature. Throughout this chapter the same amplitude coefficients {An} are assumed for all realizations of the waveform a(z, t), but the phase angles {n}, although fixed for any particular waveform, differ for different realizations. The statistical properties of a(z, t) are thus uniquely determined by the joint probability distribution over {n}. Furthermore, we consider stationary processes for which the ensemble average over different phase-angle realizations coincides with the time average derived from a single realization. This restriction of randomness to spectral phase simplifies the discussion without affecting the validity of the final results. Since the spectrum in Figure 6.1(a) is a discrete function of frequency, the corresponding amplitude a(z, t) considered either as a function of time at a fixed point z, or as a function of z at a given instant of time t, will be periodic. With z fixed, for example, the period of the function in the time domain will be

76

Classical Optics and its Applications 1.00

(a)

A (f)

0.75 0.50 0.25 0.00 0

1

2 3 4 Frequency (1014 Hz)

5

6

15 (b) 10 a (t)

5 0 –5 –10 –15 0

15

50

100 Time (fs)

150

200

(c)

10 a (t)

5 0 –5 –10 –15 5

10

15

20

25

30

35

Time (fs)

Figure 6.1 (a) A truncated Gaussian function sampled at regular intervals represents the frequency spectrum of a waveform. (b) The waveform as a function of time obtained by Fourier-transforming the spectrum in (a), assuming that the phase is a linear function of frequency. Since the spectrum is sampled at Df ¼ 4.74 · 1012 Hz, the waveform is repeated with a period of 211 fs; only one period of the wave packet is shown. (c) Close-up of the wave packet.

T ¼ 1/Df ¼ 211 femtoseconds. A plot of a(z = 0, t) over a full period T is shown in Figure 6.1(b), and a close-up of the wave packet appears in Figure 6.1(c). (This is reminiscent of the pulse train emerging from a mode-locked laser.) The width of the packet in Figure 6.1(b) is 20 fs, which is of the same order of magnitude as the inverse of the spectral width (29Df ¼ 1.27 · 1014 Hz). To increase the period T without changing the overall shape of the wave packet one must increase the rate of sampling of the spectrum of Figure 6.1(a), by selecting additional

6 First-order temporal coherence in classical optics

77

frequencies in between those that are already chosen. In this way, both the spectrum and the wave packet retain their shapes but Df becomes smaller while T becomes larger. In the limit Df ! 0 the separation T between adjacent wave packets approaches infinity. Where the first-order coherence of a given waveform is concerned, the phase distribution over its spectral range is irrelevant, even though the shape of the waveform as a function of time is significantly affected by this phase distribution. For example, in Figure 6.1(b) the phase n is assumed to be a linear function of frequency, whereas if n is picked randomly for each fn then an extended function such as that in Figure 6.2 is obtained. (The latter might, for example, be the output of a multi-longitudinal-mode laser.) There are many possible choices for {n} and each choice yields a more or less extended function of time. Only in rare occasions do we find a compact wave packet similar to that in Figure 6.1(b). However, all functions obtained by different choices of {n} are identical in their first-order coherence attributes. In other words, the compact packet of Figure 6.1(b) has the same degree of first-order coherence as the extended waveform of Figure 6.2. The time-averaged intensity of the waveform at an arbitrary point z ¼ z0 is readily computed from Eq. (6.2) as follows: Z 1 T 2 1X 2 hIðz ¼ z0 Þi ¼ a ðz ¼ z0 ; tÞ dt ¼ A Df: ð6:3Þ T 0 2 n n Note that the right-hand side of Eq. (6.3), being the area under the square of the spectral distribution of Figure 6.1(a), remains constant as the sampling rate increases. Thus reducing Df in order to increase the period T does not affect the average intensity of the waveform. 7.5 5.0

a (t)

2.5 0.0 –2.5 –5.0 –7.5 0

50

100 Time (fs)

150

200

Figure 6.2 Waveform obtained by Fourier-transforming the frequency spectrum of Figure 6.1(a) after assigning it a randomly selected phase at each frequency.

78

Classical Optics and its Applications

The Mach–Zehnder interferometer Temporal coherence is usually measured with a Michelson interferometer. For our present purposes, however, we will consider a slightly modified version of the Mach–Zehnder interferometer, shown in Figure 6.3. The collimated beam of light entering the device is split equally between its two arms at the first beam splitter (BS). The two beams are reflected by the mirrors at the end of each arm, then recombined at the second BS. If, on the one hand, the two beams happen to be perfectly in phase when they arrive at the second BS, they interfere constructively in channel 1 (see Figure 6.3) and deliver their combined total optical energy to photodetector 1; photodetector 2 in this case receives no light at all. If, on the other hand, the two beams are relatively phase-shifted by D ¼ 180 , they appear collectively at detector 2, leaving detector 1 in the dark. For intermediate values of D the energy of the beams is split between the two detectors, the splitting ratio being 50/50 when D ¼ þ90 . Now suppose the relative phase between the two beams can be varied continuously by adjusting the length of one of the interferometer’s arms. Then S1, the output of detector 1, reaches its maximum when the two arms become identical in length. As the length of the adjustable arm then increases by a quarter of a wavelength, D becomes 180 and S1 reaches a minimum. As long as the two beams remain coherent (or partially coherent) this behavior is periodically repeated, the output of each detector oscillating between a maximum and a minimum. Once the arm lengths differ by more than the coherence length of the beam, the oscillations die down and both channels receive equal amounts of light, irrespective of the pathlength difference between the arms. For the wave packet of Figure 6.1, we show in Figure 6.4 the computed output of detector 1 as a function of D z ¼ 12 cs, where s is the time delay between the two arms of the interferometer. The time-averaged detector outputs may be written 2 Z 1 T 1 ½aðtÞ aðt sÞ dt S1;2 ðsÞ ¼ T 0 2 Z T Z T 1 1 2 a ðtÞ dt aðtÞ aðt sÞ dt ¼ 2T 0 2T 0 " # 1 1X 2 ¼ A D f cosð2p fn sÞ : hIi ð6:4Þ 2 2 n n The first term on the right-hand side of this equation is a constant, independent of s, while the second term is the autocorrelation function of the waveform a(t) and coincides with the first-order field coherence function in the case of a stationary process. The Fourier series coefficients of this autocorrelation

6 First-order temporal coherence in classical optics

79

Movable reflector Photodetector 2 ΔZ

Photodetector 1 Mirror Beam-splitter 2

Mirror Incident beam

Fixed reflector

Beam-splitter 1

Figure 6.3 The Mach–Zehnder interferometer is used in analyzing the temporal coherence of a collimated beam of light. The incoming beam is split equally between the two arms of the device at the first BS. The two arms are identical except that the end-reflector is fixed in one arm and movable in the other. After traveling along these separate arms the beams are recombined at the second BS. When the optical path lengths of the two arms are identical, the beams interfere constructively in channel 1 and deliver their entire energy to photodetector 1. Deviations from path-length equality can send to channel 2 either the entire beam or a fraction of it. The movable reflector is used to adjust the optical path-length difference between the arms. 1.0 0.8

S1

0.6 0.4 0.2 0.0 –3

–2

–1

0 Δz (μm)

1

2

3

Figure 6.4 The signal S1 of detector 1 as a function of the extension Dz of the movable end-reflector of the interferometer. The assumed incoming beam is the packet shown in Figure 6.1.

function are {A2n} and are independent of {n}. It is thus clear that the signals S1(s) and S2(s), and hence the first-order temporal coherence of the waveform, depend only on the magnitude – and not the phase – of the spectral distribution, as was asserted earlier.

80

Classical Optics and its Applications

Coherence length Figure 6.5 shows the waveforms arriving in channels 1 and 2 when the wave packet of Figure 6.1(b) is sent through the interferometer, with its movable arm extended by D z ¼ cT/8 ¼ 7.91 lm. The time delay between the packets traveling in the two arms is therefore s ¼ 14T. Since this delay is longer than the duration of each packet, the two packets upon arriving at the second BS do not overlap and, therefore, appear separately in both channels. Obviously no interference takes

7.5

(a) Channel 1

5.0

a (t)

2.5 0.0 –2.5 –5.0 –7.5 0

7.5

50

100

150

200

(b) Channel 2

5.0

a (t)

2.5 0.0 –2.5 –5.0 –7.5 0

50

100 Time (fs)

150

200

Figure 6.5 Waveforms arriving at (a) channel 1 and (b) channel 2 of the Mach–Zehnder interferometer. The assumed incoming beam is the packet of Figure 6.1, and the movable arm of the interferometer has been extended by Dz ¼ cT/8 ¼ 7.91 lm. Because the delay is longer than the width of the packet no interference takes place. The two packets act independently and appear in both channels, albeit at half the original magnitude of the incoming wave. Note that the first packet in channel 2, having been transmitted through both beamsplitters, is flipped relative to the second packet, which has been reflected at both beam-splitters. In contrast, each packet arriving in channel 1 has been reflected at one and transmitted at the other beam-splitter. As a result, there is no relative phase shift between the two packets in channel 1.

6 First-order temporal coherence in classical optics

81

place in this case and each channel receives an equal share from each packet, each with one-half of the original amplitude. In the above example, where the delay s between the two arms of the interferometer is 14T, one can divide the frequency content of the wave packet into four categories. The first category consists of the frequencies f ¼ 85 D f, 89 D f, 93D f, . . . , 113 D f. All these terms are phase-shifted by 90 and, when combined at the second BS, are equally split between channels 1 and 2. The output of channel 1 for these frequency components is shown in Figure 6.6(a). The second category consists of the frequencies f ¼ 86 D f, 90 D f, 94 D f, . . . , 114 D f, which are phase-shifted by 180 and, therefore, appear exclusively in channel 2. The third category, consisting of the frequencies f ¼ 87D f, 91D f, 95D f, . . . , 111Df , is phase-shifted by 90 and is, once again, equally split between the two channels; the output of channel 1 for these components is shown in Figure 6.6(b). The fourth and last category consists of frequencies f ¼ 88 D f, 92D f, 96D f, . . . , 112D f, which are not phase-shifted at all and appear in their entirety in channel 1; these are shown in Figure 6.6(c). Now if the three sets of signals in Figure 6.6 are added together the twin packet of Figure 6.5(a) will be obtained. It is clear that the behavior of individual frequency components (or groups of such components that acquire the same phase shift) is independent of all the other components; this is simply a statement of the principle of superposition for the linear system under consideration. Furthermore, the fraction of each component appearing in a given channel is only a function of the phase delay acquired by that component between arms 1 and 2, independent of the original phase of that component. Remembering that the various frequency terms are orthogonal to each other, the behavior of the overall waveform within the interferometer must be independent of the initial phase of its individual components. Thus we see that the analysis of the packet of Figure 6.1(b) applies equally to the extended waveform of Figure 6.2. These different-looking functions share the same spectrum but have differing phase distributions over their common range of frequencies. In particular, the coherence length is equal to the width of the wave packet obtained by setting all n equal to zero. The width of the packet, of course, is roughly equal to the inverse of its spectral bandwidth. In addition to the phase angles n initially present in, and those acquired during propagation of, a given wave packet, the field may accumulate further phase shifts due to dispersive elements (such as mirrors and prisms) in its path. These phase shifts manifest themselves as delays or distortions of the packet. It is of some interest, therefore, to study reflection and transmission delays caused by dispersive elements in order to evaluate their impact on interferometric measurements.

82

Classical Optics and its Applications 4

(a)

a (t)

2 0 –2 –4 0 4

50

100

150

200

50

100

150

200

150

200

(b)

a (t)

2 0 –2 –4 0

4 (c)

a (t)

2 0 –2 –4 0

50

100 Time (fs)

Figure 6.6 The spectrum of the wave packet in Figure 6.1(a) can be considered as the superposition of four groups of frequencies. One of these groups appears exclusively in channel 2. The other three groups appear in channel 1 either fully or partially. The waveforms shown here are those that would have appeared in channel 1 had the other groups been absent. When these three waveforms are added together they reconstruct the pair of wave packets shown in Figure 6.5(b).

Delay upon reflection As an example consider a 12-layer dielectric stack, Figure 6.7, consisting of alternating layers of quartz and strontium titanate. At the central wavelength of k0 ¼ 632.8 nm the refractive indices of these materials are 1.46 and 2.39,

6 First-order temporal coherence in classical optics Quartz (108 nm)

Strontium titanate (66 nm)

83

Substrate

Incident beam

Figure 6.7 Schematic diagram of a quarter-wave stack consisting of six pairs of SiO2/SrTiO3 layers; the entire stack is 1044 nm thick. To simplify the analysis of the transmitted beam, the central region of the substrate is assumed to have been etched away. In calculating the reflection and transmission coefficients of the stack the wavelength dependence of the refractive indices of both types of layer has been taken into consideration.

respectively.6 (The indices vary somewhat within the wavelength range of interest, and the corresponding dispersion is taken into account in the following calculations.) The thickness of the quartz layer is 108 nm and that of SrTiO3 is 66 nm, each being a quarter-wave thick at k0. The stack is grown on a substrate whose central region has been subsequently removed. The hole thus created in the substrate is of no consequence for our analysis of reflection, but it simplifies the discussion in the following section concerning transmission through the stack. Figure 6.8 shows computed plots of amplitude and phase for the reflection and transmission coefficients of the stack in the frequency range covered by the wave packet of Figure 6.1. Note that, within the bandwidth of interest, the phase r of the reflection coefficient is essentially a linear function of frequency with a slope of 1.5 per THz. This slope represents a 4.2 fs delay for the packet upon reflection from the stack. It might therefore be argued that, upon arrival at the surface, the packet spends 4.2 fs in exploring the stack before bouncing back. Roughly speaking, the delay may be associated with a penetration depth of 625 nm for this stack 1044 nm thick. (For an aluminum mirror the corresponding slope is found to be 0.03 per THz, leading to a reflection delay of 0.083 fs and an estimated penetration depth of only 12.5 nm.) Delay upon transmission For the wave packet transmitted through the stack of Figure 6.7 the slope of the phase t( f ) in Figure 6.8(b) is about 0.95 per THz, which amounts to a delay of D t ¼ 2.6 fs. Note, however, that the total thickness of the stack is 1044 nm,

84

Classical Optics and its Applications 1.00

(a) |r|

Amplitude

0.75

0.50

0.25 |t| 0.00 3.75

200

4.00

4.25

4.50

4.75

5.00

5.25

5.50

5.75

(b) fr

Phase (degrees)

100 ft 0

–100

–200 3.75

4.00

4.25

4.50

4.75

5.00

5.25

5.50

5.75

Frequency (1014 Hz)

Figure 6.8 Computed amplitude and phase of the reflection and transmission coefficients r and t of the multilayer stack of Figure 6.7. The depicted range of frequencies covers the entire bandwidth of the wave packet shown in Figure 6.1.

requiring 3.5 fs for the light to cover this distance at its vacuum speed c. It appears therefore that in passing through the stack the packet has exceeded the speed of light.7,8,9,10 Since the special theory of relativity appears to have been violated, we take a closer look at the transmitted beam. Note in Figure 6.8(a) that the transmitted amplitude jt j is not constant over the range of frequencies of the wave packet but rises at both ends. This means that the actual transmitted spectrum is somewhat broadened (see Figure 6.9(a)). Taking into account the actual amplitude and phase of the transmission coefficient, we find the transmitted packet to be that of Figure 6.9(b). The peak of this packet is in fact delayed by about 2.6 fs, implying its faster-than-light

85

6 First-order temporal coherence in classical optics 0.100 (a)

A(f )

0.075 0.050 0.025 0.000 0

1

2 3 4 Frequency (1014 Hz)

5

6

(b)

2

a (t)

1 0 –1 –2 5

10

15

20 25 Time (fs)

30

35

10

15

20 25 Time (fs)

30

35

2 (c)

a (t)

1 0 –1 –2 5

Figure 6.9 The wave packet transmitted through the stack of Figure 6.7 has a broadened spectrum as shown in (a). This spectral broadening, together with the linear phase shift t ( f ) depicted in Figure 6.8(b), results in the compressed and delayed packet shown in (b). Had the spectral broadening been ignored and only the phase shift t(f) taken into account, the transmitted packet would have resembled that in (c).

propagation, but the entire packet is also compressed, which means that its starting point is about 5 fs behind that of the incoming packet (compare Figure 6.9(b) with Figure 6.1(c)). This delay of the starting point ensures that special relativity is not violated. Had we ignored the broadening of the spectrum and only included the phase shift t( f ) in our transmission calculations, we would have obtained the packet of Figure 6.9(c), which is only delayed relative to the

86

Classical Optics and its Applications 1.0

S1

0.9

0.8

0.7

0.6 –3

–2

–1

0 Δz (μm)

1

2

3

Figure 6.10 The signal S1 of detector 1 versus the extension Dz of the movable end-reflector of the interferometer. The stack of Figure 6.7 is installed in the fixed arm while the adjustable arm is extended to compensate for the transmission delay through the stack. The incoming beam is assumed to be the wave packet of Figure 6.1(b)

incoming packet by 2.6 fs, in obvious violation of special relativity. Spectral broadening caused by the transmission curve of the stack thus results in a compression that ultimately delays the emergence of the packet, and in so doing reaffirms the impossibility of communication beyond the speed of light. It is interesting to note that a measurement of the transmission delay by the interferometer of Figure 6.3 also leads to an apparent violation of special relativity. Such a measurement ostensibly determines the delay by measuring the peak of S1 (the output of detector 1) when the multilayer stack is inserted in the fixed arm of the device and the movable arm is extended to maximize S1. The corresponding signal (see Figure 6.10) is obtained by cross-correlating the wave packets of Figures 6.1(c) and 6.9(b). The peak of this curve occurs at Dz ¼ 0.4 lm, which is in agreement with the 2.6 fs delay calculated earlier. One must bear in mind, of course, that the interferometer measures the average delay of the packet upon transmission through the stack, and not the delay of its starting point.

References for Chapter 6 1 M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, Oxford, 1980. 2 L. Mandel and E. Wolf, Optical Coherence and Quantum Optics, Cambridge University Press, UK, 1995. 3 M. V. Klein, Optics, Wiley, New York, 1970.

6 First-order temporal coherence in classical optics

87

4 R. Loudon, The Quantum Theory of Light, second edition, Clarendon Press, Oxford, 1992. 5 P. Meystre and M. Sargent III, Elements of Quantum Optics, Springer-Verlag, Berlin, 1990. 6 The refractive indices of strontium titanate in the wavelength range (500 nm, 800 nm) are taken from W. J. Tropf, M. E. Thomas, and T. J. Harris, Handbook of Optics, Vol. II, Michael Bass, editor, chapter 33, p. 33.72. Those of quartz and aluminum are taken from the Handbook of Chemistry and Physics, 67th edition, R. C. Weast, editor. 7 C. G. B. Garrett and D. E. McCumber, Phys. Rev. A 1, 305, 1970. 8 S. Chu and S. Wong, Phys. Rev. Lett. 49, 1293, 1982. 9 R. Y. Chiao, P. G. Kwiat, and A. M. Steinberg, Faster than light? Scientific American, 52–60, August 1993. 10 R. Y. Chiao and A. M. Steinberg, Tunneling times and superluminality, in Progress in Optics, ed. by E. Wolf, Vol. 37, 347–406, Elsevier, Amsterdam, 1997.

7 The van Cittert–Zernike theorem

The beam of light emanating from a quasi-monochromatic point source (or a sufficiently distant extended source) is said to be spatially coherent: the reason is that, at any two points on a given cross-section of the beam, the oscillating electromagnetic fields maintain their relative phase at all times. If an opaque screen with two pinholes is placed at such a cross-section, Young’s interference fringes will form, and the observed fringe contrast will be 100% (at and around the center of the fringe pattern). This is the sense in which the fields at two points are said to be spatially coherent relative to each other. If the relative phase of the fields at the two points varies randomly with time, the pair of point sources will fail to produce Young’s fringes and, therefore, the fields are considered to be incoherent. In practice there is a continuum of possibilities between the aforementioned extremes, and the resulting fringe contrast may fall anywhere between zero and 100%. The fields at the two points are then said to be partially coherent with respect to one another, and the properly defined fringe contrast in Young’s experiment is used as the measure of their degree of coherence. Optical systems involving partially coherent illumination are explored in several other chapters of this book; see, for example, “Coherent and incoherent imaging” (Chapter 5), “Michelson’s stellar interferometer” (Chapter 35), “Zernike’s method of phase contrast” (Chapter 38), and “Polarization microscopy” (Chapter 39). Chapter 6 described first-order temporal coherence using a simple analytical method. A similar approach will be employed here to study first-order spatial coherence. Coherence theory has been treated extensively in modern and classical textbooks, and it is not our intention here to repeat what is already well known.1,2,3,4 Our goal is to present a simple derivation of the van Cittert–Zernike theorem without invoking the theories of probability and stochastic processes. This is possible because most sources of practical interest are ergodic, meaning that time-averaging over a typical waveform emanated by the source yields statistical information about 88

7 The van Cittert–Zernike theorem

89

the source’s inherently random radiation processes. We shall make exclusive use of time-averaging to derive the degree of coherence of a pair of points within the field of an extended, quasi-monochromatic, incoherent source. Time dependence, frequency spectrum, and phase Consider a point source P, radiating into free space with a range of temporal frequencies at and around f ¼ f0. The discrete frequencies fn are assumed to have a fixed spacing Df as follows: fn ¼ f0 þ nDf ¼ ðN0 þ nÞD f :

ð7:1Þ

At a given point in space, the (scalar) amplitude of the radiated waveform may be written X An ðDf Þ1=2 cosð2pfn t n Þ; ð7:2Þ aðtÞ ¼ n

where An and n are the amplitude and phase of the component whose frequency is fn. The significance of the constant multiplier (Df)1/2, which is there for normalization purposes only, becomes clear shortly. We set the central frequency f0 ¼ 5.454 · 1014 Hz (corresponding to yellow light of wavelength k ¼ 550 nm) and choose Df ¼ 5.454 · 1012 Hz, which leads to N0 ¼ 100. We adopt a Gaussian shape for the distribution of the amplitudes An, as shown in Figure 7.1(a), and let the value of n in Eq. (7.2) range from 4 to þ4, for a total of nine discrete wavelengths in the spectrum. To a large extent these choices are arbitrary but, as before, the points that we seek to clarify by way of examples based on these choices are quite general in nature. Since in the Fourier domain the spectrum in Figure 7.1(a) is a discrete function of frequency, the corresponding amplitude a(t) must be periodic, with a period of T ¼ 1/Df 183 fs. A plot of a(t) over a full period T is shown in Figure 7.1(b), where the values of n at each frequency are chosen randomly and independently of each other. To increase the period T without changing the overall shape of the spectrum one must increase the rate of spectral sampling in Figure 7.1(a), by selecting additional frequencies in between those that are already chosen. In this way the spectrum retains its shape, but Df becomes smaller while T becomes larger. In the limit Df ! 0 the period T of the waveform approaches infinity. As far as the first-order coherence of a given waveform is concerned, the specific phase distribution over its spectral range is irrelevant, even though the shape of the function a(t) may be significantly affected by this phase distribution. For example, in Figure 7.1(b) the value of n at each frequency is chosen randomly, whereas if n were chosen as a linear function of frequency then a waveform such as that of

90

Classical Optics and its Applications 1.0

(a)

Amplitude

0.8 0.6 0.4 0.2 0.0 1

0 3

2

3 4 Frequency (1014 Hz)

5

6

(b)

Amplitude

2 1 0 –1 –2 –3 0

4

25

50

100 75 Time (fs)

125

150

175

25

50

75 100 Time (fs)

125

150

175

(c)

Amplitude

2 0 –2 –4 0

Figure 7.1 (a) A truncated Gaussian function sampled at regular intervals represents the frequency spectrum of a waveform. (b) The waveform obtained by Fourier-transforming the spectrum in (a), after assigning it a randomly selected phase at each frequency. Since the spectrum is sampled at Df ¼ 5.454 · 1012 Hz, its Fourier transform is repeated with a period of 183 fs. Only one period of the waveform is shown. (c) The waveform as a function of time derived by Fouriertransforming the spectrum in (a), assuming that its phase is a linear function of frequency.

7 The van Cittert–Zernike theorem

91

Figure 7.1(c) would have been obtained. There are many possible choices for {n}, and each one yields a more or less extended function of time such as that of Figure 7.1(b). Only in rare occasions does one find a compact wave packet similar to that of Figure 7.1(c). However, the compact packet has the same first-order coherence properties as the extended waveform.

Intensity The average intensity of the waveform a(t) in Eq. (7.2) is readily computed as follows: Z 1 T 2 1X 2 hIi ¼ a ðtÞ dt ¼ A Df: ð7:3Þ T 0 2 n n Note that the right-hand side of Eq. (7.3), being the area under the square of the spectral distribution function of Figure 7.1(a), remains constant as the sampling rate increases. Thus reducing Df in order to increase the period T does not affect the average intensity. Although the average intensity does not depend on {n}, the fluctuations in intensity are most definitely affected by this phase distribution. A thermal source (such as an incandescent lamp) tends to “assign” the values of n randomly and independently of each other, thus resulting in significant fluctuations in I(t). This behavior may be observed by examining the typical waveform in Figure 7.1(b). In fact, it can be shown that, for a thermal source, h[I(t) hIi]2i ¼ hIi2. However, it is possible to assign the phase angles in such a way as to minimize the intensity fluctuations. In a well-stabilized single-mode laser, for instance, the locking of the phase angles renders the root-mean-square fluctuations of intensity negligible, that is, hI2(t)i ¼ hIi2. These considerations, however, pertain to higher-order statistics and, as far as first-order coherence is concerned, one could as well ignore the specific phase distribution.

The cross-correlation function Consider two points P and P0 on an extended source. The oscillations at these 0 points are independent of each other. Thus the set of values {n} and { n} assigned to aP(t) and aP0 (t) may be considered independent. The cross-correlation function between the amplitudes at P and P0 is given by Z 1 T 1X 2 aP ðtÞaP0 ðt sÞ dt ¼ An Df cos 2p fn s þ n 0n : ð7:4Þ Cpp0 ðsÞ ¼ T 0 2 n

92

Classical Optics and its Applications 0

Since n and n are randomly selected with a uniform distribution over [0, 2p], 0 it follows that their difference n n is also a random variable with the same distribution. The function CPP0 (s) thus resembles a(t) of Eq. (7.2) (with random phase angles), depicted in Figure 7.1(b). There is, however, a major difference between these functions: whereas Df in Eq. (7.2) appears with a power 12, the corresponding power of Df in Eq. (7.4) is unity. This means that the average of C2PP0 (s) is inversely proportional to T, namely, Z 1 T 2 1 X 4 CPP0 ðsÞ ds ¼ A Df: ð7:5Þ T 0 8T n n Thus when T!1 the magnitude of CPP0 (s) for essentially all s goes to zero, whereas in the same limit the average intensity hIi given by Eq. (7.3) remains non-zero. If the fields from P and P0 are brought together in an attempt to create interference fringes, their combined intensity will be the sum of their individual intensities plus the cross-correlation term CPP0 (s). Since CPP0 (s)!0 for sufficiently long T, the intensity of the sum will be the sum of individual intensities and, therefore, no fringes will be observed.

Interpretation One may think of the radiation emanating from the two point sources P and P0 in terms of two finite-duration wave packets (see Figure 7.1(c)). However, since the wave packets do not have a random relative phase it is impossible to get their cross-correlation to vanish. Nonetheless, we can assume that the packets are separated in time by an interval much longer than their individual widths and also much longer than any time delay that might occur in a system under consideration. In other words, as far as first-order coherence is concerned, an extended incoherent source emitting continuous radiation from its various points is equivalent to an identical source that emits relatively short bursts of light separated by long intervals. In this model of an incoherent, quasi-monochromatic, extended source each point emits only one pulse, no two points emit overlapping pulses, and all pulses from the various source locations have the same duration and shape. As an example, consider an imaging system where a quasi-monochromatic spatially incoherent light source illuminates a sample, of which an image is formed on a photographic plate. One may imagine the individual points of the source as being independent coherent point sources, each creating a coherent image of the sample on the photographic plate. Because different points radiate at different times, there will be no interference among the various

93

7 The van Cittert–Zernike theorem

images. The photographic plate duly records the intensity pattern produced by each point source, automatically adding these images together as they arrive sequentially. The final image is thus the sum of the intensity distributions of all the coherent images produced by the various point sources. Double pinhole interference Figure 7.2 shows a quasi-monochromatic point source P, illuminating a screen located at z ¼ 0. The screen is pierced with two small pinholes, which are separated by a distance d along the X-axis; their interference fringes are observed on the ng-plane at z ¼ z0. We assume that P is far enough away to yield equal intensities at the pinholes. Also the path-length difference D‘ from P to the pinholes must be short compared to the coherence length of the source, in order to give the pinhole radiations a high degree of temporal coherence. The relative phase between the fields at the pinholes, however, is not negligible and is given by D ¼ 2pD‘/k. This phase difference causes a translation of the fringe pattern along the n-axis. In the neighborhood of the

j

X

z0

d

Z

P Point source

Double pinhole

Observation screen

Figure 7.2 A pair of pinholes in the XY-plane at z ¼ 0 is illuminated by a relatively distant point source at P. The resulting interference fringes are observed at the fg-plane located at z ¼ z0.

94

Classical Optics and its Applications

origin at the observation plane the intensity distribution is given by I ðnÞ ¼ aI ðPÞf1 þ cos½ð2p=kÞðd=z0 Þn þ Dg:

ð7:6aÞ

Here a is an inconsequential proportionality constant and I(P) is the intensity of the point source at P. Note that the fringe periodicity is independent of the location of the source P; it is determined solely by the values of d, z0, and k. The shift of the fringe pattern along the n-axis, however, is a function of D, which does depend on the location of the source. For future reference we rewrite Eq. (7.6a) using complex notation as follows:

ð7:6bÞ I ðnÞ ¼ a I ðPÞ þ a Re I ðPÞ expðiDÞ exp½i2pdn=ðkz0 Þ : The van Cittert–Zernike theorem This theorem, which was first discovered by van Cittert5 and later in a simpler form by Zernike,6 relates the intensity distribution of an extended, quasimonochromatic, planar source to the degree of spatial coherence observed on a parallel plane located at a relatively large distance from the source. Figure 7.3 shows a spatially incoherent, quasi-monochromatic source of wavelength k in X zs

(x, y, –zs)

j

X z0

(x1, y1, 0) Z Source (x2, y2, 0)

Y

Y

h

Figure 7.3 An extended, quasi-monochromatic, spatially incoherent, planar source is placed in the X 0 Y 0 -plane at z ¼ zs. The light from each point (x,y,zs) of this source reaches the points (x1,y1,0) and (x2,y2,0) on the XY-plane at z ¼ 0. A pair of pinholes placed at the latter locations produces a fringe pattern in the fg observation plane at z ¼ z0. The superposition of all intensity distributions thus produced by the various points of the source yields the final intensity pattern at the observation plane.

7 The van Cittert–Zernike theorem

95

the X0 Y 0 -plane at z ¼ zs. The distance zs between the source and the XY-plane at z ¼ 0, on which we seek to determine the degree of coherence, is large enough that all the simplifying assumptions invoked in the previous sections still apply. We wish to determine the first-order coherence properties of the light that reaches the XY-plane at z ¼ 0. We select two points (x1,y1) and (x2,y2) on this plane and assume that two pinholes are placed at these points. The light reaching the pinholes from a point source at (x,y,zs) will have nearly the same amplitude but different phase. The phase difference at the pinholes is given by D ¼ 2pD‘=k 2p 1 2 x1 þ y21 12 x22 þ y22 ½ðx1 x2 Þx þ ðy1 y2 Þyg: 2 kzs

ð7:7Þ

Consider what happens when all the point sources are active. They all act independently, each creating its own fringe pattern at the observation screen. All fringes thus produced will have the same period but different strengths and are shifted by different amounts along the n-axis. Because the point sources are completely incoherent, their overlapping fringe patterns must simply be added together. In other words, the final intensity distribution is the sum of Eq. (7.6) over all points P. We assume a to be the same for all the point sources. The fringe period kz0/d is also the same. Therefore, the sum of Eq. (7.6b) over all point sources may be written as follows: Z I ðnÞ ¼ a I ðx; yÞdx dy source Z þ aRe I ðx; yÞ exp½iDðx; yÞdx dy exp½i2pdn=ðkz0 Þ : ð7:8Þ source

To simplify the notation we define the following parameters: Z I0 ¼ a I ðx; yÞ dx dy

ð7:9aÞ

source

I^ðx; yÞ ¼ I ðx; yÞ

Z I ðx; yÞ dx dy

ð7:9bÞ

I^ðx; yÞ exp½iDðx; yÞ dx dy:

ð7:9cÞ

source

Z cðx1 ; y1 ; x2 ; y2 Þ ¼

source

96

Classical Optics and its Applications

Equation (7.8) may then be rewritten as I ðnÞ ¼ I0 ð1 þ Refcðx1 ; y1 ; x2 ; y2 Þ exp½i2pdn=ðkz0 ÞgÞ:

ð7:10Þ

A comparison of Eqs. (7.10) and (7.6) reveals that the fringe contrast produced by the pinholes at (x1, y1) and (x2, y2) is equal to jcj and that the phase of c determines the shift of these fringes from the center. The function c is thus described as the complex degree of spatial coherence between (x1, y1) and (x2, y2). Substituting expression (7.7) for D in Eq. (7.9c) yields

cðx1 ; y1 ; x2 ; y2 Þ ¼ exp ip x12 þ y21 x22 þ y22 ðkzs Þ Z · I^ðx; yÞ expfi2p½ðx1 x2 Þx þ ðy1 y2 Þy=ðkzs Þgdx dy: source

ð7:11Þ Equation (7.11) is a compact statement of the van Cittert–Zernike theorem: aside from a phase factor, the complex degree of spatial coherence is the Fourier transform of the (normalized) intensity distribution at the incoherent source. Example Consider the uniform, quasi-monochromatic, incoherent source depicted in Figure 7.4(a). The source’s central wavelength is k, and its linear dimensions are 3250k on each side. A square array of 13 · 13 independent point sources on a rectangular mesh (with spacing 250k) is used to simulate this source. A pair of pinholes in an otherwise opaque screen is located at zs ¼ 107k from the source. The square pinholes shown in Figure 7.4(b) are each of side-length 350k and separated by a distance d along the X-axis. The light from the source, having gone through the pinholes, arrives at the observation plane located at z0 ¼ 106k. Figure 7.5 shows the computed fringe patterns at the observation plane for four different values of d. Note that with increasing d the fringe period decreases. The fringe contrast also declines at first, going to zero when d ¼ 3333k. Subsequently, however, the contrast increases as d continues to increase. Whereas in frames (a) and (b) the central fringe is bright, in frame (d) corresponding to d ¼ 4000k the central fringe becomes dark. This is equivalent to a half-period shift of the pattern upon crossing the point of zero contrast. Figure 7.6 shows cross-sections of the fringe patterns of Figure 7.5. The contrast calculated from these plots can be shown to be in good agreement with the values predicted by the van Cittert–Zernike theorem.

97

7 The van Cittert–Zernike theorem a

b

–3000

x/

3000 –3000

x/

3000

Figure 7.4 (a) Intensity distribution over the surface area of a uniform, quasimonochromatic, incoherent source. The linear dimensions of the source are 3250k along each side, where k is the wavelength of its radiation. (b) A pair of square pinholes each measuring 350k along each side. The center-tocenter spacing d between the pinholes is an adjustable parameter of the simulations.

a

b

c

d

–1500

x/

1500 –1500

x/

1500

Figure 7.5 Computed intensity distributions in the vicinity of the optical axis at the observation plane of Figure 7.3 for the source and pinholes of Figure 7.4. The distance between the source and the plane of the pinholes is zs ¼ 107k, while the distance between the pinholes and the observation screen is z0 ¼ 106k. Each frame corresponds to a different spacing d between the pinholes: (a) d ¼ 1250k; (b) d ¼ 2500k; (c) d ¼ 3333k; (d) d ¼ 4000k.

98

Classical Optics and its Applications 1.0

1.0

d = 1250

(a)

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

–4800

Normalized Intensity

d =2500

0.0

0.0

1.0

(b)

–2400

0

2400

4800

d = 3333

(c)

–2400 1.0

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

1200

(d)

2400

d = 4000

0.0

0.0 –1800 –1200 –600

–1200

0 x/

600

1200 1800

–1500 –1000 –500

0 x/

500

1000 1500

Figure 7.6 Cross-sectional view of the intensity distributions of Figure 7.5. (Note: the scale of the horizontal axis is different for each plot.) The fringe contrast, which is 0.75 in (a), drops to 0.3 in (b), and is effectively zero in (c). With the increasing of the pinhole separation the contrast climbs up once again to 0.15 in (d). Whereas the central fringe in (a) and (b) is bright, it is dark in (d).

References for Chapter 7 1 M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, Oxford, 1980. 2 L. Mandel and E. Wolf, Optical Coherence and Quantum Optics, Cambridge University Press, UK, 1995.

7 The van Cittert–Zernike theorem 3 4 5 6

99

M. V. Klein, Optics, Wiley, New York, 1970. R. Loudon, The Quantum Theory of Light, second edition, Clarendon Press, Oxford, 1992. P. H. van Cittert, Physica 1, 201 (1934). F. Zernike, Physica 5, 785 (1938).

8 Partial polarization, Stokes parameters, and the Poincare´ sphere

A strictly monochromatic plane wave is fully polarized; to obtain partial polarization one must consider a superposition of two or more plane waves of differing wavelengths. A collimated beam of light is considered to be fully polarized if a quarter-wave plate followed by an ideal polarizer can be used to extinguish the beam. Failure at extinction reveals the beam as either fully or partially unpolarized. In the classical literature it is customary to analyze the degree of polarization of a beam of light in terms of the cross-correlation function between two orthogonal components of the beam’s E-field.1,2,3 It is somewhat easier, however, to carry out the same calculations in the frequency domain and so to derive the relevant parameters as integrals over the frequency spectrum of the beam. One advantage of the latter approach is that it applies to beams of arbitrary bandwidth, thus removing from the results the restriction to quasi-monochromaticity. Another advantage is that it avoids the use of mutual coherence, which, at times, tends to confuse discussion of the subject. In the following sections we show how a frequency-domain analysis leads to a compact expression for the degree of polarization of a polychromatic beam of light in terms of its Stokes parameters. Orthogonal polarization components Consider a polychromatic beam of light propagating along the Z-axis and possessing polarization components in both the X- and the Y-direction: X An ðD f Þ1=2 cos½2p fn ðt z=cÞ þ n ; ð8:1aÞ Ex ðz; tÞ ¼ n

Ey ðz; tÞ ¼

X

Bn ðD f Þ1=2 cos½2p fn ðt z=cÞ þ wn :

n

100

ð8:1bÞ

8 Partial polarization, Stokes parameters, and the Poincare´ sphere

George Gabriel Stokes

101

Jules Henri Poincaré

Sir George Gabriel Stokes (1819–1903). Irish-born mathematical physicist; he spent most of his adult life at Cambridge, where he held the Lucasian chair for over half a century. He was an intimate friend of Lord Kelvin and James Clerk Maxwell. Stokes’ most important researches were concerned with hydrodynamics, optics, and geodesy. In optics he was mainly responsible for the explanation of fluorescence, and made significant contributions to the theory of diffraction. He was generous in sharing his ideas with colleagues and students and readily gave credit to others when there were any priority disputes. A few days after his death, The Times of London wrote in an obituary that “Sir G. Stokes was remarkable . . . for his freedom from all personal ambitions and petty jealousies.” (Photo: courtesy of AIP Emilio Segre´ Visual Archives, E. Scott Barr Collection.) Jules Henri Poincare´ (1854–1912), received his doctorate in mathematics from the University of Paris in 1879, and was appointed, in 1886, to the chair of mathematical physics at the Sorbonne and to a chair at the Ecole Polytechnique. Having made significant contributions to many aspects of mathematics, physics, and philosophy, Poincare´ is often described as one of the great geniuses of all time and as the last universalist in mathematics. In applied mathematics he studied optics, electricity, telegraphy, capillarity, elasticity, thermodynamics, potential theory, the theory of relativity, and cosmology. His studies of the three-body problem in celestial mechanics mark the beginning of modern chaos theory. He is acknowledged as a co-discoverer, with Albert Einstein and Hendrik Lorentz, of the special theory of relativity. (Photo: Percy Bridgman Collection, courtesy of AIP Emilio Segre´ Visual Archives.)

This beam’s spectrum consists of individual frequencies fn ¼ (N0 þ n)Df within a finite bandwidth around the central frequency f0 ¼ N0Df, as depicted in Figure 8.1. The term associated with fn along the X-axis has amplitude An and phase n; the corresponding term along the Y-axis has amplitude Bn and phase wn; c is the

102

Classical Optics and its Applications 1.0

Amplitude

0.8

0.6

0.4

0.2

0.0 5.0

5.1

5.2

5.3

5.4

5.5

5.6

Frequency (1014 Hz)

Figure 8.1 Distribution of the amplitudes of a polychromatic beam in the frequency range (5.4054 5.5146) · 1014 Hz. The arrowheads represent the sampled values of the amplitude spectrum at intervals of Df ¼ 3.64 · 1011 Hz. The central frequency f0 ¼ 1500Df ¼ 5.46 · 1014 Hz corresponds to the wavelength k ¼ 549.5 nm; the upper and lower bounds of the spectrum are at k ¼ 544 nm and 555 nm, respectively. The sampled amplitudes are representative of An and/or Bn, in Eqs. (8.1). Associated with each sample is a corresponding phase n or wn (not shown).

speed of light in vacuum, and the constant multiplier (Df )1/2 is for normalization purposes only, its significance becoming clear as the discussion proceeds. As described by Eqs. (8.1), the contribution to the beam of each frequency term fn is a fully polarized plane wave. For this plane wave, which is elliptically polarized in general, one can determine the ellipticity and the orientation of the ellipse of polarization in terms of An, Bn, and n wn. However, the superposition of different frequency terms, each having a different state of polarization, results in partially polarized light. Ideal phase-retarder and polarizer; transmitted power To determine the degree of polarization of the beam described by Eqs. (8.1), we assume a perfect retarder and a perfect polarizer placed in the path of the beam, orthogonal to the propagation direction, as in Figure 8.2. The variable retarder induces a phase delay v between Ex and Ey, the value of v being adjustable in the range 180 . It is imperative for the following analysis that v be independent of optical frequency within the bandwidth of interest; in other words, for the beam described by Eqs. (8.1) the phase delay v must be the same for all fn contained in

103

8 Partial polarization, Stokes parameters, and the Poincare´ sphere X

Incident beam

u Z

Y Retarder

Polarizer

Figure 8.2 A polychromatic beam of light propagating along the Z-axis is sent through a variable retarder and a polarizer. The retarder’s fast and slow axes are fixed along the X- and Y- directions, but its phase shift v as well as the polarizer’s orientation angle h may be adjusted to minimize the amount of light that is transmitted through the system. v must be the same for all the wavelengths contained in the incident beam.

the spectrum. (A retarder based on total internal reflection provides a good approximation to an ideal retarder; this is the same principle of operation as used in a Fresnel rhomb.1) Such a retarder can modify the shape of the cross-correlation function between Ex and Ey, but, contrary to what has been asserted in the literature, it cannot destroy the mutual coherence between these components of polarization. The confusion is perhaps rooted in the fact that a time delay between Ex and Ey can destroy the mutual coherence, whereas a frequency-independent phase shift v leaves the mutual coherence essentially intact. The beam is subsequently passed through the polarizer, which may be rotated around the Z-axis until the transmitted optical power is minimized. The retardation v is then adjusted and the orientation h of the polarizer is changed accordingly until the transmitted power reaches its absolute minimum. The optimum retardation v0 together with the optimum orientation h0 of the polarizer thus obtained determine the state of polarization of that fraction of the beam which is fully polarized. The minimum transmitted power is a measure of the unpolarized content of the beam. In the system of Figure 8.2 the amplitude of the light emerging from the polarizer is X An cosð2p fn t þ n Þ cos h Eðz ¼ 0; tÞ ¼ n

þ Bn cosð2p fn t þ wn þ vÞ sin h ðD f Þ1=2 :

ð8:2Þ

104

Classical Optics and its Applications

Because all frequencies fn in Eq. (8.2) are integer multiples of Df, namely, fn ¼ (N0 þ n)Df, the transmitted amplitude E(z ¼ 0, t) is a periodic function of time, with period T ¼ 1/Df. The time-averaged transmitted intensity as a function of v and h is thus given by Z 1 T 2 E ðz ¼ 0; tÞ dt Iðv; hÞ ¼ T 0 1X 2 An cos2 h þ B2n sin2 h þ An Bn sinð2hÞ cosðn wn vÞ D f : ¼ 2 n ð8:3Þ The presence of D f in the above expression allows a smooth transition from the discrete sum to a continuous integral in the limit D f ! 0; this, of course, is the same limit in which T ! 1. Stokes parameters To streamline the calculation of the values of v and h that minimize I(v, h), we follow Sir George Gabriel Stokes (1819–1903) in defining the four parameters that now bear his name:4 S0 ¼

1X 2 ðAn þ B2n ÞD f ; 2 n

ð8:4aÞ

S1 ¼

1X 2 ðAn B2n ÞD f ; 2 n

ð8:4bÞ

S2 ¼

X

An Bn cosðn wn ÞD f ;

ð8:4cÞ

An Bn sinðn wn ÞD f :

ð8:4dÞ

n

S3 ¼

X n

To minimize the transmitted intensity in Eq. (8.3) we first set the derivative of I(v, h) with respect to v equal to zero. This yields v0, independently of the value of h, as follows: v0 ¼ arctanðS3 =S2 Þ:

ð8:5aÞ

Substituting v0 for v in Eq. (8.3) and differentiating with respect to h, we find the optimum h0 as ð8:5bÞ h0 ¼ 12 arctan ðS2 =S1 Þ cos v0 þ ðS3 =S1 Þ sin v0 :

8 Partial polarization, Stokes parameters, and the Poincare´ sphere

105

The transmitted intensity thus turns out to have a minimum at (v0, h0) and a maximum at (v0, h0 þ 90 ), or vice versa. These values are given by Imin ¼ 12 S0 12 ðS12 þ S22 þ S32 Þ1=2 ;

ð8:6aÞ

Imax ¼ 12 S0 þ 12 ðS12 þ S22 þ S32 Þ1=2 :

ð8:6bÞ

Degree of polarization The minimum transmitted intensity Imin in Eq. (8.6a), being that part of the beam which cannot be extinguished with a retarder and a polarizer, represents the depolarized content of the beam. This, of course, is only half the total amount of depolarized light, because the same amount must also be contained in Imax. The total amount of depolarized light, therefore, is 2Imin, while the remaining part, Imax Imin, is fully polarized. The degree of polarization P of the beam may thus be defined as 1=2 P ¼ ðImax Imin Þ=ðImax þ Imin Þ ¼ ðS1 =S0 Þ2 þ ðS2 =S0 Þ2 þ ðS3 =S0 Þ2 : ð8:7Þ Using the Schwartz inequality,5 it is not difficult to show that S12 þ S22 þ S32 S02; consequently, 0 P 1. (See Note 1 at the end of the chapter.) One may question the generality of the above result because, in deriving it, the fast and slow axes of the wave-plate were fixed along the X- and Y- axes. In other words, one wonders if the result would have been different had the axes of the wave-plate been allowed to rotate around the Z-axis. The result can be shown to be quite general, however, because P of Eq. (8.7) remains invariant under a rotation of the XY-plane around Z. The value of S0, being the total power of the beam, obviously remains the same for arbitrary orientations of the coordinate system. Moreover, with some elementary algebra, the quantity S12 þ S22 þ S32 may also be shown to be invariant under coordinate rotation. (See Note 2 at the end of the chapter.) In retrospect the variable retarder of Figure 8.2 could have been replaced by an achromatic quarter-wave plate (e.g., a Fresnel rhomb) in a rotary mount. The axes of the quarter-wave plate could then be made to coincide with the axes of the ellipse of polarization in order to linearize that part of the beam which is fully polarized. This is precisely what the variable retarder accomplishes in that it adjusts the retardation v while maintaining a fixed orientation in the XY-plane. The Poincare´ sphere In general, the fraction of the beam that is fully polarized has elliptical polarization, with ellipticity g and orientation angle q (this is the angle between the

106

Classical Optics and its Applications Z

S3 S

S1

O 2

2

S2

Y

X

Figure 8.3 The Poincare´ sphere is the location of all points S with coordinates (x, y, z) ¼ (S1, S2, S3). The radius of the sphere is PS0, and the latitude and longitude of S specify the ellipticity g and orientation angle q of the polarized component of the beam.

major axis of the ellipse and the X-axis). These parameters may be readily expressed in terms of the Stokes parameters: sinð2gÞ ¼ S3 =ðS12 þ S22 þ S32 Þ1=2 ;

ð8:8aÞ

tanð2qÞ ¼ S2 =S1 :

ð8:8bÞ

Using the above relations, the French mathematical physicist Henri Poincare´ (1854–1912) represented the state of polarization as a point S on the surface of a sphere, as shown in Figure 8.3. In this representation the three Cartesian coordinates of S are S1, S2, and S3. Thus, according to Eq. (8.7), the radius of the Poincare´ sphere is PS0, the power of that fraction of the beam which is fully polarized. The latitude of S is twice the ellipticity g of the polarized component, in accordance with Eq. (8.8a), while the longitude of S represents twice the orientation angle q of the major axis of the ellipse of polarization, as prescribed by Eq. (8.8b). Unpolarized light A completely unpolarized beam of light cannot be altered by the wave-plate and polarizer of Figure 8.2. No matter what the phase shift v of the retarder and the orientation h of the polarizer may be, the output power will be one-half the input

8 Partial polarization, Stokes parameters, and the Poincare´ sphere

107

power. For this light S0 will be the total power of the beam, but S1 ¼ S2 ¼ S3 ¼ 0. P 2 P An Df ¼ Bn2 Df implies that the power along the Since S1 ¼ 0, the relation X-axis equals that along the Y-axis. For natural light, where the polarization components along the X- and Y- axes are independent of each other, the relative phase angles n wn are uniformly distributed over (0, 2p) and tend to be a random function of n. Hence, in the limit D f ! 0, the Stokes parameters S2 and S3 approach zero as well. However, there exist other combinations of n and wn that yield totally unpolarized light. For example, a superposition of two equal-magnitude beams of frequencies f1 and f2, where one beam is rightand the other left-circularly polarized, can be readily shown to be fully unpolarized. Partial depolarization by a glass slab upon reflection or transmission Figure 8.4 shows a glass slab 100 lm thick and of refractive index n ¼ 1.5, upon which a linearly polarized beam is incident at an oblique angle c ¼ 75 . The incident beam has equal amounts of p- and s-polarization with equal phase, giving its linear polarization a 45 angle relative to both p- and s-directions. The spectral content of the beam is that depicted in Figure 8.1. Upon reflection from the slab the computed amplitudes of the p- and s-components of the beam as

Ep(i )

Es(i)

Ep(r)

Es(r)

100 m

n = 1.5 Ep(t ) Es(t)

Figure 8.4 A polychromatic plane wave is incident on a glass slab 100 lm thick at c ¼ 75 . The index of refraction of the glass, n ¼ 1.5, is independent of the wavelength. The incident beam is linearly polarized at 45 to the plane of incidence, that is, it has equal amounts of p- and s-polarization. The reflected and transmitted beams are slightly depolarized.

108

Classical Optics and its Applications

1.0 (a)

180 (b) |rs|

f(rp)

f(rs)

0

135 90 Phase (degrees)

Amplitude

|rp|

0.4

Rotation and Ellipticity (degrees)

0.8

0.6

45 0

–45 –90

0.2 –135

546 548 550 552 554 (nm)

–10

–20 –30 –40 –50

–60

–180

0.0

10 (c)

–70 546 548 550 552 554 (nm)

546 548 550 552 554 (nm)

Figure 8.5 A polychromatic plane wave, having the spectrum of Figure 8.1 and a linear polarization at 45 to the plane of incidence, is reflected from a glass slab at c ¼ 75 (see Figure 8.4). Shown as functions of k: (a) the reflected amplitudes jrpj (broken line) and jrsj (solid line); (b) the phase angles of rp (broken line) and rs (solid line); (c) the reflected polarization state, defined by the rotation angle q and ellipticity g. For the reflected beam the computed degree of polarization is P ¼ 0.978, the polarized component is essentially linear (g ¼ 0.000026 ), and the polarization vector makes an angle q ¼ 60.2 with the p-direction.

functions of k are depicted in Figure 8.5(a). Multiple reflections at the two facets of the slab interfere with each other to produce the fine structure seen in the spectra of Figure 8.5(a). The phase angles of the reflected p- and s- components are shown in Figure 8.5(b), and the resulting polarization rotation angle q and ellipticity g appear in Figure 8.5(c). The knowledge of these quantities allows one to compute the Stokes parameters from Eqs. (8.4), yielding S1/S0 ¼ 0.495, S2/S0 ¼ 0.844, S3/S0 ¼ 0.89 · 106. Thus the degree of polarization of the reflected beam is P ¼ 0.978, the wave-plate’s required phase shift v0 is very small, 0.00006 , and the polarizer’s angle for minimum transmission must be set to h0 ¼ 29.8 . It is seen that the polarized content of the reflected beam is essentially linear (g ¼ 0.000026 ) and is oriented at q ¼ 60.2 relative to the p-direction. Similar results may be obtained for the beam transmitted through the slab. The corresponding amplitudes and phases are shown in Figure 8.6, and the Stokes parameters are found to be S1/S0 ¼ 0.306, S2/S0 ¼ 0.907, S3/S0 ¼ 0.7 · 106. Thus the degree of polarization is P ¼ 0.957, the wave-plate’s required phase

109

1.0

50

180 (b)

(a)

f(ts)

135

40 Rotation and Ellipticity (degrees)

|tp|

0.8

90 45 Phase (degrees)

Amplitude

0.6

0.4 |ts|

0 –45

–180

0.0 550 552 (nm)

20

10

0

–135

548

30

–90

0.2

546

(c)

f(tp)

554

–10 546

548

550 552 (nm)

554

546

548

550 552 (nm)

554

Figure 8.6 The counterpart of Figure 8.5 for the case of transmission through the glass slab 100 lm thick. The computed degree of polarization of the transmitted beam is P ¼ 0.957, the polarized component is essentially linear (g ¼ 0.000022 ), and the polarization vector makes an angle q ¼ 35.7 with the p-direction.

110

Classical Optics and its Applications

shift is v0 ¼ 0.000047 , and the polarizer’s angle for minimum transmission is h0 ¼ 54.3 . Therefore, the polarized content of the transmitted beam, oriented at q ¼ 35.7 relative to the p-direction, is essentially linear (g ¼ 0.000022 ). Partial depolarization upon transmission through a birefringent slab Figure 8.7 shows the characteristics of a polychromatic beam of light upon transmission through a birefringent slab of calcite. The thickness of the slab is 85lm and its ordinary and extraordinary refractive indices are no ¼ 1.6613 and ne ¼ 1.488. The normally incident beam, which is linearly polarized at 45 to the crystal axes, has the spectrum of Figure 8.1. The computed Stokes parameters of the transmitted beam are: S1/S0 ¼ 0.023, S2/S0 ¼ 0.259, S3/S0 ¼ 0.902. Thus the degree of polarization of the transmitted beam is P ¼ 0.939, the wave-plate’s required phase shift is v0 ¼ 74 , and the polarizer’s angle

180 (b) f(tp)

1.0 (a) |tp|

135

|ts|

Phase (degrees)

Amplitude

90 0.6

0.4

(c)

45 Rotation and Ellipticity (degrees)

0.8

60

f(ts)

45 0 –45 –90

0.2

30 15 0

–15 –30 –45 –60

–135 –75 –180

0.0 546 548 550 552 554 (nm)

–90 546 548 550 552 554 (nm)

546 548 550 552 554 (nm)

Figure 8.7 The amplitude, phase, and polarization state of a polychromatic beam, having the spectrum of Figure 8.1, upon transmission through a calcite slab 85lm thick (n0 ¼ 1.6613, ne ¼ 1.488). The normally incident plane wave is linearly polarized at 45 to the crystal axes. (a) Transmitted amplitudes of the pcomponent (broken line) and s-component (solid line) versus k. (b) Phase angles of the p-component (broken line) and s-component (solid line) versus k. (c) Polarization rotation angle q and ellipticity g of the transmitted beam versus k. The computed degree of polarization upon transmission is P ¼ 0.939, and the rotation angle and ellipticity of the polarized fraction of the beam are q ¼ 47.6 , g ¼ 37 .

8 Partial polarization, Stokes parameters, and the Poincare´ sphere

111

of minimum transmission is h0 ¼ 44.3 . The polarized fraction of the transmitted beam is, therefore, elliptical with g ¼ 37 and q ¼ 47.6 , relative to the p-direction.

Note 1 The Schwartz inequality,5 which concerns the integral of the product of two complex functions of the real variable x, is written as follows: Z 2 Z Z

f ðxÞg ðxÞ dx j f ðxÞj2 dx jgðxÞj2 dx: Defining the complex vectors A and B as A ¼ ½A1 expði1 Þ; A2 expði2 Þ; . . . ; AN expðiN Þ B ¼ ½B1 expðiw1 Þ; B2 expðiw2 Þ; . . . ; BN expðiwN Þ we find S22 þ S32 ¼ jS2 þ iS3j2 ¼ jAB*Tj2 kAk2kBk2 ¼ S02 S21, establishing the desired inequality.

Note 2 The 2 · 2 matrix

M¼

A T A B

B T

has the following properties: 1 Trace M ¼ S0 2 1 Trace2 M Det M ¼ S12 þ S22 þ S32 : 4 A Upon rotating the XY-plane through an angle f, the vector will be multiplied B on the left by the rotation matrix

cos f sin f : sin f cos f However, under this unitary transformation both the trace and the determinant of M remain unchanged. Therefore, the beam’s total power S0 and the power of its polarized component (S12 þ S22 þ S32)1/2 are rotation invariant.

112

Classical Optics and its Applications

References for Chapter 8 1 M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, 1983. 2 L. Mandel and E. Wolf, Optical Coherence and Quantum Optics, Cambridge University Press, UK, 1995. 3 M. V. Klein, Optics, Wiley, New York, 1970. 4 G. G. Stokes, Trans. Camb. Phil. Soc. 9, 399 (1852). Reprinted in his Mathematical and Physical Papers, Vol. III, p. 233, Cambridge University Press, 1901. 5 Papoulis, Probability, Random Variables, and Stochastic Processes, McGraw-Hill, New York, 1984.

9 Second-order coherence and the Hanbury Brown–Twiss experiment

Introduction The degree of first-order temporal coherence, a function denoted by g(1)(s), provides information about the coherence length and the power spectral density of a light source. However, without additional information, g(1)(s) has no bearing on intensity fluctuations and higher-order statistics of the emitted light. A quasi-monochromatic laser beam and the beam of light from an incandescent light bulb, provided that the latter is properly filtered to match the spectral lineshape of the former, will have identical degrees of first-order coherence. Any interferometric experiment involving the splitting and superposition of amplitudes would yield identical results for the laser beam and the (properly filtered) thermal light. Therefore, on the basis of such experiments alone, there is no way to distinguish the two light sources. It turns out, however, that the intensity fluctuations of laser light are fundamentally different from those of thermal light. The two sources can, therefore, be distinguished based on their secondorder coherence properties.1,2,3 An ideal photodetector produces an electrical signal proportional to the “cycleaveraged intensity” of the E-field (or B-field) of the light beam at the location of the detector. Assuming that the electrical bandwidth of the detector (including all associated circuitry) is greater than the bandwidth of the incident light wave by at least a factor of 2, the output of the detector should accurately represent the intensity fluctuations of the light beam as a function of time. At a given point in space, the light beam’s degree of second-order coherence, g(2)(s), may be defined in terms of the autocorrelation function of the output electric signal from such a detector. In the next section we derive a general expression for g(2)(s), discuss the fundamental difference between laser light and thermal light as manifested by this degree of second-order coherence, and obtain a relationship between g(2)(s) and g(1)(s) for the case of chaotic (e.g., thermal) light. 113

114

Classical Optics and its Applications

Historically, the first experiments involving the correlations of intensity fluctuations of light beams were conducted by Robert Hanbury Brown and Richard Q. Twiss in the 1950s.4,5,6,7,8 The primary goal of these experiments was to determine the diameters of astronomical objects using the correlations of their optical (i.e., visible light) emissions as picked up by a pair of Earth-based photo-multiplier tubes with an adjustable separation. In the third section we discuss the case of two distant point sources having a small angular separation, monitored by a pair of ideal photo-detectors. This simple example, which contains the essence of the Hanbury Brown–Twiss experiment, shows that the angular separation between distant point sources can, in principle, be extracted from the cross-correlation of the output signals obtained from two photodetectors with a variable separation distance. Intensity fluctuations and the degree of second-order coherence At a fixed point in space, we consider the narrow-band function a(t), which describes the light amplitude as a function of time, having a discrete spectrum consisting of 2M þ 1 frequencies that range from (N0 M)Df to (N0 þ M)Df, as follows: aðtÞ ¼

NX 0 þM

An

pﬃﬃﬃﬃﬃﬃ Df cosð2pnDft þ n Þ:

ð9:1Þ

n ¼ N0 M

pﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃ The frequency component nDf has amplitude An Df and phase n, with Df being introduced here to allow a smooth transition to the continuum limit later on. The discrete nature of the spectrum makes the field amplitude a(t) periodic, with period T ¼ 1/Df. As we add more frequency components to the spectrum to fill the gaps between adjacent frequencies, Df goes to zero, T goes to infinity (as does the number of discrete frequencies, 2M þ 1), and the spectral density approaches a continuous limit denoted by the complex function Aˆ( f ) ¼ A( f ) exp[i( f )]. The instantaneous intensity of the field given by Eq. (9.1) may be written as follows: XX 0 1 0 a2 ðtÞ ¼ 2 An An Df fcos½2pðn n ÞDft þ n n0 n

n0

þ cos½2pðn þ n0 ÞDft þ n þ n0 g:

ð9:2Þ

The second term in the above expression has high frequencies, (n þ n0 )Df , which can be removed by a low-pass filter. (Such low-pass filtering is inherent to all commonly available photo-detectors.) The low-frequency terms survive the filtering; upon rearranging the double sum in Eq. (9.2), the filtered function a2(t) – known as the

9 Second-order coherence

115

cycle-averaged intensity I(t) – becomes: IðtÞ ¼

1 2

X

A2n Df þ

2M X

N0 X þMm

An Anþm Df cosð2pmDftþnþm n Þ:

ð9:3Þ

m ¼ 1 n ¼ N0 M

n

The above equation may be written in a compact form: IðtÞ ¼ I0 þ

2M X Iˆm cosð2pmDft þ w Þ; m

ð9:4aÞ

m¼1

where I0 ¼ 12

X

A2n D f ;

ð9:4bÞ

n

Iˆm ¼ Iˆm expði wm Þ ¼

N0 X þMm

An Anþm exp½iðnþm n ÞD f

n ¼ N0 M

¼

N0 X þMm

ˆ n*A ˆ nþm D f : A

n ¼ N0 M

ð9:4cÞ

Equation (9.4a) is the Fourier series representation of the time-dependent (filtered) intensity I(t), consisting of the constant term I0, which is half the area under the power spectral density function |Aˆ( f )|2, and oscillatory terms having frequency mDf and (complex) magnitude ˆIm, obtained from the autocorrelation of the (complex) spectrum Aˆ( f). In the limit Df ! 0, we have Z1 I0 ¼

1 2

Aðf ˆ Þ2 df ;

ð9:5aÞ

0

ˆ Þ ¼ Iðf ˆ Þ exp½iwðf Þ ¼ Iðf

Z1

ˆ f 0ÞA ˆ * ð f 0 f Þd f 0 : Að

ð9:5bÞ

0

Next, we exploit the fact that I(t) given by Eq. (9.4) is a periodic function of time with period T ¼ 1/Df, and define the autocorrelation of the cycle-averaged intensity distribution by averaging over one period T, as follows: 1 hIðtÞIðt þ sÞi ¼ T

ZT IðtÞIðt þ sÞdt ¼ I02 þ 12 0

2M X Iˆm 2 cosð2pmDf sÞ: m¼1

ð9:6Þ

116

Classical Optics and its Applications

Normalization of the intensity autocorrelation function in Eq. (9.6) yields the classical degree of second-order coherence as gð2Þ ðsÞ ¼

2M X hIðtÞIðt þ sÞi ðIˆm =I0 Þ2 cosð2pmD f sÞ: ¼ 1 þ 12 hIðtÞihIðt þ sÞi m¼1

ð9:7Þ

Note that, in deriving the above expression for g(2)(s), no assumptions were made about the statistical properties of either a(t) or its Fourier transform A( f) exp[i( f)]. In particular, there is no need for a(t) to be stationary, although ergodicity will be helpful, as the time-averages obtained over one particular waveform will then be representative of the entire ensemble of such waveforms. The classical degree of second-order coherence given by Eq. (9.7) has three fundamental properties. (i) g(2)(s) is an even function of s. (ii) g(2)(s) g(2)(0), that is, the maximum of g(2)(s) occurs at s ¼ 0. This is a consequence of the fact that all the cosines in Eq. (9.7) reach their peak values simultaneously at s ¼ 0. (iii) The value of the function at s ¼ 0 is greater than or equal to unity, namely, g(2)(0) 1, because, inevitably, R (|Iˆm|/I0)2 0.

We now study two special cases in some detail. Case 1: Chaotic light The main feature of chaotic light is that the spectral phase n associated with the frequency fn ¼ nDf in Eq. (9.1) is not a well-behaved function of n. For any particular sample a(t) taken from the ensemble of all possible waveforms, the spectral amplitude An may have some simple frequency dependence, e.g., Gaussian or Lorentzian; however, n will vary randomly from one frequency fn to the next. This randomness of individual phase values does not influence the value of I0, given by Eq. (9.4b), which is independent of {n}. It will, however, have a significant effect on the autocorrelation function that yields the values of Iˆm in Eq. (9.4c). The numerical examples depicted in Figures 9.1 and 9.2 show, respectively, the cases of Gaussian and Lorentzian spectral density functions. We began these calculations by assuming a fixed profile for the spectral density |Aˆ( f)|. We then chose a small sampling interval Df , assigned random phases n to each sample, and proceeded to compute the autocorrelation of Aˆ(f), from which the values of I0 and Iˆm were determined; see Eqs. (9.4b) and (9.4c). Finally, we constructed the degree of second-order coherence g(2)(s) in accordance with Eq. (9.7). Although the numerical results in each case depended on the specific choice of the random

117

9 Second-order coherence 1.0 (a) |Â( f )|

0.8 0.6 w 0.4 0.2 0.0 f0 – 2500 Δf 2.0 1.8 (c) 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 –½

f0 + 2500 Δf

f0

0 (b) –2 –4 –6 –8 –10 log10(|Îm|/I0) –12 –14 –16 –18 1 2500 m 2.0

g(2)()

5000

(d)

1.8 4 ln 2/(w)

1.6 1.4 1.2 1.0 0 (1/Δf )

½

–0.002

0 (1/Δf )

0.002

Figure 9.1 (a) Gaussian amplitude spectrum |Aˆ( f)| centered at f ¼ f0, having FWHM equal to w ¼ 1000Df; the sampling interval is conveniently chosen as Df ¼ 1.0 in arbitrary units. (b) Logarithmic plot of |Iˆm|/I0 computed from the Gaussian frequency spectrum depicted in (a) with randomly assigned phase values to each frequency fn; M ¼ 2500, I0 ¼ 376.35. (c) Plot of g(2)(s) calculated from Eq. (9.7) with s in units of 1/Df. (d) Close-up of g(2)(s) showing its Gaussian central part having FWHM ¼ 4ln2/(pw) 0.88/w.

phase values, we found the computed g(2)(s) changed only in small and insignificant ways with each choice of {n}, provided that the chosen D f was small enough to properly sample the line-shape. Note that g(2)(s) 1 has a Gaussian form in Figure 9.1(d) and an inverse Lorentzian form in Figure 9.2(d). This, of course, is not a coincidence and a general relationship can be shown to exist between the line-shape of chaotic light and the functional form of g(2)(s). Recalling that the degree of first-order coherence is defined as Z 1 P 2 Aðf ˆ Þ2 expði2pf sÞ df An expði2pnDf sÞDf P 2 gð1Þ ðsÞ ¼ n ; ð9:8Þ ! 0 Z 1 2 An Df Aðf ˆ Þ df n 0

118

Classical Optics and its Applications 1.0

0

(a)

0.8

(b)

–1

| Â( f )|

log10(|Îm|/I0)

–2

0.6

w –3

0.4 –4 0.2

–5

0.0

–6 f0 – 1500 Δf

2.0 1.8 (c) 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 –½

f0 + 1500 Δf

f0

1 2.0

g(2)()

5000 m

10000

(d)

1.8 1.6

3 ln 2/(w)

1.4 1.2 1.0 0 (1/Δf )

½

–0.005

0 (1/Δf )

0.005

Figure 9.2 (a) Lorentzian amplitude spectrum |Aˆ( f)| centered at f ¼ f0, having FWHM equal to w ¼ 250Df ; the sampling interval is conveniently chosen as Df ¼ 1.0 in arbitrary units. (b) Logarithmic plot of |Iˆm|/I0 computed from the Lorentzian frequency spectrum depicted in (a) with randomly assigned phase values to each frequency fn; M ¼ 5000, I0 ¼ 112.32. (c) Plot of g(2)(s) calculated from Eq. (9.7) with s in units of 1/Df. (d) Close(2) up pﬃﬃﬃ of g (s) showing its inverse Lorentzian central part having FWHM ¼ 3 ln 2 / (px) 0.38 / w.

it is not difficult to show that, for chaotic light, 2 gð2Þ ðsÞ 1 þ gð1Þ ðsÞ :

ð9:9Þ

To see this, note that ˆIm in Eq. (9.4c) is the sum of (2M þ 1 m) complex numbers. Therefore, |Iˆm|2 contains the sum of the squared moduli of these numbers plus many cross terms. The cross terms, however, all have random phases and, when a large number of such terms are added together, they tend to cancel out. What remains, therefore, is mainly the sum of the squared moduli, namely, Z 1 þMm 2 N0 X 2 ˆ 0 2 2 2 2 0 0 Iˆm ˆ An Anþm D f ! j Aðf Þj j Aðf f Þj df D f : ð9:10Þ n¼N0 M

0

9 Second-order coherence

119

Now, starting with the definition of g(1)(s) in Eq. (9.8), a straightforward calculation yields P N0 þM ð1Þ 2 g ðsÞ ¼

2M NP 0 þMm P A4n D f D f þ 2 A2n A2nþm D f cosð2pmD f sÞD f

n ¼ N0 M

n ¼ N0 M

m¼1

ð

P n

A2n D f Þ2

: ð9:11Þ

In the limit when D f ! 0 the first term in the numerator of Eq. (9.11) approaches zero. As for the second term, the coefficient of cos(2pmDfs) is the same as |Iˆm|2 given by Eq. (9.10). Also, with reference to Eq. (9.4b), the denominator is equal to 4I02. It is thus clear that, in the case of chaotic light, Eq. (9.9) is a direct consequence of Eq. (9.7). Note that, for chaotic light, g(2)(0) ¼ 2, irrespective of the shape of the spectral density function (i.e., the line-shape), simply because g(1) (0) ¼ 1 according to Eq. (9.8). Also, if the linewidth approaches zero, g(2)(s) becomes very broad; in the limit of zero linewidth, therefore, g(2)(s) ¼ 2 for all values of s. The chaotic fluctuations of intensity are, therefore, intrinsic to this type of light and cannot be removed by spectral filtering, no matter how narrow the filter’s linewidth may be. This is the fundamental difference between the coherent light from a laser and the chaotic light from a thermal source; whereas the classical degree of second-order coherence for thermal light is equal to 2, that for monochromatic laser light (i.e., single longitudinal mode, narrowband) is always equal to unity, as shown in the following subsection.

Case 2: Coherent laser light Under ideal circumstances, the amplitude of the light generated by a wellstabilized, single-longitudinal-mode laser operating well above threshold exhibits the following time dependence: aðtÞ ¼ a0 cos½2pf0 t þ vðtÞ:

ð9:12Þ

Here a0 is a constant amplitude, and v(t) is a slowly varying function of time, representing the familiar phenomenon of phase drift in well-stabilized lasers. Note that the cycle-averaged intensity of this waveform, a02 , is independent of the time t. The waveform, nonetheless, has a finite bandwidth which may be obtained by Fourier transforming its amplitude a(t). For a0 ¼ 1.0 and the particular choice of v(t) shown in Figure 9.3(a), Figure 9.3(b) shows a typical slice of a(t) over a short time interval. Clearly, the slow variations

120

Classical Optics and its Applications 180 135

(a)

1.0 (b)

(t)

90

cos[2 f0 t + (t)]

0.5

45 0

0 –45

–0.5

–90 –135

–1.0

–180 0

0.5

0. 2 5 0. 5 0. 7 5 time(1/Δf ) (c)

1

| Â( f ) |

0.4 0.3 0.2

0.500

time(1/Δf)

180 (d) 135 90 45 0

0.525

f(f)

–45 –90 –135 –180

0.1 0.0 f0 – 20 Δf

f0 – 20 Δf

f0 + 20 Δf

f0

0.010 (e)

f0 + 20 Δf

f0

1.002 (f) | Îm| /I0

0.008

1.001

0.006

g(2)()

1.000

0.004 0.999 0.002 0.998

0.000 1

10

20 m

30

40

–½

0 (1/Δf )

½

Figure 9.3 (a) Phase profile v(t) over the time interval t ¼ 0 to t ¼ T ¼ 1.0 in units of 1/Df. Note that the range of variation of v is [p : p]. (b) Plot of the function a(t) ¼ a0 cos[2p f0t þ v(t)] over the brief time interval [0.500, 0.525], with a0 ¼ 1.0, f0 ¼ 500Df and v(t) as shown in (a). (c), (d) Amplitude and phase profiles of Aˆ( f), the Fourier transform of a(t), obtained numerically over the entire time interval [0, 1]. The function Aˆ( f) is truncated, with the values covering the range f0 ± 20Df retained. (e) Plot of |Iˆm|/I0, computed from the truncated Aˆ( f) using Eqs. (9.4b, 9.4c). (f) Computed plot of g(2)(s) obtained with the values of |Iˆm|/I0 inserted into Eq. (9.7). Aside from minor fluctuations – caused by the truncation of Aˆ(f) – the degree of second-order coherence is equal to 1.0 for all values of s.

121

9 Second-order coherence

of v(t) guarantee a stable amplitude for the waveform over its entire duration. Figures 9.3(c,d) display the computed amplitude and phase of Aˆ( f), the Fourier transform of a(t), obtained numerically over the time interval [0, T ]; here T ¼ 1/Df is the inverse of the frequency domain sampling interval, which is conveniently chosen as D f ¼ 1.0 in arbitrary units. As usual, the spectrum is sampled at discrete frequencies, fn ¼ f0 þ nD f ¼ (N0 þ n)D f, then truncated by limiting its frequency content to the range M n M. In Figure 9.3, f0 ¼ 500Df, and the truncated Aˆ( f) is confined to the frequency range f0 ± 20Df. The values of |Iˆm|/I0, computed from the truncated Aˆ( f) in accordance with Eqs. (9.4b) and (9.4c), are shown in Figure 9.3(e), while a plot of g(2)(s), obtained from Eq. (9.7) using these values of |Iˆm|/I0, appears in Figure 9.3(f). Aside from minor fluctuations – caused by the truncation of Aˆ( f) – note that the degree of second-order coherence, g(2)(s), is essentially equal to 1.0 for all values of s. The Hanbury BrownTwiss Experiment Although the original work of Hanbury Brown and Twiss was aimed at measuring the diameter of Sirius and other astronomical objects, the essence of their idea can be readily explained in terms of the procedure needed to measure the angular distance between the constituents of a binary star. Figure 9.4 shows two independent point sources Sa and Sb, whose angular separation, when observed from the (far away) plane of the detectors D1 and D2, is h. For simplicity of analysis, the detectors are positioned such that they are equi-distant from Sa, but receive the light from Sb with a delay of sb ¼ hL / c, where L is the distance between D1 and D2, and c is the speed of light in vacuum. Assuming the radiation from both sources is narrowband and centered at the same frequency f0 ¼ N0D f, one may

D1

I1(t)

L/2

Sa

Z

L/2 Sb

D2

I2(t)

Figure 9.4 The light from two independent point sources, Sa, Sb, is detected by the photo-detectors, D1, D2, located far away from the sources. The radiation from both sources is narrowband and centered at the same frequency f0. The ideal, pointlike detectors are separated from each other by an adjustable distance L in the same direction as Sa is separated from Sb. Seen from the detectors’ plane, the angular distance between Sa and Sb is h. Each detector produces an electrical signal proportional to the cycle-averaged intensity of the corresponding incident light.

122

Classical Optics and its Applications

write the light amplitudes a1(t) and a2(t) arriving at the two detectors as follows: NX 0 þM

a1 ðtÞ ¼

fAn

pﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃ D f cosð2pnD ft þ n Þ þ Bn Df cosð2pnDft þ vn Þg;

n¼ N0 M

ð9:13aÞ a2 ðtÞ ¼

NX 0 þM

fAn

pﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃ Df cosð2pnDft þ n Þ þ Bn Df cos½2pnDf ðt þ sb Þ þ vn g:

n ¼ N0 M

ð9:13bÞ Here the frequency component fn ¼ nDf arriving from Sa has the complex amplitude pﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃ ˆAn ¼ An Df exp(in), while that from Sb has the amplitude B ˆ n ¼ Bn Df exp(ivn). The source Sa, being equi-distant from the two detectors, makes equal contributions to a1(t) and a2(t), whereas the contributions of Sb are shifted in time by the relative delay sb. Following the same steps that led from Eq. (9.1) to Eq. (9.4), we now determine the filtered (i.e., cycle-averaged) intensities I1(t) and I2(t) observed at the detectors of Figure 9.4. We find 2M X Iˆ1m cosð2pmD ft þ w Þ; I1 ðtÞ ¼ I10 þ 1m

ð9:14aÞ

m¼1

I2 ðtÞ ¼ I20 þ

2M X Iˆ2m cosð2pmD ft þ w Þ: 2m

ð9:14bÞ

m¼1

In the above equations I10 ¼ 12

NX 0 þM

½An 2 þ Bn 2 þ 2An Bn cosðvn n ÞD f ;

ð9:14cÞ

n ¼ N0 M

Iˆ1m ¼ Iˆ1m expðiw1m Þ ¼

N0 þMm X n ¼ N0 M

I20 ¼ 12

NX 0 þM

ˆ * þB ˆ * ÞðA ˆ nþm þ B ˆ nþm ÞD f ; ðA n n

½A2n þ B2n þ 2An Bn cosð2pnD f sb þ vn n ÞD f ;

ð9:14dÞ

ð9:14eÞ

n ¼ N0 M

Iˆ2m ¼ Iˆ2m expðiw2m Þ ¼

N0 þMm X n ¼ N0 M

ˆ * þB ˆ * expði2pnDf sb ÞfA ˆ nþm ½A n n

ˆ nþm exp½i2pðn þ mÞD f sb gD f : þB

ð9:14fÞ

123

9 Second-order coherence

The normalized cross-correlation function between I1(t) and I2(t) is thus given by ð2Þ

g12 ðsÞ ¼

2M hI1 ðtÞI2 ðt þ sÞi 1X * Re½ðIˆ1m =I10 ÞðIˆ2m =I20 Þ expði2pmDf sÞ: ¼1þ hI1 ðtÞihI2 ðtÞi 2 m¼1

ð9:15Þ So far we have not made any assumptions about the nature of Sa and Sb, beyond the fact that they are distant point sources with narrowband spectra centered at the same frequency f0 ¼ N0Df. Equations (9.13)–(9.15) are, therefore, valid for any type of light, so long as the sampled spectral amplitude and phase profiles, {(An, n)} of Sa and {(Bn, vn)} of Sb, are dense enough to provide proper representations ˆ f Þ. If the light beams emerging from of the spectral density functions Aˆ( f ) and Bð the independent sources Sa and Sb happen to be chaotic (e.g., thermal), then the phase angles {n} and {vn} will be random and uncorrelated, thus leading to I10 I20

* ˆ Iˆ1m I2m

NX 0 þM

1 12 ðA2n þ B2n ÞDf ! 2 n ¼ N0 M

N0 X þMm n ¼ N0 M

þ

A2n B2nþm

Z1

Að ˆ f Þ2 þBð ˆ f Þ2 d f ;

ð9:16Þ

0

A2n A2nþm þ B2n B2nþm expði2pmDf sb Þ exp½i2pðn þ mÞD f sb þ

B2n A2nþm

expði2pnD f sb Þ ðD f Þ2 : ð9:17Þ

Next we relate the various terms appearing in Eq. (9.17) to the first-order degrees of coherence of Sa and Sb, defined as follows: P 2 An expði2pnD f sÞD f ð1Þ P 2 ga ðsÞ ¼ ð9:18aÞ An D f ; ð1Þ gb ðsÞ

P ¼

Bn2 expði2pnDf sÞD f P 2 : Bn D f

ð9:18bÞ

Straightforward calculations similar to those that led to Eq. (9.11) may now be used to determine the expanded forms of |ga(1)(s)|2, |gb(1)(s þ sb)|2, and ga(1)(s) gb*(1)(sþsb), which turn out to contain the various terms that appear in Eq. (9.17). * One must then substitute for (Iˆ1m /I10)(Iˆ2m/I20) in Eq. (9.15) from Eqs. (9.16) and (9.17), and proceed to replace the resulting expressions with their equivalents in

124

Classical Optics and its Applications

terms of the aforementioned degrees of first-order coherence. This would lead, without further approximations, to the following expression for the crosscorrelation function between I1(t) and I2(t) of Figure 9.4, in the special circumstance that Sa and Sb are both chaotic and independent: 2 R1 2 R1 ˆ 2 ð1Þ ð1Þ ˆ ½ Aðf Þ d f ga ðsÞ þ ½ Bðf Þ d f gb ðs þ sb Þ ð2Þ 0 : g12 ðsÞ 1 þ 0 2 2 R1 R1 Að ˆ f Þ d f þ Bð ˆ f Þ d f 0

ð9:19Þ

0

As an example, consider the case of two independent, but otherwise identical, chaotic point sources with angular separation h ¼ 1.0 lrad. Both sources have the same Gaussian line-shape, centered at f0 ¼ 1012Df, with a FWHM line width (2) w ¼ 103Df. Figure 9.5 shows computed plots of g12 (s) obtained for several values of L in accordance with Eq. (9.15). In the case of L ¼ 0, depicted in Figure 9.5(a), sb ¼ 0, and the cross-correlation function reduces to that of a single, equivalent light source. This may be understood by setting sb ¼ 0 in Eq. (9.19) and verifying its reduction to Eq. (9.9). In other words, since ga(1)(s) and gb(1)(s) are, respectively, the Fourier transforms of the normalized power spectral density functions Að ˆ f Þ2 and Bð ˆ f Þ2 , the weighted sum on the right-hand side of Eq. (9.19) is equal to the degree of first-order coherence of a new source whose power spectral ˆ f Þ2 þ Bð ˆ f Þ2 . density function is Að As L increases, the relative delay sb begins to affect the cross-correlation function. This is seen in Figures 9.5(b)–(d), which correspond, respectively, to L ¼ 75 m, 150 m, and 300 m. Once again, these results can be explained (2) with reference to Eq. (9.19), which expresses g12 (s) in terms of a superposition of the (complex) degrees of first-order coherence, ga(1)(s) and gb(1)(sþsb). Note that, according to Eq. (9.8), the envelope of g(1)(s), the Fourier transform of the line-shape, is modulated by exp(i2p f0s), where f0 is the central frequency of the light source. Thus, when sb is much less than the width of the envelope, gb(1)(sþs b) essentially equals gb(1)(s)exp(i2p f0sb). At L ¼ 150 m, for instance, 2p f0sb ¼ p, causing the terms containing ga(1) and gb(1) in Eq. (9.19) to cancel out; see Figure 9.5(c). At L ¼ 300 m, however, the phase shift is 2p, thus restoring the full height of the cross-correlation function; see Figure 9.5(d). (1) If L is made extremely large, say L ¼ 5 · 1012 m, then g(1) a (s) and gb (sþsb) no longer overlap, leading to the cross-correlation function depicted in Figure 9.5(e). The fact that, in the present example, the coefficients of ga(1)(s) and g(1) b (s þ sb) in Eq. (9.19) are both equal to ½, suffices to explain the reduction of the peak value (2) of g12 (s) from 2.0 in Figure 9.5(a) to 1.25 in Figure 9.5(e).

125

9 Second-order coherence 2.0

(a)

2.0

(2)

g12()

1.5

1.8 4 ln 2/(w)

1.6 1.0

1.4

0.5

1.2

L = 0, = 1.0 μrad

1.0

0.0 –0.1 2.0

0

0.1

–0.002

0

0.002

1.0 –0.002 2.0

0

0.002

0

0.002

0

0.002

2.0

(b)

1.8

1.5

1.6 1.0

1.4

0.5 L = 75 m, = 1.0 μrad 0.0 –0.1

0

0.1

2.0 (c)

1.2

1.8

1.5

1.6 1.0

1.4

0.5 0.0

1.2 L = 150 m, = 1.0 μrad –0.1

2.0

0

0.1

(d)

1.0 –0.002 2.0 1.8

1.5

1.6 1.0

1.4

0.5

1.2 L = 300 m, = 1.0 μrad

0.0 –0.1 2.0 (e)

0

0.1

1.0 –0.002 1.25 1.20

1.5

1.15 1.10

1.0

1.05

0.5

L = 5 × 1012 m = 1.0 μrad

0.0 –0.1

0 (1/Δf )

1.00 0.95 0.1

–0.015

(1/Δf )

0

(2) Figure 9.5 Computed plots of g12 (s) for a pair of chaotic point sources, Sa and Sb, having angular separation h ¼ 1.0 lrad. Both sources have a Gaussian lineshape centered at f0 ¼ 1012Df, with FWHM line width w ¼ 103Df; in these calculations M ¼ 2500. The distance between the detectors is (a) L ¼ 0, (b) L ¼ 75 m, (c) L ¼ 150 m, (d) L ¼ 300 m, and (e) L ¼ 5 · 1012 m. In each case the close-up of (2) (s) is shown on the right-hand side. the central part of g12

126

Classical Optics and its Applications

Concluding remarks Practical photodetectors may have a narrower bandwidth than is required for producing an ideal cycle-averaged intensity I(t) in a given application. In other words, the low-pass filtering mentioned in going from Eq. (9.2) to Eq. (9.3) could influence the low-frequency terms that survive the filtering. The transfer function of the detector (including all electronic circuitry) must, therefore, be included in Eq. (9.3) and all subsequent equations. For practical determinations of intensity fluctuations, of course, the effects of electronic filtering must be taken into account. However, as far as the fundamental principles discussed in this chapter are concerned, the consequences of such filtering are irrelevant, and the detection circuit’s transfer function may safely be ignored. Another practical concern revolves around the question of noise in photo detection. The output signal from a photodetector is, in general, accompanied by several types of noise, such as shot noise, thermal noise, and the noise associated with the photo-multiplication process. Accurate measurement of intensity correlations and fluctuations requires a careful analysis of all relevant sources of noise, elimination or minimization of undesirable signals, and collection of a sufficient number of photons to ensure the adequacy of the available signal-tonoise ratio. In this context it must also be mentioned that, when measuring the intensity autocorrelation hI(t) I(t þ s)i at a fixed point in space, it is advantageous to use a 50/50 beam-splitter in conjunction with two identical photodetectors, as shown in Figure 9.6. Whereas the noise or other spurious signals from a single detector could exhibit temporal correlations, a pair of well-isolated detectors is unlikely to suffer from such complications. The splitting of the beam, of course,

D1

Source

I1(t) I1(t) I2(t

D2

I2(t)

)

Delay

Figure 9.6 The degree of second order coherence g(2)(s) of a beam of light may be determined by two identical photodetectors D1 and D2, placed symmetrically with respect to the output ports of a 50/50 beam-splitter. According to the classical optical theory, the intensity fluctuations at the two detectors are identical with those of the light arriving at the splitter. The use of two detectors (instead of one) is thus dictated by the need to mitigate the temporal correlations of the noise (or other spurious signals).

9 Second-order coherence

127

will halve the signal strength at each detector, but, according to the classical optical theory, it should not disturb the intensity fluctuations otherwise. A fundamental issue raised in the wake of the Hanbury Brown–Twiss experiment concerned the quantum nature of light and its role in determining the measured intensity fluctuations and correlations of the various types of radiation. In particular, it was pointed out that a single photon leaving the source in Figure 9.6, could be picked up by either D1 or D2, but not by both, whereas the classical theory allowed the beam-splitter to divide the photon’s energy between the two receivers. Attempts to answer this and many related questions eventually ushered in the modern era of quantum optics.2,3 The results obtained in the present chapter for classical sources of light have been found to retain their validity under a quantum mechanical treatment.3 In the meantime, however, several types of non-classical light have been discovered whose proper treatment requires the full machinery of the quantum theory of radiation and detection. A striking example of quantumoptical phenomena is anti-bunching, where the degree of second-order coherence g(2)(0) for certain non-classical sources is known to be below unity.3,4 In fact, the entire range of values between 0.0 and 1.0 is accessible to g(2)(0) in quantum optics. This, of course, is an impossibility in the classical theory, where Eq. (9.7) dictates that g(2)(0) 1. References for Chapter 9 1 2 3 4 5 6 7 8

J. W. Goodman, Statistical Optics, Wiley, New York, 1985. L. Mandel and E. Wolf, Optical Coherence and Quantum Optics, Cambridge University Press, London, 1995. R. Loudon, The Quantum Theory of Light, third edition, Clarendon Press, Oxford, 2000. R. Hanbury Brown and R. Q. Twiss, Correlation between photons in two coherent beams of light, Nature 177, 27–29 (1956). R. Hanbury Brown and R. Q. Twiss, The question of correlation between photons in coherent light rays, Nature 178, 1447–1448 (1956). R. Hanbury Brown and R. Q. Twiss, Interferometry of the intensity fluctuations in light. I. Basic theory: the correlation between photons in coherent beams of radiation, Proc. Roy. Soc. London A, 242, 300–324 (1957). R. Hanbury Brown and R. Q. Twiss, Interferometry of the intensity fluctuations in light. II. An experimental test of the theory for partially coherent light, Proc. Roy. Soc. London A, 243, 291–319 (1958). R. Hanbury Brown, The Intensity Interferometer, Taylor and Francis, London, 1974.

10 What in the world are surface plasmons?†

Despite its scary name, a surface plasmon is simply an inhomogeneous planewave solution to Maxwell’s equations. Typically, a medium with a large but negative dielectric constant e is a good host for surface plasmons. Because in an isotropic medium having refractive index n and absorption coefficient j we have e ¼ (n þ ij)2, whenever j n the above criterion, large but negative e, is approximately satisfied; as a result, most common metals such as aluminum, gold, and silver can exhibit resonant absorption by surface plasmon excitation. In order to excite, within a metal, a plane wave that has a large enough amplitude to carry away a significant fraction of the incident optical energy, one must create a situation whereby the metal is “forced” to accept such a wave; otherwise, as normally occurs, the wave within the metal ends up having a small amplitude, causing nearly all of the incident energy to be reflected, diffracted, or scattered from the metallic surface, depending upon the condition of that surface. In this chapter several practical situations in which surface plasmons play a role will be presented. We begin by describing the results of an experiment that can be readily set up in any optics laboratory, and we give an explanation of the observed phenomenon by scrutinizing the well-known Fresnel’s reflection formula at a metal-to-air interface. We then describe other, slightly more complicated, situations involving the excitation of surface plasmons, in an attempt to convey to the reader the generality of the phenomenon and its various manifestations. Surface plasmons in a thin metallic film Perhaps the simplest arrangement in which one may observe surface plasmons is that shown schematically in Figure 10.1. A thin metal film, coated on the flat †

The coauthor of this chapter is Lifeng Li, now at the Tsinghua University in China.

128

10 What in the world are surface plasmons?

129

Reflected beam

Incident beam Glass hemisphere

u Thin metal film

Figure 10.1 Schematic diagram showing a monochromatic plane wave incident on a thin metal film through a hemispherical glass substrate. When the film is sufficiently thin, at a specific incidence angle h a surface plasmon is excited within the metal layer, causing a substantial fraction of the incident beam’s energy to be absorbed and converted to heat within the metal layer.

face of a glass hemisphere, is illuminated at oblique incidence through the hemisphere. In this example the glass will be assumed to have refractive index 1.5, and the metal film will be assumed to be aluminum, although most common metals coated on just about any type of glass will exhibit a similar behavior. A plane monochromatic beam of red HeNe light is directed at the glass–metal interface, and its reflection is monitored as a function of the angle of incidence h. Figure 10.2(a) shows computed plots of the reflection coefficients jrpj and jrsj versus h for the case of a very thin (d ¼ 5 nm) aluminum film. At the critical angle of total internal reflection (TIR) for a glass–air interface, hcrit ¼ 41.8 , the reflection coefficients show a sudden rise, but jrpj drops sharply above hcrit, attaining a minimum at h 45 . This sharp reduction in the reflectivity of p-polarized light is due to the excitation of a surface plasmon in the aluminum layer. Figure 10.2(b) shows plots of the magnitude of the Poynting vector S through the thickness of the aluminum layer for both p- and s-polarized light at the incidence angle h ¼ 45 . Note that for the s-light, approximately 30% of the incident optical power enters the film, which then proceeds to be absorbed (rather uniformly) through the film thickness. With the p-light, however, the fraction of the incident power absorbed by the film is much higher (close to 90%). Evidently, at this particular angle of incidence the p-light has been able to excite a very strong wave in the metal layer. Similar calculations can be done for other thicknesses of the aluminum layer; the results shown in Figure 10.3 correspond to a film thickness d ¼ 10 nm. The minimum reflectivity now occurs at h ¼ 42.95 , and the percentage of p-light absorbed by the film has climbed to over 98%. If we continue to increase the film thickness, however, the effect begins to decrease (and eventually to disappear), as demonstrated by the plots of Figure 10.4, which correspond to d ¼ 20 nm. In fact,

130

Classical Optics and its Applications (a)

= 45°

(b) 1.0

1.0

0.8

0.8 Magnitude of Poynting Vector

Amplitude Reflection Coefficient

|rs|

|rp| 0.6

0.4

0.6

0.4 Ss 0.2

0.2

0.0

Sp

0

15

30

45 60 (degrees)

75

90

0.0

0

1

2

3

4

5

Z (nm)

Figure 10.2 (a) Computed plots of amplitude reflection coefficients for the pand s-components of polarization versus the angle of incidence h, for the monochromatic plane wave (k ¼ 633 nm) incident at the interface between glass and a thin aluminum layer (d ¼ 5 nm) shown in Figure 10.1. The dip in jrpj at h 45 is caused by the excitation of a surface plasmon in the aluminum film. (b) Plots of the magnitude of the Poynting vector S against the depth z within the aluminum layer, at h ¼ 45 . Note that approximately 90% of the incident power of the p-polarized light enters the aluminum film and is absorbed fairly uniformly within the film’s thickness. In contrast, only 30% of the s-polarized light is absorbed by the film.

aside from the weak, plasmon-related feature in the vicinity of hcrit, the plots of jrpj and jrsj in Figure 10.4(a) already resemble those for a very thick aluminum film (i.e., one for which d skin depth). It is thus obvious that the lower interface, between aluminum and air, is responsible for the excitation of surface plasmons: increasing the film thickness prevents the electromagnetic field from reaching the aluminum–air interface, thus suppressing the excitation of the plasma wave. Also note in Figure 10.4(b) that the slope of Ss is greatest near the glass–aluminum interface, and the flux of optical energy contained in the spolarized beam decays exponentially as it moves away from this interface towards the aluminum–air interface. In contrast, the slope of Sp is greatest at the aluminum–air interface, indicating that most of the energy is deposited at that site. This is yet another indication that the aluminum–air interface is responsible for the excitation of surface plasmons in the system of Figure 10.1.

131

10 What in the world are surface plasmons? 1.0

|rs|

0.8

0.8

Magnitude of Poynting Vector

Amplitude Reflection Coefficient

= 42.95°

(b)

(a) 1.0

|rp| 0.6

0.4

Sp 0.6

0.4

0.2

Ss

0.2 0.0 0.0 0

15

30

45 60 (degrees)

75

90

0

2

4

6

8

10

Z (nm)

Figure 10.3 Same as Figure 10.2, except for the thickness of the aluminum film, which is now 10 nm. The resonant absorption in this case occurs at h ¼ 42.95 , and the fraction of p-polarized light absorbed by the aluminum layer is over 98%.

A simple explanation based on Fresnel’s reflection coefficients The Fresnel reflection coefficients at the interface between air and metal provide a good starting point for an explanation of the nature of surface plasmons and the conditions under which they occur. Consider the case of a polished metal surface of dielectric constant e, upon which a monochromatic plane wave of wavelength k0 is incident from air, at the oblique incidence angle of h. The k-vector of the incident beam (in air) has magnitude k0 ¼ 2p/k0, and its projections parallel and perpendicular to the air–metal interface are denoted by kk and k?. The complex Fresnel reflection coefficients rp and rs for p- and s-polarized light are written qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ e ðkk =k0 Þ2 ek? =k0 rp ¼ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ; ð10:1Þ e ðkk =k0 Þ2 þ ek? =k0 qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ e ðkk =k0 Þ2 qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ : rs ¼ k? =k0 þ e ðkk =k0 Þ2 k? =k0

ð10:2Þ

132

Classical Optics and its Applications (b)

(a) |rs|

|rp|

0.8

= 42.41°

1.0

Magnitude of Poynting Vector

Amplitude Reflection Coefficient

1.0

0.6

0.4

0.2

0.8

0.6 Sp 0.4

0.2 Ss

0.0

0.0 0

15

30

45 60 (degrees)

75

90

0

5

10

15

20

Z (nm)

Figure 10.4 Same as Figures 10.2 and 10.3, except for the thickness of the aluminum film, which is now 20 nm. The resonant absorption in this case occurs at h ¼ 42.41 , and the fraction of p-polarized light absorbed within the aluminum layer is just over 60%.

The denominator pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃof ﬃ the expression for rp in Eq. (10.1) goes to zero at kk =k0 ¼ e=ð1 þ eÞ, indicating that rp has a pole at this point. No such pole, however, exists for rs. In the case of aluminum, n þ ij ¼ 1.38 þ 7.6i at k0 ¼ 633 nm, yielding e ¼ 55.86 þ 20.98i; this results in a value 1.008 þ 0.003i for the pole of rp. Under ordinary circumstances, when the metal surface is illuminated in air at an oblique angle h we have kk/k0 ¼ sinh, which is less than unity and, therefore, far from the pole. However, if evanescent waves are somehow created at an air–aluminum interface, then kk/k0 can exceed unity and, in the neighborhood of the pole, the reflectivity rp at that interface will approach infinity. This means that an evanescent p-polarized plane wave of very small amplitude impinging at the metal surface can excite a very strong plane wave within the metal. This plane wave, of course, is the surface plasmon, which is capable of absorbing a good fraction of the energy from the incident beam and converting it to heat within the metallic medium. In light of the above arguments it is not difficult to see that, in the system of Figure 10.1, the creation of evanescent waves with kk/k0 1 at the aluminum–air interface is responsible for the sharp decline in rp at angles slightly greater than

10 What in the world are surface plasmons?

133

the critical TIR angle. Since the expression for rs in Eq. (10.2) does not admit a pole, no such behavior could be expected from the s-polarized light. Attenuated total internal reflection (ATIR) Another setup in which the excitation of surface plasmons is readily observed is shown in Figure 10.5. (Some results from this type of experiment are also described in Chapter 27, “Some quirks of total internal reflection”.) Here the presence of an air gap between the glass hemisphere and the metal plate guarantees the creation of evanescent waves whereas in the setup of Figure 10.1 the metal layer had to be sufficiently thin to provide access for the electromagnetic waves to the metal–air interface. For the system of Figure 10.5, computed reflection coefficients versus the gapwidth are plotted in Figure 10.6 for several angles of incidence in the vicinity of hcrit. Because the variations of rs were imperceptible within the chosen range of incidence angles, 41 43 , it was deemed pointless to label the various coinciding rs curves. In contrast, rp was very sensitive to changes in h, and the various rp curves in Figure 10.6 are clearly labeled to indicate this dependence. A dip in the plots of rp versus the gap-width begins to appear at angles of incidence just below hcrit; the dip becomes more pronounced with an increasing h until the minimum reflectivity actually reaches zero at h ¼ 41.95 . The dip then decreases with further increases in h and, by the time h reaches 43 , it has practically disappeared. As before we observe in the plots of Figure 10.6 the salient features of absorption by surface plasmon excitation, namely, a p-polarized incident beam, the existence of evanescent waves at an air–metal interface, and an angle of

Reflected beam

Incident beam Glass hemisphere u Air gap Metal plate

Figure 10.5 Schematic diagram showing a monochromatic plane wave at oblique incidence on the flat surface of a glass hemisphere. When the air gap separating the hemisphere and the polished metal surface is sufficiently thin, and at a specific angle of incidence h, a substantial fraction of the incident beam will be coupled into a surface plasmon and thus absorbed by the metal plate.

134

Classical Optics and its Applications (a)

(b)

|rs|

1.0

|rs|

1.0 43.0 Amplitude Reflection Coefficient

Amplitude Reflection Coefficient

|rp| 0.8

0.6

= 41.0 deg. 41.2 41.4

0.4

41.6

41.8

0.2

42.6

42.8 |rp|

0.8

42.4

0.6 42.2

0.4

0.2 = 42.0 deg.

41.9 41.95

0.0 0

300

600

0.0 900

Air Gap (nm)

1200

1500

0

300

600

900

1200

1500

Air Gap (nm)

Figure 10.6 Computed plots of amplitude reflection coefficients versus the width of the air gap for the experiment depicted in Figure 10.5. Each curve represents a specific incidence angle h in the range (41 , 43 ). The reflection coefficient jrsj for the s-polarized light does not change very much in this narrow range of incident angles; thus all the jrsj curves coincide at this scale. The dip in jrpj begins to appear at angles of incidence just below the critical TIR angle (hcrit ¼ 41.81 ); it becomes a maximum at h ¼ 41.95 and then decreases again to insignificance at h ¼ 43 . Note also that the gap-width at which jrpj reaches a minimum varies with the angle of incidence. (a) h-values from 41.0 to 41.95 ; (b) h-values from 42.0 to 43.0 .

incidence in the vicinity of the critical TIR angle for the glass–air interface, that is, when kk/k0 1.

Excitation of surface plasmons in metalized diffraction gratings A third experiment in which surface plasmons may be observed involves the reflection of light from metallized diffraction gratings. Here, as shown in Figure 10.7, the incident beam excites one or more propagating diffracted orders but also creates non-propagating evanescent waves near the surface. Whenever one of these evanescent waves happens to have the kk of a surface plasmon, the conditions for its excitation are ripe, and a good fraction of the optical energy will be coupled into the grating medium. As might be guessed

10 What in the world are surface plasmons? Incident beam

135

Diffracted orders

Metal grating

Figure 10.7 A monochromatic plane wave incident on a metal grating creates multiple diffracted orders within the reflected beam. At certain angles of incidence, when the polarization of the beam happens to have a component perpendicular to the grooves, surface plasmons are excited within the grating. These plasmons, which can convert a good fraction of the incident optical power into heat, cause a sudden and substantial drop in the diffraction efficiencies of the various orders.

Incident beam

Es Ep Objective lens

Metal grating

Figure 10.8 A good method of observing surface plasmons in practice involves the reflection of a focused beam of light from the grooved surface of a metal grating. Because the focused cone contains rays within a wide range of angles of incidence, upon reflection from the grating dark bands will appear in the exit pupil of the lens corresponding to those rays that have succeeded in exciting the surface plasmons. A camera, set up to photograph the reflected beam at the exit pupil of the lens, will not only reveal the narrow bands corresponding to surface plasmons but also show the superposition of the various diffracted orders captured by the lens. To observe the surface plasmon bands, one must allow the incident beam to have a component of polarization perpendicular to the grooves of the grating. In this figure, Es is in the direction that will excite the plasmons.

from the cases discussed in the preceding examples, the range of parameters over which surface plasmon excitation can be expected is very narrow and, therefore, the angle of incidence at which surface plasmons are excited must be sharply defined. If one directs a focused beam onto a metal grating, as shown in Figure 10.8, then a wide angular spectrum will be present in the beam, and some of the

136

Classical Optics and its Applications a

b

Figure 10.9 Photographs showing the intensity distribution at the exit pupil of a 0.8NA microscope objective lens, through which a collimated beam of laser light (k ¼ 633 nm) is focused on a gold-coated diffraction grating; (n, k)gold ¼ (0.13, 3.16). The grooves of the grating are oriented along the Y-axis, the grating period is 1.6 lm, and the grooves, which have a trapezoidal crosssection, are 0.5 lm wide at the top and 70 nm deep. The direction of the linear polarization of the incident beam is parallel to the grooves in (a) and perpendicular to the grooves in (b). (From Ronald E. Gerber, Ph.D. dissertation, Optical Sciences Center, University of Arizona, Tucson.)

rays will be strongly absorbed. A photograph of the reflected beam at the exit pupil of the lens will show one or more dark lines corresponding to the absorption of surface plasmons within the grating. Figure 10.9 shows a typical set of results obtained in an experiment of this type. When the polarization is parallel to the grooves, as is the case in Figure 10.9(a), there are no surface plasmon bands. However, with the polarization vector perpendicular to the grooves, surface plasmons are clearly excited, as shown in Figure 10.9(b). Results of theoretical calculations confirming these results are shown in Figures 10.10 and 10.11. In these calculations, Maxwell’s equations were solved for about 10 000 plane waves impinging on the metal grating at various angles. These results were then combined to represent the focused cone of light created by a 0.8NA objective lens. In the case of Figure 10.10, where the incident polarization vector was parallel to the grooves, no plasmons were observed. We did the calculations for three different positions of the focused spot over the grooves, however, to show the so-called baseball pattern that results from superposition of the various diffracted orders. Frames (a), (b), and (c) correspond respectively to a beam focused on one groove edge, on the middle of a groove, and on an opposite groove edge. The phase differences between various diffracted orders create constructive and destructive interference among these various orders in their regions of mutual overlap, thus

10 What in the world are surface plasmons?

137

a

b

c

Figure 10.10 Computed plots of intensity distribution at the exit pupil of a 0.8NA objective lens through which a uniform plane wave is focused on a diffraction grating. The grooves are oriented at 45 relative to the X-axis. The parameters of the grating are the same as those used in the experiment (see the caption to Figure 10.9). The various diffraction orders are clearly visible in these so-called “baseball” patterns. The incident linear polarization is parallel to the grooves, thus explaining the absence of plasmon-related dark bands in these pictures. The center of the focused spot is (a) on a groove edge, (b) in the middle of a groove, and (c) on the opposite groove edge.

giving rise to black and white areas. When the polarization is perpendicular to the grooves, the pattern in Figure 10.11 is obtained. For this computation the position of the focused spot on the grating was on a grooved edge similar to that shown in Figure 10.10(c). The dark bands of Figure 10.11, predicted by this theoretical calculation to arise from surface plasmon excitation, agree quite well with the experimental results of Figure 10.9(b).

138

Classical Optics and its Applications

Figure 10.11 Logarithmic plot of computed intensity distribution at the exit pupil of a 0.8NA objective lens. The simulation parameters are the same as those used to obtain Figure 10.10(c), with the exception of the direction of incident polarization, which is perpendicular to the grooves. The grooves are oriented at 45 to the X-axis, and the focused spot is centered on the edge of a groove. The absorption bands caused by the excitation of surface plasmons are identical to those observed experimentally in Figure 10.9(b).

References for Chapter 10 1 R. E. Gerber, Lifeng Li, and M. Mansuripur, Effects of surface plasmon excitations on the irradiance pattern of the return beam in optical disk data storage, Appl. Opt. 34, 4929–4936 (1995). 2 R. W. Wood, On a remarkable case of uneven distribution of light in a diffraction grating spectrum, Phil. Mag. 4, 396–402 (1902). 3 Lifeng Li, Multilayer-coated diffraction gratings: differential method of Chandezon et al. revisited, J. Opt. Soc. Am. A 11, 2816–2828 (1994). 4 R. H. Ritchie, Plasma losses by fast electrons in thin films, Phys. Rev. 106, 874–881 (1957). 5 For the computations leading to Figures 10.10 and 10.11, reflection coefficients of the grating were first computed by a vector diffraction program developed by Lifeng Li. These coefficients were subsequently imported to DIFFRACT, where they were combined to represent the effects of a focused beam. 6 J. C. Quail, J. G. Rako, and H. J. Simpson, Long-range surface plasmon modes in silver and aluminum, Opt. Lett. 8, 377 (1983). 7 D. Sarid, Long-range surface-plasma waves on very thin metal films, Phys. Rev. Lett. 47, 1927 (1981). 8 A. D. Boardman, ed., Electromagnetic Surface Modes, Wiley, New York, 1982. 9 A. E. Craig, A. Olson, and D. Sarid, Experimental observation of the long-range surface-plasmon polariton, Opt. Lett. 8, 380 (1983).

11 Surface plasmon polaritons on metallic surfaces†

Recent advances in nano-fabrication have enabled a host of nano-photonic experiments involving subwavelength metallic structures.1,2,3,4,5 This flurry of activity has, in turn, reawakened interest in surface plasmon polaritons (SPPs) and inspired theoretical research in this area. Although the fundamental properties of SPPs have been known for nearly five decades,6,7 there remain certain subtle issues that could benefit from further critical analysis. The goal of the present chapter is to use numerical simulations to verify the detailed structure of long-range SPPs. We present field distribution profiles and energy flow patterns aimed at promoting a physical understanding of SPP generation and propagation in ways that mathematical equations alone cannot convey. Thus, beginning with Maxwell’s equations in the next section, we determine the electromagnetic eigenmodes confined to flat metallo-dielectric interfaces. The behavior of these modes will then be examined through computer simulations that show the excitation of SPPs in certain practical settings. Our numerical computations are based on the Finite Difference Time Domain (FDTD) method.8 General Formulation With reference to Figure 11.1, in a homogeneous medium of dielectric constant e the propagation vector is k ¼ k0 ðry^y þ rz^zÞ, where k0 ¼ 2p/k0 and r2y þ r2z ¼ e. qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ In general, rz ¼ e r2y , with both plus and minus signs admissible. In each of the semi-infinite cladding media, however, only one value of rz is allowed, corresponding to the solution that approaches zero when z ! –1. This is why rz1 of the upper cladding in Figure 11.1 is chosen to have a plus sign, whereas that of the lower cladding has a minus sign. (rz1, rz2 have positive imaginary parts.) †

This chapter is co-authored with Armis R. Zakharian, now with Corning Corp., and Jerome V. Moloney of the University of Arizona.

139

140

Classical Optics and its Applications z

Ez Hx w

Ey k1 = k0( y y + z1z)

1

k2 = k0( y y ± z z)

2

k3 = k0( y y – z1z)

1

y

Figure 11.1 Slab of thickness w and dielectric constant e2, sandwiched between two homogeneous, semi-infinite media of dielectric constant e1. An electromagnetic mode of the structure consists of two (generally inhomogeneous) plane-waves within the slab and a single (inhomogeneous) planewave in each of the surrounding media. Continuity of the fields at z ¼ – 12w requires that ky ¼ k0ry be the same for all these plane-waves. Although the polarization state of the mode can, in general, be either TE or TM, only TM modes are considered in this chapter. The magnetic field, therefore, has a single component Hx along the x-axis, while the electric field has two components (Ey, Ez) in the yz-plane. Throughout the chapter k0 ¼ 650 nm and the metallic medium is silver, having e ¼ 19.6224 þ 0.443i (corresponding to n þ ik ¼ 0.05 þ 4.43i).

The E- and H-fields of each plane-wave are related through the Maxwell equation r · H ¼ @D/@t (where D ¼ e0e E) as follows: Hx ðy; z; tÞ ¼ H0 expfi ½k0 ðry y rz zÞ xtg;

ð11:1aÞ

Ey ðy; z; tÞ ¼ ðZ0 rz =eÞ Hx ðy; z; tÞ;

ð11:1bÞ

Ez ðy; z; tÞ ¼ ðZ0 ry =eÞ Hx ðy; z; tÞ:

ð11:1cÞ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ Here H0 is the (complex) amplitude of the magnetic field, Z 0 ¼ l0 =e0 pﬃﬃﬃﬃﬃﬃﬃﬃﬃ 377X is the impedance of the free space, and x ¼ k0 c ¼ k0 = l0 e0 is the temporal frequency of the light wave. The time-dependence factor exp(ixt) will be omitted in the following discussion. We confine our attention to symmetric structures where both cladding media have the same dielectric constant e1. In general, the modal fields are either odd or even with respect to the y-axis, allowing one to express the H-field of a given mode as follows (– signs for even

141

11 Surface plasmon polaritons on metallic surfaces

and odd modes, respectively): 8 1 > < H1 expðik0 ry yÞ exp½ik0 rz1 ðz 2 wÞ; Hx ðy; zÞ ¼ H2 expðik0 ry yÞ ½expðik0 rz2 zÞ expðik0 rz2 zÞ; > : H1 expðik0 ry yÞ exp½ik0 rz1 ðz þ 12 wÞ;

z þ 12 w; jzj 12 w; z 12 w: ð11:2Þ

The corresponding E-field for each mode can be found from Eqs. (11.1). Continuity of Hx and Ey at the z ¼ – 12w boundaries yields H2 ½expðik0 rz2 w=2Þ expðik0 rz2 w=2Þ ¼ H1 ; Z0 H2 ðrz2 =e2 Þ½expðik0 rz2 w=2Þ expðik0 rz2 w=2Þ ¼ Z0 H1 ðrz1 =e1 Þ:

ð11:3aÞ ð11:3bÞ

Substituting for H1 from Eq. (11.3a) into Eq. (11.3b), rearranging the terms, and expressing rz1 and rz2 in terms of ry, we find: qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ e1 e2 r2y e2 e1 r2y qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ expðik0 e2 r2y wÞ ¼ 1: ð11:4Þ e1 e2 r2y þ e2 e1 r2y (i) This transcendental equation in ry ¼ r(r) y þ iry is the characteristic equation of the wave-guide depicted in Figure 11.1. Each solution ry of Eq. (11.4) corresponds to a particular mode of the waveguide; when the plus (minus) sign is used on the right-hand side of Eq. (11.4), the solution represents an even (odd) mode. Since we are presently interested in modes that propagate from left to right in Figure 11.1, the imaginary part of ry must be non-negative (i.e., r(i) y 0), otherwise the mode will grow exponentially as y ! 1. Also, when computing the complex square roots in Eq. (11.4), one must always choose the root which has a positive imaginary part. Note that the coefficient multiplying the complex exponential on the left-hand side of Eq. (11.4) is the Fresnel reflection coefficient rp for a p-polarized (TM) plane-wave at the interface between media of dielectric constants e1 and e2. The pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ Fresnel coefficient has a singularity (pole) at ry ¼ e1 e2 =ðe1 þ e2 Þ, where its denominator vanishes. The function on the left-hand side of Eq. (11.4) thus varies rapidly in the vicinity of this pole, where some of the solutions of the equation are to be found. In particular, when w ! 1, the complex exponential approaches zero and the pole itself becomes a solution. This can be seen most readily with reference to Eqs. (11.3); by allowing exp (þik0rz2w/2) ! 0 and substituting forqH 1 from ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ﬃ Eq.q(11.3a) ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ into Eq. (11.3b), we find rz2 /e2 ¼ rz1/e1, namely,

e1

e2 r2y þ e2

e1 r2y ¼ 0.

142

Classical Optics and its Applications Table 11.1. First few solutions of Eq. (11.4) for a 50 nm-thick silver slab (k0 ¼ 650 nm, e1 ¼ 1.0, e2 ¼ 19.6224 þ 0.443i)

ry(þ) 1.017 þ i 0.171 þ i 0.1145 þ i 0.211 þ i 0.1892 þ i

ry()

0.22 · 103 7.868 7.860 20.0012 19.992

1.041 þ i 1.39 · 103 0.204 þ i 13.738 0.1722 þ i 13.729 0.2135 þ i 26.3795 0.1965 þ i 26.3698

Metallic slab in the free space Consider the case of e1 ¼ 1.0, e2 ¼ 19.6224 þ 0.443i (silver at k0 ¼ 650 nm). Fixing the slab’s thickness at w ¼ 50 nm and searching the complex plane for solutions of Eq. (11.4) yields the first few values of ry(–) ¼ ry(r) þ iry(i) listed in Table 11.1; the – superscripts identify the even and odd modes, respectively. (Only solutions having non-negative values of ry(i) are considered so that, as y ! þ1, the corresponding modes will decay to zero.) Although we will be concerned mainly with the top two (fundamental) solutions in Table 11.1, there exist an infinite number of solutions with large values of ry(i). The latter are generally needed to match the boundary conditions upon launching an SPP; otherwise, due to their rapid decay along the y-axis, modes with large ry(i) do not appear to have any practical significance. As the slab thickness w increases, the fundamental solutions (first row of Table p 11.1) approach each other, reaching the common value of ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ rspp ¼ e1 e2 =ðe1 þ e2 Þ ¼ ð1:0265 þ i0:6217 · 103 Þ. In contrast, reducing the slab thickness causes the fundamental solutions to move apart (and also further pﬃﬃﬃﬃ away from rspp). As w! 0, the even solution approaches ry ¼ e1 , while the odd solution acquires a large ry(r) and a fairly large ry(i). Table 11.2 lists the fundamental solutions of Eq. (11.4) for a range of values of w.

Prism-coupling To excite SPPs on the flat surface of a metal slab, one may use the prism-coupling scheme of Figure 11.2, commonly referred to as the Kretschmann or Otto configuration depending on whether the metal is thin or thick.6 The incident beam arrives at the bottom of the prism (refractive index ¼ n0) at an angle h slightly greater than the critical angle hc of total internal reflection. Since ky ¼ k0n0 sin h, the waves coupled to the metal slab will have ky > k0, a basic requirement for SPP

11 Surface plasmon polaritons on metallic surfaces

143

Table 11.2. Fundamental modes of silver slabs of differing thickness

(k0 ¼ 650 nm, e1 ¼ 1.0, e2 ¼ 19.6224 þ 0.443i)

ry(þ)

w (nm)

1.0003 þ i 1.0012 þ i 1.017 þ i 1.021 þ i 1.026 þ i 1.026 þ i

5 10 50 65 200 1

ry()

0.7 · 106 3.7 · 106 0.22 · 103 0.35 · 103 0.62 · 103 0.6219 · 103

2.3422 þ i 0.0432 1.4657 þ i 0.017 45 1.041 þ i 1.39 · 103 1.033 þ i 1.01 · 103 1.026 þ i 0.6235 · 103 1.026 þ i 0.6219 · 103

Ez Ey

Hx

u

z

w

n0 Gap

1

Metal slab

2

y

1

Figure 11.2 A Gaussian beam of wavelength k0 is focused at the bottom of a glass prism of refractive index n0. The angle of incidence h is slightly greater than the critical angle hc of total internal reflection. (In this two-dimensional system, the beam is focused by a cylindrical lens; its shape, therefore, does not vary along the x-axis.) The beam is linearly polarized, with H-field along x and E-field components (Ey, Ez) confined to the plane of incidence. A small gap separates a metallic slab of thickness w and dielectric constant e2 from the prism; the medium immediately above and below the slab has dielectric constant e1. The inset shows a half-prism that can be used to eliminate the back-coupling of the surface plasmon(s), excited on the metallic surface(s), into the prism.

excitation. The incident beam, being mildly focused, has a k-space spectrum that spans a few degrees around hc. Most of these k-vectors are reflected at the prism’s base; however, a narrow range of incidence angles evanescently couples to the metal surface and proceeds to excite the plasmons. Figure 11.3 shows computed plots of the Fresnel reflection coefficient rp ¼ jrpj exp(ip) for a p-polarized plane-wave versus the incidence angle h at the bottom of the prism (n0 ¼ 1.5, k0 ¼ 650 nm). In Figure 11.3(a), corresponding to the case of a thick silver slab separated by a 1078 nm air gap, there is a single resonant absorption at h ¼ 43.15 . Due to the narrow range of k-vectors

144

Classical Optics and its Applications 1 0.8

(a)

(b)

|rp|

0.6 0.4

|rp|

0.2 200 100

fp

0

fp

–100 –200 –300 40

42

44

46 u (°)

48

50 40

42

44

46 u (°)

48

50

Figure 11.3 Plots of the Fresnel reflection coefficient for p-polarized (TM) light, rp ¼ jrpj exp(ip), versus the incidence angle h at the bottom of the prism of Figure 11.2; k0 ¼ 650 nm, n0 ¼ 1.5. (a) The case of a thick silver slab separated from the prism by a 1078 nm air gap. (b) The case of a 65 nm-thick silver slab separated by a 950 nm air gap. In each case the gap is optimized to enhance the strength of the excited plasmon(s).

that cross the gap, we expect the footprint of the beam on the metal surface to be much wider than the diameter of the focused spot at the prism’s base. The rapid variation of the phase p in the vicinity of the resonance implies that the footprint on the metal surface will not be centered under the incident spot but, rather, it will be shifted to the right. Figure 11.3(b), corresponding to a 65 nmthick silver slab separated from the prism by a 950 nm air gap, exhibits two resonant absorptions, representing the odd and even modes of the metallic slab. The first resonance at h1 ¼ 42.86 , having the smaller value of ky, excites the even mode, while the second resonance, at h2 ¼ 43.52 , excites the odd mode. The coupling of a focused beam of light through a glass prism to a thick (semiinfinite) silver slab is depicted in Figures 11.4 and 11.5. At the base of the prism, the Gaussian beam’s full-width-at-half-maximum-amplitude (FWHM) is 4.0 lm, the central ray’s incidence angle is h ¼ 43 , and the air gap is 1.078 lm. The expected SPP wavelength, k0/Re[rspp] ¼ 633.22 nm, is consistent with k0/(n0 sin h) ¼ 633.6 nm estimated from Figure 11.3(a) at the minimum of rp(h). From Figure 11.4(a), the profile of Hx(y) sampled at Dz ¼ 10 nm below the metal surface has a period of 634 nm (peak of the function’s Fourier spectrum), in excellent agreement with the theory. The Poynting vector plots of Figure 11.5 show how a fraction of the evanescent field’s energy reaches the metal surface, of which fraction a certain portion immediately returns to the prism, while the remainder turns around and propagates along the metal surface in the y-direction.

145

11 Surface plasmon polaritons on metallic surfaces

Hx × 10–3

–7.4 –3.7 0.0

3.7

0.8

7.4

|Hx | × 10–3

0.00 1.88 3.75 5.62 7.50

a

b

0.6 z [m]

0.4 0.2 0.0 –0.2 –0.4 –35

|Ey|

–30

–20

–25

0.00 0.16 0.32 0.48 0.64

–30 –20 –10

|Ez|

0.0

0.7

0

10

20

1.4

2.1

2.8

c

0.8

d

0.6 z [m]

0.4 0.2 0.0 –0.2 –0.4 –30 –20 –10 0 y [m]

10

20

–30 –20 –10 0 y [m]

10

20

Figure 11.4 Electromagnetic fields in the gap region between the prism and the semi-infinite metal surface. (a) Instantaneous Hx. (b-d) Magnitudes of Hx, Ey, Ez. The evanescent field at the bottom of the prism is visible in the upper lefthand corner of each frame. The SPP is launched at the lower left-hand side. Due to back-coupling to the prism, the SPP’s decay rate along the y-axis is nearly twice the expected rate.

The best fit to Re[Hx] of Figure 11.4(a) is exp(0.013 67 y) sin [9.9165(y þ 0.245)]. While k0 Re[rspp] ¼ 9.9226 is quite close to the observed value of 9.9165, the decay rate of 0.013 67 is substantially greater than the SPP extinction rate of k0 Im[rspp] ¼ 0.006; this is caused by the SPP’s back-coupling to the prism. We truncated the simulated prism by removing the glass that lies directly above the excited SPP (see

Classical Optics and its Applications

Sy ×10–5 –21 231 482 734 986 0.8 a

Sz ×10–5 –88 –44 0.8

0.6

0.6

0.4

0.4

0.2 0.0

0

44

0.0

–0.2

–0.4

–0.4 –30 –20 –10 0 10 20 y [m]

246 493 739 986 c

0.4

0.2

–0.2

|S| ×10–5 0 0.6

88 b

z [m]

z [m]

z [m]

146

0.2 0.0 –0.2

–30 –20 –10 0 10 20 y [m]

–0.4

–34 –32 –30 –28 –26 –24 –22–20 y [m]

Figure 11.5 (a, b) Components Sy and Sz of the Poynting vector S in the gap region between the prism and the semi-infinite metal. (c) Close-up of jSj; superimposed arrows show the direction of S. Sy ×10–5 0.8

Sz ×10–5 –88 –44 0.8

–21 231 482 734 986 b

0.6

0.6

0.4

0.4

0.4

0.2

z [m]

0.6 z [m]

z [m]

|Hx | ×10–3 0.00 1.88 3.75 5.62 7.50 0.8 a

0.2 0.0

0.0

–0.2

–0.2

–0.2

–0.4

–0.4

–30 –20 –10 0 10 20 y [m]

44

88 c

0.2

0.0

–30 –20 –10 0 10 20 y [m]

0

–0.4

–30 –20 –10 0 10 20 y [m]

Figure 11.6 (a-c) Plots of Hx, Sy, Sz in the gap region between the prism and a semi-infinite metallic medium. To eliminate the back-coupling of the SPP to the prism, the part of the prism that lies above the launched SPP has been removed (see the inset in Figure 11.2); the prism thus extends from 40 lm to 0 along the y-axis. The SPP’s decay rate along y now agrees with the theoretical prediction.

the inset in Figure 11.2); the truncated prism thus occupied only the interval (40 lm, 0) along the y-axis. The simulation results for the truncated prism shown in Figure 11.6 exhibit a period of 634 nm in the y 0 region (obtained from the waveform’s Fourier spectrum). The best fit to Re[Hx], namely, the function exp(0.006 y) sin [9.9165(y þ 0.056)], now yields the expected decay rate as well. Interference between odd and even modes Figures 11.7 and 11.8 show the results of FDTD simulations pertaining to a Gaussian beam (FWHM at prism’s base ¼ 8.0 lm, k0 ¼ 650 nm, h ¼ 43.55 ), coupled through a truncated prism (n0 ¼ 1.5, y ¼ 80 lm to 0, air gap ¼ 950 nm) to a 65 nm-thick silver slab. The Fourier transform of Re[Hx( y)], sampled at

147

11 Surface plasmon polaritons on metallic surfaces Hx ×10–3 –8

–4

0

4

|Hx | ×10–3

8

1.5

1.5

1.0

1.0 z [m]

z [m]

a

0.5

4

6

8 b

0.5

–0.5

–0.5

–50

|E y |

0

–50

–60 –40 –20

|E z |

0.000 0.188 0.375 0.562 0.750 c

0.0

0.8

0

20

1.6

40

2.4

1.5

60

3.2 d

1.0 z [m]

1.0 z [m]

2

0.0

0.0

1.5

0

0.5 0.0

0.5 0.0

–0.5

–0.5 –60 –40 –20 0 20 y [m]

40

60

–60 –40 –20 0 20 y [m]

40

60

Figure 11.7 Electromagnetic field profiles on both sides of a 65 nm-thick silver slab illuminated through a truncated prism; the slab is centered at z ¼ 0.6 lm, while the prism’s base at z ¼ 1.55 lm extends from 80 lm to 0 along the y-axis. (a) Profile of instantaneous Hx. (b–d) Magnitudes of Hx, Ey, Ez. The evanescent field just below the prism appears in the upper left-hand corner of each frame. Both odd and even modes of the slab are excited, their interference causing the peaks and valleys of the field distributions.

Dz ¼ 10 nm below the slab, yields ky1 ¼ 636.9 nm, ky2 ¼ 629.4 nm, in excellent agreement with k0 /(n0 sin h1,2) ¼ 637.1 nm, 629.3 nm obtained from the minima of rp(h) of Figure 11.3(b). Computed values of ry for a 65 nm-thick slab in Table 11.2 yield ky(–) ¼ k0/Re[ry(–)] ¼ 636.6 nm, 629.2 nm, and k0 Im[ry(–)] ¼ 0.0034, 0.0098, once again in agreement with the simulated profile of Re[Hx(y)] shown in Figure 11.7(a).

148

Classical Optics and its Applications S y × 10–5 –50

264

577

Sz × 10–5 –50 –32.5 –15.0 2.5

890 1203 a

1.5

1.5 1.0 z [m]

z [m]

1.0 0.5

0.5

0.0

0.0

–0.5

–0.5 –60 –40 –20

301

0

602

20

40

60

0.8

0.6

0.6

0.2

0

301

0

20

602

1.0

0.8

0.4

–60 –40 –20

|S| × 10–5

903 1203 c

z [m]

z [m]

|S| × 10–5 0 1.0

0.0 –60

20.0 b

40

60

903 1203 d

0.4 0.2 0.0

–50

–40 y [m]

–30

–20

0

10

20 y [m]

30

40

Figure 11.8 (a, b) Poynting vector components Sy, Sz around the 65 nm-thick silver slab illuminated through a truncated prism. (c, d) Close-ups of jSj, showing the flow of energy in the early and late parts of the propagating SPP.

Profiles of the Poynting vector S in the gap region between the silver slab and the prism and also in the region immediately below the slab are shown in Figure 11.8. The Sz plot shows a fraction of the evanescent field’s energy reaching the metal slab, of which a certain proportion immediately returns to the prism, while the remainder turns around and straddles the slab along the y-axis. In general, the odd mode, being lossier than the even mode, has a shorter propagation distance along the y-axis. The physics behind the loss mechanism may be understood as follows. With the even mode, the field component Ez has the same sign above and below the slab; therefore, at a given point along y, the electrical

11 Surface plasmon polaritons on metallic surfaces

149

charges at the top and bottom surfaces have opposite signs. Inside the metallic slab, the field component Ez – reduced by a factor of e2/e1 relative to the Ez immediately outside – helps move the charges back and forth between the top and bottom surfaces. The slab being thin, the transport distance is short; hence the charge velocity and the corresponding electrical current are small. In contrast, the charges of the odd mode have the same sign on opposite sides of the slab. Consequently, positive and negative charges must move laterally (in the –y-directions) during each period of oscillation. The travel distance is now on the order of the SPP wavelength, which is typically greater than the slab thickness. Therefore, the current densities of the odd mode are relatively large, leading to correspondingly large losses. Polarization dependence of SPP A mathematical analysis similar to the one that led to Eq. (11.4) reveals that TE-polarized electromagnetic waves cannot support SPPs at metal–dielectric interfaces. The following argument proves the same point by appealing to the underlying physics of surface plasmons. For the sake of simplicity, consider a thick metal plate in vacuum, as shown in Figure 11.9. An SPP consists of two inhomogeneous plane-waves (one in the free space, the other in the metal), both having the same ry in the propagation direction (phase velocity Vp ¼ c/ry). The diagram in Figure 11.9(a) represents a true SPP, with the E-fields originating on positive (surface) charges and terminating on negative ones. If the continuity of Hk at the surface is assumed, then a negative emetal ensures the continuity of D?, and Ek can be made continuous by the proper choice of ry, namely, ry ¼ rspp. In contrast, the diagram in Figure 11.9(b) represents a physical impossibility; absence of magnetic charges in nature means that the H-field must be divergencefree everywhere and, in particular, at the metal–vacuum interface; however, since Hk will now have opposite directions above and below the surface, it cannot satisfy the requisite boundary condition. This is the reason why SPPs must, of necessity, be TM-polarized. Concluding remarks In this chapter we analyzed the surface modes of thin and thick metallic slabs. Maxwell’s equations admit many solutions for electromagnetic fields that can be considered localized at and around metallic surfaces (or, in general, confined to the vicinity of metallo-dielectric interfaces). However, only a handful of such solutions extend far enough beyond their point of origination to be considered useful for practical applications. The odd and even waves that propagate along the surfaces of metallic slabs are examples of such long-range surface plasmon

150

Classical Optics and its Applications (a)

(b) H

E

H

E

o/ y

Metal

Vp= c/ y

Vp= c/ y

Metal

Figure 11.9 (a) The SPP’s E-fields originate on positive charges and terminate on negative ones. Continuity of Hk and a negative emetal ensure the continuity of D?, while Ek becomes continuous when ry ¼ rspp. (b) A physical impossibility, since the divergence-free nature of the H-field requires Hk to have opposite directions above and below the surface, thus prohibiting the continuity of Hk at the boundary.

polaritons. The remaining solutions – properly classified as short-range or lossy modes – should not be ignored, however, as they participate in the matching of the boundary conditions wherever a long-range SPP is launched, or whenever an existing SPP crosses the boundary from one environment into another. Our FDTD simulations have verified the validity of the simple theoretical analysis, but they also have provided a physical picture of field distributions and energy flow patterns in realistic systems that are generally inaccessible to exact mathematical analysis. We have seen, for example, that an SPP excited through a glass prism possesses the expected spatial frequency, but that its decay rate is substantially greater than the theoretical value (due to back-coupling and subsequent leakage through the prism). We also argued that the even mode of a thin metallic slab is less lossy (hence longer range) than the odd mode, primarily because the electrical currents that sustain the even mode flow in the thickness direction, whereas those of the odd mode flow laterally, in the plane of the slab. Much insight can be gained from a detailed analysis of the electromagnetic field profiles in their intimate and intricate relationship with the behavior of the conduction electrons of the metallic medium. References for Chapter 11 1 T. W. Ebbesen, H. J. Lezec, H. F. Ghaemi, T. Thio, and P. A. Wolff, Extraordinary optical transmission through subwavelength hole arrays, Nature 391, 667–669 (1998). 2 R. D. Averitt, S. L. Westcott, and N. J. Halas, Ultrafast electron dynamics in gold nanoshells, Phys. Rev. B 58, R10203–R10206 (1998).

11 Surface plasmon polaritons on metallic surfaces 3 4 5 6 7 8

151

J. J. Mock, S. J. Oldenburg, D. R. Smith, D. A. Schultz, and S. Schultz, Composite plasmon resonant nanowires, Nano Letters 2, 465–469 (2002). H. F. Ghaemi, T. Thio, and D. E. Grupp, Surface plasmons enhance optical transmission through subwavelength holes, Phys. Rev. B 58, 6779–6782 (1998). G. Gay, O. Alloschery, B. V. de Lesegno, C. O’Dwyer, J. Weiner, and H. J. Lezec, The optical response of nano-structured surfaces and the composite diffracted evanescent wave model, Nature Phys. 264, 262–267 (2006). H. Raether, Surface Plasmons on Smooth and Rough Surfaces and on Gratings, Springer-Verlag, Berlin, 1986. J. J. Burke, G. I. Stegeman, and T. Tamir, Surface-polariton-like waves guided by thin, lossy metal films, Phys. Rev. B 33, 5186–5201 (1986). A. Taflove and S. C. Hagness, Computational Electrodynamics: The Finite-Difference Time-Domain Method, second edition, Artech House, 2000.

12 The Faraday effect

Michael Faraday (1791–1867) (Photo: National Portrait Gallery, London, courtesy of AIP Emilio Segre´ Visual Archives.)

Michael Faraday (1791–1867) was born in a village near London into the family of a blacksmith. His family was too poor to keep him at school and, at the age of 13, he took a job as an errand boy in a bookshop. A year later he was apprenticed as a bookbinder for a term of seven years. Faraday was not only binding the books but was also reading many of them, which excited in him a burning interest in science. When his term of apprenticeship in the bookshop was coming to an end, he applied for the job of assistant to Sir Humphry Davy, the celebrated chemist, whose lectures Faraday was attending during his apprenticeship. When Davy asked the advice of one of the governors of the Royal Institution of Great Britain about the employment of a young bookbinder, the man said: “Let him wash bottles! If he is any good he will accept the work; if he refuses, he is not good for anything.” Faraday accepted, and remained with the Royal Institution for the next fifty years, first as Davy’s assistant, then as his collaborator, and finally, after Davy’s death, as his successor. It has been said that Faraday was Davy’s greatest discovery. In 1823 Faraday liquefied chlorine and in 1825 he discovered the substance known as benzene. He also did significant work in electrochemistry, discovering 152

12 The Faraday effect

153

the laws of electrolysis. However, his greatest work was with electricity. In 1821 Faraday built two devices to produce what he called electromagnetic rotation, that is, a continuous circular motion from the circular magnetic force around a wire. Ten years later, in 1831, he began his great series of experiments in which he discovered electromagnetic induction. These experiments form the basis of modern electromagnetic technology. Apart from numerous publications in scientific magazines, the most remarkable document pertaining to his studies is his Diary, which he kept continuously from the year 1820 to the year 1862. (This was published in 1932 by the Royal Institution in seven volumes containing a total of 3236 pages, with a few thousand marginal drawings.) Queen Victoria rewarded Faraday’s lifetime of achievement by granting him the use of a house at Hampton Court and a knighthood. Faraday accepted the cottage but gracefully rejected the knighthood.1 On 13 September 1845, Faraday discovered the magneto-optical effect that bears his name. This day’s entry in his Diary reads: “Today worked with lines of magnetic force, passing them across different bodies (transparent in different directions) and at the same time passing a polarized ray of light through them and afterwards examining the ray by a Nichol’s Eyepiece or other means.” After describing several negative results in which the ray of light was passed through air and several other substances, Faraday wrote in the same day’s entry: “A piece of heavy glass which was 2 inches by 1.8 inches, and 0.5 of an inch thick, being silico borate of lead, and polished on the two shortest edges, was experimented with. It gave no effects when the same magnetic poles or the contrary poles were on opposite sides (as respects the course of the polarized ray) – nor when the same poles were on the same side, either with a constant or intermitting current – BUT, when contrary magnetic poles were on the same side, there was an effect produced on the polarized ray, and thus magnetic force and light were proved to have relation to each other. This fact will most likely prove exceedingly fertile and of great value in the investigation of both conditions of natural force.” Electromagnetic basis of the Faraday effect Magneto-optical (MO) effects are best described in terms of the dielectric tensor e of the medium in which the interaction between the light and the applied magnetic field (or the internal magnetization of the medium) takes place:2 0 1 exx exy exz e ¼ @ eyx eyy eyz A: ezx ezy ezz

154

Classical Optics and its Applications

In an isotropic material (such as ordinary glass) the three diagonal elements are identical and, in the presence of a magnetic field along the Z-axis, there is a nonzero off-diagonal element e0 , which couples the x- and y- components of the optical E-field, that is, 0 1 e e0 0 e ¼ @ e0 e 0 A: 0 0 e In general, e and e0 are wavelength dependent, but over a narrow range of wavelengths they might be treatable as constants. In a transparent material, where there is no optical absorption, e is real and e0 is imaginary. However, in the most general case of an absorbing MO material both e and e0 may be complex numbers. For diamagnetic and paramagnetic media e0 is proportional to the applied magnetic field H, while for ferromagnetic and ferrimagnetic materials spin–orbit coupling is the dominant source of the MO interaction, making e0 proportional to the magnetization M of the medium.2 Since B ¼ H þ 4pM (in CGS units), we consider the B-field inside the medium as the source of the MO effects. Now we discuss the basis of the MO effect. When a polarized beam of light propagates in a medium along the direction of the magnetic field B, the right and left circularly polarized (RCP and LCP) components of the beam experience different refractive indices, n ¼ (e ie0 )1/2. For fused silica glass at a wavelength k ¼ 550 nm, for example, e 2.25, and e0 107i per kOe of applied magnetic field. (Note that both nþ and n in this case are real-valued and, therefore, there is no absorption.) For linearly polarized light passing through a length L of the material under the influence of a B-field, the two circular-polarization components suffer a relative phase shift D ¼ 2pL(nþ– n)/k.3,4 As shown in Figure 12.1, a change in the relative phase of the RCP and LCP components is equivalent to a rotation of the plane of polarization by the Faraday angle hF ¼ 12D. In the above example, hF 0.22 at k ¼ 550 nm for a slab 1 cm thick immersed in a 1 kOe magnetic field. The figure of 0.22 /cm kOe is known as the Verdet constant of fused silica at the specified wavelength.3 Certain magnetic materials (e.g., magnetic garnets) are transparent enough to transmit a good fraction of the light while producing a fairly large Faraday rotation. These materials can be magnetized in a given direction and sustain their magnetization when the external field is removed. Therefore, the Faraday effect in these media may be observed in the absence of an external magnetic field. At k ¼ 550 nm, for instance, a typical crystal of bismuth-substituted rare-earth iron garnet may have e 5.5 þ 0.025i and e0 0.002 – 0.01i. The complex refractive indices for RCP and LCP light are thus (n þ ik)þ 2.347 þ 0.006i and

155

12 The Faraday effect

a–E

E

E RCP

LCP

a+E

RCP

LCP

Figure 12.1 A linearly polarized beam of light may be considered as the superposition of equal amounts of right and left circularly polarized beams. In going through a perpendicularly magnetized slab of material at normal incidence, the two components of circular polarization experience different (complex) refractive indices and, therefore, each emerges from the medium with a different phase and amplitude. The amplitudes of the emergent beams may be denoted by aþ and a, and their phase difference by D. The superposition of the emergent circular polarization states yields elliptical polarization. The angle of rotation of the major axis of the ellipse from the horizontal direction (which is the direction of the incident linear polarization) is given by h ¼ 12D, and the

ellipticity g is given by tan g ¼ (aþ– a)/(aþþ a).

(n þ ik) 2.343 þ 0.005i, yielding a Faraday rotation angle hF 1.3 for a micron-thick slab of this crystal. The absorption coefficient of the material is a ¼ 4pkL/k, where k is the imaginary part of the complex refractive index. For the above garnet, therefore, a 0.12 per micron, which is equivalent to 1 dB loss of light for every 2 lm of crystal thickness. In other words, this garnet delivers 2.6 of polarization rotation per dB of loss. These crystals can be grown in a range of thicknesses from a fraction of a micron to about 100 microns. Thicker crystals are useful at longer wavelengths, where the losses are small, but the Faraday rotation generally decreases with increasing wavelength as well. Faraday rotation in a transparent slab For the sake of simplicity we will ignore the effects of absorption in the Faraday medium and consider a transparent slab of magnetic material having a real e and a purely imaginary e0 . Thus we consider a slab 20 lm thick having e ¼ 5.5, e0 ¼ 0.01i. The material is magnetized perpendicularly to the plane of its surface, and a linearly polarized beam of light (with its E-field along the X-axis) is sent at normal incidence through the slab, as in Figure 12.2(a).5,6 Real sources of light, of course, are never perfectly monochromatic and, therefore, we assume a finite spectral bandwidth for the light source, covering the range k ¼ 545 – 555 nm. Figure 12.3 shows computed plots of the transmitted amplitudes, jtxj and jtyj, as well as the polarization rotation and ellipticity angles, hF and gF, versus k. Because

156

Classical Optics and its Applications (a)

(b) Ep Es

Ex

Ex

Ey

B

Z

Z

B

Ep

Faraday medium

Faraday medium

Figure 12.2 Faraday effect in the polar geometry. (a) In going normally through a slab of magnetic material, a linearly polarized beam of light with its E-field along the X-axis acquires a component of polarization along Y. The lines of B-field shown within the medium represent either an externally applied magnetic field or the intrinsic magnetization of the medium. (b) The effect is also observed at oblique incidence. Shown here is a p-polarized incident beam, which acquires a s-component upon transmission through the magnetic medium. (If the incident beam is s-polarized, the magneto-optically induced polarization is then in the p-direction.) In general, upon reversing the B-field from the þZ to the Z direction the magneto-optically induced component of polarization changes sign.

of multiple reflections at the front and rear facets of the slab these functions vary periodically with k. (The same interference phenomena are responsible for the nonzero values of gF, which would otherwise be absent in a transparent medium.) The net Faraday rotation angle is the average value of hF over the relevant range of wavelengths, but one should also recognize that the wavelength dependence of the direction of emergent polarization produces a certain amount of depolarization in the emergent beam. The Faraday rotation combined with the spectral bandwidth of the light source thus causes partial depolarization as a direct consequence of interference among the multiple reflections.

Oblique incidence Figure 12.4 shows the transmitted amplitudes and polarization angles versus the angle of incidence h in the case of the slab 20 lm thick magnetized along the Z-axis (e ¼ 5.5, e0 ¼ 0.01i) when, as shown in Figure 12.2(b), a p-polarized plane wave at the single wavelength of k ¼ 550 nm is incident on the slab. The

157

12 The Faraday effect 0.8 (a) 0.7

|tx|

Amplitude

0.6 0.5

|ty|

0.4 0.3 0.2 546

548

550

552

554

550 (nm)

552

554

Rotation/Ellipticity (degreees)

40 (b) F

30 20 10

F

0 –10 546

548

Figure 12.3 A plane wave, linearly polarized along the X-axis, is normally incident on a slab 20 lm thick, as shown in Figure 12.2(a). The slab (e ¼ 5.5, e0 ¼ 0.01i) is magnetized along the Z-axis. (a) Plots of jtxj and jtyj, the transmitted polarization components along the X- and Y- axes, as functions of k. (b) Plots of polarization rotation angle hF and ellipticity gF, versus k.

oscillations in the transmitted amplitudes and polarization angles are caused by interference among the beams multiply reflected from the facets of the slab. Aside from these interference oscillations, however, note that the Faraday effect does not show any signs of abatement with increasing angle of incidence. The reason is that even though the direction of propagation of the beam increasingly deviates from the direction of the B-field, the propagation distance simultaneously increases, keeping the net interaction between the magnetic material and the beam of light at a constant level. Figure 12.5 shows results for the case of oblique incidence, at h ¼ 85 , on the same slab as above in the range k ¼ 545555 nm. As in the case of normal incidence depicted in Figure 12.3, we note a significant variation of the Faraday angles and the amplitudes within this narrow range of wavelengths. Although the

158

Classical Optics and its Applications 1.0

(a) |tpp|

Amplitude

0.8 0.6 0.4 0.2

|tsp|

0.0 0

Rotation/Ellipticity (degrees)

35

15

30

45

60

75

90

15

30

45

60

75

90

(b)

30

F

25 20 15 10

F

5 0 –5 –10 0

(degrees)

Figure 12.4 A p-polarized plane wave (k ¼ 550 nm) is incident at oblique angle h on a slab 20 lm thick, as shown in Figure 12.2(b). The slab (e ¼ 5.5, e0 ¼ 0.01i) is magnetized along the Z-axis. (a) Plots of jtppj and jtspj, the transmitted polarization components along the p- and s-directions, as functions of h. (b) Plots of hF and gF versus h.

beam inside the slab travels at 25 relative to the direction of magnetization of the material, the maximum Faraday effect as exemplified by jtspj is the same as at normal incidence, because the propagation distance is correspondingly adjusted. The wavelength-averaged Faraday rotation may be lower at larger angles of incidence, but this is just a consequence of interference; it is not caused by any reduction in the intrinsic optical activity of the slab. If, for instance, the facets are antireflection coated, or if the beam enters and exits through index-matched spherical surfaces, then multiple reflections would be eliminated and the Faraday rotation becomes independent of the incidence angle. The above discussions were confined to the case of a p-polarized incident beam, but the conclusions remain valid for s-polarized light as well. For example, Figure 12.6 is the counterpart of Figure 12.4, showing the transmitted

159

12 The Faraday effect 1.0

(a) |tpp|

Amplitude

0.8 0.6 0.4

|tsp|

0.2 0.0 546

Rotation/Ellipticity (degrees)

40

548

550

552

554

550 (nm)

552

554

(b) F

30 20 10 0

F

–10 546

548

Figure 12.5 A p-polarized plane wave is incident at h ¼ 85 on the slab described in Figures 12.2–12.4. (a) Plots of jtppj and jtspj, the transmitted p- and s-components of polarization, as functions of k. (b) Plots of hF and gF versus k.

amplitudes and polarization angles versus the angle of incidence for a s-polarized incident beam. Note that the magneto-optically generated component of polarization tps in Figure 12.6 is identical to tsp in Figure 12.4. This is an important and completely general result, indicating that the amount of light converted from one polarization state to another is independent of the incident polarization state. Faraday medium in a Fabry–Pe´rot resonator Because the Faraday effect is amplified when the beam propagates back and forth within a magnetized medium, it is interesting to observe the enhancement of the Faraday effect in a Fabry–Pe´rot resonator. Figure 12.7 shows a system that may be used to monitor such enhancement over a range of angles of incidence.

160

Classical Optics and its Applications 1.0

(a)

Amplitude

0.8

|tss|

0.6 0.4 |tps|

0.2 0.0 0

15

30

45

60

75

90

75

90

Rotation/Ellipticity (degrees)

50 (b) F

40 30 20 10

F

0 –10 0

15

30

45 (degrees)

60

Figure 12.6 Same as Figure 12.4, except that here the incident beam is s-polarized. Dielectric mirrors

X Ex

Lens

Lens

Z

Y Ex

Faraday medium

Figure 12.7 A Faraday medium in a Fabry–Pe´rot resonator is placed in a convergent cone of light. The incident plane wave is linearly polarized along the X-axis, and the 0.8NA focusing lens is free from aberrations. The Faraday medium, 20 lm thick, and with e ¼ 5.5, e0 ¼ 0.01i, is uniformly magnetized along the Z-axis. The mirrors coated on the front and back facets of the Faraday slab each consist of 10 alternating layers of high-index (n ¼ 2.0) and low-index (n ¼ 1.5) quarter-wavethick dielectrics. The collimating lens is identical to the focusing objective, and the emergent beam is observed at the exit pupil of the collimator.

161

12 The Faraday effect a

b

c

d

–3200

x/

3200 –3200

x/

3200

Figure 12.8 Intensity and polarization patterns in the exit pupil of the collimating lens of Figure 12.7. (a) The intensity distribution of the emergent X-polarized component. The bright rings indicate the regions where the conditions of resonance are met and the light passes through the resonator. (b) The intensity distribution of the emergent Y-polarized component. The bright rings coincide with those in (a), indicating that the conditions of resonance for the incident polarization are the same as those for the magneto-optically induced polarization. (c) Polarization rotation angle hF of the emergent beam encoded in gray-scale. The range of values of hF is 23 (black) to þ63 (white). (d) The polarization ellipticity gF of the emergent beam encoded in gray-scale. The range of values of gF is 32 (black) to þ42 (white).

The first objective lens (NA ¼ 0.8) focuses a linearly polarized beam of light onto the Fabry–Pe´rot resonator, and the second, identical, lens collimates the transmitted beam, thus allowing observation at the exit pupil. For a slab of transparent magnetic material 20 lm thick sandwiched between a pair of dielectric mirrors, Figure 12.8 shows the computed patterns of intensity and polarization angle at the exit pupil of the collimator. This figure indicates that the rings of maximum transmission also correspond to locations of maximum polarization rotation. The maximum and minimum rotation angles in Figure 12.8(c) are þ63 and 23 , respectively, well in excess of the rotations obtained from the bare slab. Also note in Figures 12.8(c), (d) the asymmetrical nature of the polarization angles in the first and third quadrants, on the one hand, and in the second and fourth quadrants on the other hand.

162

Classical Optics and its Applications (b)

(a) Ep

Ep Es

X

Y

Z

Ep

Ep B

Longitudinal

B

Transverse

Figure 12.9 (a) Longitudinal Faraday effect is observed when the direction of the B-field within the slab of material is parallel both to the surface of the slab and to the plane of incidence. The rotation of polarization in this case occurs only at oblique incidence, where, upon transmission, a p-polarized beam acquires a s-component and vice versa. If the direction of B is reversed, the magneto-optically induced component of polarization will change sign. (b) The transverse effect occurs when the B-field lies in the plane of the sample perpendicular to the plane of incidence. The MO interaction in this case occurs only when the incident beam is p-polarized. Even then there is no polarization rotation; the only effect is that a change in the magnitude of the B-field causes a slight change in the magnitude of the transmitted p-light. The transverse effect is small and is not bipolar, meaning that reversing the direction of B does not affect the emergent beam.

Longitudinal and transverse geometries When the direction of the B-field is in the plane of the slab as well as in the plane of incidence, as in Figure 12.9(a), one observes the longitudinal Faraday effect. In this case e0 occupies the position of eyz in the dielectric tensor. The transverse effect occurs when the B-field, while in the plane of the sample, is perpendicular to the incidence plane, as in Figure 12.9(b). In this case e0 occupies the position of exz. In the longitudinal case at normal incidence no polarization rotation occurs, but the effect begins to show with increasing angle of incidence. For a

163

12 The Faraday effect 1.0

(a) |tpp|

Amplitude

0.8

0.6

0.4

|tsp|

0.2

0.0 0

15

30

45

60

75

90

45

60

75

90

Rotation/Ellipticity (degrees)

10 (b) 5

F

0 –5 F

–10 –15 –20 –25 –30 0

15

30

(degrees)

Figure 12.10 The longitudinal Faraday effect arising when a p-polarized plane wave (k ¼ 550 nm) is incident at oblique angle h on a slab 20 lm thick. The slab (e ¼ 5.5, e0 ¼ 0.01i) is magnetized along the X-axis, as depicted in Figure 12.9(a). (a) The transmitted amplitudes jtppj and jtspj versus h. (b) The polarization rotation angle hF and the ellipticity gF versus h.

p-polarized plane wave (k ¼ 550 nm) obliquely incident on a slab of magnetic material 20 lm thick (e ¼ 5.5, e0 ¼ 0.01i), Figure 12.10 shows the computed amplitudes of the transmitted p- and s-polarized light as well as the angles of rotation and ellipticity versus the incidence angle h. One could readily compute similar results for a s-polarized incident beam as well. In both cases the MO effect is bipolar, meaning that a reversal of the direction of the B-field reverses the signs of hF and gF. Moreover, as in the polar case discussed earlier, the magneto-optically generated component of polarization

164

Classical Optics and its Applications 1.0 Tp(0)

Transmitted Intensity

0.8

0.6

0.4

0.2 Tp– Tp(0) 0.0 0

15

30

45

60

75

90

(degrees)

Figure 12.11 The transverse Faraday effect arising when a p-polarized plane wave (k ¼ 550 nm) is incident at oblique angle h on a slab 20 lm thick. The slab (e ¼ 5.5) is magnetized along the Y-axis, as shown in Figure 12.9(b). In the absence of the B-field, e0 ¼ 0, and the transmission of the slab for a ppolarized incident beam is denoted by Tp(0). When a strong B-field is introduced (corresponding to e0 ¼ 0.1i in this case), the transmission changes to Tp. Shown here is the transmission differential DTp ¼ Tp Tp(0) as a function of h.

turns out to be the same for both directions of incident polarization; that is, tsp ¼ tps. The transverse effect is very different from both the polar and the longitudinal effects. With s-polarized incident light, where the optical E-field is parallel to the direction of the B-field in the slab, there is no MO effect whatsoever, but for the p-polarized light the medium exhibits an effective refractive index n ¼ [e þ (e0 2/e)]1/2. Thus in the transverse case neither s- nor p-polarized beams undergo polarization rotation, but the magnitude of the transmitted p-light shows a weak dependence on magnetization, that is, Tp ¼ jtpj2 becomes a function of the strength of the B-field. The transverse effect is not bipolar, so that changing the direction of the B-field from þY to Y does not alter the magnitude of Tp. For a slab of transparent material 20 lm thick and with a fairly large MO coefficient (e ¼ 5.5, e0 ¼ 0.1i), Figure 12.11 shows computed plots of Tp(0) (i.e., transmission in the absence of a B-field, when e0 ¼ 0) and DTp ¼ TpTp(0) versus the angle of incidence h. Note, in particular, that DTp 0 around the Brewster angle hB ¼ 66.9 , where a vanishing surface reflectivity results in minimal interference effects.

12 The Faraday effect

165

References for Chapter 12 1

2 3 4 5 6

Adapted from George Gamow, The Great Physicists from Galileo to Einstein, Dover Publications, New York, 1961. Some of the historical anecdotes have been compiled from information available on the worldwide web; see, for example, www.phy.uct.ac. za, www.iee.org.uk, www.woodrow.org. P. S. Pershan, Magneto-optical effects, J. Appl. Phys. 38, 1482–1490 (1967). F. A. Jenkins and H. E. White, Fundamentals of Optics, fourth edition, McGraw-Hill, New York, 1976. R. W. Wood, Physical Optics, third edition, Optical Society of America, Washington DC, 1988. D. O. Smith, Magneto-optical scattering from multilayer magnetic and dielectric films, Opt. Acta 12, 13 (1965). M. Mansuripur, The Physical Principles of Magneto-optical Recording, Cambridge University Press, UK, 1995.

13 The magneto-optical Kerr effect

The Scottish physicist John Kerr (1824–1907) discovered the magneto-optical effect named after him in 1888. When linearly polarized light is reflected from the polished surface of a magnetized medium its polarization vector rotates and becomes somewhat elliptical. The direction of rotation and the sense of ellipticity are reversed when the direction of magnetization M of the sample is reversed, thus providing a powerful tool for optically monitoring the state of magnetization of the sample under investigation.1,2,3 The physical mechanism of the Kerr effect is identical to that of the Faraday effect and, in fact, the same theoretical model can be used to describe both phenomena, one in reflection, the other in transmission (see Chapter 12, “The Faraday effect”). The Kerr effect can be analyzed under quite general conditions, with the direction of magnetization of the sample oriented arbitrarily relative to the plane of incidence of the light beam. However, the three geometries shown in Figure 13.1 are of particular importance and will be analyzed separately in the present chapter. When the magnetization M is perpendicular to the sample’s surface, the observed phenomenon is referred to as the polar Kerr effect. When M is parallel to the surface and in the plane of incidence, the Kerr effect is longitudinal. Finally, when M is parallel to the surface but perpendicular to the plane of incidence, the observed phenomenon is known as the transverse Kerr effect.4,5 Electromagnetic basis of the Kerr effect For convenience, we repeat in this short section the relevant text from chapter 10. Magneto-optical (MO) effects are best described in terms of the dielectric tensor e of the medium in which the interaction between the light 166

13 The magneto-optical Kerr effect Ep

Ep

Es

167

Ep

Es

Es

M Polar

Transverse

Longitudinal

Figure 13.1 The MO Kerr effect is polar, longitudinal, or transverse, depending on the orientation of the magnetic moment M relative to the sample’s surface and to the plane of incidence. The incident beam is p- or s-polarized according to whether its E-field is in the plane of incidence (Ep) or perpendicular to it (Es).

and the applied magnetic field (or the takes place:1 0 exx @ e ¼ eyx ezx

internal magnetization of the medium) exy eyy ezy

1 exz eyz A: ezz

In an isotropic material the three diagonal elements are identical and, in the presence of a magnetic field along the Z-axis, there is a non-zero off-diagonal element e0 , which couples the x- and y- components of the optical E-field: 0 1 e e0 0 e ¼ @ e0 e 0 A: 0 0 e In general, e and e0 are wavelength-dependent, but over a narrow range of wavelengths they might be treatable as constants. In a transparent material, where there is no optical absorption, e is real and e0 is imaginary. However, in the most general case of an absorbing MO material both e and e0 would be complex numbers. For diamagnetic and paramagnetic media e0 is proportional to the applied magnetic field H, while for ferromagnetic and ferrimagnetic materials spin–orbit coupling is the dominant source of the MO interaction, making e0 proportional to the magnetization M of the medium.1 Since B ¼ H þ 4pM (in CGS units), in general the B-field inside the medium may be considered the source of the MO effects. When a polarized beam of light propagates in a medium along the direction of the magnetic field B, the right and left circularly polarized (RCP and LCP) components of the beam experience different refractive indices n ¼ (e ie0 )1/2. Since the Fresnel reflection coefficients depend on the refractive index, the two

168

Classical Optics and its Applications

circular polarizations are reflected with different reflectivities, rþ and r, say. When rþ and r happen to have a phase difference, the reflected beam exhibits a polarization rotation, and if the magnitudes jrþj and jrj differ from each other, then there will be some degree of ellipticity. When the medium is transparent, n are real and, therefore, there is no phase difference between rþ and r, although their magnitudes will be different. In this case the reflected light exhibits polarization ellipticity only. However, in the general case of reflection from the surface of an absorbing medium (both e and e0 complex), the reflected light exhibits elliptical polarization, with the major axis of the ellipse rotated relative to the direction of incident polarization. For concreteness, we will confine our attention throughout this chapter to a metallic magnetic material having e ¼ 8 þ 27i and e0 ¼ 0.6 þ 0.2i at the red HeNe wavelength, k0 ¼ 633 nm. This is typical of the TbFeCo amorphous alloys used in magneto-optical disks for data storage. The discussion, however, will be kept quite general in nature, and the conclusions drawn from specific examples should be applicable to a wide variety of magnetic materials. The polar effect Figure 13.2(a) shows computed plots of the various reflection coefficients versus the angle of incidence h for the case of a perpendicularly-magnetized sample. The conventional reflection coefficients for p- and s-light, rpp and rss, show the behavior expected for a metallic surface. We denote by rps the cross-polarization factor from incident p to reflected s, and by rsp that from incident s to reflected p. These coefficients represent the ability of the magnetic medium to convert, upon reflection, p-polarized light into s, and vice versa. It can be shown quite generally that rps ¼ rsp at all angles of incidence. Thus the power of the magnetic medium to “rotate” the polarization is independent of whether the incident beam is p- or s-polarized. However, the polarization rotation and ellipticity angles, q and g, which depend on rpp and rss as well as rps, exhibit differing behaviors for p- and s-light (see Figures 13.2(b), (c)). Note also in Figure 13.3(a) that rps remains more or less constant up to fairly large angles of incidence. The longitudinal effect Plots of the various reflection coefficients versus the angle of incidence h for the longitudinal geometry appear in Figure 13.3(a). As in the polar case, it turns out that rsp ¼ rps for all values of h. At normal incidence the interaction between the incident E-field and the magnetization of the medium cannot produce any polarization rotation; therefore, rsp ¼ 0 at h ¼ 0. As h increases, however, the MO

169

13 The magneto-optical Kerr effect

Reflection Coefficient

1.0

(a)

0.8

|rss|

0.6

|rpp|

0.4

150 |rsp|

0.2 0.0

Rotation & Ellipticity (deg.)

0

30

45

60

75

90

30

45

60

75

90

45

60

75

90

0.05 (b) 0.00 p

–0.05 –0.10 –0.15

p

–0.20 –0.25 0

Rotation & Ellipticity (deg.)

15

15

0.05 (c) 0.00 s

–0.05 –0.10 –0.15

s

–0.20 –0.25 0

15

30

(degrees)

Figure 13.2 A linearly polarized plane wave is reflected from the polished surface of a magnetic material having perpendicular magnetization (the polar case); exx ¼ 8 þ 27i, exy ¼ 0.6 þ 0.2i. (a) Plots of jrppj, jrssj, and jrspj ¼ jrpsj versus the angle of incidence h. (b) The polarization rotation angle q and the ellipticity g versus h for p-polarized incident beam. (c) Same as (b) for s-polarized beam.

signal gains strength, peaking at h ¼ 65 . Again, q and g depend on whether the incident polarization is p or s (see Figures 13.3(b), (c)), but the effective MO signal, rsp, is independent of the incident polarization. The longitudinal MO signal is typically weaker than its polar counterpart by almost one order of magnitude.

170

Classical Optics and its Applications Reflection Coefficient

1.0

(a)

0.8

|rss|

0.6

|rpp|

0.4 1000 |rsp| 0.2 0.0

Rotation & Ellipticity (deg.)

0

0 Rotation & Ellipticity (deg.)

15

30

45

60

0.05 (b) 0.04 0.03 0.02 0.01 0.00 –0.01 –0.02 –0.03

0.02

75

90

p

p

15

30

45

60

75

90

75

90

(c) s

0.01 0.00 –0.01 s

–0.02 0

15

30

45

60

(degrees)

Figure 13.3 Same as Figure 13.2 but here for the longitudinal Kerr effect. Again rsp ¼ rps at all angles of incidence. The MO effect is zero at normal incidence, reaching its peak at a fairly large angle. Note that jrpsj, q, g are about an order of magnitude smaller than their counterparts for the polar geometry case. Both polar and longitudinal effects are bipolar, in the sense that a reversal in the direction of M results in a p phase shift of rps, leading to a reversal in the signs of both q and g.

The transverse effect The behavior of the reflected light in this case differs fundamentally from that in the other two cases. First, there is no interaction whatsoever between the magnetic moment of the sample and s-polarized light. Here the optical E-field is parallel to M and, therefore, does not “see” the magnetization of the sample.

171

13 The magneto-optical Kerr effect

When the incident beam is p-polarized, the interaction is confined to the plane of incidence, creating an extra E-field component within the same plane. Unlike the polar and longitudinal effects, no E-fields are generated perpendicular to the plane of incidence. Therefore, there are no polarization rotations in the transverse geometry. What is interesting, however, is that the reflectivity of the sample, Rp ¼ jrppj2, depends on the magnitude and direction of the magnetic moment M. ð0Þ In Figure 13.4(a) the reflectivity in the absence of M is denoted by Rp (i.e., e0 is set to zero). With M pointing along þY the reflectivity changes slightly, becoming ðþÞ Rp ; the difference is shown as the solid curve at the bottom of Figure 13.4(a). Similarly, when M is reversed to point along Y, the corresponding change in Rp is given by the broken curve. The change in Rp is thus seen to depend on the direction of M. This behavior is rather curious and, at first sight, appears to violate the principles of symmetry, although a careful analysis shows it to be correct.1 It is noteworthy that this bipolar nature of Rp critically depends on the magnetic medium being absorptive; for transparent magnetic media (where e

1.0 (a)

0.15 (b)

0.8

0.10

= 60°

Reflectivity

Rp(M) – Rp(0) (×100)

40° 0.6

Rp(0)

0.4

0.2

Rp(+) – Rp(0) (×100)

0.05 20°

80° 0°

0.00

–0.05

–0.10

0.0 Rp(–) – Rp(0) (×100)

–0.2 0

15

30

45 60 (degrees)

–0.15 75

90

–1.0

–0.5

0.0 M

0.5

Figure 13.4 Variation of the reflectivity Rp with the magnitude and/or direction of M in the transverse geometry. The incident beam is p-polarized in all cases; there are no transverse effects for s-polarized light. (a) The dependence of Rp on the angle of incidence h; the superscript zero indicates that M ¼ 0. When the medium is fully magnetized in the Y direction, the reflectivity is denoted by ðÞ Rp . (b) The variation of Rp with M at various angles of incidence. At h ¼ 0 the dependence on M is quadratic, while at h ¼ 20 , 40 , 60 it is nearly linear. (The off-diagonal element e0 of the dielectric tensor is assumed to be directly proportional to M.)

1.0

172

Classical Optics and its Applications

is purely real and e0 purely imaginary), the dependence of Rp on M is quadratic, showing no change with the reversal of the direction of magnetization. ðMÞ ð0Þ Figure 13.4(b) shows the variations in the reflectivity difference Rp Rp with the magnitude of M, as M varies continuously from a maximum value along þY to zero and then reverses direction and reaches a maximum in the opposite direction. At normal incidence the dependence on M is quadratic, but at larger angles (20 , 40 , 60 ) it is almost (but not quite) linear. Like the longitudinal effect, the transverse effect in this case is about an order of magnitude weaker than the polar effect. Localized probe of the state of magnetization It is sometimes desirable to probe the local state of a magnetic surface. This can be done by focusing onto the surface a polarized laser beam through a high-NA objective, as shown in Figure 13.5. The lens focuses the beam to a diffractionlimited spot (diameter k0), providing access to the sample’s magnetization within a tiny region. The focused beam, of course, contains many rays arriving at the sample from different directions, making the analysis of the resulting Kerr signal somewhat tedious. To begin with, even in the absence of a magnetic moment M the reflected polarization state is complicated. Figure 13.6 shows the various distributions at the exit pupil of the objective when M is set to zero. The intensity of the x-component

E

Incident beam

Objective

Magnetic sample

Figure 13.5 A linearly polarized beam of light having its E-field parallel to the X-axis is focused onto the flat surface of a magnetic medium through a diffraction-limited microscope objective lens (NA ¼ 0.95, f ¼ 3158k). The power of the incident beam – its integrated intensity – is set to unity. The reflected light’s distribution at the exit pupil has a small but important contribution from the magnetization M of the sample.

173

13 The magneto-optical Kerr effect a

b

c

d

e

f

–3200

x/

3200 –3200

x/

3200

Figure 13.6 Various distributions at the exit pupil of the objective of Figure 13.5, when M is set to zero (i.e., no Kerr effect). (a) Distribution of intensity for the reflected Ex; the total power ¼ 0.62. (b) Distribution of phase for Ex; min ¼ 0 , max ¼ 55 . (c) Distribution of intensity for the reflected Ey; the total power ¼ 0.011. (d) Distribution of phase for Ey; min ¼ 36 , max ¼ 150 . (e) The polarization rotation angle q; qmin ¼ 20.5 , qmax ¼ 20.5 . (f) The polarization ellipticity g; gmin ¼ 25.1 , gmax ¼ 25.1 .

of the reflected light, Ix ¼ jExj2, depicted in Figure 13.6(a), shows slight variations across the aperture, in agreement with the rpp and rss curves of Figure 13.2(a). Similar variations are seen in the corresponding phase plot of Figure 13.6(b). In addition to Ex, the reflected light also contains a y-component, Ey, whose intensity and phase plots appear in Figures 13.6(c), (d). While the total power (i.e., the integrated intensity) of Ex is 62% of the incident power, that of Ey is only 1.1%. The reflected Ey in adjacent quadrants of the aperture exhibits a phase shift of p, indicating a sign reversal from one quadrant to the next. The presence of Ey in the reflected beam gives rise to the patterns of polarization rotation and ellipticity depicted in Figures 13.6(e), (f); note the fairly large values of q and g in the four corners of the aperture qmin, qmax ¼ 20.5 ; gmin, gmax ¼ 25.1 ).

174

Classical Optics and its Applications

To determine the contribution to the reflected E-field by the sample’s magnetization, we compute the complex reflected amplitudes for M up and M down, then subtract one distribution from the other. In the process the x-component of polarization disappears, indicating that Ex is indifferent to the reversal of M. However, the residual y-component shows the distribution depicted in Figure 13.7. The total power of Ey contributed by the MO interaction in this case is 0.0042% of the incident power. Both the phase and intensity of this residual Ey are fairly uniform, with the intensity showing a mild decline towards the edge of the aperture, consistent with the behavior of rsp in Figure 13.2(a). (Note that, even at NA ¼ 0.95, the largest angle of incidence on the sample is less than 72 .) A similar calculation for the longitudinal case yields the plots in Figure 13.8. Here the complex amplitude distributions are computed for M along þX and X, then subtracted from each other. Unlike the polar Kerr signal in Figure 13.7, both the reflected Ex and the reflected Ey in the longitudinal geometry contain some MO contribution. The total power of the MO contribution to Ex is 0.0000065%, which is rather small and concentrated in the four corners of the aperture. Note that the top half of the aperture containing the Ex signal has a p phase shift relative to the bottom half. In contrast, the Ey contribution to the MO signal (see Figures 13.8(c), (d)) contains 0.000054% of the incident power, equally divided between the right and left halves of the aperture with a p phase shift. Finally, if the magnetization of the sample in Figure 13.5 is aligned with the Y-axis (perpendicular to the plane of the figure) then the MO contributions to the reflected beam will be those shown in Figure 13.9. As before, we obtain these distributions by computing the complex amplitudes at the exit pupil with M along þY and Y and then subtracting one from the other. The MO contribution to Ex, having 0.00026% of the incident power, is fairly strong. The contribution to Ey contains 0.000054% of the incident power, exactly as in the longitudinal case b

a

–3200

x/

3200 –3200

x/

3200

Figure 13.7 Contribution of the magnetic moment M of the sample in Figure 13.5 to the Ey distribution at the objective’s exit pupil; M is assumed to be perpendicular to the sample’s surface. (a), (b) Intensity and phase patterns of Ey; the total power ¼ 0.42 · 104.

175

13 The magneto-optical Kerr effect a

b

c

d

–3200

x/

3200 –3200

x/

3200

Figure 13.8 Contribution of the magnetic moment of the sample to the E-field distribution at the exit pupil of the objective of Figure 13.5; M is assumed to be aligned with the X-axis. (a), (b) Intensity and phase patterns of Ex; total power 0.65 · 107. The top and bottom halves of the aperture have a relative phase of p. (c), (d) Intensity and phase patterns of Ey; the total power ¼ 0.54 · 106. Note the p phase difference between the right and left halves of the aperture.

depicted in Figures 13.8(c), (d). Note that, with the exception of a 90 rotation of coordinates, the distributions in Figures 13.9(c), (d) are identical to those in Figures 13.8(c), (d). Signal detection The MO contribution to the reflected polarization state can be converted to an electronic signal with the aid of polarization-sensitive optics and photodetectors. For instance, to detect the polar Kerr signal shown in Figure 13.7, one can employ the differential scheme shown in Figure 13.10. Here the reflected beam is directed toward a quarter-wave plate, which helps to eliminate the phase shift between Ex and Ey. The quarter-wave plate is followed by a Wollaston prism, which mixes the MO component of polarization contained in Ey with the reflected x-component of polarization, Ex. The two mixed beams emerging from the Wollaston are detected by a pair of photodetectors whose difference signal DS conveys information about the sample’s magnetic state. A computed plot of the normalized differential signal versus the orientation angle w of M is given in Figure 13.11. As M moves away

176

Classical Optics and its Applications a

b

c

d

x/

–3200

3200 –3200

x/

3200

Figure 13.9 Same as Figure 13.8, but for the transverse geometry, where M is switched between þY and Y directions. Ex, depicted in (a), (b), has total power 0.26 · 105. Ey, depicted in (c), (d), has total power 0.54 · 106.

Incident beam Plate 4

Wollaston s1

+

s2

–

Differential amplifier ΔS

Split detector

Splitter Objective M

Magnetic sample

Figure 13.10 A differential detection scheme is used to probe the direction of M via the state of polarization of the reflected beam. To attain high spatial resolution, the laser beam is focused on the sample surface. The reflected beam goes through a quarter-wave plate, whose fast and slow axes are at 45 to the direction of incident polarization. The Wollaston prism divides the beam between two photodetectors, and the difference DS between the outputs of these detectors is monitored. To maximize the swing of DS one must adjust the orientation of the Wollaston around the optical axis.

177

13 The magneto-optical Kerr effect

Normalized Differential Signal

0.75 0.50 100ΔS/(S1 + S2) 0.25 0.00 –0.25 –0.50 –0.75 0

45

90 135 (degrees)

180

Figure 13.11 The normalized differential signal as a function of the orientation angle w of M (see Figure 13.10). The detection module has been adjusted for maximum swing of DS. This signal is bipolar, in the sense that it switches sign when M is reversed. DS is zero at w ¼ 90 (i.e., M in the plane of the sample).

from its initial orientation at w ¼ 0 toward the plane of the sample at w ¼ 90 , and continues downward until w ¼ 180 , DS follows these changes continuously. (We mention in passing that, as M rotates, the sum signal S1 þ S2 undergoes slight variations, but, for all practical purposes, it remains a constant.) Similar systems may be designed to extract the longitudinal and transverse MO signals depicted in Figures 13.8 and 13.9. However, because in these cases the E-field contributions have different signs in opposite halves of the aperture, any viable detection scheme must extract the signals from these half-apertures separately, before combining them with the proper sign at the end. Enhancing the Kerr signal To enhance the MO signal one should force the magnetic sample to absorb a greater fraction of the incident beam. As an example of how this can be done, consider the system of Figure 13.12, which consists of a magnetic sample placed under a high-reflectivity dielectric mirror. The majority of the rays in the focused beam are reflected from the mirror without ever reaching the magnetic sample. However, when the direction of the ray is such that the cavity between the mirror and the sample becomes resonant, the ray is strongly absorbed by the magnetic sample. This strong absorption produces in the reflected beam a rather large polarization component perpendicular to the incident E-field, which can then be detected at the exit pupil of the objective.

178

Classical Optics and its Applications E

Incident beam

Objective

Dielectric mirror Spacer Magnetic sample

Figure 13.12 A collimated linearly polarized laser beam (k ¼ 633 nm) is focused through a 0.75NA, f ¼ 4000k objective onto a high-reflectivity dielectric mirror sitting on top of a magnetic sample (e ¼ 8 þ 27i, e0 ¼ 0.6 þ 0.2i). The mirror, consisting of seven pairs of high- and low-index quarter-wave layers (nH ¼ 2, nL ¼ 1.5), is deposited on a glass substrate 10lm thick and of index n ¼ 1.5. The substrate is in direct contact with the magnetic surface. The magnetization M is uniform and perpendicular to the plane of the sample’s surface, and the beam entering the lens is polarized along the X-axis. Upon reflection from the sample, the light collected by the objective is photographed at the exit pupil. Most rays within the focused beam are reflected at the mirror, without ever reaching the magnetic sample. At certain angles of incidence, however, where the cavity becomes resonant, the light passes through to the magnetic medium and is absorbed by it. It is only for these resonant rays that the MO effect is observed at the exit pupil.

Figure 13.13 shows the computed distributions at the exit pupil of a 0.75NA lens. The intensity plot for Ex in Figure 13.13(a) shows absorption bands in the angular spectrum of the incident beam. The reflected Ey in Figure 13.13(b) is strong in certain regions of the aperture, but these contributions mostly come from spurious light reflected from the mirror, not from the magnetic sample. To determine the MO signal at the exit pupil, we once again compute the reflected complex amplitudes with M up and M down, then subtract the corresponding distributions. Figure 13.13(c) is the result of this calculation, showing the intensity of the residual Ey contributed by the MO interaction. The peak value of this MO signal is nearly twice that shown in Figure 13.7(a). Quadrilayer stack A practical method of enhancing the MO Kerr effect involves the incorporation of a thin magnetic film in a quadrilayer stack structure. Figure 13.14 shows one such stack, consisting of an aluminum reflector, a dielectric underlayer, a thin magnetic film, and a dielectric overlayer. By optimizing the thicknesses of these

13 The magneto-optical Kerr effect

179

a

b

c

–3200

x/

3200

Figure 13.13 Various distributions at the exit pupil of the objective of Figure 13.12. (a) Intensity of the reflected Ex; the total power ¼ 87% of the incident power. (b) Intensity of Ey; the total power ¼ 0.3% of the incident power. Most of this Ey, which is primarily produced by oblique reflections from the dielectric mirror, serves only to obscure the MO-generated component of polarization. (c) The true MO signal obtained by subtracting the distributions produced with M up and M down; the total power 0.0007% of the incident power.

layers it is possible to improve the MO signal substantially. In the following example, we will fix the thicknesses of three of the layers and optimize the thickness of the remaining one. This results in a significant gain in the performance of the stack. (It is possible to achieve further improvement by optimizing the other layers as well.) Figure 13.15 shows plots of the reflectivity R ¼ jrppj2 ¼ jrssj2, the MO Kerr signal jrspj, and the polarization rotation and ellipticity all versus the thickness t2

180

Classical Optics and its Applications Plane-wave t1

Dielectric Magnetic

t2

Dielectric Aluminum Substrate

Figure 13.14 A quadrilayer MO stack consists of an aluminum reflector, an intermediate dielectric layer, a thin magneto-optic film, and an overcoating dielectric layer. The thicknesses of the various layers may be adjusted to maximize the MO signal jrspj obtained upon reflection. 2.0

0.6 (a)

Rotation & Ellipticity (deg.)

Reflectivity

50 |rsp|

0.5

(b)

R

0.4 0.3 0.2 0.1

1.5 1.0 0.5 0.0

–0.5

0.0 0

50

100 t2 (nm)

150

0

50

100 t2 (nm)

150

Figure 13.15 The dependence of reflected signals from a quadrilayer MO stack on the thickness of the dielectric underlayer (k0¼633 nm, normal incidence). The aluminum layer (n þ ik ¼ 1.4 þ 7.6i) is 50 nm thick, the MO film is 20 nm thick, and the overcoat layer (n ¼ 2) is 80 nm thick (t1 ¼ k0/(4n)). The underlayer’s index of refraction is n ¼ 2, but its thickness t2 is adjustable. (a) Computed plots of the reflectivity R and the MO signal jrspj versus t2. (b) The Kerr rotation angle q and the ellipticity g versus t2. The maximum of q occurs when g is nearly zero, and vice versa.

of the dielectric underlayer. (Since the dependence on t2 is periodic, only one period, ranging from zero to k0/(2n), is shown.) Note that jrspj peaks when R is at a minimum, and vice versa. The maximum value of jrspj in this example is about three times greater than that of the bare magnetic sample shown in Figure 13.2.

13 The magneto-optical Kerr effect

181

References for Chapter 13 1 2 3 4 5

P. S. Pershan, Magneto-optical effects, J. Appl. Phys. 38, 1482–1490 (1967). F. A. Jenkins and H. E. White, Fundamentals of Optics, fourth edition, McGraw-Hill, New York, 1976. R. W. Wood, Physical Optics, third edition, Optical Society of America, Washington DC, 1988. D. O. Smith, Magneto-optical scattering from multilayer magnetic and dielectric films, Optica Acta 12, 13 (1965). M. Mansuripur, The Physical Principles of Magneto-optical Recording, Cambridge University Press, UK, 1995.

14 The Sagnac interferometer

The Sagnac effect pertains to the relative phase shift between two beams of light that travel on an identical path in opposite directions within a rotating frame.1,2,3,4 Modern fiber-optic gyroscopes (Sagnac interferometers) used for navigation are based on this effect, allowing highly accurate measurements of rotation rates down to about 104105 degrees per hour. Georges Sagnac (1869–1926) was the first to perform a ring interferometry experiment in 1913 aimed at observing the correlation of angular velocity and optical phase-shift. (An experiment conducted in 1911 by Francis Harress, attempting to measure the Fresnel drag of light propagating through rotating glass, was later recognized as actually constituting a Sagnac experiment; Harress had ascribed the observed “unexpected bias” to some other factor.) An ambitious ring interferometry experiment was set up by Albert Michelson and Henry Gale in 1926 to determine whether the Earth’s rotation has an effect on the propagation of light in its vicinity. The Michelson– Gale interferometer with a 1.9 km perimeter was large enough to detect the rotation of the Earth, confirming its known value of angular velocity (obtained from astronomical observations). The Michelson–Gale ring interferometer was not calibrated by comparison with an outside reference, an impossible task given that the setup was fixed to the Earth. Figure 14.1 shows the general design of a triangular Sagnac interferometer consisting of a light source, a beam-splitter S, mirrors M1, M2, and an observation plane, mounted on a base that rotates at a constant angular velocity X around a fixed axis. The rotation axis, not necessarily perpendicular to the plane of the interferometer, crosses that plane at C. The source and the observation plane are mounted on the same rotating base as M1, M2, and S, although, strictly speaking, this is not necessary (that is, either the source or the observation plane or both may be stationary while the rest of the system rotates; this would require synchronizing the light pulses with the rotating base, but would not modify the behavior of the system in any significant way). Between the source 182

183

14 The Sagnac interferometer M1 C Rotation Axis

S Source

M2

Observation Plane

Figure 14.1 Diagram of a Sagnac interferometer consisting of the beamsplitter S and the mirrors M1 and M2. The instruments are mounted on a base that rotates around a fixed axis at a constant angular velocity X. The rotation axis, which crosses the plane of the interferometer at C, is not necessarily perpendicular to that plane. In this configuration, the source and the observation plane are mounted on the same (rotating) platform as M1, M2, and S.

and the observation plane, the clockwise-propagating beam undergoes four reflections, whereas the beam that travels counterclockwise suffers two reflections (at M1, M2) and two transmissions (both at S). Given the 90 phase difference between the reflection and transmission coefficients of any beamsplitter, the counter-propagating beams arrive at the observation plane with a relative phase of 180 , thus cancelling each other out and resulting in a dark (or null) fringe. Any phase difference imparted to the two beams as a result of the rotation of the system will therefore change the strength of the signal picked up by a photodetector at the observation plane. There exist references in the literature to a “fringe shift” at the observation plane resulting from the rotation of the Sagnac interferometer.3 Such fringe shifts are observed only when a small misalignment of the system causes the separation of the counter-propagating beams upon arrival at the beam-splitter S. As a specific example based on typical numerical values, Figure 14.2 shows a Sagnac interferometer incorporating a lens that focuses the incoming laser beam onto the front and back sides of the beam-splitter after the beam has completed its round-trips in the two opposite directions (red: clockwise, blue: counter-clockwise). A slight tilt of one of the mirrors (say, M2) separates the two focused spots at S. The beams emerging from these two spots travel in parallel and interfere at the observation plane; the inset in Figure 14.2 shows the (computed) fringes that result from this interference. Any phase difference of the counter-propagating beams produced in a rotating system will cause a lateral shift of the fringes within the observation plane.

184

Classical Optics and its Applications M1

S Light Source

M2

Observation Plane

Figure 14.2 A lens (NA ¼ 0.005, f ¼ 3.0 m) focuses the incoming laser beam (k0 ¼ 0.633 lm, 1/e amplitude diameter ¼ 1.0 cm) onto the beam-splitter after the completion of a single round-trip in either direction (red: clockwise, blue: counter-clockwise). The 0.9 m-long arms of the interferometer form an equilateral triangle. The observation plane is 1.0 m away from the beam-splitter S, which consists of an 8.0 nm-thick silver film on a glass substrate (Rs ¼ 50.5%, Ts ¼ 46.8%). The mirrors M1 and M2 are 0.5 lm-thick silver films on glass substrates (RM ¼ 97%). Tilting one of the mirrors (say, M2) by Dh ¼ 0.05 separates the focused spots at the beam-splitter by 1.57 mm (in the direction parallel to the observation plane); each focused spot is 0.24 mm in diameter. The beams emerging from the focused spots travel in parallel and interfere at the observation plane; the inset shows computed fringes over a 6 · 6 mm2 area.

The three main features of a Sagnac interferometer are as follows: 1. The observed relative phase between counter-propagating beams around the Sagnac loop is proportional to A· X, where A is the loop area and X is the loop’s angular velocity, irrespective of the shape of the loop, or the location and orientation of the rotation axis. 2. Doppler shifts produced by reflections from the moving beam-splitter and mirrors do not give rise to different optical frequencies for the counter-propagating beams when these beams arrive at the observation plane, even though the frequencies of the two beams could be different elsewhere along the path. 3. The refractive indices of the media traversed by the counter-propagating beams do not affect the relative phase of these beams, provided that such media co-rotate with the rest of the Sagnac interferometer.

14 The Sagnac interferometer

185

The objective of the present chapter is to explain the physical basis of the above features of the Sagnac interferometer without resort to the principles of general relativity. Our goal is to provide an explanation based on geometry, the theory of special relativity, and the classical theory of optical wave propagation, while maintaining some level of generality. Fundamental formula of the Sagnac interferometer Figure 14.3 shows that, in its clockwise path around the Sagnac loop, the light beam propagates the distances r1, r2, r3 along the unit vectors r1, r2, r3. A total of four reflections (at S, M1, M2, and again at S) bring the beam from the source to the observation plane. With respect to the rotation center C, which is in the r1r2r3 plane, the centers of M1, M2, S are located at R1, R2, R3, respectively. The area A of the triangle r1r2r3 is equal to the area A1 of r1R1R3 plus the area A2 of R3R2r3 minus the area A3 of R1R2r2. The monochromatic light source (frequency ¼ f0, wavelength k0 ¼ c/f0, wave-number k0 ¼ 2p/k0 ¼ 2pf0/c) launches the incident beam along the unit vector r0; the emergent beam reaches the observation plane along the unit vector r4. Note that r1 · R1, a vector perpendicular to the plane of r1R1R3 with a magnitude equal to the perpendicular distance from C to r1, is exactly equal to r1 · R3. Similarly, r2 · R1 ¼ r2 · R2 and r3 · R2 ¼ r3 · R3. It is thus clear that r1r1 · R1 ¼ r1r1 · R3 ¼ 2A1; similar identities may be readily established for the areas A2 and A3 of the triangles R3R2r3 and R1R2r2 as well. Consider a plane-wave that leaves the beam-splitter S and propagates toward the mirror M1 along r1; the complex amplitude of this wave may be expressed as a0 exp[i k0(r1 · r – ct)]. The destination of the beam is the center of M1 located at r1 relative to the center of S, and the travel time is Dt ¼ r1/c. However, the rotation of the interferometer causes the center of M1 to shift to a new location r10 ¼ r1 R1 · X Dt. Upon arriving at M1 the additional phase of the beam due to this rotation will be D1 ¼ k0 r1 ðr01 r1 Þ ¼ k0 r1 ðR1 · XÞDt ¼ k0 ðr1 · R1 Þ Xr1 =c ¼ k0 ðr1 · R1 Þ X=c ¼ 2k0 A1 X=c:

ð14:1Þ

In the above derivation, we have used the vector identity a · (b · c) ¼ (a · b) · c, which is readily proven by considering the volume of a parallelepiped constructed around a, b, and c. A1 is the area of the triangle with base r1 and vertex C (i.e., the point at which the rotation axis crosses the interferometer’s plane). With reference to Figure 14.3, the sign of the cross-product giving rise to A1 is positive in the present example. When the contributions to the acquired phase of the three legs of the interferometer are added up, the positive and negative areas combine to yield the net area A of the triangle enclosed by the light beam circulating

186

Classical Optics and its Applications M1 2

R1 C

r2

r1

R2 1 0 Light Source

R3 S r3

4

3

M2

Observation Plane

Figure 14.3 In the clockwise path around the Sagnac loop, the light beam propagates the distances r1, r2, r3 along the unit vectors r1, r2, r3. Four reflections (at S, M1, M2, and again at S) bring the beam from the source to the observation plane. With respect to the rotation center C, the centers of M1, M2, S are at R1, R2, R3, respectively. The monochromatic light source (frequency ¼ f0) launches the incident beam along r0; the emergent beam (frequency ¼ f4) reaches the observation plane along r4. Viewed from an inertial frame outside the rotating system, f4 could differ from f0; from the perspective of a co-rotating observer, however, the two frequencies are identical.

around the loop. The total phase shift must be doubled when the counterclockwise-propagating beam is taken into account as well. The net phase between the two beams at the observation plane is D ¼ 4k0 A X=c:

ð14:2Þ

Equation (14.2) is the fundamental equation of the Sagnac interferometer. Although the present analysis has been limited to a triangular loop, it is obvious that the procedure can be readily extended to loops of arbitrary shape, without modifying the fundamental formula. Note that, in arriving at Eq. (14.2), we ignored the Doppler shifts of the beams produced by the moving reflectors. The reason is that the analysis has been performed from the perspective of a co-rotating observer, i.e., one who resides on the rotating platform. For this observer, the source, the splitter, and the mirrors are stationary and, therefore, the light’s frequency everywhere is the same. The analysis of the next section, conducted from the viewpoint of an observer residing in an inertial frame outside the rotating platform, provides an alternative derivation of Eq. (14.2) using the Doppler shifts produced by the moving reflectors.

14 The Sagnac interferometer

187

Figure 14.4 shows that the counter-propagating beams in each arm of the Sagnac loop interfere with each other, setting up a standing-wave pattern (period ¼ 12k0) that is stationary within the rotating platform. The effect of the rotation on the standing wave is a uniform, longitudinal shift of the fringes by A · X/c. The reason for the fringe shift is that, at any given point around the loop, one beam arrives with its phase advanced, while the other (counterpropagating) beam arrives with its phase retarded. The combined phase-shift of Du ¼ 2 k0A· X/c is the same as the accumulated phase in a single round trip for either beam; the proof follows the same line of argument as that employed in conjunction with Eq. (14.1). Doppler shift caused by moving reflectors To analyze the Doppler shift upon reflection from the moving mirrors M1, M2, and the beam-splitter S, we consider the system depicted in Figure 14.5. In the xyz frame of reference, the mirror M moves with constant velocity v along the z-axis. From the perspective of an observer in the x0y0z0 frame in which the mirror is stationary, the incident and reflected plane-waves have the same frequency f 0 (wavelength k0 ¼ c/f 0 ), and the propagation directions are uniquely identified with unit vectors r10 and r20 . In the xyz frame, however, the incident plane-wave has frequency f1, wavelength k1 ¼ c/f1, and propagation direction r1, whereas the reflected wave’s parameters are f2, k2, and r2. The Lorentz transformation of the spatio-temporal coordinates from (x0, y0, z0, t0) to (x, y, z, t) yields the relationship between the incident and reflected waves in the xyz frame. In the co-moving x0y0z0 frame, the complex amplitudes of the incident and reflected beams may be written as follows (r 0 denotes r10 for the incident beam and r20 for the reflected beam): aðx0 ; y0 ; z0 ; t0 Þ ¼ a00 exp½ið2pf 0 =cÞðrx0 x0 þ ry0 y0 þ rz0 z0 ct0 Þ: ð14:3Þ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ Substituting x0 ¼ x, y0 ¼ y, z0 ¼ (z – vt)/ 1 v2 =c2 , t0¼ (t – vz/c2)/ 1 v2 =c2 in accordance with the Lorentz transformation yields aðx; y; z; tÞ ¼ a0 exp½ið2pf =cÞðrx x þ ry y þ rz z ctÞ;

ð14:4aÞ

where qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ f ¼ f 0 ½1 þ ðv=cÞrz0 = 1 v2 =c2 ;

ð14:4bÞ

r ¼ ðrx ; ry ; rz Þ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ¼ ðrx0 1 v2 =c2 ; ry0 1 v2 =c2 ; rz0 þ v=cÞ=½1 þ ðv=cÞrz0 :

ð14:4cÞ

188

Classical Optics and its Applications M1

C

½0 S Light Source

M2

Observation Plane

Figure 14.4 The counter-propagating beams of the Sagnac loop interfere with each other, setting up a standing wave pattern that remains stationary in the rotating frame. The wavelength everywhere within the rotating platform is k0, and the standing-wave fringes have a period of 12k0. At any given point around the loop, the clockwise beam is delayed and the counter-clockwise beam is advanced, yielding a combined phase-shift of Du ¼ 2k0A·X/c. The phase shift results in a longitudinal translation of the fringe pattern by A·X/c.

x x

M 2

v

1

. y

.

y z z

Figure 14.5 In the xyz frame, the mirror M moves with constant velocity v along the z-axis. From the perspective of an observer in the x0y0z0 frame, the mirror is stationary, and the incident and reflected plane-waves have the same frequency f 0 and propagation directions identified with unit vectors r10 and r20 . In the xyz frame, the incident wave has frequency f1 and propagation direction r1, while the reflected wave’s parameters are f2 and r2.

14 The Sagnac interferometer

189

It is not difficult to verify that jrj ¼ 1. From Eq. (14.4c) one can compute rz0 as a function of rz, then substitute into Eq. (14.4b) to find: pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ f ¼ f 0 1 v2 =c2 =ð1 r v=cÞ: ð14:5Þ Here r · v/c is an alternative expression for (v/c)rz. Consequently, in the xyz frame, where the incident beam has frequency f1 and propagation direction r1, while the reflected beam has frequency f2 and propagation direction r2, we have f2 =f1 ¼ ð1 r1 v=cÞ=ð1 r2 v=cÞ:

ð14:6Þ

Returning now to the system depicted in Figure 14.3, where, for the beamsplitter and the mirrors, r · v ¼ – r · (R · X ) ¼ – (r · R ) · X, we note that the magnitude of r · R is simply the perpendicular distance from C to a straight line aligned with r. The vector r · R is perpendicular to the plane of the interferometer, pointing either up or down depending on the direction of r. Thus we may write, for the clockwise path in Figure 14.3, f4 =f0 ¼ ðf4 =f3 Þðf3 =f2 Þðf2 =f1 Þðf1 =f0 Þ ¼ ½1 þ ðr0 · R3 Þ X=c=½1 þ ðr4 · R3 Þ X=c:

ð14:7Þ

In the final analysis, therefore, the ratio of the emergent frequency f4 to the source frequency f0 is a function of the perpendicular distance from C to the incidence vector r0 as well as that from C to the emergent vector r4. We emphasize that f4 and f0 appear to be different only to a stationary observer outside the rotating system; a comparison of Eq. (14.7) with Eq. (14.5) clearly indicates that, from the perspective of a co-rotating observer, the frequency at the observation plane is the same as the source frequency. In the counter-clockwise direction the beam is transmitted twice through the splitter S; neither passage introduces any Doppler shifts, as the propagation direction remains unaltered before and after transmission through a parallelplate slab (in other words, r1 ¼ r2 in conjunction with Eq. (14.6) immediately implies that f1 ¼ f2). The Doppler shifts produced by reflection from the moving mirrors M1 and M2, however, need to be taken into account. A similar analysis as the one that led to Eq. (14.7) then reveals that, for the counter-clockwise path, the ratio of the emergent frequency f4 to the source frequency f0 is the same as that for the clockwise path. Therefore, at the observation plane, the emerging clockwise and counter-clockwise beams will have the same frequency f4. We now derive Sagnac’s fundamental equation, Eq. (14.2), from the perspective of an observer outside the rotating platform, an observer residing in an inertial frame in which the rotation axis is stationary. Our alternative derivation

190

Classical Optics and its Applications

relies on the Doppler-shifted frequencies in the three arms of the loop depicted in Figure 14.3. When the inertial observer considers the clockwise path at a frozen instant in time, the Doppler-shifted frequencies f1, f2, f3 in the three arms of the loop yield the accumulated phase as follows: D1 þ D2 þ D3 ¼ 2pðf1 r1 þ f2 r2 þ f3 r3 Þ=c ¼ ð2pf0 =cÞ½ðf1 =f0 Þr1 þ ðf2 =f0 Þr2 þ ðf3 =f0 Þr3 ¼ ð2pf0 =cÞ½1 þ ðr0 · R3 Þ X=c ( r1 r2 þ · 1 þ ðr1 · R3 Þ X=c 1 þ ðr2 · R1 Þ X=c ) r3 þ 1 þ ðr3 · R2 Þ X=c ð2pf00 =cÞ ðr1 þ r2 þ r3 Þ ½ðr1 · R3 Þ þ ðr2 · R1 Þ þ ðr3 · R2 Þ X=c ¼ k00 ðr1 þ r2 þ r3 Þ 2k00 A X=c:

ð14:8Þ

In the above derivation we have used Eq. (14.5) to relate f0 to f00 , the frequency of the source within the rotating system, while ignoring terms of the order (v/c)2 and higher. A similar treatment of the counter-clockwise path leads to the same result as in Eq. (14.8), except for the minus sign between the two terms being replaced by a plus sign. Thus the net phase-shift Du between the counter-propagating beams is, once again, given by Eq. (14.2). Note that k00 in Eq. (14.8) is the same as k0 in Eq. (14.2), both symbols representing the light source’s wave-number as measured on the rotating platform. Finally, we must show that the counter-propagating beams in each arm of the Sagnac interferometer produce running fringes that travel with the velocity of the arm itself; this would corroborate our earlier assertion that the fringes co-rotate with the platform. For the sake of concreteness, we denote by f2þ and f2, respectively, the clockwise and counter-clockwise frequencies in the arm located between the mirrors M1 and M2. (A similar analysis applies to the other arms as well.) In analogy with Eq. (14.7), we write f2 =f0 ¼ ½1 þ ðr0 · R3 Þ X=c=½1 ðr2 · R1 Þ X=c:

ð14:9Þ

Again, relating f0 to f00 through Eq. (14.5) and ignoring terms of the order (v/c)2 and higher yields f2 ½1 ðr2 · R1 Þ X=c f00 :

ð14:10Þ

14 The Sagnac interferometer

191

Since r2 · R1 is a vector perpendicular to the loop, having the magnitude of the vertical distance from C to r2, one may replace R1 in Eq. (14.10) with any vector R from C to an arbitrary point along the straight line that connects the centers of M1 and M2. The scalar product (r2 · R1) · X is thus equal to v2 ¼ r2·(X · R), namely, the linear velocity (along r2) of each and every point that resides on the path from the center of M1 to the center of M2. We have f 2 ½1 ðv2 =cÞ f00 :

ð14:11Þ

In the region between M1 and M2 the counter-propagating beam amplitudes are a0 cos(x2þt – k2þx þ u2þ) and a0 cos(x2t þ k2x þ u2); here x2 ¼ 2p f2, k2 ¼ x2/c, u2þu2 is the relative phase, and x is the distance from the center of M1 along the unit vector r2. The total intensity I(x, t) in this region may thus be written þ þ 2 Iðx; tÞ ¼ a20 h½cosðxþ 2 t k2 x þ 2 Þ þ cosðx2 t þ k2 x þ 2 Þ i;

ð14:12Þ

where the angle brackets represent time-averaging to eliminate rapid oscillations. We have þ þ Iðx; tÞ a0 2 f1 þ cos½ðxþ 2 x2 Þt ðk2 þ k2 Þx þ ð2 2 Þg a0 2 f1 þ cos½4pðf0 0 =cÞðx v2 tÞ ðþ 2 2 Þg

ð14:13Þ

The running fringes thus have a period of 12k0 and travel with velocity v2 in the r2 direction.

The effect of a co-rotating dielectric medium With reference to Figure 14.6, a transparent dielectric slab of thickness L moves with constant velocity v along the z-axis in the xyz frame. From the perspective of an observer in the x0y0z0 frame, which also moves with velocity v along z, the slab is stationary, and the incident, intermediate, and transmitted (or emergent) planewaves all have the same frequency f 0 as well as identical propagation directions r0 ¼ r10 ¼ r20 . (The entrance and exit facets of the slab are perpendicular to the common propagation direction.) We denote by n0 the refractive index of the (stationary) slab at f ¼ f 0. In the xyz frame, the incident and transmitted planewaves have frequency f1, wavelength k1 ¼ c/f1, and propagation direction r1. (The beam parameters inside the dielectric, f2, k2 ¼ c/(n^2 f2), and r2 will not enter the following analysis.)

192

Classical Optics and its Applications x

x

L

1 2

1

v n

. y

y

.

z z

Figure 14.6 In the xyz frame, a transparent dielectric slab of thickness L moves with constant velocity v along the z-axis. From the perspective of an observer in the x0y0z0 frame, the slab is stationary, and the incident, intermediate, and transmitted plane-waves all have the same frequency f 0 and (identical) propagation directions r0. In the xyz frame, both the incident and transmitted waves have frequency f1 and propagation direction r1, while the beam parameters inside the dielectric are f2 and r2.

The Lorentz transformation of the spatio-temporal coordinates from (x0, y0, z0, t0) to (x, y, z, t) yields the relationship between the various plane-waves in the xyz and x0y0z0 systems. In particular, one can readily show that pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ f 0¼ ð1 r1 v=cÞf1 = 1 v2 =c2 ð1 r1 v=cÞf1 ; ð14:14aÞ

r0¼ ðrx1

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 v2 =c2 ; ry1 1 v2 =c2 ; rz1 v=cÞ=ð1 r1 v=cÞ

ðr1 v=cÞ=ð1 r1 v=cÞ:

ð14:14bÞ

Let the entrance facet of the slab at t ¼ t0 ¼ 0 be centered at the (coincident) origin of the two coordinate systems, namely, (x0, y0, z0) ¼ (x00 , y00 , z00 ) ¼ (0, 0, 0). The reference phase of the plane-wave entering the slab is thus zero in both coordinate systems. In the absence of any movement, i.e., when v ¼ 0, the frequency f1 and the propagation direction r1 of the beam are the same everywhere, inside and outside the slab. Moreover, the propagation direction r1 may be assumed to be perpendicular to the entrance and exit facets, so that at the center r1 ¼ (x1, y1, z1) of the exit facet, r1 · r1 ¼ L. Denoting by n1 the refractive index of the stationary slab at f ¼ f1, one can write the phase u1s of the emergent beam at

14 The Sagnac interferometer

193

the center of the exit facet (relative to the phase at the center of the entrance facet) as follows: 1s ¼ 2pn1 f1 L=c:

ð14:15Þ

With the slab traveling at a constant velocity v, the frequency f 0 and the propagation direction r0 inside the slab (within the co-moving x0y0z0 frame) determine the phase at the center of the exit facet located at r10 ¼ Lr1 (relative to that at the center of the entrance facet) as u10 ¼ 2pn0f 0Lr1 · r0/c. The emergent beam may thus be written as aðr0 ; t0 Þ ¼ a00 expði0 Þ exp½ið2pf 0=cÞðr0 r0 ct0 Þ;

ð14:16aÞ

0 ¼ 2pðn0 1Þf 0 Lr1 r0 =c ð2pf1 L=cÞðn0 1Þð1 r1 v=cÞ:

ð14:16bÞ

where

We denote the (group) index of refraction at f ¼ f 0 by ng, corresponding to the group velocity of a light pulse inside the stationary slab within the moving x0y0z0 frame, namely, ng ðf 0 Þ ¼ nðf 0 Þ þ ðdn=df Þf 0 n0 þ ðn1 n0 Þf 0 =ðf1 f 0 Þ ð2n0 n1 Þ þ ðn1 n0 Þ=ðr1 v=cÞ n0 þ ðn1 n0 Þ=ðr1 v=cÞ:

ð14:17Þ

In the last line of Eq. (14.17), the near-equality of n0 and n1 has been used to substitute n0 for (2n0 – n1); however, the same approximation cannot be applied to the second term, because of the division by the small quantity r1 · v /c. When a short pulse launched at the entrance facet at t0 ¼ 0 reaches the slab’s exit facet at t0 ¼ ngL/c, the center of the exit facet, located at r10 ¼ Lr1 in the x0y0z0 system, will have reached the point r1 in the xyz frame. Using the Lorentz transformation, we find h pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃi r1 ¼ Lrx1 ; Lry1 ; ðLrz1 þ vng L=cÞ= 1 v2 =c2 Lðr1 þ ng v=cÞ: ð14:18Þ Therefore, in the xyz system, the phase of the emergent beam at the center of the exit facet will be 1 ¼ 0 þ ð2pf1 =cÞr1 r1 ð2pf1 L=cÞ½ðn0 1Þð1 r1 v=cÞ þ ð1 þ ng r1 v=cÞ ð2pf1 L=cÞðn1 þ r1 v=cÞ:

ð14:19Þ

194

Classical Optics and its Applications

The difference between the above phase and that obtained in Eq. (14.15) for a stationary slab is 1 1s ð2pf1 L=cÞr1 v=c:

ð14:20Þ

Note that the above expression for the phase difference between a rotating Sagnac interferometer (with co-rotating slab) and a stationary one, both computed at the exit facet of the slab (relative to the entrance facet), is independent of the slab’s refractive index. The expression for u1 – u1s in Eq. (14.20) is the same as that which would have been obtained had the beam traveled from the entrance to the exit facet in the free space (rather than in a dielectric of refractive index n). We have thus proven that the presence of co-moving dielectric media within one or more arms of the device will not alter the fundamental formula, Eq. (14.2), of the Sagnac interferometer. The laser gyroscope Figure 14.7 shows a diagram of a ring laser gyroscope. The co-rotating gain medium is now an integral part of the Sagnac loop, and the beam-splitter S is oriented in such a way as to act as a semi-transparent mirror for the ring laser. The beam traveling clockwise around the loop emerges from the splitter S along the direction r0, and is subsequently directed toward the photodetector array. Similarly, the counterclockwise-traveling beam emerges along r4, and is directed toward the detector. When the loop is stationary (i.e., X ¼ 0), the clockwise and counterclockwise laser modes have identical frequencies f0. Denoting by l the perimeter of the loop, and assuming that the entire loop is uniformly filled with a gain medium of refractive index n0, the wavelength k ¼ c/(n0 f0) must fit within the loop, that is, l/k ¼ n0 f0 l/c ¼ m, where m is an integer. Any rotation of the loop will introduce a phase shift in the clockwise path, resulting in the lasing frequency of the clockwise mode to drift from f0 to f0þ. A nearly equal but opposite phase shift in the counter-clockwise path will result in the lasing frequency of the counter-clockwise mode to drift from f0 to f0. Note that f0 are the frequencies measured by the co-rotating observer. Assuming the frequency shifts are not large enough to modify the mode number m, one may write: 2pm ¼ 2pn0 f0 ‘=c ¼ ð2pnþ f 0þ l=cÞ ð4pf 0þ =cÞA X=c

¼ ð2pn f 0 l=cÞ þ ð4pf 0 =cÞA X=c:

ð14:21Þ

195

14 The Sagnac interferometer M1 2

R1 C

r2

r1

R2 R3

1 0

r3

4

3

M2

S

Detector Array

Figure 14.7 An active Sagnac gyroscope is a ring laser whose gain medium is placed within one or more arms of a Sagnac interferometer. The beam-splitter S is now re-oriented to allow the counter-propagating beams to circulate around the loop. Fractions of the clockwise and counter-clockwise modes emerge from the cavity along r0 and r4, respectively. When the platform is stationary, the two modes have the same frequency f0; the mode frequencies, however, drift in opposite directions when X6¼0. The two modes are brought together on an observation plane to form an interference pattern with running fringes. A photodetector array picks up the beat frequency Df between the modes, which is proportional to the platform’s angular frequency X. The detector array is also capable of detecting the sign of X by monitoring the direction of travel of the fringes.

For sufficiently small frequency shifts (i.e., Df ¼ f þ0 f 0 f0), the refractive indices n ¼ n( f0) will be nearly the same as n0, and the average frequency f ¼ 12 ( f0þ þ f0) will be essentially equal to f0. Therefore, Df =f ðf 0þ f 0 Þ=f0 4A X=ðn0 lcÞ:

ð14:22Þ

When the two beams emerging from the laser cavity are brought together on a photodetector array, as in Figure 14.7, they produce a pattern of running fringes. The beat frequency Df then yields the magnitude of X, while its sign is determined by the direction of fringe travel.

196

Classical Optics and its Applications

References for Chapter 14 1 G. Sagnac, Comptes Rendus de l’Acade´mie des Sciences (Paris) 157, 708–710, 1410–1413 (1913). 2 H. Ives, JOSA 28, 296–299 (1938). 3 E. J. Post, Sagnac effect, Rev. Mod. Phys. 39, 475–493 (1967). 4 H. Lefe`vre, The Fiber-Optic Gyroscope, Artech House, Boston, 1993.

15 Fabry–Pe´rot etalons in polarized light

The principles of operation of Fabry–Pe´rot interferometers are well known, and their application in spectroscopy has established their status as one of the most sensitive instruments ever invented.1,2 However, the behavior of a Fabry–Pe´rot device in polarized light, especially when birefringence and optical activity are present within the mirrors or in the cavity, is less well known. We devote this chapter to a description of some of these phenomena, in the hope of clarifying their physical origins and perhaps suggesting some new applications. The dielectric mirror A multilayer stack mirror is shown schematically in Figure 15.1. The substrate is a transparent slab of glass, and the layers are made of high- and low-index dielectric materials.3 In the examples used in this chapter the low-index layers will have (n, k) ¼ (1.5, 0) and thickness d ¼ 105.5 nm, and the high-index layers will have (n, k) ¼ (2, 0) and d ¼ 79.125 nm. (At the operating wavelength, 633 nm, both these layers will be one quarter-wave thick.) Figure 15.2 shows computed plots of amplitude and phase for the reflection coefficients of a 10-layer mirror. At normal incidence (h ¼ 0) both p- and s-components of polarization have an amplitude reflectivity jqj ¼ 0.844. The mirror, therefore, reflects about 71% of the incident optical power and transmits the remaining 29%. Ignoring any loss of light at the uncoated facet of the substrate, the amplitude transmission coefficient (outside the substrate) turns out to be jsj ¼ 0.536. At larger angles of incidence both the amplitude and phase for pand s-light begin to deviate from their normal-incidence values and from each other, but we are not concerned with these variations here. What is important is to note that, at small angles of incidence (say up to 30 ), the reflectivity remains high. Let us also mention in passing that, when light shines on a dielectric mirror through its substrate, the reflection and transmission coefficients q0 and s0 generally retain the same amplitudes as above, but their phases will differ from those of q and s. 197

198

Classical Optics and its Applications Z Ep Es Θ

Y X

Layer n Layer 1 Substrate

Figure 15.1 A multilayer dielectric mirror and a plane wave at oblique incidence. In the examples used in this chapter, the substrate refractive index n is 1.5, the oddnumbered layers have an index of 2 and are 79.125 nm thick, and the even-numbered layers have an index of 1.5 and are 105.5 nm thick. At the design wavelength, k ¼ 633 nm, these layer thicknesses correspond to one-quarter of the wavelength.

(a)

(b)

1.0

0 Phase of Reflection Coefficient

Amplitude Reflection Coefficient

|s| 0.8

0.6

|p|

0.4

–45

–90

fs

–135

0.2

0.0

fp

0

15

30

45

60

75

90

–180

0

15

(degrees)

30

45

60

75

90

(degrees)

Figure 15.2 Computed plots of (a) amplitude and (b) phase of the reflection coefficients of a dielectric mirror for p- and s-polarized beams versus the angle of incidence. The assumed mirror is as shown in Figure 15.1, with a total of 10 layers; the medium of incidence is air.

The Fabry–Pe´rot etalon Figure 15.3 shows the schematic of a Fabry–Pe´rot etalon. Two dielectric mirrors, separated by an air gap, are placed face-to-face and parallel to each other. A plane monochromatic beam is shown at oblique incidence h on one of the mirrors.

Fabry–Pe´rot etalons in polarized light

199

Air gap

Z

Θ

Substrate

Quarter-wave stacks

Substrate

Figure 15.3 A Fabry–Pe´rot etalon consists of two face-to-face dielectric mirrors separated by an air-gap. Also shown is a plane wave at an oblique incidence angle h.

(In practice the uncoated facets of both substrates are given a slight wedge to eliminate spurious reflections.) For the system of Figure 15.3 the computed plots of reflection amplitude and phase versus h are shown in Figure 15.4, for mirrors with 10 dielectric layers each, and an air gap 8.229 lm wide, which is exactly 13k. Note that, within the 0 to 30 range of angles of incidence depicted, the pand s- reflectivities are nearly the same. Sharp drops in the etalon’s reflectivity occur at h ¼ 0 , 15.37 , 21.83 , and 26.85 ; at these angles (d/k) cos h ¼ 13, 12.53, 12.07, and 11.6, respectively. In other words, when the effective gap-width is an integer multiple of a half-wavelength, the etalon becomes transparent to the incident light. To be sure, there are slight deviations from exact half-wavelength multiplicity here, which have to do with the h-dependence of the phase of the individual mirror reflectivities (see Figure 15.2 (b)), but, for our purposes, these differences are small and may be ignored. Next, we study the setup of Figure 15.5, which is designed to send a focused beam of light onto an etalon and to analyze the resulting reflection. The setup includes a path for a reference beam, so that Twyman–Green interferometry may be used to reveal the reflected phase pattern. It also includes a polarizer before the observation plane to allow selection of the polarization direction of interest. Figure 15.6 shows computed plots of intensity at the observation plane obtained under various conditions. Frames (a) and (b) are obtained when the reference beam is blocked, whereas frames (c) and (d) are interferograms obtained in the presence of the reference beam. The circular area in each frame represents the aperture of the objective lens (NA ¼ 0.5).

200

Classical Optics and its Applications (a)

|rp|

1.0

180

|rs|

(b) fp

Phase of Reflection Coefficient

Amplitude Reflection Coefficient

135 0.8

0.6

0.4

fs

90 45 0 –45 –90

0.2 –135

0.0

–180 0

5

10

15

(degrees)

20

25

30

0

5

10

15

20

25

30

(degrees)

Figure 15.4 Computed plots of (a) amplitude and (b) phase of the reflection coefficients of a Fabry–Pe´rot etalon for p- and s-polarized beams versus the angle of incidence. The assumed etalon is that shown in Figure 15.3 with 10-layer mirrors and a 8.229 lm gap.

In (a) the polarizer is taken to transmit the same direction of polarization as that of the incident beam. The dark rings correspond to the angles of incidence at which the reflectivity plots of Figure 15.4(a) exhibit their minima. In frame (b) the polarizer is turned by 90 so that only a small fraction of the light (about 3 · 104 of the original incident power) passes through to the observation plane. The four corners of this distribution correspond to the four corners of the focused cone of light, which have a mix of p- and s-polarization. Here, the rays incident on the etalon are subject to slightly different reflectivities in their p- and s-components (see Figure 15.4(a)), which gives rise to a small rotation of polarization from its original direction. It is this polarization rotation in the four corners of the lens that is responsible for the four corners of the intensity distribution in Figure 15.6(b). Frames (c) and (d) of Figure 15.6 are obtained by unblocking the reference arm in the system of Figure 15.5, thus allowing the interference pattern (between the beam reflected from the etalon and that reflected from the reference mirror) to impinge on the observation plane. The case for the parallel component of polarization depicted in (c) shows the phase of the pattern to be more or less uniform over the entire aperture; in particular, it shows that there are no phase jumps between adjacent rings. The case for the perpendicular component of

201

Fabry–Pe´rot etalons in polarized light Reference mirror Quarter-wave plate Neutral-density filter

Microscope objective lens (NA = 0.5)

Linearly polarized incident beam

Cube beam-splitter Polarizer

Fabry–Pérot etalon

Observation plane

Figure 15.5 Schematic diagram of a system used for observing Fabry–Pe´rot fringes. A linearly polarized beam of light is focused on an etalon, which reflects the beam and sends it back through the system. The recollimated beam may then be viewed after passing through a polarizer. By setting the transmission axis of the polarizer perpendicular to the direction of incident polarization, one may observe regions of the beam which have suffered a small (but measurable) polarization rotation. A Twyman–Green interferometer is also incorporated into the system for observing the reflected phase pattern. The neutral-density filter is needed to adjust the amplitude of the reference beam in order to obtain high-contrast interferograms. A 45 rotation of the quarter-wave plate around the optical axis causes a 90 rotation of the reference beam’s polarization; this is needed when the polarizer’s transmission axis is set perpendicular to the direction of incident polarization.

polarization depicted in (d) shows a 180 phase shift between adjacent corners. This is caused by the fact that, in adjacent corners of the lens, the polarization vector rotates in opposite directions. Mirror birefringence Next we consider the effects of birefringence in the mirrors of the Fabry– Pe´rot etalon.4,5 For this analysis we assume that the mirrors have 20 layers each (jqj ¼ 0.9905, jsj ¼ 0.1375) and that the uppermost layer of both mirrors is slightly birefringent. We will suppose that the uppermost layer has a nominal index of 1.5, except along the Y-axis (see Figure 15.1) where the index is 1.505. We also assume a normally incident plane wave and an adjustable gap-width. For

202

Classical Optics and its Applications a

b

c

d

Figure 15.6 Computed plots of intensity distribution at the observation plane of the system of Figure 15.5. Frame (a) is obtained when the reference beam is blocked and the polarizer is set to transmit the direction of incident polarization. In the case of frame (b), the reference beam is still blocked, but the polarizer is rotated by 90 . The logarithm of the intensity distribution is plotted here in order to enhance the weak regions of the pattern; this is similar to over-exposing a photographic film placed at the observation plane. Frames (c) and (d) show the corresponding interference patterns obtained when the reference beam is unblocked. In the case of frame (d) the polarizer is rotated by 90 and the quarter-wave plate by 45 , a strong neutral-density filter is used to attenuate the reference beam substantially, and again the logarithm of the intensity distribution is plotted to enhance weak regions.

this etalon the computed transmission coefficients and the polarization state of the transmitted light versus the gap-width are plotted in Figure 15.7. Two different peaks are observed in transmission, one for the p-polarized, the other for the s-polarized incident beam. (The E-field of the p-light is parallel to X, while that of the s-light is parallel to Y.) The peak separation arises because the mirrors give a slightly different phase upon reflection to the two components of polarization. The gap-width, therefore, must be adjusted to compensate for this phase difference. If the incident beam is linearly polarized at 45 (i.e., halfway between p and s), the transmitted beam will show the rotation angle w and the ellipticity n depicted in Figure 15.7(b). The maximum ellipticity is close to 40 , which shows that the transmitted light at this point is nearly circularly polarized. A very small amount of birefringence in the mirrors can, therefore, have substantial effects on the polarization state of the transmitted (or reflected) beam.

203

Fabry–Pe´rot etalons in polarized light 70

(a)

(b)

60 Polarization Rotation and Ellipticity (degrees)

Amplitude Transmission Coefficient

1.0

0.8

0.6

0.4 |ts| |tp|

0.2

c

50 40 30 20 10

j

0 –10 –20 –30 –40

0.0 8210

8220

8230 8240 Gap Width (nm)

8250

8210

8220

8230 8240 Gap Width (nm)

8250

Figure 15.7 (a) Computed amplitude transmission coefficients and (b) the state of transmitted polarization plotted versus the gap-width for a normally incident beam on the Fabry–Pe´rot etalon of Figure 15.3. The mirrors are assumed to have 20 layers each and, for both mirrors, the uppermost layer is assumed to be birefringent. With reference to Figure 15.1, the refractive indices of the top layer along the X-, Y-, and Z- axes are 1.500, 1.505, and 1.500, respectively. The normally incident beam is linearly polarized at 45 to the X-axis. In (b) the polarization rotation angle, w, is also referred to the X-axis. By definition, the polarization ellipticity n is the arctangent of the ratio of the minor axis of the ellipse of polarization to its major axis. Thus e ¼ 0 corresponds to linear polarization, whereas n ¼ 45 represents circular polarization.

In practice, if the mirrors are known to have the same amount of birefringence, these problems can be avoided by rotating one of the mirrors by 90 relative to the other. Also, it might be of some interest to note that birefringence of the top layer poses the most serious problem for the Fabry–Pe´rot etalons. In our calculations, the effects diminished as we moved the birefringent layer down the stack (closer to the substrate). By the time the birefringence is moved to layer 14 of both mirrors, its effects are totally negligible. Enhancement of Faraday rotation Figure 15.8 shows a Fabry–Pe´rot etalon with a Faraday rotator inserted in the gap between its mirrors. The Faraday rotator may be a slice of a transparent magnetic

204

Classical Optics and its Applications Substrate

Quarter-wave stacks Substrate

Ep

Ep

Es Z

Linearly polarized incident beam

Faraday rotator

Figure 15.8 Schematic diagram showing a Fabry–Pe´rot etalon with a Faraday rotator placed in the gap. The normally incident plane wave is linearly polarized along the p-direction, but the optical activity of the Faraday rotator produces a transmitted component of polarization along the s-direction. The mirrors are taken to have 21 layers each and the Faraday medium to be 2.11lm thick and to have refractive indices (n, k) ¼ (1.5 0.333 · 104, 0) for the states of right and left circular polarization.

crystal (such as an iron garnet) or a transparent piece of glass in which an externally applied magnetic field has induced polarization rotation. In our simulations this medium was 2.11lm thick, with refractive indices (n, k) ¼ (1.5 0.333 · 104, 0) for the states of right and left circular polarization (RCP and LCP). This small amount of optical activity would produce only 0.04 of polarization rotation in a single pass of the beam through the medium. However, as we shall see shortly, the etalon enhances the rotation because, in effect, it circulates the beam through the Faraday medium. Figure 15.9 shows the transmitted amplitudes and the polarization state of a linearly polarized beam after going through the etalon of Figure 15.8. Since tuning of the cavity may be accomplished by varying the incident wavelength k, we have plotted the data versus k in the vicinity of resonance (633 nm). Note that almost all the incident beam is transmitted through the etalon and that its polarization rotation at resonance is close to 11 . (Even more rotation may be obtained if higher-reflectivity mirrors are used.) Absorption within the Faraday medium reduces the quality factor Q of the cavity and, therefore, hampers its ability to enhance the Faraday rotation. If the same medium as above is assumed to have an absorption coefficient k ¼ 104, the characteristics of the etalon shown in Figure 15.9 will change to those in Figure 15.10.

205

Fabry–Pe´rot etalons in polarized light 1.0

(a)

(b)

Polarization Rotation and Ellipticity (degrees)

Amplitude Transmission Coefficient

10

0.8 |tp| 0.6

0.4

0.2

|ts|

0.0

c

8 6 4 2 0 j

–2 –4 –6

631

632

633

634

635

631

632

633

634

635

(nm)

(nm)

Figure 15.9 (a) Computed amplitude transmission coefficients and (b) the state of transmitted polarization plotted versus k for a plane wave normally incident on the etalon of Figure 15.8. The assumed direction of incident polarization is p. In (a) the transmission coefficient ts is defined as the ratio of the transmitted s-component to the incident p-component. In (b) the polarization rotation angle w is relative to the direction of incident polarization.

Note that the transmitted power has dropped by more than 60% and that the peak rotation angle is reduced by about 4 . The plot in Figure 15.10(c) of the magnitude of the Poynting vector along the propagation path shows constant values in the multilayer mirrors but a rapid decline within the Faraday medium. Of the roughly 85% of the optical power that enters the etalon, 46% gets absorbed in the Faraday rotator and only 39% eventually passes out of the device. The small plateaux in the Poynting vector plot of Figure 15.10(c) are caused by the standing-wave pattern of the E-field within the Faraday medium: the absorption rate goes through minima and maxima following the E-field intensity variations. In our example, where the Faraday medium is 5k thick, there are exactly 10 such plateaux. A simple analysis We now present a simple derivation of the basic properties of the Fabry–Pe´rot interferometer. Let us assume that the two mirrors are identical, with reflection coefficients q and transmission coefficients s. If the light is incident from the

206

0.0

0.2

0.4

0.6

0.8

631

632

(nm)

633

|ts|

|tp|

634

635

Polarization Rotation and Ellipticity (degrees) –6

–4

–2

0

2

4

6

8

10

631

(b)

632

j

633 (nm)

c

634

635

0.0

0.2

0.4

0.6

0.8

0

1.0 (c)

1500

3000 Z (nm)

4500

6000

Figure 15.10 (a) and (b) are the same as in Figure 15.9, except for the presence of a small absorption coefficient (k ¼ 104) in the Faraday medium. The plot in (c) shows the magnitude of the Poynting vector as a function of position along the beam’s propagation path. The flat parts of the curve indicate that optical energy passes unattenuated through the dielectric mirrors. The steep, staircase-like drop in the curve is caused by absorption within the Faraday medium.

Amplitude Transmission Coefficient

1.0 (a)

Normal Component of Poynting Vector

Fabry–Pe´rot etalons in polarized light

207

substrate side of the mirror, these coefficients will be denoted by q0 and s0 , respectively. For dielectric mirrors, in general, we have jqj ¼ jq0 j and jsj ¼ js0 j, and there are simple relations among the corresponding phase factors, but these are not needed here. Also, since there is no absorption in the mirrors, we have jqj2 þ jsj2 ¼ 1. Consider the case of a unit-amplitude beam normally incident on a Fabry–Pe´rot etalon, such as that shown in Figure 15.3 but with h ¼ 0. Denote the gap-width by D, and let there be two counter-propagating beams in the cavity, one with amplitude A traveling to the right and the other with amplitude B traveling to the left. Since at the second mirror there are no incoming beams from the outside, and since the beam with amplitude A is reflected with coefficient q from this mirror, we must have B ¼ qA exp(i2pD/k). At the first mirror, the beam with amplitude B is reflected once again, and its amplitude becomes q2A exp(i4pD/k). Since the incident beam has unit amplitude, its contribution to the field just inside the cavity is s0 . Therefore A ¼ q2 A expði4pD=kÞ þ s0 ;

ð15:1Þ

which yields A¼

s0 : 1 q2 expði4pD=kÞ

ð15:2Þ

Resonance occurs when the phase of q2 (if any) plus the phase acquired in a round trip through the cavity, 4pD/k, becomes a multiple of 2p, at which point the denominator in Eq. (15.2) will be at a minimum and the field amplitude A within the cavity at a maximum. The light transmitted through the device will have amplitude t ¼ sA expði2pD=kÞ

ð15:3Þ

and that reflected from the device will have amplitude r ¼ q 0 þ sqA expði4pD=kÞ:

ð15:4Þ

The same equations may be used at oblique incidence, provided that the gapwidth D is multiplied by cos h and that q, s, q0 , and s0 represent the corresponding quantities at the particular angle of incidence. When the medium of the cavity happens to be absorptive, the same type of analysis may still be used to arrive at the relevant formulas. We now demonstrate the application of the preceding equations to some of the cases discussed earlier. In the case of the 10-layer stack, the mirror coefficients were q ¼ q0 ¼ 0.844 and s ¼ s0 ¼ 0.536. From Eqs. (15.2)–(15.4) we find that, at resonance, A ¼ 1.863, t ¼ 1 and r ¼ 0. In the case of the 20-layer stack,

208

Classical Optics and its Applications

q ¼ q0 ¼ 0.9905 and s ¼ s0 ¼ 0.1375, yielding at resonance A ¼ 7.27, t ¼ 1 and r ¼ 0. For the component of polarization that sees the higher refractive index of the uppermost layer, q acquires a phase angle of 0.9 , which is canceled when the gapwidth is reduced by 1.6 nm. This is exactly the peak shift observed in Figure 15.7(a). The 21-layer stacks used with the Faraday rotator were symmetric, in the sense that their substrate and their medium of incidence had the same refractive index, n ¼ 1.5. For these mirrors q ¼ q0 ¼ 0.9964 and s ¼ s0 ¼ 0.0843i, yielding at resonance A ¼ 11.73i, t ¼ 1 and r ¼ 0. Since the refractive indices of the Faraday medium for RCP and LCP light deviated from the nominal value by 0.0022%, a similar change in wavelength was needed to re-establish the conditions of resonance for each state of circular polarization. At k ¼ 633 nm, however, both RCP and LCP components of the incident beam were slightly off resonance. This, according to Eq. (15.2), caused a large phase shift between the values of A for RCP and LCP light, which translated into a large phase difference between the transmitted RCP and LCP components, in accordance with Eq. (15.3). The resulting Faraday rotation angle of the transmitted beam was a manifestation of this phase difference. Similar arguments can be advanced to explain the consequences of the absorption observed in Figure 15.10. Note In general, it is possible to eliminate from a stack non-absorbing layers whose thicknesses are multiples of k/2. Now, if the gap happens to be an integer multiple of k/2 its elimination will bring the top dielectric layers of the two mirrors into contact. These layers, each being a quarter-wave thick, will combine into a halfwave layer that can be subsequently eliminated, paving the way for the elimination of all the remaining layers in similar fashion. At the end, the two substrates will come into direct contact, and the incident light will be fully transmitted, as is expected of a well-tuned etalon. References for Chapter 15 1 C. Fabry and A. Pe´rot, Ann. Chim. Phys. (7) 16, p. 115 (1899). 2 R. W. Wood, Physical Optics, third edition, Optical Society of America, Washington, 1988. 3 H. A. Macleod, Thin Film Optical Filters, second edition, Macmillan, New York, 1986. 4 S. C. Johnston and S. F. Jacobs, Some problems caused by birefringence in dielectric mirrors, Appl. Opt. 25, 1878 (1986). 5 C. Wood, S. C. Bennett, J. L. Roberts, D. Cho, and C. E. Wieman, Birefringence, mirrors, and parity violation, Opt. & Phot. News 7, 54 (1996).

16 The Ewald–Oseen extinction theorem

When a beam of light enters a material medium, it sets in motion the resident electrons, whether these electrons are free or bound. The electronic oscillations in turn give rise to electromagnetic radiation which, in the case of linear media, possesses the frequency of the exciting beam. Because Maxwell’s equations are linear, one expects the total field at any point in space to be the sum of the original (exciting) field and the radiation produced by all the oscillating electrons. However, in practice the original beam appears to be absent within the medium, as though it had been replaced by a different beam, one having a shorter wavelength and propagating in a different direction. The Ewald–Oseen theorem1,2 resolves this paradox by showing how the oscillating electrons conspire to produce a field that exactly cancels out the original beam everywhere inside the medium. The net field is indeed the sum of the incident beam and the radiated field of the oscillating electrons, but the latter field completely masks the former.3,4 Although the proof of the Ewald–Oseen theorem is fairly straightforward, it involves complicated integrations over dipolar fields in three-dimensional space, making it a brute-force drill in calculus and devoid of physical insight.5,6 It is possible, however, to prove the theorem using plane waves interacting with thin slabs of material, while invoking no physics beyond Fresnel’s reflection coefficients. (These coefficients, which date back to 1823, predate Maxwell’s equations.) The thin slabs represent sheets of electric dipoles, and the use of Fresnel’s coefficients allows one to derive exact expressions for the electromagnetic field radiated by these dipolar sheets. The integrations involved in this approach are one-dimensional, and the underlying procedures are intuitively appealing to practitioners of optics. The goal of the present chapter is to outline a general proof of the Ewald–Oseen theorem using arguments that are based primarily on thin-film optics.

209

210

Classical Optics and its Applications

Dielectric slab Consider the transparent slab of dielectric material of thickness d and refractive index n, shown in Figure 16.1. A normally incident plane wave of vacuum wavelength k0 produces overall a reflected beam of amplitude r and a transmitted beam of amplitude t. Both r and t are complex numbers in general, having a magnitude and a phase angle. Using Fresnel’s coefficients at each facet of the slab and accounting for multiple reflections, it is fairly straightforward to obtain expressions for r and t. The reflection and transmission coefficients at the front facet of the slab are5,7 q ¼ ð1 nÞ=ð1 þ nÞ;

ð16:1Þ

s ¼ 2=ð1 þ nÞ:

ð16:2Þ

At the rear facet the corresponding entities are q0 ¼ ðn 1Þ=ðn þ 1Þ;

ð16:3Þ

s0 ¼ 2n=ðn þ 1Þ:

ð16:4Þ

A single path of the beam through the slab causes a phase shift w, where w ¼ 2pnd=k0 :

ð16:5Þ

Adding up all partial reflections at the front facet yields an expression for the reflection coefficient r of the slab. Similarly, adding all partial transmissions at d 1

r

n

9exp(i)

9r9exp(i2) 9r92 exp(i3) 9r93 exp(i4) 9r94 exp(i5) 9r95 exp(i6)

Figure 16.1 A transparent slab of homogeneous material of thickness d and refractive index n, on which is normally incident a monochromatic plane wave of wavelength k0. The beam suffers multiple reflections at the two facets of the slab. By adding the various reflected and transmitted amplitudes one obtains the expressions for the total r and t given in Eqs. (16.6) and (16.7).

16 The Ewald–Oseen extinction theorem

211

the rear facet yields the transmission coefficient t. Thus r ¼ q þ ss0 q0 expði2wÞ

1 X

½ q0 expðiwÞ2m ¼ q þ

m¼0

t ¼ ss0 expðiwÞ

1 X m¼0

½ q0 expðiwÞ2m ¼

ss0 q0 expði2wÞ 1 q02 expði2wÞ

ss0 expðiwÞ : 1 q02 expði2wÞ

ð16:6Þ ð16:7Þ

Rather than try to simplify these complicated functions of n, d and k0, we give numerical results in Figure 16.2 for the specific case of n ¼ 2 and k0 ¼ 633 nm. The magnitudes of r and t are shown in Figure 16.2(a), and their phase angles in Figure 16.2(b), both as functions of the thickness d of the slab. For any given value of d it is possible to represent r and t as complex vectors (see Figure 16.3). Since the phase difference between r and t is always 90 , these complex vectors are orthogonal to each other. Also, the conservation of energy requires that jrj2 þ jtj2 ¼ 1. These observations lead to the conclusion that the hypotenuse of the triangle in Figure 16.3 must have unit length, that is jt rj ¼ 1, which is also confirme numerically in Figure 16.2(c). Within the slab the incident beam sets the atomic dipoles in motion. These dipoles in turn radiate plane waves in both the forward and the backward directions, as shown in Figure 16.4. When the slab is sufficiently thin, symmetry requires forward- and backward-radiated waves to be identical, that is, they must both have the same amplitude r. In the forward direction, however, the incident beam continues to propagate unaltered, except for a phase-shift caused by propagation in free-space through a distance d. Thus we must have t ¼ r þ expði2pd=k0 Þ:

ð16:8Þ

It was pointed out earlier in conjunction with the diagram of Figure 16.3 that t r has unit amplitude, which is in agreement with Eq. (16.8). It is by no means obvious, however, that the phase of t r must approach 2pd/k0 as d ! 0. Figure 16.2(c) shows computed plots of the phase of t r normalized by 2pd/k0. It is seen that in the limit d ! 0 the normalized phase approaches unity as well. This confirms that the slab radiates equally in the forward and backward directions, and that the incident beam, having set the dipolar oscillations in motion, continues to propagate undisturbed in free space. Radiation from a uniform sheet of oscillating dipoles In the limit of small d Eq. (16.6) reduces to the following simple form: r i½pðn2 1Þd=k0 exp½ipðn2 þ 1Þd=k0

d=k0 1:

ð16:9Þ

212

Classical Optics and its Applications 1.0

(a)

|t|

Amplitude

0.8 0.6 |r| 0.4 0.2 0.0 0

Phase (degrees)

270

25

75

50

100

125

150

125

150

(b) f (r)

180

f (t) 90

0 0

Normalized (t– r)

2.0

25

50

75

100

(c) f (t – r) / (2d/ 0 )

1.5 1.0

|t – r|

0.5 0.0 0

25

50

75 100 Thickness d (nm)

125

150

Figure 16.2 Computed plots of r and t for a slab of thickness d and refractive index n ¼ 2, when a plane wave with k0 ¼ 633 nm is normally incident on the slab. The horizontal axis covers one cycle of variations in r and t, corresponding to a half-wave thickness of the slab.

In this limit the radiated field is slightly more than 90 ahead of the incident field, while its amplitude is proportional to d/k0 and also proportional to n21, the latter being the coefficient of polarizability of the dielectric material. Note that the small phase angle of r over and above its 90 phase, i.e., the exponential

16 The Ewald–Oseen extinction theorem

213

d Imaginary

n l t

r

r –r t t–r Real

Figure 16.3 A dielectric slab of thickness d and refractive index n, reflecting the unit- amplitude incident beam with coefficient r while transmitting it with coefficient t. The complex-plane diagram on the right shows the relative orientations of r, t and their difference t r. For a non-absorbing slab (i.e., one with a real-valued index n) r and t are orthogonal to each other, and t r has unit magnitude.

d Oscillating dipole

n

exp (i2d/0)

1 r

r

t

Figure 16.4 Bound electrons within a very thin dielectric slab, when set in motion by a normally incident plane wave of unit amplitude, radiate with equal strength in both the forward and backward directions. The magnitude of the radiated field is the reflection coefficient r of the slab. The incident beam continues to propagate undisturbed as in free-space, acquiring a phase shift of 2pd/k0 upon crossing the slab. The sum of the incident beam and the forwardpropagating part of the radiated beam constitutes the transmitted beam.

factor in Eq. (16.9), is essential for the conservation of energy among the incident, reflected, and transmitted beams (see Figure 16.4). Equation (16.9) is in fact the exact solution of Maxwell’s equations for the radiation field of a sheet of dipole oscillators. Although derived here as an aid in proving the extinction theorem, it is an important result in its own right. Note, for example, that the amplitude of the radiated field is proportional to 1/k0 even though the field of an individual dipole radiator is known to be proportional to 1/k02. The coherent addition of amplitudes over the sheet of dipoles has thus modified the wavelength dependence of the radiated field.3

214

Classical Optics and its Applications Δz

1

Z

z0

n

Figure 16.5 A semi-infinite medium of refractive index n is illuminated by a unit-amplitude plane wave at normal incidence. The medium may be considered as a contiguous sequence of thin slabs, each radiating with equal strength in both the forward and backward directions. Adding coherently the backward-radiated fields yields the reflection coefficient at the front facet of the medium. Similarly, the internal field at z ¼ z0 is obtained by coherent addition of the incident beam, the forward-propagating radiations from the left side of z0, and the backwardpropagating radiations from the right side of z0.

The extinction theorem Having derived Eq. (16.9) for the field radiated by a sheet of dipoles, we are now in a position to outline the proof of the extinction theorem. Consider a semiinfinite, homogeneous medium of refractive index n, bordering with free space at z ¼ 0, as shown in Figure 16.5. A unit-magnitude plane wave of wavelength k0 is directed at this medium at normal incidence from the left side. To determine the reflected amplitude r at the interface, divide the medium into thin slabs of thickness Dz, then add up (coherently) the reflected fields from each of these slabs. Similarly, the field at an arbitrary plane z ¼ z0 inside the medium may be computed by adding to the incident beam the contributions of the slabs located to the left of z0 as well as those to the right of z0. The simplest way to proceed is by assuming that the field inside the medium has the expected form, s exp(i2pnz/k0), then showing self-consistency. These calculations involve simple one-dimensional integrals, and are in fact so straightforward that there is no need to carry them out here. The interested reader may take a few minutes to evaluate the integrals and convince himself or herself of the validity of the theorem. Slab of absorbing material When the material of the slab is absorbing, similar arguments to those above may be advanced to prove the Ewald–Oseen theorem, although the expressions for the reflection and transmission coefficients become more complicated. Numerically, however, it is still possible to describe the situation with great accuracy.

215

16 The Ewald–Oseen extinction theorem 1.0

(a) |r|

Amplitude

0.8 0.6 0.4 |t|

0.2 0.0 0

20

40

60

80

100

80

100

80

100

Phase (degrees)

90 (b) f (t) 0 –90 f (r)

–180 0

Normalized (t–r)

2.0

20

40

60

(c)

1.5 |t – r |

1.0 0.5

f(t – r)/(2d/ 0 )

0.0 0

20

40 60 Thickness d (nm)

Figure 16.6 Computed plots of r and t for a slab of thickness d and complex refractive index (n, k) ¼ (2,7), when a plane wave with k0 ¼ 633 nm is normally incident on the slab. The horizontal axis covers the penetration depth of the material.

Figure 16.6(a) shows computed plots of r and t for a metal slab having complex index n þ ik ¼ 2 þ i7. (Compare these plots with the corresponding plots for the dielectric slab in Figure 16.2.) It is seen that the reflectance drops sharply while the transmittance increases as the film thickness is reduced below about 20 nm. The phase plots in Figure 16.6(b) are quite different from those of the dielectric slab, indicating a phase difference greater than 90 between r and t. A complex-plane diagram for this type of material is given in Figure 16.7. The angle between r and t being greater than 90 implies that jt rj2 > jtj2 þ jrj2, while the conservation of energy requires that jtj2 þ jrj2 < 1 in the case of

Classical Optics and its Applications Imaginary

216

t–r

–r Real

t r

Figure 16.7 A complex-plane diagram showing the reflection coefficient r, transmission coefficient t, and their difference t r for a thin slab of an absorbing material.

d n r

t

u S

Oscillating dipole

Figure 16.8 An s-polarized plane wave is obliquely incident at an angle h on a dielectric slab of thickness d and index n. The electric dipoles of the slab oscillate in a direction perpendicular to the plane of the diagram, radiating identical fields in the forward and backward directions.

absorbing media. The fact that jt rj can approach unity is borne out by the numerical results depicted in Figure 16.6(c). In the limit d ! 0, not only does the magnitude of t r become unity but also its phase approaches 2pd/k0. Therefore, in the limit of small d, the transmitted beam may be expressed as the sum of the reflected beam and the phase-shifted incident beam, the phase shift being due to free-space propagation over the distance d. This is all that one needs in order to prove the extinction theorem for absorbing media. Oblique incidence on a dielectric slab Figure 16.8 shows an s-polarized plane wave at oblique incidence on a dielectric slab of thickness d and index n. The oscillating dipoles are parallel to the

217

16 The Ewald–Oseen extinction theorem

s-direction of polarization and radiate with equal magnitude in the forward and backward directions. The computed plots of rs and ts versus d for the specific case of k0 ¼ 633 nm, n ¼ 2, and h ¼ 50 are shown in Figure 16.9. The angle of propagation inside the medium is obtained from Snell’s law as h0 ¼ 22.52 , and the half-wave thickness of the slab is given by k0/(2n cos h0 ) ¼ 171.3 nm. These curves are again very similar to those of Figure 16.2, showing a 90 phase

1.0 (a)

|ts|

Amplitude

0.8 0.6 |rs|

0.4 0.2 0.0 0

Phase (degrees)

270

25

50

75

100

125

150

175

150

175

(b) f (rs)

180 f (ts) 90

0 0

Normalized (t s – r s )

3.0

25

50

75

100

125

(c)

2.5 2.0

f (t s – r s ) / ( 2d c o s / 0 )

1.5 1.0 |ts– rs|

0.5 0.0 0

25

50

75 100 125 Thickness d (nm)

150

175

Figure 16.9 Computed plots of r and t for a slab of thickness d and index n ¼ 2, when a s-polarized plane wave with k0 ¼ 633 nm illuminates the slab at h ¼ 50 . The horizontal axis covers one cycle of variations of r and t, corresponding to a half-wave thickness of the slab at this particular angle of incidence.

218

Classical Optics and its Applications d n r

t

p

Oscillating dipole

Figure 16.10 A p-polarized plane-wave is obliquely incident at an angle h on a dielectric slab of thickness d and index n. The oscillating dipoles make an angle 00 h with the surface of the slab, radiating with different amplitudes in the forward and backward directions.

difference between rs and ts, unit magnitude for ts rs, and a phase for ts rs that approaches 2p(d/k0)cos h as d ! 0. The Ewald–Oseen theorem for the case of s-polarized light at oblique incidence can therefore be proven along the same lines as described earlier for normal incidence. The case of p-polarized light, depicted in Figure 16.10, is somewhat different, however. Here the directionality of the dipole oscillations within the slab breaks the symmetry between the forward- and backward-radiated beams. The angle h00 between the direction of oscillation of the dipoles and the plane of the slab may be determined by considering multiple reflections within the slab. For very thin slabs, it is possible to show that tan h00 ¼ ð1=n2 Þ tan h:

ð16:10Þ

Note that at Brewster’s angle, where tan h ¼ n, we have tan h00 ¼ 1/n, that is, h00 ¼ h0 , where h0 is the propagation angle within the medium as given by Snell’s law. At angles below the Brewster angle h00 < h0 , while above the Brewster angle h00 > h0 . For the case of p-polarized light of wavelength k0 ¼ 633 nm incident at h ¼ 50 on a slab of index n ¼ 2, plots of r and t versus the slab thickness d are shown in Figure 16.11. Although the magnitude of tp rp can still be shown to be unity, its phase does not approach 2p(d/k0)cos h as d ! 0. This is a manifestation of the breakdown of symmetry between the forward and backward radiations. If the magnitudes of the beams radiated in the two directions are taken into account, however, the preceding arguments can be restored. One may readily observe from

219

16 The Ewald–Oseen extinction theorem 1.0 (a)

|t p |

Amplitude

0.8 0.6 0.4

|r p |

0.2 0.0 0

25

50

75

100

125

150

175

125

150

175

75 100 125 Thickness d (nm)

150

175

Phase (degrees)

270 (b) f (r p )

180

f (t p ) 90

0 0 Normalized (t p – wr p )

3.0

25

50

75

100

(c)

2.5 f(t p – wr p )/(2d cos / 0 )

2.0 1.5 1.0

|t p – w r p |

0.5 0.0 0

25

50

Figure 16.11 Computed plots of r and t for a slab of thickness d and index n ¼ 2, when a p-polarized plane wave with k0 ¼ 633 nm illuminates the slab at h ¼ 50 . The horizontal axis covers one cycle of variations in r and t, corresponding to a half-wave thickness of the slab at this particular angle of incidence.

Figure 16.10 that the ratio of the forward- and backward-propagating magnitudes must be given by WðhÞ ¼ cosðh h00 Þ= cosðh þ h00 Þ:

ð16:11Þ

Therefore, for p-polarized light at oblique incidence, it is t Wr that approaches exp(i2pd cos h/k0) as d ! 0. This is seen to be verified in Figure 16.11(c).

220

Classical Optics and its Applications 0= 633 nm, d = 10 nm, n = 2.00

1.0

1/ W

0.5 rp tp– exp(i2d cos/0)

0.0

–0.5

–1.0 0

15 30 45 60 75 Angle of incidence (degrees)

90

Figure 16.12 Computed ratio of the amplitudes of backward-propagating radiation and forward-propagating radiation for a dielectric slab 10 nm thick and with n ¼ 2. A p-polarized plane wave with k0 ¼ 633 nm is assumed to be obliquely incident on the slab at an angle h.

As a further test of Eq. (16.11), we show in Figure 16.12 the computed plot versus h of rp/[tp exp(i2pd cos h/k0)] for a slab with d ¼ 10 nm and n ¼ 2, illuminated by a plane wave with k0 ¼ 633 nm. This curve overlaps the plot of the function 1/W(h) exactly. Taking into account the ratio W(h) between the forward and backward radiated beams, one can prove the Ewald–Oseen theorem as before. Appendix This chapter, when originally published in Optics & Photonics News, prompted the following criticism and reply. “Editor: While we are pleased that Masud Mansuripur has called attention in OPN to the rather basic Ewald–Oseen extinction theorem, we wish to take issue with certain parts of his article.1 “Mansuripur states that the goal of his article is ‘to outline a general proof of the Ewald–Oseen theorem using arguments that are based primarily on thinfilm optics.’ We wish to note first that the proof he outlines, based on the field produced by a uniform sheet of dipole oscillators and the assumed form

16 The Ewald–Oseen extinction theorem

221

exp[2pinz/k0] for the field inside the medium, is essentially the same approach used by Fearn, James, and Milonni.2 Their proof is more general in that Fresnel coefficients (for normal incidence) are derived rather than assumed. Indeed, the derivation of the Fresnel coefficients assumes the extinction of the incident field inside the dielectric medium: Mansuripur’s starting point implicitly assumes the very theorem he is trying to prove! In this connection we note that it was not claimed by Fearn et al. that they provided a ‘general proof ’ of the extinction theorem. A general proof, valid for media bounded by surfaces of arbitrary shape, is given by Born and Wolf.3 “Mansuripur cites References 2 and 3 in support of his opinion that the proof of the extinction theorem is ‘devoid of physical insight’. While it is true that the proofs given in these references involve ‘complicated integration over dipolar fields in three-dimensional space,’ we do not think it is fair to say it [the proof] is devoid of physical insight. In Reference 3, page 101, the significance of the theorem is described in the following manner that could hardly be more physical: ‘The incident wave may . . . be regarded as extinguished at any point within the medium by interference with the dipole field and replaced by another wave with a different velocity (and generally also a different direction) of propagation.’ “Finally we note that various features of the extinction theorem have been interpreted differently by various authors: some of these differences have been discussed by Fearn et al.2 It would be unfortunate if readers of Mansuripur’s article were left with the impression that the theorem can somehow be based ‘primarily on thin-film optics.’ 1 2 3

M. Mansuripur, The Ewald–Oseen extinction theorem, Opt. & Phot. News 9 (8), 50–55 (1998). H. Fearn et al., Microscopic approach to reflection, transmission, and the Ewald– Oseen extinction theorem, Am. J. Phy. 64, 986–995 (1996). M. Born and E. Wolf, Principles of Optics, sixth edition, Cambridge University Press, Cambridge UK 1985, section 2.4.2. Daniel James and Peter W. Milonni, Los Alamos National Laboratory, Los Alamos NM Heidi Fearn, California State University at Fullerton, Fullerton CA Emil Wolf, University of Rochester, Rochester NY”

The author replied: “It is puzzling that Fearn et al. consider starting from Fresnel’s reflection coefficients a shortcoming of my method of proof. The Fresnel coefficients can be

222

Classical Optics and its Applications

derived directly from Maxwell’s equations without invoking the extinction theorem, they are available in many textbooks (including Born and Wolf, sixth edition, pp. 38–41), and their derivation from first principles does not in any way add to the value of a paper. I used Fresnel’s coefficients to derive the radiation field for a sheet of dipoles (Equation 9 of my article), as this is a simple, accurate, and intuitive way of calculating the field, and also because its underlying principle is familiar to many practitioners of optics. Alternatively, one could derive the radiation field by integrating over individual dipoles within the sheet, as is done, for example, in The Feynman Lectures on Physics (my reference 3). After this step that establishes the radiation field from a dipolar sheet, the method of proof that I proposed (based on demonstrating self-consistency) is similar to that of Fearn et al. “Although Fresnel’s coefficients are derived from Maxwell’s equations, nowhere in the standard derivation is it assumed that the incident beam is still present within the medium (albeit masked by the dipole radiations). Had the Ewald–Oseen theorem been somehow implicit in the standard derivation of Fresnel’s coefficients, there would have been no need for the paper of Fearn et al. in the first place. “I strongly disagree with the suggestion that the use of Fresnel’s coefficients somehow renders my proof of the Ewald–Oseen theorem circular. I also dispute the assertion made by Fearn et al. that ‘it would be unfortunate if readers . . . were left with the impression that the theorem can somehow be based primarily on thin film optics.’ Emphatically, the proof of the theorem can be based on thin film optics (this is exactly what I showed in the article), and it is far from ‘unfortunate’ indeed when a valid proof happens to be based on a simple physical picture. “I erred in stating that I was going to ‘outline a general proof of the . . . theorem.’ Mine was a general proof for the one-dimensional case, where the beam enters from free space through a plane boundary into an isotropic, homogeneous medium. My proof is more general than the proof of Fearn et al., in that it covers both transparent and absorbing media, and also in that it considers the case of oblique incidence with p and s polarized light. The method described in Born and Wolf is obviously more general than both, because it applies to arbitrary boundaries. None of the above methods, however, is sufficiently general to embrace inhomogeneous, anisotropic, and optically active media, for which the theorem is presumably valid as well. “Finally, my expressed opinion regarding the proof of the extinction theorem being ‘devoid of physical insight’ was meant as a commentary on the nature of the method, not as a reflection on the authors of the cited references. Ultimately, of course, such judgments are subjective and are best left to the readers.”

16 The Ewald–Oseen extinction theorem

223

References for Chapter 16 1 2 3 4 5 6 7

P. P. Ewald, On the foundations of crystal optics, Air Force Cambridge Research Laboratories Report AFCRL-70-0580, Cambridge MA (1970). This is a translation by L. M. Hollingsworth of Ewald’s 1912 dissertation at the University of Munich. ¨ ber die Wechselwirkung Zwischen Zwei elektischen Dipolen der C. W. Oseen, U Polarisationsebene in Kristallen und Flu¨ssigkeiten, Ann. Phys. 48, 1–56 (1915). R. P. Feynman, R. B. Leighton, and M. Sands, The Feynman Lectures on Physics, chapters 30 and 31, Addison-Wesley, Reading, Massachusetts, 1963. V. Weisskopf, How light interacts with matter, in Lasers and Light, Readings from Scientific American, W. H. Freeman, San Francisco, 1969. M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, Oxford, 1980. H. Fearn, D. F. V. James, and P. W. Milloni, Microscopic approach to reflection, transmission, and the Ewald–Oseen extinction theorem, Am. J. Phys. 64, 986–995 (1996). H. A. Macleod, Thin Film Optical Filters, second edition, Macmillan, New York, 1986.

17 Reciprocity in classical linear optics

An informal survey of some colleagues and students revealed that the notion of reciprocity in optics is not widely appreciated. One colleague even justified the prevailing ignorance by drawing a parallel between reciprocity in optics and complementarity in quantum mechanics: “Both are true statements which have little, if any, practical value in their respective domains.” This chapter is an attempt at explaining the concept of reciprocity, clarifying some associated misconceptions, and pointing out its practical applications. Non-reciprocity of Faraday rotators No one disputes that a Faraday rotator is a non-reciprocal element. The usual argument goes as follows. Let a linearly polarized beam of light be fully transmitted through a polarizing beam-splitter (PBS) before being directed through a 45 Faraday rotator, as shown in Figure 17.1. If the beam is reflected back (by an ordinary mirror, for example), it retraces its path through the rotator and emerges with its polarization vector rotated by a full 90 . At the PBS, therefore, the returning beam will be deflected away from its original path. (This, in fact, is a well-known method of isolating laser diodes from spurious reflections within a given system.) Since the reflected light does not return on its original path, and since the PBS is believed to be reciprocal, the argument is taken as proof of the non-reciprocity of the Faraday rotator. Although it is true that Faraday rotators are non-reciprocal, there is a flaw in the above argument, which will become clear upon inspection of the system of Figure 17.2. In this system, which is similar to that of Figure 17.1, the Faraday rotator is replaced by a quarter-wave plate (QWP). The fast and slow axes of the plate are oriented at 45 to the direction of incident polarization, so that the light emerging from the plate in the forward path is circularly polarized. (The system of Figure 17.2 is used in some optical disk drives with the optical disk acting as a 224

225

17 Reciprocity in classical linear optics Polarizing beam-splitter

45° Faraday rotator

Mirror

P

P

S S

Figure 17.1 A Faraday rotator as used in an optical isolator. The incident ppolarized beam, having undergone two consecutive 45 rotations in its forward and backward paths through the rotator, becomes s-polarized, enabling the PBS to divert it away from its original direction.

Quarter-wave plate

Polarizing beam-splitter P

P

Mirror

RCP

S

LCP

S

Figure 17.2 The quarter-wave plate as used in this system helps to separate the reflected beam from the incident beam. The key contribution is made by the (conventional) mirror, which converts the incident RCP beam into LCP upon reflection.

mirror, the purpose being to separate the reflected beam from the incident beam efficiently, as well as to isolate the laser diode.) Although the system of Figure 17.2 behaves very much like that of Figure 17.1, no one claims that a QWP is non-reciprocal. This seeming paradox can be resolved after a careful examination of the concept of reciprocity, to which we now turn. Is a polarizer reciprocal? Consider the simple linear polarizer shown in Figure 17.3. A collimated beam of light entering from the left-hand side emerges from the polarizer linearly polarized along the transmission axis. The polarization state of the incident beam may be decomposed into two linear components, one parallel and the other perpendicular to

226

Classical Optics and its Applications P

P

S

Transmission axis

Figure 17.3 An ideal polarizer has a well-defined transmission axis (shown by the vertical double-ended arrow). The component of the incident beam that is polarized along the transmission axis goes through, while the component perpendicular to this axis is fully absorbed within the polarizer.

Plano-convex Lens

F

Z

Figure 17.4 A simple plano-convex lens behaves differently depending on whether the light is incident from its plane side or from its convex side. The particular plano-convex lens used in the simulations had refractive index n ¼ 1.5 at k0 ¼ 633 nm, thickness ¼ 5 mm, radius of curvature ¼ 10 mm, and clearaperture diameter ¼ 10 mm. The 5 mm diameter incident beam was collimated and uniform. When the beam enters at the convex facet, the best focus (i.e., the circle of least confusion) appears at a distance of 16.49 mm in front of the plane facet. With the lens flipped and the beam entering the plane facet, the best focus occurs at 19.29 mm in front of the lens.

the transmission axis. Assuming an ideal polarizer, the entire parallel component is transmitted while the entire perpendicular component is absorbed within the polarizer. If the direction of propagation of the transmitted beam is reversed, it will pass through the polarizer without any change. Since the original state of polarization of the incident beam is not recovered, the polarizer is a non-reciprocal element. One might argue that in one sense the polarizer is reciprocal because, irrespective of whether the incident beam illuminates it from the left or from the right side, it behaves the same way. However, this turns out to be a poor way to define reciprocity, because it cannot be generalized to cover other optical elements. For example, consider the simple plano-convex lens shown in Figure 17.4. As will be shown below, lenses in general are reciprocal elements. However, a collimated beam of light shining on the convex surface of this lens comes to focus with

227

17 Reciprocity in classical linear optics

less spherical aberration than a beam shining on its flat surface (see Figures 17.5 and 17.6). Therefore, if reciprocity required the identity of behavior from both sides of an element, one would end up with the undesirable result that a planoconvex lens, for example, is non-reciprocal. To avoid this outcome we return to our earlier definition that the beam transmitted through a reciprocal element, when “properly” reversed, must recreate the incident beam in the reverse direction. It is in this sense that the polarizer of Figure 17.3 is non-reciprocal. Are lenses reciprocal in the above sense? Consider an aberration-free lens that brings a collimated beam of light to focus as in Figure 17.7(a). A flat mirror placed in the focal plane of the lens reflects the

a

–2.75

b

x (mm)

c

–12

2.75 –2.75

x (mm)

2.75

x (μm)

24

d

x (μm)

12 –24

Figure 17.5 Plots of intensity and phase corresponding to the plano-convex lens of Figure 17.4, illuminated by a collimated and uniform beam from the convex side. (a) Intensity distribution immediately after the beam leaves the plane facet of the lens. (b) Residual phase distribution immediately after the beam leaves the plane facet. The curvature of the beam has been removed from the phase distribution, leaving only the residual spherical aberration balanced by a small amount of defocus. The r.m.s. value of these residual aberrations over the entire aperture is 0.17k0. (c) Intensity distribution in the plane of best focus, i.e., at the circle of least confusion. (d) Same as (c) but on a logarithmic scale and over a larger area.

228

Classical Optics and its Applications a

–2.75

b

x (mm)

c

–12

2.75 –2.75

x (mm)

2.75

x (μm)

24

d

x (μm)

12 –24

Figure 17.6 Same as Figure 17.5 for the case when the beam enters from the plane side of the lens. (a) Emergent intensity distribution at a plane tangent to the convex facet of the lens at its vertex. Note the larger diameter of the emergent beam compared with Figure 17.5(a). (b) Residual phase of the emergent beam within the tangent plane to the convex surface. The r.m.s. wavefront aberration over the entire aperture is 0.68k0. (c) Distribution of intensity in the plane of best focus. (d) Same as (c) but on a logarithmic scale and over a larger area.

beam back towards the lens. Upon re-emerging from the lens the beam, now collimated once again, propagates in the reverse direction of the original incident beam. Is this sufficient proof that the lens is reciprocal? The answer is no, for the following reasons. What if the lens has aberrations? What if the incident beam is only illuminating one half of the lens’s aperture, as in Figure 17.7(b)? What if the mirror is displaced from the focal plane of the lens, as in Figure 17.7(c)? In all these examples (and many more that can be conceived) the returning beam does not retrace the path of the incident beam. Does this mean that the lens is nonreciprocal? Again the answer is no. The culprit in all these examples is the mirror, which does not “properly” reverse the path of the beam. What we need in place of the conventional mirror is a phase-conjugate mirror1 (PCM) to reverse the wavefront properly. Suppose a PCM is placed perpendicular to the Z-axis at z ¼ z0. If the complex-amplitude distribution incident on the PCM is denoted by A(x, y, z0), then the reflected wavefront at the plane of the mirror will be

229

17 Reciprocity in classical linear optics (a)

Lens

Mirror

Z

(b)

Lens

Mirror

Z

(c)

Lens

Mirror

Z F

Figure 17.7 (a) A collimated and uniform beam, focused by an aberration-free lens and reflected by a plane mirror placed at the focal plane of the lens, retraces its path through the lens. (b) A collimated beam entering the upper half of the lens aperture and reflected from the mirror surface does not return on itself, but emerges from the lower half of the lens aperture. (c) When the mirror is displaced from the focal plane, the returning beam no longer retraces the incidence path.

A*(x, y, z0), which propagates along the negative Z-axis and completely retraces the incidence path. Substituting the ordinary mirror by a PCM in Figure 17.7 ensures that the beam is properly reversed in each case, and proves beyond any doubt that lenses are reciprocal. The quarter-wave plate Returning now to the system of Figure 17.2, we re-examine the question of reciprocity of the QWP. Once again the mirror is recognized as the culprit: upon reflection from an ordinary mirror, a right circularly polarized (RCP) beam

230

Classical Optics and its Applications

becomes left circularly polarized (LCP) and vice versa. The result is that the QWP in Figure 17.2 rotates the polarization of the beam by 90 in double pass, forcing it to change its propagation direction at the PBS. If the mirror is replaced by a PCM, the sense of circular polarization does not change upon reflection, and the beam emerges from the QWP with the same linear polarization as it had when it first entered the plate. The returning beam thus retraces its path, proving the reciprocity of the QWP. The question arises as to what happens in the system of Figure 17.1 if the mirror is replaced by a PCM? Since the beam incident on the mirror is linearly polarized, it remains linear whether it is reflected from an ordinary mirror or from a PCM. Therefore, the path of the reflected light in Figure 17.1 does not change as a result of changing the mirror, confirming our earlier conclusion that the Faraday rotator is non-reciprocal. Reciprocity of conventional mirrors In general, lossy elements are non-reciprocal by the above definition of reciprocity. The beam going through (or reflecting from) a lossy device becomes attenuated. Reversing the beam by a PCM reverses the propagation direction, but does not recover the losses incurred. A second path through the lossy element attenuates the beam even further. Thus the returning beam is twice attenuated, which means that it differs from the original incident beam, if not in its direction of propagation or phase or polarization state, at least in its amplitude. With the strict definition of reciprocity, which requires the beam to be fully recovered in the reverse path, attenuation is sufficient grounds for declaring lossy elements non-reciprocal. A conventional mirror, such as a polished metallic surface, is lossy and therefore non-reciprocal. But consider a total internal reflection (TIR) device such as that shown in Figure 17.8. Here there are no losses and the only effect of the mirror on the incident beam is a change in its state of polarization. Suppose that the p and s components of the incident beam have complex amplitudes ap exp(ip) and as exp(is), respectively. Upon reflection from the TIR mirror these components retain their amplitudes but acquire different phases; the first one becomes ap exp[i(p þ wp)], say, and the second becomes as exp[i(s þ ws)]. If the direction of propagation of the beam is reversed by means of a conventional mirror, the phase angles wp and ws do not disappear from the returning beam; rather, they become twice as large. However, by now we have learned that a conventional mirror is not the proper device for reversing the beam. Instead, one must use a PCM to phase-conjugate the beam and launch it on its way back. When placed in the system of Figure 17.8, the PCM will return the two components of polarization

17 Reciprocity in classical linear optics

231

MIRROR

TIR Prism

Figure 17.8 A collimated, uniform, and polarized beam of light is reflected from the rear facet of a TIR prism. If the beam is returned via a conventional mirror, the emergent beam will not, in general, have the same state of polarization as the incident beam. Use of a PCM mirror, however, ensures not only that the beam retraces its path but also that it will have the same state of polarization at any point along the path.

as ap exp[i(p þ wp)] and as exp[i(s þ ws)]. The second reflection from the TIR mirror eliminates the acquired phases wp and ws and returns the conjugate of the original incident beam, which is exactly what is needed. A TIR mirror, therefore, is a reciprocal element. A regular beam-splitter There are many different ways of constructing a beam-splitter. For simplicity’s sake, let us consider the specific beam-splitter shown in Figure 17.9. This flat piece of glass of thickness d and refractive index n has no coating layers and is used at a 45 angle of incidence. If the reflected and transmitted beams are returned by conventional mirrors, as shown in the figure, then, in general, a certain fraction of the light returns along the incidence path and the remainder leaves the beam-splitter along a fourth direction. However, if the conventional mirrors in Figure 17.9 are replaced by PCMs, the entire beam will retrace its original path. To see this we must first examine certain properties of the glass slab that forms the beam-splitter. Figure 17.10 shows computed plots of the reflection and transmission coefficients versus the thickness d of the slab. The assumed refractive index is n ¼ 2, the angle of incidence is fixed at h ¼ 45 , and the incident beam is a coherent and monochromatic beam from a red HeNe laser (k0 ¼ 633 nm). Only the range of thicknesses corresponding to one half-wavelength is shown in

232

Classical Optics and its Applications Mirror

Beam-splitter Mirror

Figure 17.9 A parallel plate made of a glass slab of thickness d and refractive index n used as a beam-splitter. The collimated and uniform incident beam is partially reflected and partially transmitted at the slab. If conventional mirrors are used to return the reflected and transmitted beams back to the beam-splitter, in general a fraction of the beam will go back towards the source but the remainder will leave the beam-splitter in a fourth direction. However, if the mirrors are replaced by phase-conjugate mirrors, the entire beam will return along the incidence path.

Figure 17.10, since the reflection and transmission coefficients are periodic with this period. The half-wave thickness of the slab is d ¼ k0 /(2n cos h0 ) ¼ 169.2 nm. Here h 0 ¼ 20.7 , obtained from Snell’s law, is the angle between the propagation direction within the slab and the slab’s surface normal. The reflection and transmission coefficients for both p- and s-polarized light are shown in the figure. Note in Figure 17.10 that, at any given thickness, jrj2 þ jtj2 ¼ 1 and r t ¼ 90 . In fact, it may be shown that these two properties of the slab are quite general and hold not only for all thicknesses but also for all values of the refractive index n, angle of incidence h, and wavelength k0. The first identity is a trivial statement of the principle of conservation of energy. The second, relating the phase angles of the reflected and transmitted beams, is more subtle, but its violation also results in nonconservation of energy, as we shall see shortly. When the transmitted beam returns to the slab via a PCM it will have an amplitude t*. Upon transmission (in the reverse direction) its amplitude becomes tt*; it will then combine with the reversed reflected beam whose amplitude at this point is rr*. The total returning amplitude is therefore rr* þ tt* ¼ jrj2 þ jtj2 ¼ 1. The remainder of the beam, leaving the beam-splitter in the fourth direction, will have a total amplitude rt* þ r*t ¼ 2jrtj cos(r t), which is exactly zero because the phase difference between r and t is 90 . Thus the beams reversed by the two PCMs combine at the beam-splitter to yield the reverse propagating beam along the original path, leaving no other light to go in the fourth direction.

233

17 Reciprocity in classical linear optics 270 (b)

1.0 (a) Phase (degrees)

Amplitude

225

|tp|

0.8 0.6 0.4

|rp|

0.2

ftp

135 90 45

0.0

0 0

25

50

75

100 125 150 175

0 270

1.0 (c)

0.6 |rs|

0.4 0.2

25

50

75

100 125 150 175

(d)

225 Phase (degrees)

|ts|

0.8 Amplitude

frp

180

frs

180 135

fts

90 45

0.0

0 0

25

50 75 100 125 150 175 Thickness (nm)

0

25

50 75 100 125 150 175 Thickness (nm)

Figure 17.10 Computed plots of reflection and transmission coefficients versus slab thickness for the parallel-plate beam-splitter of Figure 17.9. The assumed refractive index of the glass material is n ¼ 2, and the angle of incidence is fixed at h ¼ 45 . The incident beam is a coherent and monochromatic beam from a red HeNe laser (k0 ¼ 633 nm), and it is assumed to be linearly polarized either along the p- or the s-direction. The phase angles are evaluated at the front facet of the slab for the reflection coefficients and at the rear facet for the transmission coefficients. The reference phase angle is that of the incident beam at the front facet.

Although the above proof for reciprocity of the glass slab was given for plane waves, one can show its validity in the general case of a finite-size incident beam as well. To appreciate the effects of finite size, consider the plots of intensity distribution in Figure 17.11, computed for a HeNe beam of diameter 2000k0 upon reflection from and transmission through a slab 500 lm thick of n ¼ 2 glass. Near the edges of the beam the various reflected (or transmitted) orders do not overlap and, consequently, give rise to varying degrees of brightness in these regions. Instead of considering the edges separately, however, the appropriate proof of reciprocity for a finite-size beam involves the consideration of such beams as a superposition of a large number of plane waves traveling in different directions (i.e., angular spectrum decomposition). Since the reciprocity applies to each such plane wave, it must, of necessity, also apply to their linear superposition.

234

Classical Optics and its Applications a

b

c

d

–1500

x/

1500 –1500

x/

1500

Figure 17.11 Plots of intensity distribution upon reflection or transmission of a collimated uniform beam from the beam-splitter of Figure 17.9. (Due to the limited range of the gray-scale, certain weak parts of the distributions are not visible.) The incident beam diameter is 2000k0, where k0 ¼ 633 nm. The beamsplitter, oriented at 45 to the propagation direction of the incident beam, has n ¼ 2 and d ¼ 500 lm. (a) Logarithmic plot of the reflected intensity distribution for p-polarized incident beam. Since reflection from each surface is weak, only the first- and second-order reflected beams are observed. (b) Transmitted intensity distribution for p-polarized incident beam. (c) Logarithmic plot of the reflected intensity distribution for s-polarized incident beam. Since reflection from each surface is strong, the effect of the third-order reflection can also be seen in this figure. (d) Transmitted intensity distribution for s-polarized incident beam.

Reciprocity and Maxwell’s equations The principle of reciprocity in classical linear optics is rooted in the fact that electromagnetic waves obey Maxwell’s equations and that these equations admit reciprocal solutions. Consider a distribution of electromagnetic waves in a region of space occupied by matter represented by the dielectric tensor e(x, y, z). Assume that the fields oscillate harmonically at a given frequency x, and that the timedependence factor exp(ixt) has been eliminated from Maxwell’s equations.2 Suppose now that the propagation direction is reversed everywhere, so that any plane-wave component of the field that was propagating along a given k-vector is

17 Reciprocity in classical linear optics

235

now propagating along the negative direction of that same k-vector. If we replace the E-fields by E* and the H-fields by H* everywhere, Maxwell’s equations remain satisfied so long as the dielectric tensor of the material environment obeys the relation e ¼ e* at all points of space. This latter relation holds, for example, if the medium is isotropic and lossless (i.e., e is a real-valued scalar), or if the medium is birefringent but non-absorptive (i.e., e is a real-valued symmetric matrix), or if the medium has optical activity of the type observed in sugar crystals. If, however, the medium is absorptive, or if it has magneto-optical activity such as that exhibited by a Faraday rotator, then e 6¼e*, in which case the reverse-propagating beam(s) violate Maxwell’s equations and, consequently, reciprocity breaks down.

Multilayer dielectric stack The power of the reciprocity principle may be demonstrated by the following analysis of a multilayer dielectric stack. Adopting the approach pioneered by Sir George Gabriel Stokes (1819–1903)3 we prove that any stack consisting of an arbitrary number of dielectric (i.e., non-absorbing) layers exhibits symmetric behavior between its front facet and rear facet reflectivity (or transmissivity). To prove this statement consider returning both the reflected beam and the transmitted beam back to the stack via two PCMs, as shown in Figure 17.12. Denoting the front facet reflection and transmission coefficients by r and t, and the corresponding rear facet coefficients by r0 and t0 we must have, by reciprocity, the following identities: rr þ t0 t ¼ 1;

ð17:1Þ

tr þ r 0 t ¼ 0:

ð17:2Þ

Equation 17.1, in conjunction with the principle of conservation of energy, yields t ¼ t0 , proving that the complex transmission coefficient is the same from both facets of the stack. From Eq. (17.2) one obtains r0 ¼ tr*/t*, which proves that the amplitude of the reflection coefficient is the same from the two facets, that is, jrj ¼ jr0 j. As for the phase angles we have: 1 ð þ 0r Þ ¼ t 90 : 2 r

ð17:3Þ

These relations are readily verified for the specific quadrilayer stack whose performance characteristics are depicted in Figure 17.13. Needless to say, the symmetry of reflection and transmission from the two facets of a multilayer

236

Classical Optics and its Applications PCM rr * r tt *

r*

t rt*

tr * t* PCM

Figure 17.12 Multilayer stack consisting of an arbitrary number of dielectric layers. A unit-amplitude beam is partially reflected and partially transmitted at the top facet of the stack. If the reflected and transmitted beams are returned to the stack via phase-conjugate mirrors (PCMs), the principle of reciprocity requires that the beam must retrace its path. Thus the total amplitude along the reverse incidence direction must be unity and the total amplitude emerging from the bottom facet of the stack must be zero.

stack applies quite generally unless one or more layers are absorptive or magneto-optically active. In fact, the media of incidence and emergence on the two sides of the stack do not have to be identical either. Using the method of proof outlined above, one can readily show that the behavior of dielectric stacks remains symmetrical even when the media above and below the stack have arbitrary refractive indices n1 and n2, provided that proper account is made of the difference in beam cross-section and the dependence of power on the refractive index. Another interesting property of multilayer stacks arises when one or more of the layers happen to be absorptive. Since reciprocity no longer applies to this case, it should come as no surprise that the reflectivities of the two sides of the stack are, in general, different. What is surprising is that, even in the presence of absorption, the transmissivity continues to be the same from both sides. This property can be proven using standard methods of thin-film-stack calculation4 and has been verified numerically in several situations. A simple proof for the symmetric behavior of the transmissivity under quite general conditions is given in the following appendix.

237

17 Reciprocity in classical linear optics 1.00

Amplitude

0.75

|rs|

0.50

|tp|

|ts|

0.25

|rp| (a)

0.00 0

15

30

45

60

75

90

Phase (degrees)

200

ftp = ftp 100

frp 0

frp (b)

–100 0

15

30

45

60

75

90

Phase (degrees)

400

frs

300

frs

200

100

fts = fts

(c) 0

15

30 45 60 Angle of Incidence (degrees)

75

90

Figure 17.13 Computed plots of reflection and transmission coefficients versus the angle of incidence for a quadrilayer dielectric stack surrounded by free space. The layer thickness d and refractive index n for consecutive layers starting at the top of the stack are as follows: 140 nm, 2.2; 200 nm, 1.8; 80 nm, 2.0; 100 nm, 1.5. The magnitudes of the various reflection and transmission coefficients shown in (a) are the same whether the beam is incident from the top side or from the bottom side of the stack. The phase angles of the transmission coefficients, tp and ts, are also the same for top and bottom incidence. The phase angles of the reflection coefficients, however, depend on the side of the stack at which the beam is directed. In (b) and (c) rp and rs are the phase angles for p- and s-reflectivities when the beam is incident from the top of the stack. The corresponding primed quantities refer to incidence from the bottom.

238

Classical Optics and its Applications

Appendix We prove that the Fresnel transmission coefficient t for a multilayer stack consisting of metal and dielectric layers does not depend on whether the light is incident from the top or the bottom of the stack. For stacks consisting solely of dielectric layers this property has been proved in the present chapter, using reciprocity. Reciprocity, however, breaks down in the presence of absorptive layers, and one needs to resort to an alternative method of proof, such as that outlined below. A general stack consists of an arbitrary number of layers, each having thickness dj and complex refractive index (n þ ik)j, the subscript j referring to the layer number. For an incident plane wave of wavelength k, arriving at the top of the stack at angle h, the Fresnel reflection and transmission coefficients of the stack are denoted by r and t, respectively. Similarly, when the beam is incident from the bottom side on the stack (again at angle h), the Fresnel coefficients are denoted r 0 and t 0 . Our goal is to demonstrate the equality of t and t 0 , even though, in general, r and r 0 may differ from each other. Consider the hypothetical situation shown in Figure A17.1, where the stack is split along an interfacial plane into two smaller stacks separated by an air gap d. The upper stack, identified as stack 1, has reflection and transmission coefficients from top and bottom denoted by r1, t1, r10 , t10 . Similarly, the corresponding parameters of the lower stack, stack 2, are r2 , t2 , r20 , t20 . The transmissivity t of the entire stack (in the presence of the air gap) can be obtained by adding an infinite number of terms corresponding to the beams bouncing back and forth in the gap, namely, t ¼ t1 t2 expðiÞ þ t1 r2 r10 t2 expði3Þ þ t1 r22 r10 2 t2 expði5Þ þ ¼ t1 t2 expðiÞ=½1 r10 r2 expði2Þ:

ðA17:1Þ

Here ¼ 2pd cos h/k is the phase delay due to one passage of the beam through the gap. In the limit of a vanishing gap (i.e., d ! 0) we find a simple expression for t in terms of the parameters of stacks 1 and 2: t ¼ t1 t2 =ð1 r10 r2 Þ:

ðA17:2Þ

In similar fashion, the reverse-direction transmissivity t 0 of the stack (bottom illumination) is found to be t0 ¼ t10 t20 =ð1 r10 r2 Þ:

ðA17:3Þ

The argument for the equality of t and t 0 flows readily from Eqs. (A17.2) and (A17.3), using proof by induction as follows . It is clear that if the individual

17 Reciprocity in classical linear optics a

239

ra

Stack 1 (r1, t1, r1, t1)

Air gap d

t1a

r1 r2 t1 exp(i2f)a

Complete stack in the limit d 0 (r, t, r, t)

Stack 2 (r2, t2, r2, t2)

ta

Figure A17.1 A multilayer stack consisting of metal and dielectric layers is split into two sub-stacks along an arbitrary interfacial plane. The upper stack has Fresnel reflection and transmission coefficients r1, t1 when the beam of light is incident from the top. The corresponding coefficients when the light is incident from the bottom are r10 , t10 . Similarly, the lower stack has reflection and transmission coefficients r2 , t2 , r20 , t20 . The width of the air gap separating the two substacks is d. The overall transmission coefficient t of the entire stack can be obtained by adding the contributions of the infinite number of beams that bounce back and forth in the air-gap region.

sub-stacks are such that t1 ¼ t10 and t2 ¼ t02, then t ¼ t0 is guaranteed. For each substack the reduction to a pair of smaller stacks can be repeated until each sub-stack is a single-layer, in which case t1 ¼ t10 and t2 ¼ t20 obviously hold. The proof is thus complete. References for Chapter 17 1 2 3 4

A. Yariv and D. M. Pepper, Amplified reflection, phase conjugation, and oscillation in degenerate four-wave mixing, Opt. Lett. 1, 16–18 (1977). M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, Oxford, 1980. E. Hecht, Optics, third edition, Addison-Wesley, Reading, Massachusetts, 1998. H. A. Macleod, Thin Film Optical Filters, second edition, Macmillan, New York, 1986.

18 Optical pulse compression

A variety of methods exist for temporally compressing (shortening) optical pulses. These methods typically start with pulses in the picosecond or femtosecond range, and end up with pulses that can be as short as a few optical cycles. The optical bandwidth of the initial pulse is usually increased using a nonlinear interaction such as self-phase modulation; this leads to a chirped pulse, which sometimes ends up being longer than the original pulse. A well-known technique for generating sub-100 fs pulses is nonlinear compression in a fiber, where the fiber’s nonlinearity is used to broaden the optical spectrum. Thereafter, the pulse duration is reduced using linear dispersive compression, which removes the chirp by flattening the spectral phase. This is accomplished by sending the pulse through an optical element with a suitable amount of dispersion, such as a prism pair, an optical fiber, a grating compressor, or a chirped mirror. In the 1960s, Gires and Tournois1 and Giordmaine et al.2 independently proposed the shortening of optical pulses using compression techniques analogous to those used at microwave frequencies. Fisher et al.3 suggested that femtosecond optical pulses could be obtained by first passing a short pulse through an optical Kerr liquid in order to impress a frequency sweep or “chirp” on the pulse’s carrier. Pulse compression was then to be achieved by compensating the frequency sweep in the pulse frequency spectrum using a dispersive delay line. In 1982, Shank et al.4 reported the generation and measurement of an optical pulse of only 30 fs duration at a wavelength of 619 nm, corresponding to 14 optical cycles. In their experiment, 90 fs optical pulses, obtained from a mode-locked, colliding pulse, ring dye laser followed by a dye amplifier, were focused onto a 15 cm-long, single-mode, polarization preserving optical fiber. For a few nano-joules of pulse energy coupled into the fiber, the optical spectrum was observed to broaden significantly. (With increasing input energy the spectrum continued to broaden, covering nearly the entire visible range.) The optical energy coupled into the fiber was adjusted to

240

18 Optical pulse compression

241

produce a factor of 3 increase in the frequency spectrum bandwidth; the spectral half-width thus broadened from 6 nm to 20 nm. The light emerging from the fiber was subsequently recollimated with a lens and sent through a grating compressor (i.e., two gratings set 6.4 cm apart, each having 600 lines/mm; angle of incidence on the gratings 30 ), to yield clean, 30 fs pulses (compression ratio Rc ¼ 3.0). Shank et al. used the second-harmonic autocorrelation method to determine the duration and profile of their compressed pulses. Compression techniques have evolved over the years,5,6,7,8,9 and many impressive applications of the femtosecond pulse technology have been reported. In this chapter we present an elementary theory of optical pulse compression, describe the role of optical nonlinearity in self-phase modulation (which produces chirp and thereby broadens the Fourier spectrum), and analyze methods of chirp cancellation using dispersive (linear) optical instruments.

Pulse propagation in an isotropic, homogeneous, and dispersive medium Consider a periodic train of light pulses propagating along the z-axis in a Cartesian coordinate system. The amplitude of this pulse train is denoted by a(t, z), a function of the coordinate z and time t. At the origin of the coordinates, z ¼ 0, the Fourier spectrum of the pulse is given by Að f Þ ¼

M X

Am dðf f0 mDf Þ:

ð18:1Þ

m ¼ M

Here Am ¼ jAmj exp(im) is the complex amplitude of the spectral component at f ¼ f0 þ mDf, where f0 is the central frequency of the spectrum. At z ¼ 0, the pulse amplitude will be aðt; z ¼ 0Þ ¼

M X

jAm j cos½2pðf0 þ mDf Þt m :

ð18:2Þ

m ¼M

Here the time-dependence factor is assumed to be exp(i2pft). In the limit Df ! 0, the above sum is replaced by an integral, and one recovers a single pulse of continuous spectrum A( f ). Figure 18.1 shows plots of A( f ) and a(t, z ¼ 0) for the specific values of f0 ¼ 3.75 · 1014 Hz (corresponding to k0 ¼ c/f0 ¼ 0.8 lm), Df ¼ 0.01f0, and M ¼ 10. The amplitude function a(t, z ¼ 0) in Figure 18.1 is periodic, with a period T ¼ 1/Df 267 fs; a single period of the function is shown in Figure 18.1(b), and a close-up of the pulse appears in Figure 18.1(c).

242

Classical Optics and its Applications 1.0

(a)

A(f)

0.8 0.6 0.4 0.2 0.0 0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

Frequency (1014 Hz)

a (t, z = 0) (× 10–7)

2

(b)

1 0 –1 –2 0

a (t, z = 0) (× 10–7)

2

50

100 150 Time (fs)

200

250

(c)

1 0 –1 –2 10

20

30

40 Time (fs)

50

60

70

Figure 18.1 Amplitude A(f) of the Fourier transform of a Gaussian pulse train described by Eq. (18.1), having f0 ¼ 3.75 · 1014 Hz, Df ¼ 0.01f0, and M ¼ 10. The corresponding amplitude profile a(t, z ¼ 0) is a periodic function of time, with period T ¼ 1/Df 267 fs. A single period of the pulse train is shown in (b), and a close-up appears in (c).

At any point Z ¼ z0 along the propagation path, each Fourier component in Eq. (18.1) will be multiplied by exp(i2pnfz0/c), where f ¼ f0 þ mDf is the specific frequency for that component of the spectrum, and n is the effective refractive index of the medium; c is the speed of light in vacuum. In free-space, n ¼ 1.0, and the pulse amplitude at Z ¼ z0 simply becomes a(t, z ¼ z0) ¼ a(t – z0/c). The pulse then propagates at the speed of light, c, without any change of shape whatsoever. In a (homogeneous and isotropic) medium of refractive index n, however, the dependence of n on the frequency f complicates the propagation process. For a

18 Optical pulse compression

243

sufficiently narrow spectrum, one may approximate the dependence of n on f by the first few terms of the Taylor series of n(f ), namely, nðf Þ n0 þ n1 ðf f0 Þ þ n2 ðf f0 Þ2 :

ð18:3Þ

The propagation factor at f ¼ f0 þ mDf, up to and including second-order terms in (f – f0), may then be written as follows: expði2pnfz0 =cÞ expði2pn1 f02 z0 =cÞ exp½i2pðn1 þ n2 f0 Þðf f0 Þ2 z0 =c · exp½i2pðn0 þ n1 f0 Þfz0 =c:

ð18:4Þ

In the above equation, the first term on the right-hand side is a constant phase-factor (independent of f ), which can, for purposes of the present analysis, be ignored. The second term is a quadratic phase-factor in (f – f0) ¼ mDf, which may be combined with the phase m of Am in Eq. (18.2); this term is ultimately responsible for the broadening and chirp induced on the pulse by the effects of dispersion. The last term is a linear phase-factor that translates the (dispersed) pulse from t ¼ 0 at z ¼ 0 to t ¼ (n0 þn1f0)z0/c at z ¼ z0. The group velocity Vg is thus found to be Vg ¼ c=ðn0 þ n1 f0 Þ:

ð18:5Þ

When n1 ¼ 0, the refractive index is, to first order, independent of the optical frequency f, and the group velocity Vg would be equal to the phase velocity Vph¼ c/n0. In general, the refractive index of a transparent optical material is an increasing function of the frequency f, hence n1 0 and Vg Vph. For a typical material such as fused silica, where n0 ¼ 1.46 and n1 ¼ 4.2 · 1017 s at f0 ¼ 5.4546 · 1014 Hz (corresponding to k0 ¼ 0.55 lm), Vg 0.985Vph. (For fused silica in the wavelength range k ¼ 0.3 lm – 1.6 lm, plots of n0, n1, n2 versus the optical frequency f are shown in Figure 18.2.) Note that the above arguments have been presented in the context of propagation in a homogeneous medium, where n(f ) is a characteristic of the material environment. For a beam confined to a waveguide, however, the index n(f ) is an effective index that depends not only on the material properties of the core and the cladding, but also on the structure of the waveguide. Equation (18.5) will still be applicable in this case, but the coefficients n0 and n1 must be obtained for the effective index neff of the waveguide, for the particular mode under consideration. (See the Appendix for a discussion of guided modes and the effective index of a simple slab waveguide.) Figure 18.3 shows the pulse of Figure 18.1 after propagating a distance of 4.0 mm in fused silica (n0 ¼ 1.4534, n1 ¼ 3.69 · 1017 s, n2 ¼ 0.6 · 1033 s2

Classical Optics and its Applications Refractive index and derivatives

244

n0

1.5

1

1016n1

0.5 1030n2 0

–0.5 2

3

4

5

6

7

8

9

10

Frequency (1014 Hz)

Figure 18.2 Plots of n0, n1, and n2 versus the optical frequency f for fused silica in the wavelength range k ¼ 0.3 lm – 1.6 lm. The refractive index n0( f) is measured and fitted to the Sellmeier equation, then the derivatives of the equation are obtained analytically to yield the plots of n1 and n2.

a (t, z = 4 mm) × 10–7

1.5 1.0 0.5 0.0 –0.5 –1.0 –1.5 0

50

100

150

200

250

Time (fs)

Figure 18.3 The pulse depicted in Figure 18.1 after propagating a distance of 4.0 mm in fused silica (n0 ¼ 1.4534, n1 ¼ 3.69 · 1017 s, n2 ¼ 0.6 · 1033 s2 at k0 ¼ c/f0 ¼ 0.8 lm).

at k0 ¼ c/f0 ¼ 0.8 lm). Clearly it does not take much propagation for a short pulse of the given wavelength in the given material to become significantly broadened. Group velocity dispersion The group velocity defined by Eq. (18.5) may itself be treated as a function of frequency, namely, Vg(f ) ¼ c/(n þ n0 f ), where n0 is the derivative of n with respect

18 Optical pulse compression

245

to f. The variations of Vg in the vicinity of a given frequency f0 may then be analyzed in terms of the derivative of Vg with respect to f, evaluated at f ¼ f0, namely, Vg0 ð f0 Þ ¼ 2cðn1 þ n2 f0 Þ=ðn0 þ n1 f0 Þ2 :

ð18:6Þ

(Here we have used the fact that n00 ¼ 2n2.) The so called group velocity dispersion (GVD) defined by Eq. (18.6) is clearly proportional to the coefficient (n1 þ n2f0) appearing in the quadratic phase factor in Eq. (18.4). In particular, the sign of (n1 þ n2f0) determines whether Vg is an increasing or decreasing function of frequency. Quadratic phase-factor, chirp, and pulse broadening The spectral amplitude A(f ) of a single light pulse may be a Gaussian function of frequency f, namely, h i Aðf Þ ¼ A0 exp paðf f0 Þ2 ; ð18:7Þ where A0 and a are two complex constants. Whereas A0 ¼ jA0jexp(i0) may be chosen arbitrarily, the parameter a ¼ a1 ia2 is required to have a positive real part, that is, a1 > 0. The units of A0 are volt·second/meter, while a has units of second2. The Fourier transform of A(f ) is given by n o 1 aðtÞ ¼ Re A0 a 2 expðpt2 =aÞ expði2pf0 tÞ

1 ¼ jA0 jða21 þ a22 Þ4 exp p a1 =ða21 þ a22 Þ t2 n o · cos 2pf0 t þ p a2 =ða21 þ a22 Þ t2 12 tan1 ða2 =a1 Þ 0 : ð18:8Þ Note that the field amplitude a(t) has units of volt/meter, namely, those of the electric field in the MKSA system of units. The pulse envelope is a Gaussian pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 function whose width is proportional to ða1 þ a22 Þ=a1 . (To obtain the pulse’s full widthﬃ at half-maximum intensity (FWHM), multiply this parameter by pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 ln 2=p 0:665.) Thus the quadratic phase-factor, having a coefficient pa2 in the exponent of the spectral function A(f ), causes a broadening of the pulse. For example, the coefficient of the quadratic phase-factor in Eq. (18.4), a2 ¼ 2(n1þn2f0)z0/c, indicates a growing pulse-width with the propagation distance z0. The time-dependent chirp frequency in Eq. (18.8), f ¼ f0 þ [a2/(a12 þ a22)]t, varies continuously along the pulse within a range centered at f0 (noting that the pulse of Eq. (18.8) is centered at t ¼ 0). If (n1þn2f0) happens to be positive, then the chirp frequency will increase with time (up-chirp). Since the GVD, given by Eq. (18.6), is negative in this case, the leading edge of the pulse, having a frequency that is lower than f0, travels faster than the trailing edge, which has a

246

Classical Optics and its Applications

higher frequency. On the other hand, if (n1þn2f0) happens to be negative, the chirp frequency will decrease with time (down-chirp). However, since the GVD is positive in this case, the leading edge, once again, will travel faster than the trailing edge. Either way, the pulse is seen to broaden as a result of propagation in the dispersive medium, which is the same conclusion arrived at earlier, when we argued that the width of the Gaussian pulse of Eq. (18.8) is an increasing function of a2. The minimum width occurs at z0 ¼ 0, where a2 ¼ 0; here the pulse is said to pﬃﬃﬃﬃﬃ be transform-limited, meaning that its width, a1 , cannot be reduced any further, owing to the finite width of its Fourier transform A(f ).

Propagation in nonlinear media and pulse compression Consider a chirp-free Gaussian pulse as described by Eq. (18.8) with a set equal to a1 (i.e., a2 ¼ 0). When this pulse is launched into a nonlinear medium having a (linear) refractive index n0, its period-averaged intensity at the origin of the coordinates, z ¼ 0, will be IðtÞ ¼ ðn0 =Z0 Þha2 ðtÞi 2 2 2 ¼ 12 n0 Z01 a1 1 jA0 j expð2pt =a1 Þ I peak ð1 2pt =a1 Þ;

ð18:9Þ

where Ipeak ¼ n0jA0j2/(2Z0a1) is the pulse’s peak intensity (i.e., optical power per unit cross-sectional area). Here Z0 377 X is the free-space impedance in the MKSA system of units. The pulse’s Gaussian intensity profile is approximated in Eq. (18.9) by the quadratic function (1 2pt2/a1), which provides an accurate description at and around the center of the pulse, but grossly underestimates the intensity distribution further away, in the wings. The only justification for this approximate treatment is that it simplifies the following analysis; more realistic calculations, therefore, must properly account for the actual pulse’s intensity profile. If the nonlinear refractive index of the medium happens to be proportional to I(t), namely, n(I) ¼ n0 þ cI, which is characteristic of media with the so-called Kerr nonlinearity, then, assuming dispersionless propagation and ignoring timeindependent terms, the phase modulation imparted to the pulse after propagating a distance z0 will be DðtÞ 2pcIðtÞz0 =k0 4p2 f0 cIpeak z0 =ða1 cÞ t2 ¼ pt2 =a3 ; ð18:10Þ where a3 ¼ a1c/(4pf0cIpeakz0) is a real-valued constant. For typical values of c ¼ 2.5 · 1020 m2/w and n0 ¼ 1.4534 in a silica glass fiber at f0 ¼ 3.75 ·1014 Hz, setting a1 ¼ 5 ·1028 s2, z0 ¼ 1.0 m, and jA0j ¼ 2.0 ·106 v·s/m (corresponding to Ipeak 15 w/lm2 inside the fiber), we find a3 ¼ 8.5 ·1029 s2. Figure 18.4 shows

247

18 Optical pulse compression a (t, z = 1m) × 10–7

10 5 0 –5 –10 0

10

20

30 40 Time (fs)

50

60

70

Figure 18.4 The original pulse of Figure 18.1(c) after acquiring a nonlinear phase shift, D(t) ¼ pt2/a3, with a3 ¼ 8.5 ·1029 s2. Other relevant parameters are jA0j ¼ 2.0 · 106 v·s/m, a1 ¼ 5.0 ·1028 s2, f0 ¼ 3.75 ·1014 Hz. The chirp frequency is seen to be a linearly increasing function of time.

a plot of the original pulse depicted in Figure 18.1(c) after acquiring the nonlinear phase shift given by Eq. (18.10) with the above value of a3. The quadratic nature of the phase-shift in Eq. (18.10) is responsible for the chirped behavior of the oscillations in Figure 18.4, where the frequency is seen to be a linearly increasing function of time (the so-called up-chirp). When the quadratic phase-factor exp(ipt2/a3) is imposed on a transformlimited Gaussian pulse, it does not change the Gaussian nature of the pulse, but modifies the pulse parameter from a ¼ a1 to b. Defining 1/b ¼ (1/a1) þ i/a3, the Fourier transform of the chirped pulse becomes h i pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð18:11aÞ Að f Þ ¼ A0 b=a1 exp pbð f f0 Þ2 : Writing b ¼ b 1 – ib 2, we find b 1 ¼ a1 b 2 ¼ a3

.h

i 1 þ ða1 =a3 Þ2 ;

ð18:11bÞ

.h

i 1 þ ða3 =a1 Þ2 :

ð18:11cÞ

The first consequence of imposing a chirp in the time domain, therefore, is a broadening of the spectral function A(f ) in the frequency domain; this is because b 1 is always less than a1. The second consequence of a time-domain chirp is the imposition of the quadratic phase-factor exp[ipb2(f – f0)2] on the spectrum of the pulse. Should this spectral phase somehow be eliminated, the new pulse would become chirp-free. More importantly, however, is the fact that, by virtue of its broader spectrum, the resulting chirp-free pulse will be a compressed version of the original pulse.

248

Classical Optics and its Applications

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ Note that the spectral broadening factor a1 =b1 depends only on the ratio a1 =a3 ¼ 4pcIpeak z0 =k0 , which is independent of the original pulse duration. With a reasonable value of the nonlinear coefficient c, and with sufficient intensity Ipeak and/or propagation distance z0, it should be possible to broaden the spectra of pulses having durations of the order of picoseconds and perhaps even those that reach into the nanosecond regime. To substantially broaden the spectral width requires that a1/a3 be much greater than unity, in which case b 2 will become nearly equal to a3. However, for a given compression ratio, a3 is proportional to a1, which is the square of the original pulse width. For pulses in the picosecond regime and shorter, the value of b 2 will be small enough that passing the chirped (and spectrallybroadened) pulse through a simple dispersive element (e.g., a pair of prisms or gratings) will eliminate the quadratic phase-factor, thus yielding a compressed pulse. However, for longer pulses (say, in the nanosecond range) the quadratic phase coefficient b 2 will be so large as to render ineffective these simple methods of chirp-cancellation. Under such circumstances, one should resort to resonant linear devices such as Fabry Pe´rot etalons and tuned spectral filters to accomplish chirp cancellation.

Eliminating the quadratic phase-factor One way of removing the quadratic phase-factor exp ipb 2 ð f f0 Þ2 , imposed on the spectral function A(f ) by the effects of nonlinear propagation, is to simply send the chirped pulse through a linear dispersive medium such as a transparent glass slab or a length of fiber. A comparison of Eq. (18.11) with Eq. (18.4) reveals the required propagation distance in the dispersive element to be z0 ¼ 12 cb2/(n1þn2f0), where, obviously, b 2 and (n1þn2f0) must have opposite signs. Example: For an infra-red laser pulse (k0 ¼ 1.5 lm, f0 ¼ 2.0 · 1014 Hz), let a1 ¼ 1025 s2 and a3 ¼ 1026 s2. From Eq. (18.11c), we find b 2 ¼ 0.99a3. The GVD of fused silica, which is negative in the visible and near-infrared, changes sign at k 1.3 lm. At k ¼ 1.5 lm, we find n1 ¼ 9.0 · 1017 s, n2 ¼ 5.5 · 1031 s2, yielding (n1þn2f0)¼ 2.0 · 1017s. Passing the pulse through a fairly thick plate of fused silica (thickness ¼ 74.25 mm) can, therefore, cancel the chirp and produce a pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ compressed pulse. The compression ratio is Rc ¼ a1 =b 1 ¼ 1 þ ða1 =a3 Þ2 10. Another method of chirp-cancellation utilizes dispersive optical elements such as gratings and prisms. Figure 18.5 shows a pair of identical diffraction gratings of period p and separation d. Let the pulse be incident at an angle h on the first

249

18 Optical pulse compression

m0 d

p

Figure 18.5 A pair of identical gratings of period p and separation d is commonly used as a linear dispersive device for chirp cancellation. The incident pulse arrives at the first grating at an angle h relative to the grating’s surface normal. The mth diffracted order leaves the first grating at an angle hm, and is subsequently diffracted from the second grating. The emergent beam is parallel to, but laterally displaced from, the direction of incidence.

grating. The (complex) amplitude of a plane-wave of wavelength k illuminating the grating surface will be exp[i(2p/k)x sinh]. The grating surface will modulate this wavefront by a phase-factor of period p/m, where m, an integer, is the diffraction order. Thus the reflected wavefront immediately after the first grating surface will be exp[i(2p/k)(sin h þ mk/p)x]. This wavefront propagates a distance z ¼ d along the grating’s surface normal, acquiring along the way a phase , where qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðkÞ ¼ ð2pd=kÞ 1 ðsin h þ mk=pÞ2 : ð18:12Þ Reflection from the second grating does not modify the acquired phase factor given in Eq. (18.12), but merely cancels the modulation of the wavefront, exp(i2pmx/p), which was added at the first grating. The beam thus returns to propagating in its original direction at an angle h to the grating normal, but retains the phase (k) which it acquired while propagating between the two gratings. We mention in passing that, in addition to the above phase, one must take into account the phase-shift imparted to each wavelength upon reflection from the two gratings. The phase shifts of the two gratings, which will be the same if the gratings are identical, must be added to (k), and their wavelength dependence must be fully accounted for when computing the quadratic phase-factor imposed on the emergent beam. The Taylor series expansion of (k) of Eq. (18.12) around the center frequency f0 yields ðf Þ ¼ U0 þ U1 ðf f0 Þ þ U2 ðf f0 Þ2 þ ;

ð18:13aÞ

where, denoting by hm0 the mth order diffraction angle corresponding to the

Classical Optics and its Applications f( f ) – Φ0 – Φ1( f – f0 ) (rad)

250 0 –10 –20 –30 –40 –50 –60

–70 3.55

3.6

3.65

3.7 3.75 3.8 Frequency (× 1014 Hz)

3.85

3.9

3.95

Figure 18.6 Plot of the function (f) – U0 – U1(f – f0) in the vicinity of f0 ¼ 3.75·1014 Hz. (f) is given by Eq. (18.12), while its first two Taylor series coefficients, U0 and U1, are given by Eqs. (18.13b) and (18.13c). The diffraction order under consideration is m ¼ 1, the assumed grating period is p ¼ 1.0 lm, the incidence angle is h ¼ 60 , and the separation between the gratings is d ¼ 10 mm.

central wavelength k0, namely, sin hm0 ¼ sin h þ mk0/p, we have U0 ¼ 2pðd=k0 Þ cos hm0 ;

ð18:13bÞ

U1 ¼ 2pðd=cÞð1 sin h sin hm0 Þ=cos hm0 ;

ð18:13cÞ

U2 ¼ pdm2 k30 =ðc2 p2 cos3 hm0 Þ:

ð18:13dÞ

As a typical example, Figure 18.6 shows a plot of the phase function (f ) of Eq. (18.12), with the constant and linear terms of Eq. (18.13a) removed. The horizontal axis is centered at f0 ¼ 3.75 · 1014 Hz (k0 ¼ 0.8 lm), the grating period is p ¼ 1.0 lm, the assumed incidence angle is h ¼ 60 , the separation between the gratings is d ¼ 10 mm, and the diffraction order under consideration is m ¼ 1. The numerical value of U2, 1.8 · 1025 s2, provides a good match to the actual curvature of the function plotted in Figure 18.6. This quadratic phase-factor is a linear function of the separation d of the two gratings, and also a strong function of the grating period p. Note that the quadratic phase coefficient U2 of Eq. (18.13d) is always negative. Losses due to diffraction orders other than the mth order used, as well as polarization dependence of the diffraction efficiency from gratings, can be a disadvantage. Since the various frequencies are shifted laterally upon emerging from the second grating, to the extent that this lateral shift cannot be ignored, one must either employ a second, identical pair of gratings, or return the beam through the same pair, in order to compensate for this lateral spectral shift. In the end, chirp-compensation with a grating pair works well for femtosecond and even a few-pico second-long pulses,

251

18 Optical pulse compression

r = exp(if) Dielectric mirror 2 = exp(i 2) Incident beam d Dielectric mirror 1 = exp(i 1)

Figure 18.7 Diagram of a Gires–Tournois resonator. The (collimated) incident beam is fully reflected because both mirrors are lossless and, moreover, the rear mirror is 100% reflective. The spectral shape of the pulse is preserved by virtue of the fact that the device’s reflection coefficient r has a magnitude of unity at all frequencies. The phase (f) of r depends on the reflectivity q of the front mirror, on the separation d between the mirrors, and on the phase angles w1 and w2 of the individual mirror reflectivities.

but increasing the pulse duration to the sub-nanosecond regime and beyond imposes unrealistic demands on the grating period p and grating separation d, which renders impractical this method of chirp-compensation for long pulses. A third method of chirp-compensation is based on resonant structures, such as Fabry–Perot etalons. Figure 18.7 is a diagram of a special resonator (the Gires– Tournois interferometer), which is particularly useful for low-level chirpcancellation. For simplicity, let us assume that the front mirror has amplitude reflectivity q1 ¼ q exp(iw1) and transmissivity s1 from both sides (i.e., symmetric mirror), and that the second mirror is 100% reflective, that is, q2 ¼ exp(iw2). The mirrors being lossless, we have jq1j2 þ js1j2 ¼ 1;palso, the phase difﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃgenerally, ﬃ 2 ference between q1 and s1 is 90 ; therefore, s1 ¼ i 1q expðiw1 Þ. Assuming the separation between the two mirrors is d, the amplitude reflectivity of the GT etalon will be r ¼ jr j expðiÞ ¼

q expfi½w1 þ w2 þ ð4pd=kÞg expðiw1 Þ: 1 q expfi½w1 þ w2 þ ð4pd=kÞg

ð18:14Þ

Clearly, jrj ¼ 1 at all wavelengths, and tanð w1 Þ ¼

ðq2 1Þ sin½w1 þ w2 þ ð4pd=kÞ : 2q ðq2 þ 1Þ cos½w1 þ w2 þ ð4pd=kÞ

ð18:15Þ

252

Classical Optics and its Applications

The above phase can be expanded in a Taylor series around the center frequency f0, as follows: ðf Þ ¼ U0 þ U1 ðf f0 Þ þ U2 ðf f0 Þ2 þ :

ð18:16aÞ

Ignoring the frequency dependence of q, w1, and w2, we find

U1 ¼ ð4pd=cÞð1 q2 Þ= 1 þ q2 2q cos½w1 þ w2 þ ð4pd=k0 Þ ; U2 ¼ ð4pd=cÞ2

qðq2 1Þ sin½w1 þ w2 þ ð4pd=k0 Þ f1 þ q2 2q cos½w1 þ w2 þ ð4pd=k0 Þg2

:

ð18:16bÞ ð18:16cÞ

A typical behavior of the GT phase-function (f ) for the special case of q ¼ 0.9 is shown in Figure 18.8. The dependence of on the total retardation w ¼ w1 þ w2 þ (4pd/k), shown in Figure 18.8(a), reveals that rises rapidly from 0 to 2p in the vicinity of resonance, which occurs at w ¼ 0. Figure 18.8(b) shows the dependence of 0 (f ) ¼ d/df on w. As can be seen from Eq. (18.16b), the maximum value of 0 , namely, (4pd/c)(1 þ q)/(1 q), occurs on resonance, at w ¼ 0. Therefore, for the (chirped) incident pulse to experience, upon reflection, the full range of the available phase of the GT resonator, the pulse’s spectral width should be Df (c/4d)(1 q)/(1 þ q). Assuming Df 3.0 · 1011 Hz (corresponding to pulses in the few-picosecond range), a good choice for the separation distance of the GT mirrors would be d 14 lm. With reference to Figure 18.8(c), which is a plot of 00 (f ) ¼ d2/df 2 versus w, it is clear that a slight increase of the mirror separation d (by only 3.8 nm in the present example) will shift the center of the incident spectrum (f0 ¼ 3.75 · 1014 Hz) to the vicinity of the negative peak of 00 (f ), where a large negative quadratic phase factor is available for chirp cancellation. Figure 18.9 is a plot of (f ) in the vicinity of the center frequency f0, with the first two terms of the Taylor series subtracted. The assumed parameters are q ¼ 0.9, w1 ¼ w2 ¼ 0, d ¼ 14.005 lm, and the computed Taylor series coefficients are U0 ¼ 1.28, U1 ¼ 7.2 · 1012, U2 ¼ 1.9 · 1023. It is easy to verify that the quadratic function U2( f – f0)2 provides a fairly good match to the actual phase depicted in Figure 18.9. In general, the magnitude of the quadratic phase available from a GT resonator is rather small, thus limiting the applicability of this type of device to situations that involve small compression ratios only. To see this, note in Eq. (18.11) that the qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ compression ratio Rc ¼ a1 =b 1 is equal to 1 þ ða1 =a3 Þ2 ; also, b2/b 1 ¼ a1/a3; pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ therefore, b 2 =b 1 ¼ R2c 1. Defining the bandwidth Df as the FWHM

253

18 Optical pulse compression 6

(a)

f( f ) (rads)

5 4 3 2 1 0 –150

f( f )/(4d/c)

20

–100

–50

0

50

100

150

(b)

15 10 5 0 –60

–20

0

20

40

60

–40

–20

0

20

40

60

(c)

f( f )/(4d/c)

2

100

–40

50 0 –50

–100 –60

C1+C2+ 4d/ (degrees)

Figure 18.8 Characteristics of a GT resonator having q ¼ 0.9. (a) Phase plotted versus the total retardation angle w ¼ w1 þ w2 þ (4pd/k). (b) Plot of d/df versus w; the vertical axis is in units of 4pd/c. (c) Plot of d2/df 2 versus w; the vertical axis is in units of (4pd/c)2.

of jA(f )j2, the phase of A(f ) in Eq. (18.11) varies by about 0.35b 2/b 1 radians between f0 and f0 12 Df. In practice, the limited amount of quadratic phase available from a GT resonator means that this device can handle only small compression ratios (Rc 2–3). Operating away from the peak of 00 (f ) could help provide a slight increase in the range of the quadratic phase at the expense of introducing third- and higher-order phase factors into the optical spectrum. It is also possible to

254

Classical Optics and its Applications

f( f ) – Φ0 – Φ1( f – f0) (rad)

0 –0.1 –0.2 –0.3 –0.4 –0.5 3.748

3.749

3.75

3.751

3.752

Frequency (× 1014 Hz)

Figure 18.9 Plot of the function (f) – U0 – U1(f – f0) in the vicinity of f0 ¼ 3.75 · 1014 Hz. The GT parameters are q ¼ 0.9, w1 ¼ w2 ¼ 0, d ¼ 14.005 lm, while the computed Taylor series coefficients are U0 ¼ 1.28, U1 ¼ 7.2 · 1012, U2 ¼ 1.9 · 1023. The quadratic function U2(f – f0)2 provides a fairly good match to the actual function depicted here.

design the GT mirrors such that q, w1, and/or w2 exhibit strong dependences on f within the relevant spectral range. Inserting a transparent dielectric slab (or thin film layer) between the mirrors is another degree of freedom that can be (and has been) exploited for the purpose of improving the performance of GT compressors.9 Concluding remarks In this chapter we have attempted to provide an explanation of the fundamental principles of optical pulse compression. We stayed away from the advanced topics, and steered clear of some of the technical difficulties as well as the ingenious methods that have been developed to overcome them. In practice, one must contend with a host of technical problems in order to reliably and efficiently produce high-quality compressed pulses. The nonlinear medium, which imparts the all-important phase modulation to the initial pulse, may introduce significant dispersion of its own. This results in a distorted pulse and, often, it is the mechanism that limits the amount of useful chirp that can be placed on the pulse. In addition, the third- and higher-order terms introduced into the spectral phase profile, either within the nonlinear medium or as a consequence of passage through the chirp compensator, must be identified and corrected, perhaps by sending the pulse through additional (high-order) compensators. Finally, the profile of the compressed pulse must be measured to determine the degree of

255

18 Optical pulse compression

compression, and to find out whether the pulse is free from distortions and other imperfections. The interested reader may consult the vast literature of the subject for further details. Appendix Slab waveguide and the effective refractive index of guided modes Consider the slab waveguide depicted in Figure 18.10. The guiding layer has thickness d and refractive index ng. The substrate and the cladding layer, having refractive indices ns and nc, respectively, may be assumed to be infinitely thick. Within the guiding layer a pair of plane-waves propagate at an angle h relative to the surface normal; h is greater than the critical angle of total internal reflection at both interfaces, that is, ng sin h > max (ns, nc). The two plane-waves thus have the following complex amplitudes: E ðx; zÞ ¼ jE0 j expði0 Þ exp½ið2png =k0 Þð x cos h þ z sin hÞ:

ðA18:1Þ

x

nc

z Evanescent field

ng

Guiding layer

Evanescent field

ns

Figure 18.10 Slab waveguide consisting of a guiding layer of thickness d and refractive index ng, sandwiched between a substrate of index ns and a cladding layer of index nc. Within the guiding layer, a pair of plane-waves propagate at an angle h relative to the surface normal.

256

Classical Optics and its Applications

Here the plus sign refers to the up-going beam, the minus sign to the down-going beam, 0 defines the relative phase between the two plane-waves, and k0 ¼ c/f0 is the vacuum wavelength. At the interface with the cladding, where x ¼ d/2, the down-going beam must have the same amplitude as the up-going beam, but its phase must be incremented by the phase of the Fresnel reflection coefficient at this interface. The Fresnel coefficient, depending on whether the beam is s- or p-polarized, is rp ¼ exp(ip) or rs ¼ exp(is), where pðcladÞ

¼ þ2 tan

sðcladÞ

1

¼ 2 tan

ðn2c 1

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ cos hÞ= ng n2g sin2 h n2c ;

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 n2g sin h n2c =ðng cos hÞ :

Therefore, at the cladding interface, one must have ðcladÞ 1 0 þ ð2png =k0 Þ 2 d cos h þ z sin h þ p;s 1 ¼ 0 þ ð2png =k0 Þ 2 d cos h þ z sin h ;

ðA18:2Þ ðA18:3Þ

ðA18:4Þ

which leads to ðcladÞ 20 þ 2png ðd=k0 Þ cos h þ p;s ¼ 0:

ðA18:5Þ

A similar relation must hold at the substrate interface (x ¼ d/2), where the down-going beam is incident and the up-going beam is reflected. Therefore, ðsubÞ 20 þ 2png ðd=k0 Þ cos h þ p;s ¼ 0:

ðA18:6Þ

Equations (A18.5) and (A18.6) can be satisfied simultaneously if and only if an integer m exists such that ðcladÞ ðsubÞ 4png ðd=k0 Þ cos h þ p;s þ p;s ¼ 2mp:

ðA18:7Þ

If the guiding layer’s thickness d is sufficiently small, Eq. (A18.7) will have only one solution (i.e., one acceptable value of h) for s-light, and perhaps another solution for p-light. The guide is then said to be single-mode. Larger values of d lead to more solutions, which correspond to higher-order modes. (Note: h ¼ 90 is always an acceptable solution; however, 0 in this case turns out to be 0 for p-light and 90 for s-light. Both of these solutions result in the up-going and down-going plane-waves coming into alignment with equal but opposite amplitudes, thereby canceling each other out. The solution corresponding to h ¼ 90 ,

257

18 Optical pulse compression

therefore, does not lead to a viable mode.) For a viable mode, denoting the solution of Eq. (A18.7) by hm, and with reference to Eq. (A18.1), the E-field amplitude within the guiding layer will be Eðx; zÞ ¼ Eþ ðx; zÞ þ E ðx; zÞ

¼ 2jE0 j cos ð2png cos hm =k0 Þx þ 0 exp ið2png sin hm =k0 Þz : ðA18:8Þ

The cross-sectional profile of the mode along the x-axis is thus determined by the cosine function on the right-hand side of Eq. (A18.8) (and also by the evanescent fields within the cladding and the substrate). The exponential term in Eq. (A18.8) is the propagation phase-factor, from which one can identify an effective refractive index neff ¼ ng sin hm for the given mode. Considering that, in general, both ng and the solution hm of Eq. (A18.7) are functions of the frequency f, the dispersive properties of the waveguide are seen to arise from the frequency dependence of neff. References for Chapter 18 1 2 3 4 5 6 7 8 9

F. Gires and P. Tournois, Compt. Rend. 258, 6112 (1964). J. A. Giordmaine, M. A. Duguay, and J. W. Hansen, Quantum Electron. 4, 252 (1968). R. A. Fisher, P. L. Kelley, and T. K. Gustafson, Sub-picosecond pulse generation using the optical Kerr effect, Appl. Phys. Lett. 14, 140 (1969). C. V. Shank, R. L. Fork, R. Yen, R. H. Stolen, and W. J. Tomlinson, Compression of femtosecond optical pulses, Appl. Phys. Lett. 40, 761 (1982). W. J. Tomlinson, R. H. Stolen, and C. V. Shank, Compression of optical pulses chirped by self-phase modulation in fibers, J. Opt. Soc. Am. B 1, 139 (1984). J. Biegert and J.-C. Diels, Compression of pulses of a few optical cycles through harmonic generation, J. Opt. Soc. Am. B 18, 1218 (2001). J. Moses and F. K. Wise, Soliton compression in quadratic media: high-energy fewcycle pulses with a frequency-doubling crystal, Opt. Lett. 31, 1881 (2006). W. Rudolph and B. Wilhelmi, Light Pulse Compression, Harwood Academic, London, 1989. J.-C. Diels and W. Rudolph, Ultrashort Laser Pulse Phenomena, Academic Press, New York, 1996.

19 The uncertainty principle in classical optics

In the classical electromagnetic theory the wave-vector k ¼ (2p/k)r underlies the Fourier space of propagating (or radiative) fields. The k-vector combines into a single entity the wavelength k and the unit vector r that signifies the beam’s propagation direction. The Fourier transform relation between the threedimensional space of everyday experience and the space of the wave-vectors (the so-called k-space) gives rise to relationships between the two domains analogous to Heisenberg’s uncertainty relations. Considering that in quantum theory the electromagnetic k-vector is proportional to the photon’s momentum1 (p ¼ h k, where h ¼ h/2p, h being the Planck constant), one should not be surprised to find relationships between dimensions of a beam in the XYZ-space and its momentum spread in the k-space. Such relationships impose fundamental limits on the ability of measurement systems to determine the various properties of electromagnetic fields. In this chapter we address two problems that have widespread applications in optical metrology, spectroscopy, telecommunications, etc., and discuss the constraints imposed by the uncertainty principle on these problems. The first topic of discussion is the separation of two overlapping beams of identical wavelength having slightly different propagation directions. This will be followed by an analysis of the limits of separating co-propagating beams having slightly different wavelengths. Angular separation and the limit of resolvability Figure 19.1 shows an aperture of diameter D, which transmits two plane waves of the same wavelength k propagating in slightly different directions. Denoting the angular separation between the beams by Dh, we find that the projections of the two k-vectors along the X-axis differ by Dkx (2p/k)Dh. In geometrical optics, rays propagate along straight lines and, therefore, the two beams must separate from each other after a certain propagation distance. In wave optics, however, the 258

259

19 The uncertainty principle in classical optics X Observation plane

Δu Z

D

z

Figure 19.1 Two beams of the same wavelength k, propagating in slightly different directions, pass through an aperture of diameter D. The angle between the two k-vectors is Dh, giving rise to Dkx (2p/k)Dh. The beams separate from each other at the observation plane located a distance z from the aperture, provided the uncertainty relation DDkx 2p is satisfied.

beams expand as they propagate along Z and, although their centers drift apart, there is the distinct possibility that they will never be completely separated. Roughly speaking, we expect the beams to remain more or less collimated between z ¼ 0 and z ¼ D2/k, the Rayleigh range2 for a beam of diameter D and wavelength k. If at the Rayleigh range the distance between the beam centers is greater than D, the beams should be separable; otherwise their drifting apart will go hand in hand with their expansion, and the beams remain entangled as they propagate beyond the Rayleigh range. The necessary condition for separability is thus (D2/k)Dh > D, or equivalently, D Dkx > 2p:

ð19:1Þ

The lower bound 2p on the product of D and Dkx appearing in Ineq. (19.1) is not exact, but depends on the definition of beam diameter D and the adopted criterion for separability, which are typically imprecise. For all practical purposes, the number appearing on the right-hand side of Ineq. (19.1) should be on the order of unity, say, greater than 1 but less than 10. Invoking the quantum nature of light, if the aperture diameter D is interpreted as a measure of the uncertainty Dx about the photon position along X, while Dkx is related (through the relation p ¼ h k) to the linear momentum uncertainty Dpx along the same axis, then Ineq. (19.1) is equivalent to Heisenberg’s uncertainty relation DxDpx > h.

260

Classical Optics and its Applications a

b

c

d

e

f

–500

x/

500 –500

x/

500

Figure 19.2 Plots of intensity (left) and phase (right) at the entrance aperture of the system of Figure 19.1. Two uniform beams, one propagating with a slight tilt toward the upper right, another with a slight tilt toward the lower left, enter a D ¼ 500 k aperture. The angular separation of the beams is Dh ¼ 0.23 . The individual beams are shown in the top (a, b) and the middle (c, d) rows; their superposition appears at the bottom (e, f).

Figure 19.2 shows the intensity and phase profiles of two plane waves as well as those of their superposition at the aperture depicted in Figure 19.1 (diameter D ¼ 500 k). The phase distributions in Figures 19.2(b) and 19.2(d) indicate that one of the beams is slightly tilted towards the upper-right corner of the XY-plane, while the other is tilted by an equal amount towards the lower-left corner. The angular separation between these beams is Dh ¼ 0.23 ¼ 0.004 radians.

19 The uncertainty principle in classical optics

261

The combined beam’s intensity distribution in Figure 19.2(e) reveals the angular separation of the two superimposed beams through a tell-tale fringe pattern. When the composite beam (whose intensity and phase distributions are shown in Figures 19.2(e, f)) is propagated along the Z-axis, one obtains at various distances from the aperture the intensity patterns displayed in Figure 19.3. It is seen in these pictures that the two constituent beams continue to overlap at first, giving rise to interesting interference patterns. After a sufficient propagation distance, however, the beams separate and go their own ways. The assumed value of DDkx in this example is 4p, which satisfies Ineq. (19.1). Separating two beams by means of a lens In the preceding section it was demonstrated that separating two beams of a certain angular distance Dh requires a minimum beam diameter D in accordance with Ineq. (19.1). It may be asked whether a similar limitation exists on the propagation distance z before the individual beams can be resolved. Apparently no physical law limits the required distance z, although practical considerations seem to impose certain constraints. In free space, the required propagation distance is typically less than or equal to the Rayleigh range, D2/k, but one can substantially reduce this distance by employing a lens, as shown in Figure 19.4. Here two overlapping beams of diameter D and angular separation Dh are resolved after going through an aberration-free lens. In the focal plane of the lens the center-to-center spacing of the focused spots is f Dh, which must be greater than the Airy disk3 radius of 0.6k/NA ¼ 1.2 f k/D. Note that the resolvability criterion is independent of f and NA, requiring only that D(Dh/k) > 1.2, which is a statement of the uncertainty principle in the present context. The required propagation distance f in this example can be much less than that needed in the case of free-space propagation of Figure 19.1. It must be emphasized that the uncertainty principle does not impose any constraints on z, the requirement for resolvability being only a restriction on the product of D and Dh. An interesting feature of separating two beams by means of a lens is the resulting dependence of the focused spots on the state of polarization. To reduce the required propagation distance z, one may use a high-NA lens, thus enhancing the polarization effects. The shortest focal length f is obtained when the NA of the lens is close to unity, that is, f D/2. Figure 19.5 shows computed plots of intensity distribution at the focal plane of an NA ¼ 0.99, f ¼ 250k lens, when the incident beam is the two-beam superposition depicted in Figure 19.2. The three columns of Figure 19.5 represent three different polarization states. In (a) both incident beams are linearly polarized along X, which explains the elongation of the spots in this particular direction. In (b) the two beams are linearly polarized

262

Classical Optics and its Applications a

b

c

d

e

f

g

h

i

j

k

l

m

n

o

–500

x/

500 –500

x/

500 –500

x/

500

Figure 19.3 Two overlapping plane waves depicted in Figure 19.2 propagate along the Z-axis. The various intensity patterns in frames (a) to (o) are obtained at z/(103k) ¼ 1, 2, 3, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 125, and 150, respectively. Initially the beams strongly interfere with each other, but as propagation proceeds, they separate and exhibit their individual identities.

263

19 The uncertainty principle in classical optics X Focal plane Lens

D

Z

f

Figure 19.4 Two identical beams of diameter D and angular separation Dh may be isolated after going through an aberration-free lens. In the focal plane, the distance between the focused spots is fDh, which must be greater than the Airy disk radius of 1.2fk/D if the individual spots are to be resolved.

at 45 to the X- and Y-axes, i.e., the direction along which the spots are separated from each other. The plot in (c) corresponds to the case when both beams are circularly polarized. Frames (d)–(f) are the logarithmic versions of those in (a)–(c), showing their detailed structure by emphasizing the weaker regions. Since the assumed values of D ¼ 500k and Dh ¼ 0.004 rad satisfy the uncertainty relation in Ineq. (19.1), the focused spots are seen to be resolved irrespective of their polarization state. Angular discrimination by means of a Fabry–Pe´rot etalon Another device that can, in principle, accomplish the separation of two beams via angular discrimination is a Fabry–Pe´rot etalon,3,4 such as that shown in Figure 19.6. This particular etalon is tuned to transmit a plane wave of k ¼ 633 nm at the incidence angle of h ¼ 45 . Figure 19.7 shows the etalon’s computed reflection and transmission coefficients, rs ¼ jrsjexp (irs) and ts ¼ jtsjexp(its), versus h for an s-polarized plane-wave of k ¼ 633 nm. It turns out that the shapes of the transfer functions jrs(h)j and jts(h)j are not quite suitable for complete separation of two finite-diameter beams of differing propagation directions. Computed plots of intensity distribution in Figure 19.8 confirm that the etalon of Figure 19.6 can only partially separate two beams of diameter D ¼ 2 · 104k and angular separation Dh ¼ 0.115 ¼ 0.002 rad, even though the value

264

Classical Optics and its Applications

–1

x/

d

–2

c

b

a

1 –1

x/

2 –2

x/

1

f

e

x/

1 –1

x/

2 –2

x/

2

Figure 19.5 Total electric field intensity distribution (jEj2 ¼ jExj2þjEyj2þ jEzj2) at the focal plane of a 0.99NA lens. (Rainbow colors: red ¼ maximum, blue ¼ minimum). The beam at the entrance pupil is the superposition of two D ¼ 500k beams of angular separation Dh ¼ 0.23 , as shown in Figure 19.2. In (a) the assumed polarization state of both incident beams is linear along the X-axis. In (b) the two beams are linearly polarized at 45 to the X-axis, i.e., along the direction of separation of the spots. In (c) one of the beams is right-circularly polarized, while the other is left-circularly polarized. Frames (d)–(f) in the bottom row are the logarithmic versions of frames (a)–(c) in the top row. Like an over-exposed photographic plate, a logarithmic plot reveals weak regions of an intensity distribution.

of D(Dh/k) ¼ 40 in this case amply satisfies Ineq. (19.1). Figure 19.8(a) shows the incident pattern of intensity distribution of the superposed beams upon arriving at the etalon. One of these beams propagates along the direction that makes a 45 angle with the etalon’s surface normal, while the other deviates from this direction by Dh ¼ 0.115 . The reflected intensity profile depicted in Figure 19.8(b) contains mostly the latter beam, plus a small fraction of the former. This is due to the imperfect transfer function of the etalon, which cannot fully transmit the angular spectrum of the 45 beam, nor can it fully reflect the spectrum of the 45.115 beam. Either beam’s angular spectrum has a width of k/D 0.003 , which would readily pass through a narrow rectangular transfer function, but is

265

19 The uncertainty principle in classical optics Reflected beam

Incident beam

Dielectric stack

Es

dair Es

Es Substrate

Transmitted beam

Air gap Substrate

Figure 19.6 Fabry–Pe´rot etalon designed for operation at k ¼ 633 nm, h ¼ 45 . Dielectric mirrors each contain six pairs of high/low-index layers (n1 ¼ 2.0, d1 ¼ 84.6 nm; n2 ¼ 1.5, d2 ¼ 119.6 nm). Both mirror substrates are glass (nsub ¼ 1.5), and the medium separating the mirrors is air (dair ¼ 55.95 lm). The incidence angle on the etalon is in the vicinity of h ¼ 45 ; within the substrate, however, the angle of incidence on the stack is close to h 0 ¼ 28.1255 (sin h ¼ nsub sin h 0 ). The etalon can separate two beams of identical k arriving through an aperture of diameter D, but differing in propagation direction, namely, h1 ¼ 45 , h2 ¼ 45 þ Dh. One beam is reflected by the etalon while the other is transmitted. Only s-polarized light is considered here, although p-polarized beams exhibit similar behavior.

partially blocked by the sharply peaked transfer functions of the etalon (see Figure 19.7(a)). The same arguments apply to the transmitted intensity distribution shown in Figure 19.8(c) which, although primarily composed of the 45 incident beam, still contains a fraction of the 45.115 beam. To summarize the results of this and the preceding sections, there are several ways of separating two overlapping beams of the same wavelength and differing propagation directions. Some of these methods may be more effective than others, but none could violate the uncertainty relation given by Ineq. (19.1). Moreover, Ineq. (19.1) remains valid even if the beams are observed within a transparent medium of refractive index n > 1. For instance, in Figure 19.1 if the region to the right of the aperture happens to be filled with such a medium, the angular separation Dh of the beams shrinks by a factor n upon entering the medium, but the length of the k-vector increases by the same factor, thus preserving the magnitude of Dkx. Similarly, in Figure 19.4 if the index of the medium on the right-hand side of the lens happens to be n, the focused spot diameters will be n times smaller, but their center-to-center spacing will also be reduced by the same factor, resulting once again in the preservation of Ineq. (19.1).

266

Classical Optics and its Applications 1.0

|r s |

(a) 0.8

Amplitude

0.6 0.4

|t s |

0.2 0.0 44.8

44.9

45.0

45.1

45.2

180 135

(b)

f ts

Phase (degrees)

90 45 0

–45 –90

–135

f rs

–180 44.8

44.9

45.0

45.1

45.2

Angle of Incidence (degrees)

Figure 19.7 Computed reflection and transmission coefficients versus the incidence angle h for the etalon of Figure 19.6 at k ¼ 633 nm for an s-polarized plane wave. The magnitude and phase of the reflection and transmission coefficients are defined through the relations rs ¼ jrsj exp(irs) and ts ¼ jtsj exp(its). At k ¼ 633 nm the stack is tuned to fully transmit at h ¼ 45 . A small deviation from 45 incidence causes a sharp drop in jtsj and a corresponding rise in jrsj.

Co-propagating beams of differing wavelengths A problem of general interest in spectroscopy is that of separating two beams of slightly different wavelengths, k1 and k2, propagating in the same direction. In this case the beam diameter D turns out to be irrelevant, but the available propagation distance z is critical for isolating the individual beams.

19 The uncertainty principle in classical optics a

b

c

–12500

x/

12500

Figure 19.8 Two overlapping beams of uniform amplitude and circular crosssection (k ¼ 633 nm, D ¼ 2 · 104k) arrive at the etalon of Figure 19.6. One beam travels at h ¼ 45 relative to the etalon’s surface normal, the other at h ¼ 45.115 . (a) Intensity distribution of the superposed beams at the entrance aperture. (b) Reflected intensity distribution, consisting mainly of the second beam plus a small fraction of the first. (c) Transmitted intensity distribution, consisting mostly of the first beam plus a small fraction of the second.

267

268

Classical Optics and its Applications S2 Detector 2

Detector 1 S1 50/50 Splitter

Incident beams (1, 2)

50/50 Splitter

Reflector 45° Mirror

Figure 19.9 The Mach–Zehnder interferometer can be used to separate two beams of differing wavelengths, k1 and k2. The beams have identical diameters and arrive in the same direction. The two beams are split equally at the first 50/50 splitter, travel the two arms of the device, and are recombined at the second 50/50 splitter. The lengths of the two arms of the interferometer differ by Dz. When Dz/k1 – Dz/k2 ¼ 1/2, constructive interference at the second beam-splitter for one of the two wavelengths coincides with destructive interference for the other. The beams are thus separated at the second splitter, one is captured by detector 1, the other by detector 2. The 45 mirrors (three in each arm) have a reflectivity of 90%, resulting in an overall system transmission of 73%. The 50/50 splitters are identical, each consisting of a glass substrate coated with a six-layer dielectric stack as follows: n1 ¼ 2.64)/(d2 ¼ 140 nm, n2 ¼ 1.76)/ (substrate, nsub ¼ 1.5)/(d1 ¼ 30 nm, (d3 ¼ 50 nm, n3 ¼ 2.64)/(d4 ¼ 105 nm, n4 ¼ 1.76)/(d5 ¼ 60 nm, n5 ¼ 2.64)/ (d6 ¼ 100 nm, n6 ¼ 1.76)/air. Although the above stack works for both p- and s-polarized light, its splitting ratio is much closer to 50/50 for s-light than for p-light. In our simulations the polarization state of the incident beam was fixed at s.

A straightforward method of separating two beams of differing wavelengths is shown in Figure 19.9. This Mach–Zehnder interferometer3 splits each input beam into two equal halves, provides a separate path for each half, then recombines the halves into a single beam at the output. For one of the wavelengths, say k1, the path-length difference Dz between the two arms of the device may be an integermultiple of k1, in which case the corresponding half-beams interfere constructively and emerge from one exit channel of the interferometer. For the other wavelength, k2, the path-length difference may be a half-integer-multiple of k2, in which case interference will be destructive and the beam will emerge from a

269

19 The uncertainty principle in classical optics

Figure 19.10 Computed detector signals S1 and S2 versus the input wavelength k in the Mach–Zehnder interferometer of Figure 19.9. The assumed path-length difference between the two arms of the device is Dz ¼ 1.266 mm. In the vicinity of k ¼ 633 nm the adjacent peaks of S1 and S2 are separated by Dk ¼ 0.158 nm, in agreement with Eq. (19.2).

different exit channel of the device. Therefore, the separability condition for this interferometer is Dz/k1Dz/k2 ¼ 12, or Dz Dkz 2pDzDk=k2 ¼ p:

ð19:2Þ

Figure 19.10 shows computed detector signals S1, S2 of the system of Figure 19.9 versus the input wavelength in the vicinity of k ¼ 633 nm. For the particular pathlength difference chosen in this example (Dz ¼ 1.266 mm), it is observed that, in compliance with Eq. (19.2), a pair of beams having Dk ¼ 0.158 nm can be readily separated from each other. An alternative form of the uncertainty relation may be obtained in this case by invoking the quantum-mechanical relation between the magnitude k of the

270

Classical Optics and its Applications

wave-vector and the photon energy E ¼ hm, namely, k ¼ 2p/k ¼ 2pm/c ¼ E/•c. For two beams of wavelengths k and k þ Dk, co-propagating in the Z direction, Dkz ¼ DE/•c. Also Dz ¼ cDt, where c is the speed of light and Dt is the time needed for light to travel a distance Dz in free space. The product Dz Dkz is thus proportional to DEDt, with • being the proportionality constant. One may thus reinterpret Eq. (19.2) as a statement of the time-versus-energy uncertainty. When the observations are made in a transparent medium of refractive index n > 1, the increase of the k-vector by a factor of n dictates a corresponding decrease in Dz. This is consistent with the reduced speed of light in the medium of index n, which yields the same travel time Dt for the shorter propagation distance Dz/n. Needless to say, DE ¼ hDm is independent of n. Wavelength discrimination using a Fabry–Pe´rot etalon The etalon of Figure 19.6 may also be used to separate co-propagating beams of slightly different wavelengths, say k and k þ Dk. Figure 19.11 shows computed plots of reflection and transmission coefficients versus k for a resonator having an air gap dair ¼ 55.95 lm. From Eq. (19.2) at k ¼ 633 nm, considering that Dz ¼ 2dair cos(45 ) ¼ 79.125 lm, we find Dk ¼ 2.53 nm, in agreement with the peak-to-valley distance in the simulated results of Figure 19.11. The figure, however, indicates the feasibility of resolving beams with a smaller Dk as well; this is due to the high finesse of the Fabry–Pe´rot etalon. In other words, multiple back and forth reflections within the etalon’s cavity build up an optical field whose amplitude is G times stronger than that of the incident beam. (In the present example, G is 3.0 for s-light and 1.94 for p-light.) The effective Dz is thus G times the effective gap width, resulting in a corresponding increase in the resolution of the device. Spectral analysis using a diffraction grating Consider two co-propagating beams of wavelengths k and k þ Dk, where it is assumed for convenience that Dk > 0. These beams travel along the Z-axis and pass through an aperture of diameter D. By definition, kz ¼ 2p/k and, therefore, Dkz ¼ 2pDk/k2. Figure 19.12 shows the above beams arriving at an incidence angle 0 h < 90 on a grating of period P. The Nth diffracted order emerges from the grating at an angle h 0 , in accordance with Bragg’s law,3,4 sin h0 ¼ sin h þ Nk=P;

ð19:3aÞ

cos h0 Dh0 ¼ ðN=PÞDk:

ð19:3bÞ

which yields

271

19 The uncertainty principle in classical optics 1.0 |r p |

(a)

Amplitude

0.8 0.6 0.4 |t p |

0.2 0.0 630

632

634

636

638

640

638

640

1.0 (b)

|r s |

Amplitude

0.8 0.6 0.4 0.2

|t s |

0.0 630

632

634

636 (nm)

Figure 19.11 Computed plots of amplitude reflection and transmission coefficients versus k for the Fabry–Pe´rot etalon depicted in Figure 19.6. The air gap and the incidence angle are fixed at dair ¼ 55.95 lm and h ¼ 45 . The incident beam is p-polarized in (a) and s-polarized in (b).

Now, the emergent beam diameter is D0 ¼ Djcosh 0 /coshj. Since the lens is expected to resolve the two wavelengths, Ineq. (19.1) requires that jDh 0 j k/D0 , which leads to jcos h 0 Dh 0 j k cos h/D, which in turn leads to jN /Pj Dk k cos h/D. In other words, D=cos h ðk=DkÞjP=N j:

ð19:4aÞ

From Eq. (19.3a) it is clear that jNk/Pj 2, that is jP/N j 12k. Inequality (19.4a) may thus be written as follows: D=cos h 12 k2 =Dk:

ð19:4bÞ

272

Classical Optics and its Applications

D

Z

Grating

Lens

f

Figure 19.12 Two beams of wavelengths k and kþDk, propagating in the same direction Z, arrive at an aperture of diameter D. The beams propagate a distance Dz1 from the center of the aperture to a grating of period P, shining on the grating at an angle h. One of the diffracted orders (the Nth order) leaves the grating at an angle h 0 , travels a distance Dz2 (from the center of the grating to the center of the lens), then enters a lens of focal length f and numerical aperture NA 1. Emerging from the grating, the two wavelengths deviate from each other by an angle Dh 0 , thus forming separate focused spots at the focal plane of the lens. From the entrance aperture to the focal plane, the total propagation distance is Dz ¼ Dz1þDz2þf.

Inequality (19.4b) places a lower bound for resolvability not on the beam diameter D, but on the illuminated length of the grating, D/cos h, in the direction perpendicular to the grooves. Next we examine the propagation distance from the center of the entrance aperture to the focal plane of the lens. With reference to Figure 19.12, the shortest possible distance from the entrance aperture to the grating center is Dz1 ¼ 12 D tan h. Similarly, the shortest possible distance from the grating to the lens center (ignoring the possibility that the lens might block the incident beam) is Dz2 ¼ 12 D0 jtanh 0 j ¼ 12 D jsinh 0 j/cosh. The smallest feasible focal length for the lens is f ¼ 12D 0 , corresponding to NA ¼ 1. Therefore, the shortest distance Dz from the center of the entrance aperture to the focal plane of the lens is given by Dz ¼ Dz1 þ Dz2 þ f ¼ 12 ðD=cos hÞðsin h þ jsin h0 j þ jcos h0 jÞ:

ð19:5aÞ

Since sinh 0, and jsinh 0 jþjcosh 0 j 1 for any h 0 , Eq. (19.5a) yields ð19:5bÞ

Dz 12 D=cos h: Combining Ineqs. (19.4b) and (19.5b) then yields Dz Dz Dkz 12 p:

1 4

k2/Dk, that is, ð19:6Þ

19 The uncertainty principle in classical optics

273

Note that the initial beam diameter D in this example is not restricted at all, whereas the propagation distance Dz is required to be greater than a certain minimum, 14 k2/Dk, to ensure resolvability of the wavelengths k and k þ Dk. References for Chapter 19 1 2 3 4

R. P. Feynman, R. B. Leighton, and M. Sands, The Feynman Lectures on Physics, Addison-Wesley, Reading, Massachusetts (1964). A. E. Siegman, Lasers, University Science Books, California (1986). M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, Oxford, 1980. M. V. Klein, Optics, Wiley, New York (1970).

20 Omni-directional dielectric mirrors

An omni-directional dielectric mirror (also known as a one-dimensional photonic bandgap crystal)1,2 exhibits 100% reflectivity at all angles of incidence and for all states of incident polarization.3,4 Unlike metallic mirrors, which absorb a small fraction of the incident optical power, dielectric reflectors are lossless. These properties make omni-directional dielectric mirrors ideal candidates for applications in which a beam of light in an unknown or unpredictable polarization state is likely to arrive at the mirror from any direction, and in which loss of light at the mirror, no matter how small, is deemed intolerable. A good example is provided by the walls of an optical waveguide. Since there are numerous reflections from the wall as a beam of light travels through, even small losses at each encounter with a wall rapidly deplete the beam’s energy. A typical omni-directional reflector is a periodic stack of bilayers, each bilayer consisting of a high-index and a low-index dielectric layer. The larger the refractive indices of the available dielectrics (and also the larger the difference between these indices), the easier it is to design the reflector. For example, if the two materials available for fabricating a stack of bilayers have indices n1 ¼ 1.5 and n2 ¼ 2.0, it is impossible to obtain omni-directionality for both p- and s-polarized light. However, with n1 ¼ 1.5 and n2 2.3, an omni-directional reflector can be designed. When the available dielectrics have reasonably large indices, it is also possible (by properly selecting the layer thicknesses) to achieve omni-directionality over a broad range of wavelengths. In this chapter we describe a theory of omni-directional reflectors, and outline a method of selecting the layer thicknesses for a given pair of indices n1, n2. Although the following discussions are confined to flat mirrors, it is clear that the exterior of a glass cylinder (or the interior of a hollow tube) may also be coated with omnidirectional reflectors. As long as the diameters of these cylinders/tubes are not too small (compared to the incident wavelength) the walls will appear flat locally and, therefore, the cylinder/tube may be used as an essentially lossless waveguide.4 274

20 Omni-directional dielectric mirrors

275

General properties Consider a periodic multilayer stack such as that depicted in Figure 20.1. The stack consists of an infinite number of identical blocks, each block having reflection coefficients r ¼ jrj exp(ir) from the top side and r 0 ¼ jr 0 j exp(ir 0 ) from the bottom side, as well as transmission coefficient t ¼ jtj exp(it) from either side. In general, r, r 0 and t are functions of the incidence angle h and the polarization state (p or s) of the incident light. From the reciprocal properties of electromagnetic waves in non-absorbing media, it is known that t should be the same whether the incidence is from the top or from the bottom side, that jrj ¼ jr 0 j, and that 12(rþr 0 ) ¼ t 90 (see Chapter 17, Reciprocity in classical linear optics). Also, from conservation of energy, jrj2þjtj2 ¼ 1. As shown in Figure 20.1, one can express the reflection coefficient r0 from the top of the stack in terms of the parameters r, r 0 , t of the individual blocks by assuming a diminishing air-gap between the top unit and the rest of the stack. Denoting the round-trip phase delay within this (artificial) air-gap by d, and

r

r0t2ei

r02r9t2ei2

tr02r92ei2

.. .

Figure 20.1 Method of calculating the reflection coefficient r0 of a periodic stack in terms of the parameters r, r0 , t of the individual blocks that comprise the stack.

276

Classical Optics and its Applications

recognizing that the reflectivity r0 of the infinite stack is the same with and without itsuppermost block, we write r0 ¼ lim r þ r0 t2 expðidÞ þ r02 r 0 t2 expð2idÞ þ r03 r 02 t2 expð3idÞ þ d!0

¼ ½r r0 ðrr 0 t2 Þ=ð1 r0 r0 Þ ¼ fr r0 exp½iðr þ r0 Þg=ð1 r0 r0 Þ:

ð20:1Þ

The above formula is a quadratic equation in r0. A perfect reflector requires that jr0j ¼ 1; Eq. (20.1) then yields the following expression for the phase 0 of r0 in terms of jrj, r, and r 0 :

1 1 cos 0 ðr r0 Þ ¼ cos ðr þ r0 Þ ð20:2Þ jr j: 2 2 Since in practice the actual value of 0 is irrelevant, the above equation predicts that the reflectivity R0 ¼ jr0j2 of the stack will be unity provided that the right-hand side of Eq. (20.2) is confined to the interval [1, þ1]; in other words, the necessary and sufficient condition for the infinite dielectric stack of Figure 20.1 to have 100% reflectivity may be written as follows:

1 ð20:3aÞ jr j>cos ðr þ r0 Þ : 2 Using the identity jrj 2 þ jtj 2¼ 1 and the relation among r , r 0 , t mentioned earlier, Ineq. (20.3a) may be written in either of the following alternative forms: jrj > jsin t j;

ð20:3bÞ

jtj < jcos t j:

ð20:3cÞ

The three inequalities (20.3a), (20.3b), and (20.3c) are equivalent and may be used interchangeably. As an example, consider a unit block consisting of a pair of high-index, low-index layers, each a quarter-wave thick at the free-space wavelength of k0 ¼ 633 nm at normal incidence (i.e., h ¼ 0 ). Let n1 ¼ 2, t1 ¼ 79.0 nm, n2 ¼ 1.5, t2 ¼ 105.5 nm. Figure 20.2 shows plots of jrj and cos [12 (r þ r 0 )] in frame (a), jtj and cos t in frame (b), as functions of h for both p- and s-polarized incident plane-waves. Note that Ineqs. (20.3) are satisfied for p-light when 0 < h < 40 , and for s-light when 0 < h < 52 . The computed p- and s-reflectivities for a quarter-wave stack consisting of twenty repetitions of the above bilayer are shown in Figure 20.3. As expected, Rp0 ¼ jrp0j2 1 in the incidence range 0 < h < 40 , and similarly Rs0 ¼ jrs0j2 1 in the range 0 < h < 52 .

277

20 Omni-directional dielectric mirrors 1.0

1.0

|tp| |ts| cos[( rs + rs)/2]

0.8 Transmission Coefficient

Reflection Coefficient

0.8

0.6

|rs|

0.4

|rp|

0.2

cos( tp)

0.6 cos( ts)

0.4

0.2

cos[( rp + rp)/2]

0.0

0.0 0

15

30 45 60 u (degrees)

75

90

0

15

30 45 60 u (degrees)

75

90

Figure 20.2 Plots of the various functions appearing in Ineqs. (20.3) for a bilayer consisting of a pair of dielectric layers, each having a quarter-wave thickness at the vacuum wavelength of k0 ¼ 633 nm at normal incidence. (n1 ¼ 2.0, t1 ¼ 79.0 nm, n2 ¼ 1.5, t2 ¼ 105.5 nm.)

Single dielectric layer In principle, the unit block from which an omni-directional reflector is constructed can be a bilayer or a multilayer, which may even contain gradient-index layers. It should be obvious that a single homogeneous layer (say, having index n and thickness d) will never produce a 100% reflector; therefore, Ineqs. (20.3) cannot be satisfied for such a layer. (We note in passing that, for a single layer, r ¼ r 0 ¼ t 90 .) Let us examine in some detail the single dielectric layer shown in Figure 20.4. The monochromatic plane wave of wavelength k0 is incident on the top surface of the layer at an angle h; the Fresnel reflection and transmission coefficients of the top surface are q and s. Inside the layer the transmitted wave-vector makes an angle h0 with the surface normal. The reflection and transmission coefficients at the bottom surface, where the light shines from within the slab onto the glass–air interface, are q 0 and s0 . Reciprocity may be invoked to show that, for both p- and s-polarization, q0 ¼ q and q2 þ ss0 ¼ 1 at all h. The dependences of q on n and h

278

Classical Optics and its Applications 1.0

Rso

Reflectivity

0.8

0.6

0.4 Rpo 0.2

0.0 0

15

30

45 (degrees)

60

75

90

Figure 20.3 Computed reflectivity R versus h for p- and s-polarized light for a quarter-wave stack consisting of twenty repetitions of the bilayer depicted in Figure 20.2. Rpo and Rso are 100% in those regions where Ineqs. (20.3) are satisfied.

for p- and s-light are given by the Fresnel formulas5 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 2 2 2 2 2 qp ¼ n sin h n cos h n sin h þ n cos h ;

qs ¼

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 2 cos h þ n2 sin2 h : cos h n sin h

ð20:4aÞ

ð20:4bÞ

Also, the single-path phase-shift D acquired through the thickness d of the slab is D ¼ 2pðnd=k0 Þ cos h0 ;

ð20:4cÞ

279

20 Omni-directional dielectric mirrors ei2 d

n

ei

Figure 20.4 Method of calculating the reflection and transmission coefficients, r and t, for a single-layer dielectric slab of thickness d and refractive index n.

where h0 is the angle of the refracted ray.5 The slab’s reflection and transmission coefficients, r and t, may be obtained by summing the infinite number of rays multiply reflected from its front and rear facets, namely, r ¼ q þ ss0 q0 expði2DÞ þ ss0 q03 expði4DÞ þ ; t ¼ ss0 expðiDÞ þ ss0 q0 expði3DÞ þ ss0 q0 expði5DÞ þ : 2

4

When simplified, the above expressions yield pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ jrj ¼ 2ð q sin DÞ= q4 2q2 cosð2DÞ þ 1;

ð20:5Þ ð20:6Þ

ð20:7aÞ

r ¼ arctanfð1 q2 Þ=½ð1 þ q2 Þ tan Dg;

ð20:7bÞ

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ q4 2q2 cosð2DÞ þ 1;

ð20:8aÞ

t ¼ arctanf½ð1 þ q2 Þ tan D=ð1 q2 Þg:

ð20:8bÞ

jtj ¼ ð1 q2 Þ=

Equations (20.7) and (20.8) readily confirm that jrj2 þ jtj2 ¼ 1, that r ¼ t 90 , and that a single-layer slab does not satisfy Ineq. (20.3).

280

Classical Optics and its Applications

Double layer Next, consider the bilayer slab depicted in Figure 20.5. The top layer has index n1, thickness d1, and reflection and transmission coefficients r1, t1 at the incidence angle h. The corresponding parameters of the second layer are n2, d2, r2, t2. To determine the bilayer’s overall transmission coefficient t, we assume a small air gap between the two layers and proceed to sum the partial transmission coefficients. We find, in the limit of a vanishing gap, t ¼ t1 t2 þ t1 t2 r1 r2 þ t1 t2 r12 r22 þ ¼ t1 t2 =ð1 r1 r2 Þ:

ð20:9Þ

It is now easy to apply criterion (20.3c) to the bilayer’s transmission coefficient given by Eq. (20.9), to determine the conditions under which an infinite stack of bilayers becomes a 100% reflector. Both the necessary and sufficient conditions turn out to be ð20:10Þ jt1 jjt2 j jr1 jjr2 j cosðr1 þ r2 Þ; which is actually two distinct inequalities in one, depending on whether the absolute value on the right-hand side is that of a positive or a negative quantity. Substituting in Eq. (20.10) for r, t, in terms of q and D from Eqs. (20.7, 20.8),

d1

n1

d2 n2

Figure 20.5 Method of calculating the transmission coefficient t of a bilayer slab consisting of two dielectric layers, one having thickness d1, index n1, the other having thickness d2, index n2.

20 Omni-directional dielectric mirrors

281

we find the necessary and sufficient conditions for 100% reflectivity to be

2 1 ð20:11aÞ ðD1 þ D2 Þ ; Gp;s ðn1 ; n2 ; hÞ sin D1 sin D2 cos 2

1 Gp;s ðn1 ; n2 ; hÞ sin D1 sin D2 sin2 ðD1 þ D2 Þ ; ð20:11bÞ 2 where Gp;s ðn1 ; n2 ; hÞ ¼ ðq1 q2 Þ2 =½ð1 q21 Þð1 q22 Þ:

ð20:11cÞ

Inequalities (20.11a,b) are the fundamental results of this section, each expressing the condition (both necessary and sufficient) for the attainment of R ¼ 1.0 from a periodic stack of bilayers. Whereas Ineq. (20.11a) leads to bilayer designs in which both layer thicknesses are close to k/4, Ineq. (20.11b) yields structures in which one layer’s thickness is k/4 while the other’s is 3k/4. Discussion We begin by examining the behavior of Gp,s(n1, n2, h). According to Eq. (20.11c) this function depends only on q1 and q2, which, in turn, are dependent on n1, n2, the angle of incidence h, and the polarization state of the beam, but not on layer thicknesses d1 and d2. For fixed values of n1, n2 the function depends only on h and on the polarization state. Substituting from Eqs. (20.4a) and (20.4b) into Eq. (20.11c) yields qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 ðn2 sin2 hÞ=ðn22 sin2 hÞ Gs ðn1 ; n2 ; hÞ ¼ 4 q1ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ﬃ 1 1 ðn22 sin2 hÞ=ðn21 sin2 hÞ ; ð20:12aÞ þ 4 2 qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 2 Gp ðn1 ; n2 ; hÞ ¼ ðn2 =n1 Þ ðn21 sin2 hÞ=ðn22 sin2 hÞ 4 qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 1 2 ð20:12bÞ þ ðn1 =n2 Þ ðn22 sin2 hÞ=ðn21 sin2 hÞ ; 4 2 Gp,s(n1, n2, h) is plotted versus h in Figure 20.6 for both p- and s-polarized planewaves for the specific values (a) n1 ¼ 1.5, n2 ¼ 2.0, and (b) n1 ¼ 1.3, n2 ¼ 1.2. Although the selected values of n1, n2 are specific, the shapes of the functions are quite general. The two functions for p- and s-light are always positive; they both start, at h ¼ 0 (normal incidence), at the same level, from there Gs goes up and Gp goes down with increasing h. The behavior shown in Figure 20.6(a), where Gs increases while Gp decreases monotonically, is typical of situations of interest in this chapter, where the Brewster angle hB is inaccessible from outside the multilayer. The behavior depicted in Figure 20.6(b), where Gs increases

282

Classical Optics and its Applications 0.05

(a) n1 = 1.50 n2 = 2.00

G (n1, n2, u)

0.04

Gs(n1, n2, u)

0.03

0.02 Gp(n1, n2, u) 0.01

0.00 0

15

30

45

60

75

90

0.0125 (b) n1 = 1.30

G (n1, n2, u)

0.0100

n2 = 1.20

0.0075

Gs(n1, n2, u)

0.0050

0.0025 Gp(n1, n2, u) 0.0000 0

15

30

45

60

75

90

u (degrees)

Figure 20.6 Plots of the functions Gp,s (n1, n2, h) versus h for specific values of n1, n2. In (a) n1 and n2 are large enough to satisfy Ineq. (20.13), thus ensuring that the Brewster angle is inaccessible from outside the stack. In (b) the Brewster angle is reached at h ¼ 61.9 .

monotonically while Gp first drops to zero at hB ( 61.9 in this example) before rising again, is typical of situations where hB can be accessed from outside the stack. Since Rp at h ¼ hB cannot be made equal to unity, the latter situation is of no interest here.

20 Omni-directional dielectric mirrors

283

One can readily show that the slope of Gs versus h is always positive. Gp, on the other hand, has a negative slope at h ¼ 0 , which continues to be negative up to where sin h ¼ n12 n22/(n12 þ n22). At this point Gp achieves its minimum value of zero, then rises until grazing incidence at h ¼ 90 . The angle h at which Gp is a minimum corresponds to the Brewster angle hB between two media of indices n1 and n2. When hB is accessible from the incidence medium (air in this case), it will be impossible to achieve 100% reflectivity at this particular angle. Therefore, we impose the following constraint on the indices of the bilayer: ð1=n1 Þ2 þ ð1=n2 Þ2 <1:

ð20:13Þ

In this way, Gp,s(n1, n2, h) will always exhibit the typical behavior depicted in Figure 20.6(a), namely, both functions start at the same level, 14(n1/n2) þ 14(n2/n1)12, when h ¼ 0 ; from there Gs increases and Gp decreases, both monotonically, with an increasing h. Inequality (20.11a) can be satisfied over the entire range of h for both p- and s-light if D1 and D2 are maintained around p/2 throughout the range h ¼ [0 , 90 ]. Likewise, Ineq. (20.11b) can be satisfied if D1 is kept around p/2 while D2 is kept around 3p/2 (or vice versa). When n1 and n2 are far apart, Gs and Gp will be fairly large, and choosing d1 and d2 to satisfy the requisite inequalities for all h will not be difficult. When n1 and n2 are close together, however, it is easier to maintain D1 and D2 both around p/2 (if at all possible), rather than to keep one of them around 3p/2. This is simply because the variations with h will be greater for that D which stays near 3p/2. We limit the following discussion to stacks that satisfy Ineq. (20.11a), but emphasize that a similar class of reflectors based on Ineq. (20.11b) is feasible as well. Selecting layer thicknesses The parameters d1, d2 should be chosen to satisfy Ineq. (20.11a) for all h from 0 to 90 . Since Gs Gp, one should try to achieve omni-directional reflectivity for p-light only; the s-reflectivity will automatically follow suit. The phase D acquired in a single path through a layer of thickness d and index n at incidence angle h is given in Eq. (20.4c). D can be made equal to p/2 at some arbitrary angle of incidence, say, h ¼ h0, by choosing the layer thickpﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ness d such that nd cos h00 ¼ d n2 sin2 h0 ¼ k0/4. Since h can be anywhere between 0 and 90 , one should choose h0 in such a way as to make D vary symmetrically around p/2. This is achieved when pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ d=k0 ¼ 0:5=ðn þ n2 1Þ: ð20:14Þ

284

Classical Optics and its Applications

The maximum deviation of D from p/2 is then given by pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 2 D 1p ¼ p 1 1n 1 þ 1 n2 : 2 max 2

ð20:15Þ

The expression on the right-hand side of Eq. (20.15) is a monotonically decreasing function of n, going from p/2 to zero as n goes from 1 to 1. Therefore, the range of variation of D is smaller for larger values of n. If d is chosen in accordance with Eq. (20.14), variations of D with h will be symmetric around D ¼ p/2; as h goes from 0 to 90 , D, which is larger than p/2 in the beginning, drops to p/2, then decreases further until its value at grazing incidence becomes equal to that at normal incidence. In this way, sin D remains close to unity, swinging from just below þ1 to þ1, and then back to its initial value, as the incidence goes from normal to grazing. Similarly, cos2D varies symmetrically around 0, moving, between normal and grazing incidence, from a small positive value to zero, then back again to its initial (positive) value. The choice of d according to Eq. (20.14) thus minimizes the swing of sinD around its desired value of þ1, while simultaneously minimizing the swing of cos2D around its desired value of zero. In the case of a bilayer, one must select values for d1 and d2; both can be chosen to satisfy Eq. (20.14) with the corresponding value of n. In this way, Ineq. (20.11a) is likely to be satisfied, because sin D1 sin D2 will remain around unity, with a minimum swing throughout the range of h, and, similarly, cos2[(D1þD2)/2] remains around zero. Figure 20.7 shows plots of these functions for a bilayer consisting of materials with n1 ¼ 1.5, n2 ¼ 2.0. Layer thicknesses obtained from Eq. (20.14) are d1 ¼ 0.191k0 and d2 ¼ 0.134k0. With these choices, Figure 20.7 shows that sin D1 sin D2 remains above 0.97 while cos2[(D1þD2)/2] remains below 0.03 throughout the entire range of h. Unfortunately, the function Gp(n1, n2, h) shown in Figure 20.6(a) is extremely small, and the aforementioned choice of layer thicknesses is not able to yield a 100% reflector through the entire range of incidence. Because Gp is relatively large near normal incidence but decreases with an increasing h, it may be helpful to make the curves of sin D1 sin D2 and cos2[(D1þD2)/2] slightly asymmetric. This is done by choosing a somewhat larger thickness for layer 2, as shown in Figure 20.7, where the dashed curve corresponds to d2 ¼ 0.147k0 and the dotted curve to d2 ¼ 0.161k0. (d2 is a better choice for this purpose than d1, because the corresponding layer has a larger n and, therefore, its variation with h is smaller, thus causing a smaller variation in the functions depicted in Figure 20.7.) Note in Figure 20.7 that, as d2 increases, at large h, the function cos2[12(D1 þ D2)] is depressed, and the function sin D1 sin D2 moves toward unity (at least initially); both of these trends are helpful in satisfying Ineq. (20.11a). Unfortunately, the other side of the curves (i.e., the side around h ¼ 0 ) moves in the wrong direction, making it harder to satisfy the inequality at and

285

20 Omni-directional dielectric mirrors 1.00

(a)

0.98 0.134

sin Δ1 sin Δ2

0.96

0.147 0.94 d2/0 = 0.161

0.92 0.90

n1 = 1.5, n2 = 2.0 d1 /0 = 0.191

0.88 0

0.12

15

45

60

(b)

0.10

cos2[(Δ1 + Δ2)/2]

30

75

90

n1 = 1.5, n2 = 2.0 d1 /0 = 0.191

d2/0 = 0.161

0.08 0.147

0.06 0.04

0.134 0.02 0.00 0

15

30

45 u (degrees)

60

75

90

Figure 20.7 (a) Plots of sinD1 sinD2 versus h for a bilayer slab consisting of materials with n1 ¼ 1.5 and n2 ¼ 2.0. The first layer’s thickness is fixed at d1 ¼ 0.191k0, but the second layer’s assumes one of three different values. (b) Same as (a) for the function cos2[12(D1þD2)].

around normal incidence. All in all, it turns out that it is impossible to design an omnidirectional reflector with bilayers having n1 ¼ 1.5 and n2 ¼ 2.0. Figure 20.8 shows the best that one can achieve with n1 ¼ 1.5, n2 ¼ 2.0, and layer thicknesses d1 ¼ 0.191k0, d2 ¼ 0.134k0 (chosen to satisfy Eq. (20.14)).

286

Classical Optics and its Applications

Gp, s (n1, n2, u) sin Δ1 sin Δ2 – cos2 [(Δ1 + Δ2)/2)]

0.04

(a)

s-polarization

0.03 0.02 0.01

p-polarization

0.00 –0.01 –0.02 –0.03 –0.04 15

0 1.0

30

45

60

75

90

(b) Rso

0.8

Reflectance

Rpo 0.6

0.4

0.2

0.0 0

15

30

45 u (degrees)

60

75

90

Figure 20.8 (a) Plots of Gp,s(n1, n2, h) sinD1 sinD2 cos2 [12(D1þD2)] versus h for a bilayer having n1 ¼ 1.5, d1 ¼ 0.191k0 and n2 ¼ 2.0, d2 ¼ 0.134k0. At small angles of incidence both p- and s-light violate Ineq. (20.11a), while at large angles only p-light is inadequate. (b) Computed plots of p- and s-reflectivity, Rp0, Rs0, versus h for a twenty-period stack of the above bilayer. The regions of 100% reflectivity coincide with those that satisfy Ineq. (11a).

287

20 Omni-directional dielectric mirrors 0.090 Gp, s (n1, n2, ) sinΔ1 sinΔ2 – cos2[(Δ1+ Δ2)/2)]

(a)

n1 = 1.5, d1 = 0.1910 s-polarization

n2 = 2.3, d2 = 0.1220

0.075

0.060

0.045

0.030 p-polarization 0.015

0.000 0 1.000

15

30

45

60

75

(b) Rso

0.999

Reflectance

90

0.998

Rpo

0.997

0.996

0.995 0

15

30

45 (degrees)

60

75

90

Figure 20.9 Same as Figure 20.8 for a twenty-period stack consisting of layers with n1 ¼ 1.5, d1 ¼ 0.191k0 and n2 ¼ 2.3, d2 ¼ 0.122k0. It is seen in (a) that Ineq. (20.11a) holds for both polarization states throughout the entire range of incidence. In (b) the reflectances are 100% everywhere. The slight drops in Rpo and Rso are due to the fact that the assumed stack consists only of a finite number of bilayers; the reflectivity will rise rapidly if the total number of bilayers comprising the stack is increased. Note that the smaller the functions depicted in (a) become, the harder it is to obtain 100% reflectivity from a finite stack.

288

Classical Optics and its Applications

Figure 20.8(a) shows plots of Gp,s(n1, n2, h)sinD1 sinD2 cos2 [12(D1 þ D2)] versus h for both p- and s-polarized light; both functions must stay above zero to satisfy Ineq. (20.11a). Figure 20.8(b) shows computed plots of reflectivity, Rp0 and Rs0, versus h for a twenty-period stack of this bilayer. It is seen that 100% reflectivity is achieved in exactly those regions where the functions depicted in Figure 20.8(a) are positive-valued. Designing an omni-directional reflector To achieve 100% reflectivity over the entire range of incidence requires materials with larger indices (or a larger index difference) than those examined above. For example, by keeping n1 at 1.5 but raising n2 to 2.3 it is possible to satisfy Ineq. (20.11a) with d1 ¼ 0.191k0 and d2 ¼ 0.122k0, as shown in Figure 20.9. The plots in Figure 20.9(a) confirm that Ineq. (20.11a) for both p- and s-light is satisfied over the entire range of incidence. Figure 20.9(b) shows plots of reflectivity versus h for a twenty-period stack; both Rpo and Rso are seen to be greater than 99.5% between normal and grazing incidence. To increase the reflectivity beyond 99.5% one should either increase the number of bilayers comprising the stack, or use materials which have a larger index difference. Finally, it must be mentioned that omni-directional reflection can be achieved not just for one wavelength but over a continuous range of wavelengths. To the extent that n1 and n2 remain constant over the desired range of k, the functions Gp and Gs remain the same at all wavelengths of interest. Designing an omnidirectional reflector for a range of k then reduces to choosing thicknesses d1 and d2 that satisfy Ineq. (20.11a) throughout that range. The techniques described in this chapter can be readily extended to allow adjusting layer thicknesses for the desired band of wavelengths. References for Chapter 20 1 E. Yablonovitch, Phys. Rev. Lett. 58, 2059 (1987). 2 J. D. Joannopoulos, R. D. Meade, and J. N. Winn, Photonic Crystals, University of Princeton Press, Princeton, N.J., 1995. 3 J. N. Winn, Y. Fink, S. Fan, and J. D. Joannopoulos, Omnidirectional reflection from a one-dimensional photonic crystal, Optics Letters 23, 1573–1575 (1998). 4 Y. Fink, J. N. Winn, S. Fan, C. Chen, J. Michel, J. D. Joannopoulos, and E. L. Thomas, A dielectric omnidirectional reflector, Science 282, 1679–1682 (1998). 5 M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, Oxford, 1980.

21 Linear optical vortices†

An optical vortex is a phase singularity nested within the cross-sectional profile of a coherent beam of light.1,2,3 Such vortices occur naturally in the electromagnetic mode structure of certain optical cavities.4 They may also be created artificially by computer-generated holograms designed to impart to an incident beam of light the desired phase and amplitude characteristics of a vortex.5 In recent years the study of vortices has become the focus of several research groups around the world, as potential applications have emerged. Noteworthy examples of such applications are the manipulation of small objects by optical tweezers6 and the control of atomic or molecular beams via the exchange of angular momentum with optical vortices.7

Mathematical description The complex-amplitude distribution of a simple vortex of order m centered at (x0, y0) in the cross-sectional plane of a Gaussian beam may be written as Aðx; y; z ¼ 0Þ ¼ ½ðx x0 Þ þ i signðmÞ ðy y0 Þjmj exp½ðx2 þ y2 Þ=r02 :

ð21:1Þ

The sign of the integer m determines whether the vorticity is clockwise or counterclockwise, and the magnitude of m is the number of 2p phase shifts in one cycle around the singularity. For the amplitude distribution to be single-valued it must go to zero at the center, as is indeed the case in Eq. (21.1). The host is here a circular Gaussian beam having 1/e radius r0 at the waist, and the beam’s propagation direction is along the Z-axis. Figure 21.1 shows an m ¼ þ1 vortex centered at (x0, y0) ¼ (0, 0) within a Gaussian beam of radius r0 ¼ 10k, where k is the wavelength of the light. The †

This chapter was coauthored with Ewan M. Wright of the College of Optical Sciences, University of Arizona.

289

290

Classical Optics and its Applications a

b

c

–15

x/

15

Figure 21.1 A Gaussian beam, having 1/e radius r0 ¼ 10k at the waist, hosts an m ¼ þ1 vortex at its center. (a) Intensity and (b) phase distribution at the beam waist. (c) Interferogram with a tilted plane wave.

intensity distribution in Figure 21.1(a) has a hole at the center, and the phase distribution in Figure 21.1(b) displays a continuous variation from 0 to 2p around the vortex. If this vortex is made to interfere with a tilted plane wave, the resulting fringe pattern would resemble that in Figure 21.1(c). The fork at the center of the fringe pattern created by the splitting of a single fringe is characteristic of all first-order vortices. An important feature of vortices is that they maintain their identity as they propagate through space. Figure 21.2 shows the intensity and phase

21 Linear optical vortices

291

a

b

–65

x/

65

Figure 21.2 The beam of Figure 21.1 propagates to its Rayleigh range at z ¼ 314k. (a) Intensity, (b) phase. The phase singularity is now mixed with the wavefront curvature.

distributions of the vortex of Figure 21.1 after propagating to the Rayleigh range of the host Gaussian beam at z ¼ pr02/k ¼ 314k. Clearly the central hole in the intensity pattern and the 2p phase variation around the singularity are preserved, even though the phase is mixed with the curvature acquired during propagation. The flow of energy Figure 21.3 shows another example of an optical vortex, this one of order m ¼ 3. The intensity distribution in Figure 21.3(a) is similar to that of the first-order vortex in Figure 21.1(a). However, the phase distribution depicted in Figure 21.3(b) shows a continuous 6p variation around the center. The fringe pattern in Figure 21.3(c), produced by allowing the vortex to interfere with a tilted plane wave, has the characteristic fork; here one fringe splits into four. Figure 21.4 shows the distribution of the Poynting vector S for the above vortex. Shown from top to bottom are the x-, y-, and z- components of S encoded in gray-scale (black corresponds to a minimum, white to a maximum). The

292

Classical Optics and its Applications a

b

c

–20

x/

20

Figure 21.3 Same as Figure 21.1 for an m ¼ 3 vortex.

normalized ranges of values are: 0.04 Sx 0.04, 0.04 Sy 0.04, and 0 Sz 1. So, for example, in Figure 21.4(a) the bright regions indicate that Sx is directed along þX, while in the dark regions Sx is directed along X. Similarly, Sy on the left-hand side of Figure 21.4(b) is directed along þY, while on the righthand side it is directed along Y. In Figure 21.4(c), where Sz 0, large positive values appear bright whereas those in the vicinity of zero are dark. Overall, S spirals around the Z-axis. The electromagnetic energy, therefore, does not flow straightforwardly along the optical axis but twists and turns as the beam moves forward.

21 Linear optical vortices

293

a

b

c

–20

x/

20

Figure 21.4 From top to bottom: x-, y-, and z-components of the Poynting vector S for the vortex of Figure 21.3. The minimum value of each function is shown as black and its maximum value as white, the intermediate values being covered by the gray-scale. The depicted ranges of values (in normalized units) are: 0.04 Sx 0.04, 0.04 Sy 0.04, 0 Sz 1. The Poynting vector here has clockwise circulation around the optical axis.

When the vortex of Figure 21.3 propagates to the Rayleigh range at z ¼ 314k, the intensity and phase patterns of Figure 21.5 are obtained. As before, the central hole in the intensity pattern and the singularity of the phase pattern are preserved, but the phase is now mixed with the curvature of the diverging wavefront.

294

Classical Optics and its Applications a

b

–65

x/

65

Figure 21.5 The beam of Figure 21.3 is propagated to its Rayleigh range at z ¼ 314k. (a) Intensity, (b) phase. The phase singularity is mixed with the wavefront curvature.

Vortex pair The complex amplitude of a beam containing multiple vortices may be written as the product of terms similar to those appearing in Eq. (21.1), namely, ( ) N Y Aðx; y; z ¼ 0Þ ¼ ½ðx xn Þ þ i signðmn Þ ðy yn Þjmn j exp ðx2 þ y2 Þ=r02 : n¼1

ð21:2Þ This equation represents N vortices nested in a Gaussian beam of radius r0; the nth vortex whose order is mn is centered at the point (xn, yn) within the XY-plane. Figure 21.6 shows two identical m ¼ þ1 vortices, separated by a distance d ¼ 11k in a Gaussian beam with r0 ¼ 10k. From left to right, the distributions represent the beam at its waist (z ¼ 0), at the Rayleigh range (z ¼ 314k), and in the far field (z ¼ 2000k). Note that the three cross-sections in this figure are

295

21 Linear optical vortices b

a

–30

x/

30 –40

c

x/

40 –165

x/

165

Figure 21.6 A pair of identical m ¼ þ1 vortices separated by d ¼ 11k, nested within a Gaussian beam having r0 ¼ 10k at the waist. Shown are the logarithmic plots of intensity (top row) and phase (bottom row) at several distances from the waist. (a) At the waist of the beam. (b) At the Rayleigh range, z ¼ 314k. (c) At z ¼ 2000k. Note that the vortex pair survives into the far field while rotating by 90 .

plotted on different scales. The beam expands along the propagation path, of course, but the vortices maintain their relative shape and position while undergoing a collective 90 rotation around the optical axis between the waist and the far field.8 The case of two vortices of opposite helicity is shown in Figure 21.7. Here m ¼ 1 for one vortex and m ¼ þ1 for the other. The initial separation between the vortices at the beam waist is d ¼ 11k. As the beam propagates through free space, the vortices appear to spread out and combine with each other. Eventually, they carve out a circular niche for themselves, but the phase discontinuity near the beam center survives all the way to the far field. A somewhat different behavior will be observed when two vortices of opposite polarity are separated at the beam waist by d < r0. The corresponding intensity and phase patterns remain more or less the same as those in Figure 21.7 (which are representative of the case d > r0) but, at some distance z from the waist, the phase discontinuity near the beam center disappears.4 This behavior is reminiscent of fluid vortices of opposite chirality, which collide and annihilate when they happen to be within each other’s basin of attraction.

296

Classical Optics and its Applications a

–30

b

x/

30 –40

c

x/

40 –165

x/

165

Figure 21.7 A pair of vortices of opposite helicity, m ¼ þ1 and m ¼ 1, separated by d ¼ 11k, nested within a Gaussian beam having r0 ¼ 10k at the waist. Shown are the logarithmic plots of intensity (top row) and phase (bottom row) at several distances from the waist. (a) At the waist of the beam. (b) At the Rayleigh range, z ¼ 314k. (c) At z ¼ 2000k. Note that the phase singularity survives into the far field.

Relation to Gauss–Hermite (or Laguerre) polynomials It can be shown that N vortices of the type described by Eq. (21.2) can be written as a superposition of Gauss–Hermite (or Gauss–Laguerre) polynomials of order N. These polynomials, which describe the eigenmodes of certain waveguides, are also the eigenmodes of free-space propagation in the paraxial regime. The vortices formed by the superposition of such modes propagate in free space but, while individual modes maintain their identities, different modes accrue different phase shifts. As a result of these differing phase shifts the pattern of vortices might change along the optical axis, but the main topological features of their singularities are usually preserved.4,9 Resolving adjacent vortices An interesting question concerning vortices is whether one can densely pack them on a given waveform (or on a given surface) and use the resulting pattern

297

21 Linear optical vortices Mirror Objective

Sample Gaussian beam

Observation Plane

Figure 21.8 Densely packed vortices imprinted upon a sample’s flat surface may be observed through a coherent-light microscope. The incident Gaussian beam has 1/e radius r0 ¼ 900k. The entrance pupil of the 0.95NA objective lens, having a radius of 3000k, allows the Gaussian beam through with negligible truncation. The beam reflected from the sample picks up the amplitude and phase patterns of the vortices and returns through the objective lens. The phase structure may be extracted by interference with the original Gaussian beam reflected from the reference mirror.

for data communication (or for information storage). Figure 21.8 is the schematic of a coherent-light microscope that might be used to retrieve a dense pattern of vortices recorded on a flat surface. The Gaussian beam entering the system is narrow enough that truncation at the objective’s aperture may be considered negligible. Upon focusing through the 0.95NA lens, the FWHM diameter of the focused spot becomes 1.33k. The focused spot is modulated by the amplitude and phase reflectivity of the sample before returning to the objective lens. At the beam-splitter the returning beam is diverted towards the observation plane, where the intensity pattern may be examined directly, and the phase pattern may be obtained by interference with a reference beam (supplied by the mirror). Figure 21.9 shows the patterns of intensity and phase at the focal plane of the objective lens immediately after the beam is reflected from the sample. There appear here a total of seven vortices within the focused beam area, all with the same helicity, m ¼ þ1. The pair in the middle, having a center-tocenter spacing of k/2, is at the resolution limit of conventional optical microscopy. When the reflected beam reaches the observation plane, the patterns shown in Figure 21.10 are obtained. Note that both the intensity and the phase

298

Classical Optics and its Applications a

b

–4

x/

4

Figure 21.9 (a) Intensity and (b) phase distribution imparted to the focused Gaussian beam in Figure 21.8 immediately after reflection from the sample’s surface. There is a total of seven m ¼ þ1 vortices; the distance between the closest pair, near the center, is 0.5k.

distribution at the observation plane are magnified versions of the original distributions at the sample, albeit with an inconsequential 90 rotation around the optical axis. As was the case for two vortices of the same helicity (see Figure 21.6), the beam in the present case has also preserved the seven equal-helicity vortices all the way to the far field. To observe the phase structure, however, it is necessary to interfere the beam returning from the sample with a reference beam. The resulting interference pattern is shown in Figure 21.10(c). The split fringes corresponding to five of the vortices are clearly distinguishable in this pattern, even for those vortices that are far from the beam center. However, the split fringes for the two adjacent vortices near the center are hard to recognize and, in practice, where the signal-to-noise ratio is limited, it is unlikely that these vortices can be resolved. It appears, therefore, that the Rayleigh criterion for resolution in imageforming systems applies to these vortices as well, even though the image quality here is extremely good and no information seems to have been lost between the sample and the observation plane.

21 Linear optical vortices

299

a

b

c

–3100

x/

3100

Figure 21.10 Distributions of (a) intensity and (b) phase at the observation plane of Figure 21.8 corresponding to the seven vortices of Fig. 21.9. In (a) and (b) the reference beam is blocked. In (c) the reference beam interferes with the beam returning from the sample, thus creating fringes. The vortices may be identified by the forks within these fringes.

References for Chapter 21 1 2 3 4

J. F. Nye and M. V. Berry, Dislocations in wave trains, Proc. Roy. Soc. London A 336, 165–190 (1974). N. B. Baranova et al., Wavefront dislocations: topological limitations for adaptive systems with phase conjugation, J. Opt. Soc. Am. 73, 525–528 (1983). J. M. Vaughan and D. V. Willetts, Temporal and interference fringe analysis of TEM01* laser modes, J. Opt. Soc. Am. 73, 1018–1021 (1983). G. Indebetouw, Optical vortices and their propagation, J. Mod. Opt. 40, 73–87 (1993).

300

Classical Optics and its Applications

5 N. R. Heckenberg et al., Laser beams with phase singularities, Opt. Quant. Electronics 24, S951–S962 (1992). 6 K. T. Gahagan and G. A. Swartzlander, Optical vortex trapping of particles, Opt. Lett. 21, 827–829 (1996). 7 H. He et al., Direct observation of transfer of angular momentum to absorptive particles from a laser beam with a phase singularity, Phys. Rev. Lett. 75, 826–829 (1995). 8 D. Rozas, Z. S. Sacks and G. A. Swartzlander, Experimental observation of fluidlike motion of optical vortices, Phys. Rev. Lett. 79, 3399–3402 (1997). 9 M. W. Beijersbergen et al., Astigmatic laser mode converters and transfer of orbital angular momentum, Opt. Comm. 96, 123–132 (1993).

22 Geometric-optical rays, Poynting’s vector, and the field momenta

In isotropic media the rays of geometrical optics are usually obtained from the surfaces of constant phase (i.e., wavefronts) by drawing normals to these surfaces at various points of interest.1 It is also possible to find the rays from the eikonal equation, which is derived from Maxwell’s equations in the limit when the wavelength k of the light is vanishingly small.2 Both methods provide a fairly accurate picture of beam-propagation and electromagnetic-energy transport in situations where the concepts of geometrical optics and ray-tracing are applicable. The artifact of rays, however, breaks down near caustics and focal points and in the vicinity of sharp boundaries, where diffraction effects and the vectorial nature of the field can no longer be ignored. It is possible, however, to define the rays in a rigorous manner (consistent with Maxwell’s electromagnetic theory) such that they remain meaningful even in those regimes where the notions of geometrical optics break down. Admittedly, in such regimes the rays are no longer useful for ray-tracing; for instance, the light rays no longer propagate along straight lines even in free space. However, the rays continue to be useful as they convey information about the magnitude and direction of the energy flow, the linear momentum of the field (which is the source of radiation pressure), and the angular momentum of the field. Such properties of light are currently of great practical interest, for example, in developing optical tweezers, where focused laser beams control the movements of small objects.3,4,5,6 Similarly, the manipulation of atoms and molecules with laser beams is presently an active area of research that has tremendous potential for future applications.7 Computing the Poynting vector For a coherent, monochromatic beam of light the time-averaged Poynting vector S at any point (x, y, z) in space can represent the direction and magnitude of the corresponding ray. Computing S is fairly straightforward and involves only a few 301

302

Classical Optics and its Applications

fast-Fourier transformations (FFTs). Outlined below is a method of calculating S for a beam in free space, but the method can readily be generalized to material environments as well. Throughout this chapter the adopted system of units is MKS, c is the speed of light in vacuum, e0 is the permittivity of free space (e0 c2 ¼ 107/4p), and h is Planck’s constant. With the beam’s propagation direction fixed along the Z-axis, consider the distribution of the E-field in the beam’s cross-sectional plane, XY. The only components of E needed for calculating S are Ex and Ey. To compute the Poynting vector, decompose the beam into its plane-wave spectrum. This requires one FFT for Ex(x, y) and another for Ey(x, y). For each plane wave thus obtained compute the Z-component Ez(x, y) of the E-field using the requirement k · E ¼ 0. Here k ¼ 2pr/k is the wave-vector for the plane wave propagating along the unit vector r. The knowledge of E and k for each plane wave leads directly to the corresponding magnetic field B ¼ r · E/c. At this point all six components of E and B for each plane wave are determined; therefore, an inverse FFT on each such component would yield the complete E- and B- fields within the crosssectional XY-plane. Finally, the time-averaged Poynting vector may be obtained from S ¼ 12 e0 c2 Real(E · B*).8 Rays of a linearly polarized Gaussian beam As an example, consider a Gaussian beam of wavelength k at its waist, having 1/e (amplitude) radii Rx ¼ 15k along X, Ry ¼ 10k along Y. Assuming that the beam is linearly polarized in the X-direction, its Poynting vector may be computed by the aforementioned method. Figure 22.1(a) shows the distribution of intensity for the x-component of polarization, Ix ¼ jExj2. By definition, this beam is linearly polarized along X and has no Ey, but Ez, albeit very small, is not zero, as can be seen in Figure 22.1(b). The phase of Ez (not shown) is +90 on the left-hand side and 90 on the right-hand side of the beam’s cross-section. For this beam it turns out that Sx ¼ Sy ¼ 0, and only Sz 6¼ 0; a plot of Sz at the waist of the beam is shown in Figure 22.1(c). Clearly Sz is strongest at the beam center and decays with increasing distance from the center, behaving very much like Ix does. The fact that S in this case is everywhere parallel to the Z-axis is consistent with one’s intuitive expectation that, at its waist, the Gaussian beam should be collimated. When the beam propagates away from the waist, it acquires curvature and the rays exhibit behavior characteristic of a divergent beam. The Cartesian components of the Poynting vector (Sx, Sy, Sz) all turn out to be nonzero in this case. Figure 22.2 shows various distributions for the above Gaussian beam at a distance of z ¼ 800k from the waist. Shown in the left-hand column are the intensity profiles for the three Cartesian components of E. The peak intensities in these

22 Geometric-optical rays, Poynting’s vector, and the field momenta

303

a

b

c

–20

x/

20

Figure 22.1 Various distributions at the waist of a Gaussian beam having 1/e (amplitude) radii Rx ¼ 15k, Ry ¼ 10k; the beam is linearly polarized along the X-axis. (a) Intensity of the x-component of polarization, Ix ¼ jExj2. (b) Intensity of the z-component of polarization, Iz ¼ jEzj2. In (a) and (b) the peak intensities are in the ratio Ix : Iz ¼ 1.0 : 0.83 · 104. (c) A plot of Sz, the projection of the Poynting vector S along the optical axis. Sz(x, y) 0 is encoded in gray-scale (black, minimum; white, maximum). The other components of S, namely, Sx and Sy, are exactly zero at this cross-section.

figures are in the ratios Ix : Iy : Iz ¼ 1.0 : 0.39 · 108 : 0.83 · 104. Whereas the beam at the waist is elongated along X, at z ¼ 800k it is elongated along Y; this is a natural consequence of diffractive propagation. The phase plots in Figure 22.2, middle column, reveal the acquired curvature of the beam, as well as a p phase difference between the adjacent quadrants of Ey and the two halves of Ez. The

304

Classical Optics and its Applications

–40

x/

40 –75

x/

75 –40

x/

40

Figure 22.2 The Gaussian beam of Figure 22.1 after propagating a distance of z ¼ 800k in free space. The left-hand column shows, from top to bottom, the distributions of intensity for the x-, y-, and z- components of polarization; the peak intensities are in the ratios Ix : Iy : Iz ¼ 1.0 : 0.39 · 108 : 0.83 · 104. The middle column shows the corresponding phase plots for Ex, Ey, Ez; the gray-scale covers the range 180 (black) to þ180 (white). The third column shows the Cartesian components of the Poynting vector, Sx, Sy, Sz, in gray-scale (black, minimum; white, maximum). Here the normalized ranges of values are: 0.48 Sx 0.48, 0.9 Sy 0.9, 0 Sz 100. Symmetry with respect to the optical axis ensures that the angular momentum of the field around this axis is zero. Note that the dimensions are not the same in the three columns.

general structure of the intensity and phase patterns depicted here may be readily understood in terms of the symmetries of the Gaussian beam and the basic properties of electromagnetic radiation. Shown in the right-hand column of Figure 22.2 are, from top to bottom, the x-, y-, and z-components of S encoded in gray-scale (black corresponds to a minimum, white to a maximum). The normalized ranges of values

22 Geometric-optical rays, Poynting’s vector, and the field momenta

305

are: 0.48 Sx 0.48, 0.9 Sy 0.9, 0 Sz 100. So, for example, in the top frame the bright regions indicate that Sx is directed along þX, while in the dark regions Sx points toward X. Similarly, Sy in the upper half of the middle frame points along þY, while it is directed along Y in the lower half. In the bottom frame where Sz 0, the large positive values appear bright and those in the vicinity of zero appear dark. As expected, these plots of Sx, Sy, Sz represent a divergent beam. The case of circular polarization Let us consider once again the Gaussian beam for which the computed Poynting vector at the waist was shown in Figure 22.1(c). This time, however, we assume that the polarization state of the beam is circular rather than linear. The Cartesian components of S for this circularly polarized Gaussian beam at the waist are shown in Figure 22.3. The normalized ranges of values are: 0.96 Sx 0.96, 0.64 Sy 0.64, 0 Sz 100. Although Sx and Sy are nonzero at the waist, they exhibit neither a convergent nor a divergent behavior. Indeed the projection of S in the XY-plane, Sxx þ Syy, shows only a counterclockwise circulation. (Reversing the sense of circular polarization of the beam would reverse the circulation of S as well.) From the standpoint of geometrical optics this behavior of the rays is totally unexpected, since the state of polarization should affect neither the magnitude nor the direction of the rays. Nonetheless, taking into account the full distribution of the fields (especially the components Ez and Bz) yields for circularly polarized light a non-zero projection of S in the XY-plane, in sharp contrast to the case of linear polarization where Sx ¼ Sy ¼ 0. Linear and angular momenta of the field It is well known that the momentum density of the field is directly proportional to S. (Feynman et al.8 give a beautiful exposition of the concept of field momentum density and its relation to the Poynting vector, p ¼ S/c2.) The field’s angular momentum is then computed by integrating r · p over the volume of interest. Here r is the position vector and p is the field’s momentum density at location r. Since the momentum distribution of the circularly polarized Gaussian beam depicted in Figure 22.3 has a net circulation in the XY-plane, it follows that the beam carries a net angular momentum around the optical axis Z. If, for instance, such a beam is absorbed by a particle, it will exert a torque on the particle due to the transferred angular momentum.9 If one expands the Gaussian beam by enlarging its cross-sectional area (while maintaining its total optical power), Sx and Sy decrease faster than Sz and, in the

306

Classical Optics and its Applications a

b

c

–20

x/

20

Figure 22.3 From top to bottom: plots of Sx, Sy, Sz at the waist of a circularly polarized Gaussian beam having 1/e radii (Rx, Ry) ¼ (15k, 10k). The normalized ranges of values are: 0.96 Sx 0.96, 0.64 Sy 0.64, and 0 Sz 100. The counterclockwise circulation of S around the optical axis gives rise to the beam’s angular momentum around this axis.

limit of an infinitely large beam (i.e., a plane wave), Sx and Sy vanish. Does this mean that a circularly polarized plane wave does not carry angular momentum? The answer is no, because while Sx and Sy diminish with the expansion of the beam they also spread over a larger area, yielding the same final value for the integrated r · p over the beam’s cross-section.10 This is also in agreement with the quantum picture of light, where a circularly polarized photon of frequency m carries energy hm and a unit of angular momentum h/2p.

22 Geometric-optical rays, Poynting’s vector, and the field momenta

307

Spin versus orbital angular momentum In Chapter 21 we discussed optical vortices and showed that their Poyntingvector distribution over the beam’s cross-section exhibits a circulation similar to that seen here in Figure 22.3. In the case of these vortices the state of polarization was linear and the circulation of S arose from the particular phase structure of the beam, whereas in the present case the phase is uniform but the polarization is circular. These two cases have often been compared to the orbital and spin angular momenta of bound electrons but, in reality (unless the beam is treated in the paraxial approximation), the two contributions to angular momentum are intermixed, making it difficult to distinguish one from the other. All one can say in the general case is that the field has a net angular momentum, which is obtained by integrating r · p over the beam’s cross-section.10,11 Rays at the focal plane of a lens As a final example, we show the complex pattern of ray distribution that can be obtained by focusing a relatively simple beam through a diffraction-limited microscope objective lens. Consider a beam of constant amplitude and phase, but non-uniform polarization, in which one side is linearly polarized at þ45 and the other side at 45 relative to the X-axis. The distribution of polarization angle over the beam’s cross-section is shown in Figure 22.4(a). Let this beam be brought to focus by an aberration-free 0.5NA lens. The distribution of total E-field intensity (i.e., jExj2 þ jEyj2 þ jEzj2) at the focal plane is shown in Figure 22.4(b). a

–3200

b

x/

3200 –5

x/

5

Figure 22.4 A coherent, monochromatic, and collimated beam having constant amplitude and phase but nonuniform polarization enters the pupil of an aberration-free, 0.5NA lens. (a) Distribution of the beam’s polarization angle at the entrance pupil. Both halves are linearly polarized, the right at þ45 and the left at 45 with respect to the X-axis. (b) Logarithmic plot of total E-field intensity at the focal plane of the lens.

308

Classical Optics and its Applications

–2.5

x/

2.5 –5

x/

5 –2.5

x/

2.5

Figure 22.5 Various distributions for the focused spot of Figure 22.4(b). The left-hand column shows, from top to bottom, plots of intensity for the x-, y-, and z-components of polarization. The peak intensities are in the ratios Ix : Iy : Iz ¼ 0.49 : 1.0 : 0.06. The corresponding phase plots appear in the middle column, where the gray-scale covers the range 180 (black) to þ180 (white). The right-hand column shows plots of Sx, Sy, Sz in gray-scale (black, minimum; white, maximum). The normalized ranges of values are: 9.5 Sx 9.5, 22.6 Sy 12.9, 0 Sz 100. Note that the dimensions are not the same in the three columns.

Note the elongation of the focused spot along the X-axis, which is a consequence of the particular polarization pattern of the incident beam. Figure 22.5, left column, shows the computed intensity distributions for the x-, y- and z- components of polarization at the focal-plane. The corresponding phase patterns are shown in the middle column. Of particular interest here are the focal-plane distributions of Sx, Sy, Sz, shown in the right-hand column. There are two equal but opposite vortices in this picture, which may be discerned by considering the combined effects of Sx and Sy. A schematic diagram of the projection of S in the focal plane, namely, Sxx þ Syy, is given in Figure 22.6. These, as well as more complex momentum distributions, can now be routinely created in the laboratory and

22 Geometric-optical rays, Poynting’s vector, and the field momenta

309

Figure 22.6 A schematic diagram showing the vortex structure of the Poynting vector for the focused spot depicted in Figures 22.4(b) and 22.5. The arrows represent the projection of S in the focal plane, namely, Sx x þ Sy y.

used to trap and manipulate small objects within the confines of the focal region of a microscope. References for Chapter 22 1 M. V. Klein, Optics, Wiley, New York (1970). 2 M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, Oxford, 1980. 3 A. Ashkin, J. M. Dziedzic, J. E. Bjorkholm, and S. Chu, Observation of a singlebeam gradient force optical trap for dielectric particles, Opt. Lett. 11, 288–290 (1986). 4 K. T. Gahagan and G. A. Swartzlander, Optical vortex trapping of particles, Opt. Lett. 21, 827–829 (1996). 5 H. He et al., Direct observation of transfer of angular momentum to absorptive particles from a laser beam with a phase singularity, Phys. Rev. Lett. 75, 826–829 (1995). 6 M. W. Berns, Laser scissors and tweezers, Scientific American 278, 62–67 (April 1998). 7 E. A. Cornell and C. E. Wieman, The Bose–Einstein condensate, Scientific American 278, 40–45 (March 1998). 8 R. P. Feynman, R. B. Leighton, and M. Sands, The Feynman Lectures on Physics, Addison-Wesley, Reading, Massachusetts (1964). See Vol. I, section 34–9, and Vol. II, chapter 27. 9 M. Kristensen and J. P. Woerdman, Is photon angular momentum conserved in a dielectric medium?, Phys. Rev. Lett. 72, 2171–2174 (1994). 10 H. A. Haus and J. L. Pan, Photon spin and the paraxial wave equation, Am. J. Phys. 61, 818–821 (1993). 11 S. M. Barnett and L. Allen, Orbital angular momentum and nonparaxial light beams, Opt. Commun. 110, 670–678 (1994).

23 Doppler shift, stellar aberration, and convection of light by moving media

The characteristics of a beam of light emanating from a source in uniform motion with respect to an observer differ from those measured when the source is stationary. In general, it is irrelevant whether the source is stationary and the observer in motion or vice versa; the observed characteristics depend only on the relative motion. The observed frequency of the light, for example, has been known to depend on this relative motion since the Austrian physicist Christian Doppler (1842) showed the effect to exist both for sound waves and light waves.1 The perceived direction of propagation of a light beam also depends on the relative motion of its source and the observer. The English astronomer James Bradley (1727) was the first to argue that the motion of the Earth in its orbit around the Sun causes a periodic shift of the apparent position of fixed stars as observed from the Earth; a telescope viewing a star must be tilted in the direction of the Earth’s motion. Although this so-called stellar aberration could be explained on the basis of the corpuscular theory of light accepted at the time,1 certain features of it remained poorly understood until the advent of Einstein’s special theory of relativity in 1905. The mid-nineteenth century measurements of the speed of light in moving media could be made to agree with the prevailing theories at the time only if one assumed that the moving medium partially carried the luminiferous ether, the hypothetical medium which filled the Universe and in which the light waves propagated. The magnitude of this ether convection depended on the velocity as well as the refractive index of the moving medium.1 Since the refractive index is wavelength dependent, implicit in these theories was the assumption that different ethers exist for different light colors, each being carried at a different rate by the moving medium. This ad hoc and unsatisfactory state of affairs came to an end with the advent of the special theory of relativity. Doppler shift, stellar aberration, and the 310

23 Doppler shift, stellar aberration

311

convection of light by moving media are the various manifestations of the same fundamental phenomenon: different relative perceptions of space and time for observers in motion with respect to one another. In this chapter we derive general formulas for all three phenomena by applying the Lorentz transformation to a plane electromagnetic wave. Examples will be used to clarify the physics behind the formulas.

Plane waves and the Lorentz transformation A plane electromagnetic wave of frequency f propagating in free space along a direction specified by the polar and azimuthal angles (h, ) within a Cartesian XYZ coordinate system has the following form: aðx; y; z; tÞ ¼ A0 expfi2pðf =cÞ½ðsin h cos Þx þ ðsin h sin Þy þ ðcos hÞz ctg:

ð23:1aÞ

Here c is the speed of light in free space, and the complex vector A0 denotes the strength of the field at the origin of the coordinate system. Let us define the coordinates of a point in space-time as p ¼ (x, y, z, ict). The coefficients appearing in the exponent of the plane wave in Eq. (23.1a) can then be grouped together as r ¼ [sin h cos , sin h sin , cos h, i], and the equation may be written in compact form, aðx; y; z; tÞ ¼ A0 exp½i2pðf=cÞr pT ;

ð23:1bÞ

where superscript T denotes a transposed vector. A second inertial system, X 0 Y 0 Z 0 , in which the XYZ system moves with uniform velocity V along the common Z-direction, is shown in Figure 23.1. The origins of the two systems coincide at t ¼ t 0 ¼ 0. The point p 0 ¼ (x 0 , y 0 , z 0 , ict 0 ) in the new coordinate system is related to p by the Lorentz transformation,1,2 pT ¼ L p0 ; T

ð23:2aÞ

where the 4 · 4 transformation matrix L is given by 1 0 1 0 0 0 C B0 1 0 0 B qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ2 qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ2 C B L ¼ B0 0 1 1 ðV=cÞ iðV=cÞ 1 ðV=cÞ C C: q q ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ A @ 2 2 0 0 iðV=cÞ 1 1 ðV=cÞ 1 ðV=cÞ ð23:2bÞ

312

Classical Optics and its Applications Y X

Y f

X

u

Z

V Z

Figure 23.1 In the Cartesian XYZ coordinate system a plane wave of frequency f (wavelength ¼ k) propagates along the unit vector u. The polar and azimuthal angles of u are denoted by h and . As seen from another system, X 0 Y 0 Z 0 , the XYZ system moves at a constant velocity V along the Z-axis. From the perspective of an observer stationary in X 0 Y 0 Z 0 , the plane wave is Doppler shifted to a different frequency f 0 , and the polar angle of its propagation direction has a different value h 0 . The azimuthal angle , however, remains the same in the two systems.

(Recall that c is a constant, having the same value in any frame of reference in which it is measured.) L is a unitary matrix whose inverse is the same as its transpose, i.e., LLT equals a 4 · 4 identity matrix. We substitute for pT in Eq. (23.1b) from Eq. (23.2a), and evaluate rL as follows: 1 þ ðV=cÞ cos h rL ¼ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ½sin h 0 cos ; sin h 0 sin ; cos h 0 ; i: 1 ðV=cÞ2

ð23:3aÞ

Here qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ sin h ¼ 1 ðV=cÞ2 sin h=½1 þ ðV=cÞ cos h;

ð23:3bÞ

cos h 0 ¼ ðcos h þ V=cÞ=½1 þ ðV=cÞ cos h:

ð23:3cÞ

0

It is readily verified from Eqs. (23.3b) and (23.3c) that sin2 h 0 þ cos2 h 0 ¼ 1, which is needed if the above definition of h 0 is to be meaningful. We conclude that the plane wave in XYZ remains a plane wave in X 0 Y 0 Z 0 , albeit with a different frequency and a different propagation direction. Doppler shift Replacing rpT in Eq. (23.1b) with rLp 0 T and using Eq. (23.3a), it becomes clear that the optical frequency f 0 of the plane wave as measured in the X 0 Y 0 Z 0

23 Doppler shift, stellar aberration

313

system is given by

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 ðV=cÞ2 : f ¼ f 1 þ ðV=cÞ cos h 0

ð23:4Þ

This is the relativistic formula for the Doppler shift,2 valid for all propagation directions h and all speeds V. When h ¼ 0 , the propagation direction and the motion of the observer are antiparallel. In this case f 0 is greater than f (i.e., blueshifted) according to the following formula: pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð23:5aÞ f 0 ¼ f ½1 þ ðV=cÞ=½1 ðV=cÞ: When h ¼ 180 , the propagation direction and the motion of the observer are parallel, in which case f 0 is less than f (i.e., red-shifted) as follows: pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð23:5bÞ f 0 ¼ f ½1 ðV=cÞ=½1 þ ðV=cÞ: When h ¼ 90 , the observer is moving at right angles to the propagation direction. The classical analysis does not yield any Doppler shift in this case,1 but the relativistic formula yields qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 0 f ¼f 1 ðV=cÞ2 : ð23:5cÞ It is possible to have a direction of propagation with no Doppler shift at all, i.e., f 0 ¼ f . From Eq. (23.4) this direction is found to be: qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

2 cos h ¼ 1 ðV=cÞ 1 ðV=cÞ: ð23:6Þ Substitution into Eqs. (23.3b) and (23.3c) reveals that, when the above condition is satisfied, h 0 ¼ 180 – h. Based on Eq. (23.4), Figure 23.2(a) shows plots of f 0 /f versus V/c for several values of h, while Figure 23.2(b) shows plots of f 0 /f versus h for different values of V/c. At a given velocity V, the beam is blue-shifted when h ¼ 0 , i.e., the observer moves opposite to the direction of propagation, and red-shifted when h ¼ 180 , i.e., the observer moves along the propagation direction. If h is varied continuously from 0 to 180 , the frequency changes from blue-shifted to red-shifted, becoming equal to f somewhere after h ¼ 90 . As V increases, the crossing point occurs at larger angles h. Stellar aberration The direction of propagation of the beam perceived by the observer in the X 0 Y 0 Z 0 frame has polar angle h 0 , given by Eqs. (23.3b) and (23.3c), and azimuthal angle

314

Classical Optics and its Applications 5

5

(a)

4

(b)

V/c = 0.9

4

3

3

f/f

u = 0°

0.7 30°

2

2

0.5

60° 0.3

90° 120°

1

1

V/c = 0.1

150° 180° 0

0 0.0

0.2

0.4

0.6 V/c

0.8

1.0

0

45

90

135

180

u (degrees)

Figure 23.2 A plane wave of frequency f and propagation direction (h, ) in the XYZ coordinates is observed from the X 0 Y 0 Z 0 system of Figure 23.1. The Doppler-shifted frequency f 0 seen by the observer is a function of V and h, but does not depend on . (a) Plots of f 0 / f versus V/c for several values of h. (b) Plots of f 0 / f versus h for different values of V/c.

. Dividing Eq. (23.3b) by Eq. (23.3c) yields2 qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 0 tan h ¼ 1 ðV=cÞ2 sin h=ðcos h þ V=cÞ:

ð23:7Þ

Figure 23.3(a) shows plots of h 0 as function of V/c for several values of h. Similarly, Figure 23.3(b) shows plots of h 0 versus h for different values of V/c. It is clear that, for a given h, the apparent direction of propagation depends on the relative velocity between the observer and the light source. For instance, if a telescope is aimed at a distant star far above the plane of the Solar System, then in a reference frame where the star is stationary, h ¼ 90 . However, for an Earthbound observer, cos h 0 ¼ V/c, where V 31 km/s is the speed of the Earth in its orbit around the Sun. As the Earth travels in its orbit, the direction of V changes and so does the apparent location of the star. In a six-month period, cos h 0 changes by 2V/c, causing a shift of Dh 0 0.012 4300 in the star’s apparent location. (In contrast, the size of the parallax for the nearest star as measured from the same location on the Earth over a six-month period is less than 200 .)

315

23 Doppler shift, stellar aberration 180

(a)

180 (b) V/c = –0.9 u = 150°

150

150

–0.7 –0.5

120°

u9 (degrees)

120

120

–0.3 –0.1

90° 90

90 0.1

60° 60

0.3

60

30°

0.5 30

30

0

0

–1.0

–0.5

0.0 V/c

0.5

1.0

0.7 V/c = 0.9 0

45

90

135

180

u (degrees)

Figure 23.3 A plane wave of frequency f and propagation direction (h, ) in the XYZ coordinates is observed from the X 0 Y 0 Z 0 system of Figure 23.1. The polar angle h 0 of the beam seen by the observer is a function of V and h, but does not depend on . (a) Plots of h 0 versus V/c for several values of h. (b) Plots of h 0 versus h for different values of V/c.

Diffraction of light from a grating in uniform motion Figure 23.4(a) shows a uniform beam of frequency f (and wavelength k ¼ c/f ) focused onto a diffraction grating through an objective lens of numerical aperture NA. The grating’s period P is sufficiently small to allow only the 0th, 1st, and þ1st orders of diffraction to appear upon reflection. The three cones of light thus reflected from the grating are captured by the lens in the return path. The partial overlap of the diffracted cones at the lens’s exit pupil (resulting in interference among them) gives rise to the so-called baseball pattern of intensity distribution. When the grating moves at a constant velocity V along the Z-axis, the contrast within the baseball pattern shows a periodic oscillation. This is caused by the variation of the relative phase between the 1st and the 0th diffracted orders, the phase being dependent on the position of the focused spot relative to the grooves on the grating. As a specific example, Figure 23.4(b) shows the surface profile of a grating with a trapezoidal cross-section having period P ¼ 1.2k and groove depth ¼ 0.15k. Figure 23.4(c), a logarithmic plot of the intensity distribution at the focal

316

Classical Optics and its Applications a

b

Objective Y

Y

c

Z P

V Z

–4

x/

+4

Figure 23.4 (a) A monochromatic plane wave of wavelength k is focused, through an objective lens of numerical aperture NA, onto a diffraction grating. The period P of the grating is small enough to support only the 0th and 1st diffraction orders upon reflection. The three cones of light returning from the grating partially overlap in the exit pupil, giving rise to a “baseball pattern.” When the grating moves at constant velocity V in the focal plane, the contrast within the baseball pattern shows a periodic oscillation. (b) Trapezoidal profile of a grating having period P ¼ 1.2k and groove depth ¼ 0.15k. (c) Logarithmic plot of intensity distribution at the focal plane of a uniformly illuminated 0.6 NA objective.

plane of a 0.6NA objective, shows the diameter of the central bright spot – the Airy disk – to be 1.22k/NA 2k. Figure 23.5 shows computed patterns of reflected intensity at the exit pupil of the objective for several positions of the focused spot over the grating. From (a) to (i), the groove center’s distance from the center of the focused spot is 0, 0.2k, 0.4k, 0.5k, 0.6k, 0.8k, k, 1.1k, and 1.2k, respectively. In these simulations the grating is assumed to be stationary in its various positions relative to the lens. An alternative (and physically more accurate) explanation of the baseball patterns of Figure 23.5 may be based on the Doppler shift between the 0th order and the 1st-order diffracted light cones depicted in Figure 23.4. From the viewpoint of an observer in the grating’s rest frame, the incident cone of light moves with velocity V along the Z-axis. This cone is a superposition of a multitude of plane waves of differing directions and frequencies. With reference to Figure 23.6, consider a plane wave of frequency f and propagation direction (h, ) in the XYZ coordinate system in which the lens is stationary. This plane wave, when seen from the grating’s rest frame, has frequency f 0 given by Eq. (23.4) and

317

23 Doppler shift, stellar aberration a

b

c

d

e

f

g

h

i

Figure 23.5 Patterns of intensity distribution observed at the exit pupil of the objective of Figure 23.4. From (a) to (i) the distance between the groove center and the center of the focused spot is 0, 0.2, 0.4, 0.5, 0.6, 0.8, 1.0, 1.1, and 1.2 (in units of k).

propagation direction (h 0 , ) given by Eq. (23.7). For the 0th-order reflected plane wave, the frequency remains f 0 but the propagation direction becomes (h 0 , ). Viewed from the rest frame of the lens, this reflected 0th-order beam has frequency f and propagation direction (h, ). Thus the 0th-order reflected cone – which is simply a superposition of various reflected 0th-order plane waves – seen by the lens is ignorant of the velocity V of the grating. As for the þ1st-order beam, in the grating’s rest frame the diffracted plane wave has frequency f 0 and propagation direction (h 0þ1, 0þ1), where, in accordance with Bragg’s law, 0 cos hþ1 ¼ cos h 0 þ ðk0 =PÞ;

ð23:8aÞ

0 cos 0þ1 ¼ sin h 0 cos : sin hþ1

ð23:8bÞ

318

Classical Optics and its Applications Y

Y9

0th order, f

Incident, f

0th order, f 9

Incident, f 9

1st order, f+1 u

u

u+1

V

1st order, f 9 u9

u9

u9+1 Z9

Z Lens reference frame

Grating reference frame

Figure 23.6 In the reference frame of the lens of Figure 23.4, a plane wave of frequency f incident at an angle h on a moving grating gives rise to a 0th order diffracted beam of the same frequency and polar angle. The þ1st order diffracted beam, however, will have frequency fþ1 and polar angle hþ1. In the grating’s rest frame, the incident beam has frequency f 0 and polar angle h 0 . All diffracted orders have the same frequency f 0 . The polar angle of the 0th order 0 . beam is h 0 , while that of the þ1st order beam is h þ1

Back in the rest frame of the lens, the diffracted þ1st order plane wave appears to have a new frequency fþ1, where qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ Df ¼ fþ1 f ¼ V= P 1 ðV=cÞ2 ð23:9Þ is independent of the incident beam’s propagation direction (h, ). The period P of the grating thus appears to have been foreshortened by the Lorentz contraction factor, and the Doppler shift Df is proportional to the velocity V and inversely proportional to the (contracted) grating period. Since Df is independent of the direction of the incident plane wave, the entire þ1st-order cone will be Doppler shifted by the same amount. This Doppler shift causes a beating at the exit pupil between the 0th-order and the þ1st-order cones in their area of overlap. The beat period, 1/Df, is independent of the groove profile as well as the NA of the lens. The same arguments apply to the 1st-order light cone, except that the Doppler shift in this case is –Df. We mention in passing that, for the plane wave incident at (h, ) in the rest frame of the lens, the propagation directions of the 1st-order reflected plane waves are (h1, 1), where qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ cos h k P 1 ðV=cÞ2 qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ cos h1 ¼ ð23:10Þ 1 ðV=cÞk P 1 ðV=cÞ2 0 (see Eq. (23.8b)). Aside from the Lorentz contraction of the and 1 ¼ 1 grating period P, there is a relativistic correction to Bragg’s law of diffraction

319

23 Doppler shift, stellar aberration

from a moving grating. This correction term, which appears in the denominator on the right-hand side of Eq. (23.10), is of the first order in V/c. Rayleigh range of a moving Gaussian beam Figure 23.7 shows a Gaussian beam of wavelength k propagating along the Z-axis in the XYZ coordinate system, in which the source of thepbeam is at rest. ﬃﬃﬃ The beam diameter W0 at the waist increases by a factor of 2 at a distance L0 ¼ W02/k, the Rayleigh range of the beam.3 To an observer at rest in the X 0 Y 0 Z 0 frame, the source of the Gaussian beam moves at constant velocity V along the Z-axis. Since the beam diameter at any cross-section is a measurable quantity along Y 0 , from the perspective of the observer the beam diameters at the waist and pﬃﬃﬃ at the Rayleigh range remain W0 and 2W0, respectively. However, the distance 0 L 0 between these two points appears to have shrunk by Lorentz contraction, that qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 0 is, L 0 ¼ L0 1 ðV=cÞ2 . At the same time, the observer perceives the wavelength k of the beam to have shifted in accordance with the Doppler formula to a lower or higher value, depending on whether the motion of the light source is towards or 0 away from the observer. The Gaussian beam formula then yields L 0 ¼ W02 /k 0 , so 0 that, in the observer’s rest frame, the Rayleigh range L 0 could be greater or less than L0 depending on whether the observer moves towards or away from the light source. The two conclusions thus reached are contradictory, leaving one to wonder which prediction, if any, will be borne out by experiment.

Y

Y W0

L0

2 W0

Z

V

Z9

Figure 23.7 A Gaussian beam of wavelength k propagates along the Z-axis. The pﬃﬃﬃ beam diameter is W0 at the waist and 2W0 at the Rayleigh range, which is a

distance L0 from the waist. To an observer moving with constant velocity along the Z-axis, p the ﬃﬃﬃ beam diameters at the waist and at the Rayleigh range remain W0 and 2W0, but the distance between them appears to have shrunk by the Lorentz contraction factor.

320

Classical Optics and its Applications

Upon closer inspection, the Gaussian beam will be recognized as a superposition of many plane waves of differing propagation directions. In the rest frame of the beam’s source, all these plane waves have the same wavelength k, but viewed from a moving frame the wavelengths differ for different propagation directions. (The propagation directions of the various plane waves are not the same in the two coordinate systems, thus resulting in a wider or narrower spatial frequency spectrum in X 0 Y 0 Z 0 depending on the direction of motion of the observer.) The formula relating the Rayleigh range to the beam waist and to the wavelength has been derived with the implicit assumption that the beam is a superposition of plane waves of identical frequency.3 This is true in the rest frame of the beam’s source, but decidedly false in the moving frame. Therefore, our second method of deter0 mining L 0 must be wrong, leaving the Lorentz contracted L0 as the correct answer. Convection of light by moving media Consider a plane wave of frequency f and propagation direction (h, ), propagating in a stationary medium of refractive index n. By definition, the speed of light in this medium is c/n. The expression for the field distribution throughout space and time is similar to that given by Eq. (23.1a), namely, aðx; y; z; tÞ ¼ A0 exp i2pðf =cÞ½nðsin h cos Þ x þ nðsin h sin Þ y ð23:11Þ þ nðcos hÞz ct : From the perspective of an observer whose frame of reference X 0 Y 0 Z 0 is in uniform motion relative to the stationary medium (see Figure 23.1), the expression for the field is obtained by substituting in Eq. (23.11) the Lorentz transformation of Eq. (23.2). This yields the same expression as in Eq. (23.11) with remaining the same but f, n, and h changing as follows:

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 0 f ¼ f 1 þ nðV=cÞ cos h 1 ðV=cÞ2 ; ð23:12aÞ

0

n ¼

1þ

1 ðn2 1Þ½1 ðV=cÞ2 2 ½1 þ nðV=cÞ cos h2

;

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 tan h ¼ n 1 ðV=cÞ sin h= n cos h þ ðV=cÞ : 0

ð23:12bÞ

ð23:12cÞ

It is clear that the Doppler-shifted frequency f 0 and the polar angle h 0 are not only functions of h and V/c, as before, but they also depend on the refractive index n of

321

23 Doppler shift, stellar aberration 2.50

(a)

2.50

(b)

n = 1.33

n = 1.33 V/c = 0.9

2.25

2.25

0.7

= 180°

0.5 2.00

2.00

n

150° 1.75

1.75 120° 1.50

1.50 V/c = 0.1

90°

1.25

1.25

0.3

60°

0.9

= 0° 1.00

1.00 0.00

0.25

0.50 V/c

0.75

1.00

0

45

90 135 (degrees)

180

Figure 23.8 The refractive index n0 of water (n ¼ 1.33), moving at constant velocity V along the Z-axis, depends on V and on the propagation direction h. In the water’s rest frame, the assumed propagation direction of a plane wave of wavelength k has polar and azimuthal angles (h, ). (a) Plots of n0 versus V/c for several values of h. (b) Plots of n0 versus h for different values of V/c.

the propagation medium. Similarly, the apparent index n 0 of the moving medium depends on n, V/c, and h in accordance with Eq. (23.12b). For water of refractive index n ¼ 1.33, Figure 23.8(a) shows plots of n 0 versus V/c for several values of h, while Figure 23.8(b) shows plots of n 0 versus h for different values of V/c. In the special case when the beam moves in the same direction as the medium, h ¼ 0 and Eq. (23.12b) simplifies to n0 ¼

n þ ðV=cÞ : j1 þ nðV=cÞj

ð23:13Þ

When V c, Eq. (23.13) yields Fresnel’s formula1 for the drag of light by a moving medium, c=n0 ðc=nÞ þ ð1 n2 ÞV:

ð23:14Þ

322

Classical Optics and its Applications

Thus the speed of light in a moving medium increases by a fraction of V if the light and the medium move in the same direction, whereas the apparent speed of light decreases when the light moves opposite the direction of motion of the medium. References for Chapter 23 1 Max Born, Einstein’s Theory of Relativity, Dover, New York, 1965. 2 J. D. Jackson, Classical Electrodynamics, Wiley, New York, 1962. 3 A. E.Siegman, Lasers, University Science Books, Sausalito, CA, 1986.

24 Diffraction gratings†

John William Strutt, Lord Rayleigh (1842–1919), graduated from Trinity College Cambridge in 1864. From 1879 to 1884 he was the Cavendish professor of experimental physics at Cambridge, succeeding James Clerk Maxwell. His theory of scattering (1871) provided the first correct explanation of the blue color of the sky. Rayleigh’s discovery of the inert gas argon (1895) earned him the 1904 Nobel Prize for Physics. The Rayleigh–Sommerfeld theory of diffraction is one of the pillars of the classical theory. (Photo: courtesy of AIP Emilio Segre´ Visual Archives, Physics Today Collection.)

Diffraction gratings have been used in spectroscopy and other studies of electromagnetic phenomena for nearly two centuries.1,2,3,4 Josef Fraunhofer (1787–1826), the discoverer of the dark lines in the solar spectrum, built the first gratings in 1819 by winding fine wires around two parallel screws.5 Henry Rowland made significant contributions to the fabrication of precise, large-area, high-frequency ruled gratings in the 1880s.6 Robert Wood, who succeeded †

This chapter’s coauthors are Lifeng Li and Wei-Hung Yeh.

323

324

Classical Optics and its Applications

Rowland in the chair of experimental physics at Johns Hopkins University in 1901, used these ruled gratings extensively in his researches and discovered, among other things, the “anomalous” behavior of metallic gratings, which he first published in 1902.7 John William Strutt (Lord Rayleigh) developed a theoretical model of these gratings around 1907 and was successful in explaining certain features of Wood’s anomalies.8 However, it is only during the past thirty years or so that a thorough understanding of nearly all aspects of the behavior of diffraction gratings has been achieved through the consistent application of Maxwell’s equations with the help of advanced analytical and numerical techniques.2,9,10 Modern gratings having a few thousand lines per millimeter with nearperfect periodicity are fabricated over fairly large areas (grating diameters of around one meter or so are possible). The groove shapes can be controlled to be sinusoidal, rectangular, triangular, or trapezoidal, and one can obtain shallow or deep grooves (relative to the groove width) by current manufacturing techniques. These gratings can be made on various metal, plastic, and glass substrates and, when necessary, they can be coated with thin-film metal and/or dielectric stacks. The primary applications of diffraction gratings are still in spectroscopy, where they are used for analyzing the frequency content of electromagnetic radiation (visible light, ultraviolet, X-rays, infrared, microwave), but they are also used as wavelength selectors in tunable lasers, beam-sampling mirrors in high-power lasers, band-pass filters, pulse compressors, and polarization-sensitive optics, among other applications. The goal of the present chapter is to describe some of the basic properties of gratings and to point out through several examples the complex behavior of these devices. These examples are by no means comprehensive, but they should make it amply clear that there is no simple way to predict a grating’s diffraction efficiency. Although the number and the propagation direction of diffracted orders can be readily obtained from simple principles, the computation of diffraction efficiencies requires the complete solution of Maxwell’s equations in conjunction with the appropriate boundary conditions. The results of these calculations are often non-intuitive and depend strongly on a number of factors such as the period of the grating, the geometry of the grooves, the (complex) refractive index of the material(s) comprising the grating, and the wavelength as well as the polarization state and the propagation direction of the incident beam of light. Fortunately, powerful computer programs now exist that take all the relevant factors into account and provide a reliable solution to the electromagnetic equations that govern the behavior of diffraction gratings.

24 Diffraction gratings

325

Grating theories The simplest theory of gratings treats them as corrugated structures that modulate the amplitude and/or phase of the incident beam in proportion to the local reflectivity and the height or depth of the surface relief features. The modulated reflected (or transmitted) wavefront is then decomposed into its Fourier spectrum to yield the various diffracted orders. Known as the scalar theory of gratings, this elementary treatment yields the correct number and direction of propagation for the diffracted orders, but it does not provide an accurate estimate of the amplitude, phase, and polarization state of each order. Rayleigh made a substantial contribution to the understanding of gratings by representing the diffracted field as the superposition of a number of homogeneous (i.e., propagating) and inhomogeneous (i.e., evanescent) plane waves.8 He then determined the complex amplitudes of the various plane waves by imposing the electromagnetic boundary conditions at the grating surface. Although Rayleigh’s method was far superior to the scalar theory – it could account for some of the observed anomalies and, in fact, provided exact solutions to the electromagnetic field equations in certain cases of practical interest – it failed to provide a comprehensive solution that would be applicable under general conditions. A satisfactory analysis of the diffraction from gratings requires a numerically stable solution to Maxwell’s equations constrained by the relevant boundary conditions. Several such methods have been discovered and elaborated over the past 30 years by a number of researchers from around the world.2,9,10,11 The results presented in this chapter are based on the differential method of Chandezon, which uses the so-called coordinate transformation technique.11

Diffraction orders Figure 24.1 shows the cross-section of a metallized grating with a trapezoidal groove geometry. The grating period is denoted by p, the groove depth by d, and the duty cycle, which is the ratio of the land width to the grating period, by c. In this symmetric grating both side walls make the same angle a with the horizontal plane. The metal layer, specified by its complex refractive index (n, k), is assumed to be thick enough to render the grating opaque. Referring to Figure 24.2, the plane of the grating is XY, and its surface normal is the Z-axis. The plane of incidence is XZ, h being the angle of incidence. When the incident E-field is in the plane of incidence, the beam is p-polarized, and when the E-field is along the Y-axis it is s-polarized. In an alternative nomenclature,

326

Classical Optics and its Applications Land Metal

Groove 100 nm

d α

Substrate

Period ( p)

Figure 24.1 Cross-section of a metallized grating. Throughout this chapter the side-wall angle a ¼ 60 and the duty cycle c, which is the ratio of the land width to the grating period, is 60%. At k0 ¼ 0.633 lm the substrate’s refractive index n ¼ 1.5 and the metal layer’s complex index is given by (n, k) ¼ (2, 7).

in Figure 24.2(a), the polarization is transverse electric (TE) when the incident E-field is parallel to the grooves and transverse magnetic (TM) when it is perpendicular to the grooves. Although the grating may be mounted with its grooves in an arbitrary direction within the XY-plane, we shall consider only two situations. In the first case, depicted in Figure 24.2(a) and referred to as “classical mount”, the grooves are perpendicular to the plane of incidence. In this case all diffracted orders remain in the XZ-plane, their propagation vectors k given by kðmÞ ¼ ð2p=k0 ÞðrxðmÞ x þ rzðmÞ zÞ ¼ ð2p=k0 Þf½sin h þ ðmk0 =pÞx þ rzðmÞ zg: ð24:1Þ Here k0 is the vacuum wavelength of the light, the integer m specifies the diffraction order, the unit vector r ¼ (rx, ry, rz) is along the propagation direction, and the medium of incidence is implicitly assumed to be air. With ry ¼ 0, it is necessary that rx2 þ rz2 ¼ 1, from which rz can be determined once rx is known. To keep rz real, rx(m) ¼ sin h þ mk0/p must be in the range (1, þ1), a constraint that determines the number of propagating orders. In the second case, depicted in Figure 24.2(b) and referred to as “conical mount”, the grooves are parallel to the plane of incidence. Here all diffracted orders (other than the zeroth) are outside the XZ-plane and their propagation vectors are given by kðmÞ ¼ ð2p=k0 ÞðrxðmÞ x þ ryðmÞ y þ rzðmÞ zÞ¼ ð2p=k0 Þ½ðsin hÞ x þ ðmk0 =pÞy þ rzðmÞ z: ð24:2Þ Again, the integer m is the diffraction order, the implicitly assumed medium of incidence is air, and the constraint rx2 þ ry2 þ rz2 ¼ 1 specifies rz once rx and ry are identified. The inequality rx2 þ ry2 ¼ sin2h þ (mk0 /p)2 1 determines the number of propagating orders. This mounting is called conical because the

327

24 Diffraction gratings (a)

Z –2 Es

–1

Ep

0 +1

X

Z

(b)

Es

Ep

+1 0

–1

X

Figure 24.2 A monochromatic beam of light breaks up into multiple diffraction orders upon reflection from a grating. The incidence angle h is measured from the Z-axis. When the incident E-field is in the XZ-plane of incidence, the beam is p-polarized, and when E is perpendicular to XZ, the beam is s-polarized. In (a) the plane of incidence is perpendicular to the direction Y of the grooves. In this so-called classical mount all diffracted orders remain within the XZ-plane. In (b), where the grooves are parallel to the plane of incidence (conical mount), diffracted orders appear on both sides of the XZ-plane.

various diffracted orders reside on the surface of a cone. Technically speaking, the mount is conical whenever the grooves deviate from the normal to the plane of incidence.2 In this chapter, however, whenever the mount is said to be conical, the grooves will be strictly parallel to the plane of incidence.

328

Classical Optics and its Applications

Location of diffracted beams A simple experimental setup for observing the beams diffracted from a grating appears in Figure 24.3. The coherent beam of a red HeNe laser (k0 ¼ 0.633 lm) is focused at oblique incidence h onto the grating through a long-focal-length lens (NA ¼ 0.065). The diffraction-limited spot diameter is 1.22k0 /NA 12 lm, which, if the grating period p is sufficiently small, will cover several land-groove pairs. The various diffracted beams are then collected and collimated by a microscope objective lens (NA ¼ 0.8, f ¼ 2000k0). In the following examples the system is arranged in such a way that the zeroth-order beam always appears at the center of the collimating lens. This lens being aplanatic, if we denote the angle between a diffracted beam and the zeroth order by v, then the diffracted beam’s distance from the center of the exit pupil will be f sin v rather than f tan v. Figure 24.4 shows computed patterns of intensity distribution at the exit pupil of the collimating lens for the grating of Figure 24.1 in the case of classical mount (k0 ¼ 0.633 lm, p ¼ 4k0, d ¼ k0 /8; h ¼ 0 in Figures 24.4(a), (b), h ¼ 40 in Figures 24.4(c), (d)).12 The incident beam is p-polarized or TM (i.e., E-field parallel to the XZ-plane), but the beams appearing in the exit pupil have both the p- and scomponents of polarization. In Figure 24.4 the intensity patterns on the left represent the component of polarization that stays within the XZ-plane, while those on the right correspond to the component along Y. In all cases E? is much weaker than Ek, the ratio of the peak intensities jE?j2: jEkj2 being 0.65 ·105 in the case of normal incidence and 0.009 in the case h ¼ 40 . Figure 24.5 is similar to Figure 24.4, except that the grating is rotated by 90 in the XY-plane to bring the grooves parallel to the plane of incidence (i.e., conical

Diffracted orders Incident beam

Focusing lens Collimating lens Grating

Figure 24.3 A monochromatic beam of light is focused by a low-NA lens onto a grating. Compared with the grating period, the focused spot is large, covering several land-groove pairs at the grating’s surface. The diffracted orders, collected and collimated by a high-NA aplanatic lens, may be observed at the exit pupil.

329

24 Diffraction gratings a

b

c

d

–1600

x/0

1600 –1600

x/0

1600

Figure 24.4 Computed plots of intensity distribution at the exit pupil of the collimating lens in Figure 24.3, when the beam is diffracted from the grating of Figure 24.1 (k0 ¼ 0.633 lm, p ¼ 4k0, d ¼ k0 /8). The grooves are perpendicular to the plane of incidence, as in Figure 24.2(a), and the incident beam is p-polarized. The frames on the left correspond to the component of polarization parallel to the XZ-plane (Ek), while those on the right correspond to the component along the Y-axis (E?). In (a) and (b) the incidence is normal, whereas in (c) and (d) h ¼ 40 . The ratio of the peak intensity in (b) to that in (a) is 0.65 · 105. Similarly, the peak intensity ratio of (d) to (c) is 0.009. These results are based on full vector-diffraction calculations.

mount). In Figures 24.5(a), (b) the incidence is normal, whereas in (c), (d) it is oblique at h ¼ 30 . In both cases the incident beam is p-polarized, but the diffracted beams contain a certain amount of s-polarization as well.12 At the exit pupil of the lens, the ratio of the peak intensities perpendicular and parallel to the XZ-plane is fairly small, jE?j2: jEkj2 being 0.97 · 104 at normal incidence and 0.025 at h ¼ 30 . In both the above cases if the scalar theory of diffraction is used (instead of the full vector theory), the picture that emerges will show the diffracted orders in their correct locations but the amplitude, phase, and polarization state of the various orders will be substantially incorrect.

330

Classical Optics and its Applications a

b

c

d

–1600

x/0

1600 –1600

x/0

1600

Figure 24.5 Computed plots of intensity distribution at the exit pupil of the collimating lens of Figure 24.3, when the beam is diffracted from the grating of Figure 24.1 (k0 ¼ 0.633 lm, p ¼ 4k0, d ¼ k0 /8). The grooves are parallel to the plane of incidence, as in Figure 24.2(b), and the incident beam is p-polarized. The frames on the left correspond to the component of polarization parallel to the XZ-plane (Ek), while those on the right correspond to the component along the Y-axis (E?). In (a) and (b) the incidence is normal, whereas in (c) and (d) h ¼ 30 . The ratio of the peak intensity in (b) to that in (a) is 0.97 · 104. Similarly, the peak intensity ratio of (d) to (c) is 0.025. These results are based on full vector-diffraction calculations.

Diffraction efficiency We denote by E the amplitude of the incident beam at angle h and by E(m) the amplitude of the mth-order reflected (or transmitted) beam emerging at h(m). It is further assumed that the incidence medium is air and, in the case of a transmission grating, that the transparent medium into which the diffracted orders emerge has refractive index n0. For the mth-order reflected (transmitted) beam the diffraction efficiency q(m) (s(m)) can be written as qðmÞ ¼ jEðmÞ j2 cos hðmÞ =ðjEj2 cos hÞ,

ð24:3aÞ

sðmÞ ¼ n0 jEðmÞ j2 cos hðmÞ =ðjEj2 cos hÞ:

ð24:3bÞ

24 Diffraction gratings

331

Here the squared amplitude is the beam’s intensity, and the cosine factor keeps track of the change in the beam’s cross-sectional area upon diffraction. Figure 24.6 shows computed plots of diffraction efficiency versus h for the zeroth- and first-order beams for the grating of Figure 24.1 (k0 ¼ 0.633 lm, p ¼ 3k0, d ¼ k0 /8).12 In each frame there are four curves, representing the diffraction efficiency of the corresponding order when the incident beam is either p- or s-polarized and when the mount is either classical (qp, qs) or conical (qp0 , qs0 ). The sharp peaks and valleys appearing in these plots are caused by the excitation of surface plasmons, which, in the case of metal gratings, exist only when the incident beam has an E-field component perpendicular to the grooves (see Chapter 9, “What in the world are surface plasmons?”). The arrows at the bottom of each figure point to the angles of incidence associated with the Rayleigh anomalies; these are points at which a particular diffraction order appears or disappears. In Figure 24.6(b), for example, qp and qs terminate at h ¼ 41.81 , which is where the þ first-order beam becomes parallel to the surface and subsequently vanishes. In the case of qp0 and qs0 (conical mount) the cutoff of both the first orders occurs at h ¼ 70.53 . When the metallic grating has a large conductivity, the surface plasmon features and Rayleigh anomalies are usually located pairwise, close to each other. Dependence of diffraction efficiency on the grating period The efficiency curves become somewhat erratic as the period p of the grating decreases, but they approach a limiting behavior with increasing p. Figure 24.7 shows computed plots of the zeroth-order efficiency versus h for the grating of Figure 24.1 with (a) p ¼ k0 and (b) p ¼ 5k0 (in both cases k0 ¼ 0.633 lm, d ¼ k0 /8).12 These plots should be compared with those of Figure 24.6(a) for which p ¼ 3k0. Notice the substantial departure of the curves in Figure 24.7(a) from those in Figure 24.6(a). However, there are similarities between Figures 24.6(a) and 24.7(b), stemming from the fact that in both cases the grating period is fairly large and the grooves are rather shallow. Figure 24.8 shows plots of q(0) versus p at a fixed angle of incidence (h ¼ 30 , k0 ¼ 0.633 lm, d ¼ k0 /8). The solid (broken) arrows at the bottom (top) of the figure indicate the locations of Rayleigh anomalies for the classical (conical) mount. It appears that as the period increases the various zeroth-order efficiencies approach a limiting value in the vicinity of 55%. The remainder of the incident energy in this case is partly absorbed by the metal layer and partly distributed among other diffracted orders. As p ! 1 the orders that carry the bulk of the reflected energy converge towards the zeroth order line. At the same time, the overall reflectance, which becomes more and more concentrated around the direction of specular reflection, approaches the specular reflectivity of the flat metal layer at 30 incidence

332

Classical Optics and its Applications 1.0

0 = 0.633μm p = 30 d = 0/8

(a)

0.9

Diffraction Efficiency

0.8 0.7

s(0) p(0)

0.6 0.5 0.4

p(0)

0.3 s(0)

0.2 0.1 0.0 0

15

30

45

60

75

90

(degrees) 0.24 (b) p(+1)

Diffraction Efficiency

0.20

0.16

0.12 s(+1) 0.08 p(+1)

s(+1) 0.04

0.00 0

15

30

45

60

75

90

(degrees)

Figure 24.6 Computed plots of diffraction efficiency versus the angle of incidence h for qp, qs (classical mount) and qp0 , q s0 (conical mount, i.e., grooves parallel to the incidence plane). The solid (broken) arrows indicate the locations of Rayleigh anomalies for the classical (conical) mount. (a) zeroth-order, (b) þfirst-order, and (c) first-order diffracted beams upon reflection from the grating of Figure 24.1 (k0 ¼ 0.633 lm, p ¼ 3k0, d ¼ k0 /8).

333

24 Diffraction gratings 0.24

(c)

Diffraction Efficiency

0.20

s(–1) p(–1)

0.16

0.12 p(–1)

0.08

s(–1)

0.04

0.00 0

15

30

45

60

75

90

(degrees)

Figure 24.6

(continued)

(i.e., 84% for p-light, 88% for s-light). In the opposite extreme, p ! 0, the reflectivity curves once again show a limiting behavior. Although there are no other diffracted orders in this case, the limiting value of q(0) is not necessarily the same as the specular reflectance of the flat metal layer but should be calculated from an “effective medium” theory.

Effect of the groove depth Another factor that complicates the behavior of a grating is the dependence of its efficiency on the groove depth d. Figure 24.9 shows plots of q(0) versus h for reflection from the grating of Figure 24.1 when the groove depth d ¼ 0.2 lm (k0 ¼ 0.633 lm, p ¼ 3k0). These curves are quite different from those of Figure 24.6(a), which correspond to a similar grating with shallower grooves. The lower values of q(0) in the case of a deep-groove grating indicate that more light is being channeled into other diffracted orders.

Reciprocity theorem There exists a powerful and quite unexpected reciprocity relation between the beam incident on a grating and any of the resulting diffracted orders. Suppose the incident beam arrives at the grating at an angle h and the mth diffracted order emerges at an angle h(m), having diffraction efficiency q(m) or, in the case of a transmitted order,

334

Classical Optics and its Applications 1.0

(a) 0 = 0.633 μm, d = 0/8 p = 0

Diffraction Efficiency

0.8 p(0)

s(0) 0.6

0.4

0.2

p(0)

s(0)

0.0 0

15

30

45

60

75

90

(degrees) 1.0

(b) p = 50

Diffraction Efficiency

0.8

s(0)

s(0) p(0)

0.6 (0) p

0.4

0.2

0.0 0

15

30

45

60

75

90

(degrees)

Figure 24.7 Computed plots of diffraction efficiency versus h for the zeroth-order diffracted beam upon reflection from the grating of Figure 24.1 (k0 ¼ 0.633 lm, d ¼ k0 /8). In (a) the grating period p ¼ k0 while in (b) p ¼ 5k0. The solid (broken) arrows indicate the locations of Rayleigh anomalies for the classical (conical) mount.

s(m). If the direction of incidence is now changed in such a way that the incident beam is along the path of the mth-order beam (in the reverse direction, of course), there emerges a mth diffracted order along the path of the original incident beam (again in the reverse direction). The reciprocity theorem states that the

335

24 Diffraction gratings 0.9

Diffraction Efficiency

0.8

s(0) p(0)

0.7

s(0)

0.6 0.5 (0) p

0.4 0 = 0.633 μm p = 0 /8 = 30°

0.3 0.2 0.1 0

1

2

3

4

5

6

p (μm)

Figure 24.8 Computed plots of the zeroth-order efficiency versus the grating period p for the grating of Figure 24.1 (k0 ¼ 0.633 lm, d ¼ k0/8, h ¼ 30 ). The solid (broken) arrows indicate the locations of Rayleigh anomalies for the classical (conical) mount.

1.0

0 = 0.633 μm p = 30 d = 0.2 μm

Diffraction Efficiency

0.8

s(0)

0.6 p(0) 0.4

s(0)

0.2

p(0) 0.0 0

15

30

45

60

75

90

(degrees)

Figure 24.9 Computed plots of the zeroth-order diffraction efficiency versus the angle of incidence for the grating of Figure 24.1 (k0 ¼ 0.633 lm, p ¼ 3k0, d ¼ 0.2 lm). The solid (broken) arrows indicate the locations of Rayleigh anomalies for the classical (conical) mount.

336

Classical Optics and its Applications

efficiency of this particular diffracted order will be exactly equal to q(m) (or s(m)). This theorem can be rigorously proved under general conditions.2 In Figure 24.6 the first-order efficiency curves in the classical mount, i.e., qs(1) and qp(1), show several manifestations of the reciprocity theorem. A few more consequences of reciprocity will be pointed out in the examples that follow.

Resolving power Consider a grating of period p having a total of N grooves. The width of the mthorder diffracted beam that covers the entire grating is Np cos h(m). If this beam is brought to diffraction-limited focus by a lens of focal length f, the focused spot diameter D will be1 D k0 f =ðNp cos hðmÞ Þ:

ð24:4Þ

Spectroscopists are interested in the focused spots formed by two nearby wavelengths, k0 and k0 þ Dk. According to Eq. (24.1) the diffraction angle h(m) in the classical mount is given by sin h(m) ¼ sin h þ mk0/p, in which case for a small change of wavelength Dk we have cos hðmÞ DhðmÞ ðm=pÞDk:

ð24:5Þ

Therefore, in the focal plane of the lens, a shift of the wavelength from k0 to k0 þ Dk causes a shift of the focused spot by the following amount: f DhðmÞ mf Dk=ðp cos hðmÞ Þ:

ð24:6Þ

The two wavelengths are just resolved when the above shift equals the spot diameter D in Eq. (24.4), that is, when f Dh(m) D. This leads to the following expression for the resolving power: k0 =Dk mN:

ð24:7Þ

It is thus seen that the resolving power of a grating is directly proportional to N, its total number of illuminated grooves, and to m, the order of diffraction. The resolving power is completely independent of such seemingly relevant factors as the groove period, the groove geometry, and the incidence angle.

Littrow mount and blazed gratings To build compact spectrometers, it is desirable that one of the diffracted orders should return along (or almost along) the direction of incidence. In the so-called

337

24 Diffraction gratings

Littrow mount, the nth-order beam, where n is negative, returns along the direction of incidence. For instance, in the first-order Littrow mount, we find from Eq. (24.1) 2 sin h ¼ k0 =p:

ð24:8Þ

Under this condition, if p < 1.5k0, then the only possible diffracted orders are the zeroth and the first. Furthermore, if the efficiency for the zeroth order can be reduced to zero, all the available power that is not absorbed by the grating will return along the first reflected order, thus maximizing the sensitivity of the spectrometer. Gratings that direct all or most of the incident optical power into a single diffracted order are known as blazed gratings. Although in the early days ruled gratings having a triangular groove profile satisfied the blaze condition, a triangular cross-section is no longer a prerequisite to the blazing property. Gratings with triangular cross-section and a 90 apex angle are now more appropriately referred to as “echelette” gratings. Figure 24.10 shows a metallic prism with an inclination angle a. When a plane wave is normally incident on the inclined facet of this prism, the specularly

= Incident beam 4m0/2 3m0/2 2m0/2

m0/2

d = 12 m0 cos

p=

m0 2 sin

Figure 24.10 A normally incident beam of light is specularly reflected from the inclined facet of a metallic prism (inclination angle a). For a given integer m, imagine cutting the prism along the broken and dotted lines, which are parallel to the direction of incidence and have lengths that are multiples of mk0 / 2. The various sections are then rearranged to form the echelette grating shown in the lower part of the figure. If the grating is similarly illuminated at h ¼ a, the diffracted order that retraces the incidence path in the reverse direction will be quite strong, which is why this kind of grating has come to be known as a blazed grating.

338

Classical Optics and its Applications

reflected light returns along the direction of incidence. Let the lengths of the equidistant lines drawn on the prism parallel to the direction of incidence be integer multiples of mk0 / 2, where m is an arbitrary (but fixed) integer. If the metal prism is cut along these lines and its segments rearranged, one obtains an echelette grating with period p ¼ mk0/(2sin a), as shown in the lower part of the figure. With an incidence angle h ¼ a on this grating, Littrow’s condition for the negative mth diffracted order will be satisfied. In the geometric-optical approximation, this grating should be equivalent to the original prism, because the various reflected rays from its individual facets suffer phase delays in multiples of 2p only, making the grating’s reflected wavefront indistinguishable from that of the prism. In reality, however, the electromagnetic field “feels” the groove structure, and the actual diffraction efficiency of the beam returning along the direction of incidence will not always be the same as the specular reflectance of the polished metal prism, although they are usually close. Figure 24.11 shows computed efficiency curves in the classical mount for the echelette grating of Figure 24.10 having a ¼ 30 , p ¼ 2k0, and (n, k) ¼ (2, 7) at k0 ¼ 0.633 lm.12 The horizontal axis depicts sin h, the incidence angle h being positive (negative) when incidence is from the side of the large (small) facet of the triangular grooves. The arrows at the top of each frame indicate the locations of Rayleigh anomalies, in the neighborhood of which resonance features and slope discontinuities are seen to occur. The zeroth-order efficiency curves for pand s-polarized light are shown in Figure 24.11(a). Despite the asymmetrical groove geometry, the plots of qp(0) and qs(0) are perfectly symmetric around h ¼ 0, which is a manifestation of the reciprocity theorem mentioned earlier. The þfirstorder efficiency curves in Figure 24.11(b) show the same kind of symmetry around h ¼ 14.48 (i.e., sin h ¼ 0.25), which is the angle of incidence for the þfirst-order Littrow mount. Similarly, the first-order curves in Figure 24.11(c) show the reciprocity theorem at work around h ¼ 14.48 , the angle of incidence for the first-order Littrow mount. The Rayleigh anomalies at h ¼ 30 (i.e., sin h ¼ 0.5) mark the disappearance of the first-order beams beyond these angles, as may be seen clearly in Figures 24.11(b) and 24.11(c). The second-order efficiency curves are shown in Figure 24.11(d). These curves peak at, and are symmetrical around, h ¼ 30 , where the Littrow condition for the second-order beams is satisfied. Reciprocity between the incident beam and the second-order reflected beams is evident in the symmetrical values of efficiency around h ¼ 30 . Note in the case of the p-polarized beam incident at h ¼ 30 , where the second-order efficiency reaches 80% while that of all other orders essentially vanishes, that the remaining 20% of the incident power must have been absorbed by the grating. A similar consideration applies to both qp(þ2) and qs(þ2) at h ¼ 30 . The third-order beams exist only at large angles

339

24 Diffraction gratings 1.0

(a) 0 = 0.633 μm p= 20 α = 30°

Diffraction Efficiency

0.8

0.6 s(0) 0.4

0.2 p(0) 0.0 –1.0

–0.5

0.0

0.5

1.0

sin 0.40

(b)

0.35 p(+1) Diffraction Efficiency

0.30 0.25 0.20 0.15 0.10

s(+1)

0.05 0.00 –1.0

–0.5

0.0

0.5

1.0

sin

Figure 24.11 Computed plots of diffraction efficiency versus sin h, where h is the angle of incidence on the echelette grating of Figure 24.10 (k0 ¼ 0.633 lm, a ¼ 30 , p ¼ 2k0, (n, k) ¼ (2, 7)). When h > 0, incidence is from the large-facet side of the triangular grooves while when h < 0 incidence is from the small-facet side. The displayed efficiencies are for p- and s-polarized incident light in the classical mount. (a) Zeroth order, (b) þfirst order, (c) first order, (d) second order, (e) third order. The arrows at the top of each frame indicate the locations of Rayleigh anomalies.

340

Classical Optics and its Applications 0.30

(c)

Diffraction Efficiency

0.25 s(–1)

0.20

0.15

0.10 p(–1) 0.05

0.00 –1.0

–0.5

0.0 sin

0.5

1.0

(d) 0.8

Diffraction Efficiency

p(–2) 0.6 p(+2)

s(–2)

0.4

0.2 s(+2) 0.0 –1.0

Figure 24.11

–0.5

0.0 sin

0.5

1.0

(continued)

of incidence, as may be inferred from Figure 24.11(e). Again note the symmetry of these curves (due to reciprocity) around sin h ¼ 0.75; these values of h correspond to the Littrow mount in the third-order. For the sake of completeness we present in Figure 24.12 computed efficiency curves in the case of conical mount for the same echelette grating as discussed

341

24 Diffraction gratings (e) 0.6 s(+3)

Diffraction Efficiency

0.5

50 (–3) s

0.4 50 p(–3) 0.3 0.2 0.1

p(+3)

0.0 –1.0

–0.5

0.0 sin

0.5

1.0

Figure 24.11 (continued)

above.12 Here the grooves are parallel to the plane of incidence, and symmetry with respect to h ¼ 0 obviates the need for displaying the results for negative values of h. In this conical mount only the zeroth and first diffracted orders are allowed; even then, the first-order beams disappear beyond h ¼ 60 . Note that, because of the asymmetrical groove shape, the þfirst-order efficiency curves are quite different from those of the first-order. Also note that, beyond h ¼ 60 , where the zeroth-order beam is the only beam reflected from the grating, the relatively small values of q0p ð0Þ and q0s ð0Þ indicate substantial absorption within the grating medium. Transmission grating Consider a grooved glass plate such as that depicted in Figure 24.13(a). When a plane wave is incident at h on this grating, the directions of the reflected orders may be found from Eqs. (24.1) and (24.2), but the transmitted orders inside the glass plate obey different equations. In the classical mount the transmitted orders emerge at angles h(m), where n0 sin hðmÞ ¼ sin h þ mk0 =p:

ð24:9Þ

Here n0 is the refractive index of the substrate. The number of diffracted orders in the substrate could, therefore, be greater than the number reflected into the air.

342

Classical Optics and its Applications 1.0

(a) 0 = 0.633 μm p = 20 α = 30°

Diffraction Efficiency

0.8

p(0)

0.6

0.4

s(0)

0.2

0.0 0

15

30

45

60

75

90

75

90

(degrees)

0.30

(b)

Diffraction Efficiency

0.25

0.20 p(+1)

s(+1)

0.15

0.10

0.05

0.00 0

15

30

45 60 (degrees)

Figure 24.12 Computed plots of diffraction efficiency versus the angle of incidence on the echelette grating of Figure 24.10 (k0 ¼ 0.633 lm, a ¼ 30 , p ¼ 2k0, (n, k) ¼ (2, 7)). The displayed efficiencies are for p- and s-polarized incident light in the conical mount. (a) zeroth order, (b) þfirst order, (c) first order. The arrows at the bottom of each frame indicate the locations of Rayleigh anomalies.

343

24 Diffraction gratings 0.6

(c)

Diffraction Efficiency

0.5

0.4

p(–1)

0.3

0.2

s(–1)

0.1

0.0 0

15

30

45 60 (degrees)

75

90

Figure 24.12 (continued)

However, when the transmitted orders attempt to exit the bottom of the substrate, those incident at an angle higher than the critical angle for total internal reflection will be fully reflected. The beams that do exit the substrate will emerge at angles greater than h(m), in accordance with Snell’s law; the coefficient n0 on the left-hand side of Eq. (24.9) is effectively canceled. Consequently, the beams emerging from the bottom of the substrate have exactly the same number and (aside from being mirror images) the same directions as those reflected from the top of the grating. Nonetheless, the transmitted diffracted orders may be observed in their native form by using a hemispherical substrate, as shown in Figure 24.13(b). In the case of conical mount similar arguments apply, so that the mth-order beam inside the substrate will have a propagation direction given by the unit vector r(m), where rðmÞ ¼ rxðmÞ x þ ryðmÞ y þ rzðmÞ z ¼ ð1=n0 Þ½ðsin hÞx þ ðmk0 =pÞy þ rzðmÞ z:

ð24:10Þ

Again, rz is determined from the relation rx2 þ ry2 þ rz2 ¼ 1. As above, when this beam emerges into air from the bottom of a flat substrate, Snell’s law multiplies rx and ry by the refractive index n0, ensuring that the emergent beams (aside from being mirror images) have the same propagation directions as the corresponding beams reflected from the top of the grating.

344

Classical Optics and its Applications (a)

Grating +1 –3

Substrate (glass)

–2

–1

0

(b)

+1

–3

–2

–1

0

Figure 24.13 A simple transmission grating may be obtained by ruling or etching a glass substrate, or by a holographic method. The substrate’s refractive index being greater than unity, the diffraction angles inside the substrate are smaller than those observed upon reflection from the same grating into the air. (a) When the substrate bottom is flat, Snell’s law of refraction reorients the beams as they emerge into the air, making the diffraction angles equal to those observed in reflection. However, one or more diffracted orders may be missing, owing to total internal reflection at the substrate bottom. (b) If the grating is made on the flat surface of a glass hemisphere, the transmitted orders emerge into the air undisturbed.

Figure 24.14 shows the location of the transmitted diffracted orders from a glass grating.12 The assumed grating in this case is similar to that of Figure 24.1, except that the metal layer is absent. The observation system is also similar to that in Figure 24.3, except for the position of the collimating lens, which is moved to the opposite side of the grating to collect the transmitted orders. The incident beam, arriving at h ¼ 30 in the conical mount, is p-polarized. The pictures on

345

24 Diffraction gratings a

b

c

d

–1600

x/0

1600 –1600

x/0

1600

Figure 24.14 Computed plots of intensity distribution at the exit pupil of the collimating lens of Figure 24.3, when the system is rearranged to allow observation of transmitted orders from the grating of Figure 24.1, from which the metal layer has been removed (k0 ¼ 0.633 lm, p ¼ 4k0, d ¼ k0 /8). In this case of conical mount at 30 incidence the grooves are parallel to the plane of incidence, as in Figure 24.2(b), and the incident beam is p-polarized. The pictures on the left correspond to the component of polarization in the XZplane, while those on the right represent the polarization component along the Y-axis. In (a) and (b) the substrate bottom is flat, as in Figure 24.13(a), whereas in (c) and (d) it is hemispherical, as in Figure 24.13(b). The ratio of the peak intensity in (b) to that in (a) is 0.21 · 104. Similarly, the peak-intensity ratio of (d) to (c) is 0.89 · 104. These results are based on full vector-diffraction calculations.

the left-hand side of Figure 24.14 represent the component of polarization in the XZ-plane (Ek), while those on the right correspond to polarization along the Y-axis (E?). The top row shows the intensity distribution at the exit pupil of the collimating lens when the substrate bottom is flat; the bottom row corresponds to the case of a hemispherical substrate. As expected, in the latter case there are more diffracted orders, the orders are more closely spaced, and the individual beam diameters are smaller. For the flat substrate the peakintensity ratio jE?j2 : jEkj2 ¼ 0.21 · 104, while for the hemispherical substrate jE?j2 : jEkj2 ¼ 0.89 · 104.

346

Classical Optics and its Applications

Dielectric-coated grating Figure 24.15 is a diagram of a dielectric-coated transmission grating on a hemispherical glass substrate. In the example that follows it is assumed that k0 ¼ 0.633 lm, the grating period p ¼ k0, the groove depth d ¼ k0 /8, the side-wall inclination angle a ¼ 60 , and the duty cycle c ¼ 60%. The coatings are conformal to the grating surface, both dielectric layers are 100 nm thick, and their refractive indices are 2.1 and 1.5, as indicated. Because there are no metallic layers in this case there will be no surface plasmon excitations, but there is the possibility of guided-mode coupling to the dielectric waveguide formed by the coating layers. The hemispherical substrate allows all transmitted orders to exit and be measured in air. The bottom of the hemisphere is antireflection coated, to avoid losses as the beams exit the substrate. Figure 24.16 shows computed plots of diffraction efficiency versus h for the grating of Figure 24.15.12 The case of conical mount does not show interesting phenomena, as evidenced by the featureless plots of q0 and s0 for the various orders. This is not surprising, considering that no guided modes can be launched in the dielectric layers in this case. However, for the classical mount qp, qs, sp and ss show peaks and valleys that are indicative of resonant

0 Incident beam

n2 = 1.5 n1 = 2.1 Substrate (n0 = 1.5)

100 nm 100 nm

–2

+1

–1

0

Figure 24.15 Cross-section of a dielectric-coated diffraction grating. The sidewall angle a ¼ 60 , and the duty cycle c, which is the ratio of the land width to the grating period, is 60%. Both coating layers are 100 nm thick and (at k0 ¼ 0.633 lm) their refractive indices are n1 ¼ 2.1 and n2 ¼ 1.5. For the substrate, which is also transparent, n0 ¼ 1.5.

347

24 Diffraction gratings 1.0 (a) Bilayer-coated grating 0 = 0.633 μm p = 0 d = 0/8

Diffraction Efficiency

0.8

0.6

s(0)

s(0)

0.4 p(0) p(0)

0.2

0.0 0

15

30

45

60

75

90

(degrees) 0.10

(b) p(–1)

Diffraction Efficiency

0.08

0.06

s(–1)

0.04

0.02

0.00 0

15

30

45 60 (degrees)

75

90

Figure 24.16 Computed diffraction efficiencies versus h for the dielectriccoated grating of Figure 24.15 (k0 ¼ 0.633 lm, p ¼ k0, d ¼ k0 /8). Reflected beams: (a) zeroth order, (b) first order. Transmitted beams: (c) zeroth order, (d) þfirst order, (e) first order, (f) second order (classical mount only). The arrows at the top or the bottom of each frame indicate the locations of Rayleigh anomalies in the classical mount.

348

Classical Optics and its Applications 1.0

(c)

Diffraction Efficiency

0.8

p(0)

0.6

s(0) p(0)

0.4 s(0) 0.2

0.0 0

0.06

15

30

45 60 (degrees)

75

90

(d) p(+1)

Diffraction Efficiency

0.05

s(+1)

p(+1)

0.04

0.03 s(+1)

0.02

0.01

0.00 0

15

30

45

60

75

90

(degrees)

Figure 24.16

(continued)

behavior. Figure 24.16(b) shows plots of qp and qs for the first-order reflected beam, which carries as much as 8% of the incident beam into this particular direction at several angles of incidence. Reciprocity between the incident beam and the first-order reflected beam is evident in Figure 24.16(b), in the symmetrical

349

24 Diffraction gratings 0.14 (e) p(–1)

0.12

Diffraction Efficiency

0.10

s(–1)

0.08 p(–1)

0.06 0.04

s(–1)

0.02 0.00 0

15

30

45

60

75

90

(degrees) (f )

0.30

0.25 Diffraction Efficiency

p(–2) 0.20

0.15

0.10

0.05

s(–2)

0.00 0

15

30

45 60 (degrees)

75

90

Figure 24.16 (continued)

values of efficiency before and after h ¼ 30 . Note that, unlike surface plasmon excitations in metals, which occur in p-polarization only, the waveguide modes of dielectric layers can be excited by both p- and s-polarized light. For the classical mount, Figure 24.16(d) shows that the þfirst-order transmitted beam is cut off

350

Classical Optics and its Applications

beyond h ¼ 30 . In its place the second-order transmitted beam shown in Figure 24.16(f) appears and shows fairly high efficiency for p-polarized light in a narrow range of angles around h ¼ 33 . It is impossible to describe in a brief survey the entire range of physical phenomena that occur in diffraction gratings and their potential applications. We hope, however, to have brought to the reader’s attention the richness and complexity of the physics of gratings, and to have encouraged further exploration of this fascinating subject. References for Chapter 24 1 M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, Oxford, 1980. 2 R. Petit, editor, Electromagnetic Theory of Gratings, Vol. 22 of Topics in Current Physics, Springer Verlag, Berlin, 1980. 3 M. C. Hutley, Diffraction Gratings, Academic Press, New York, 1982. 4 E. G. Loewen and E. Popov, Diffraction Gratings and Applications, Marcel Dekker, New York, 1997. 5 J. Fraunhofer, Ann. d. Physik 74, 337 (1823), reprinted in his collected works, 117 (Munich, 1888). 6 H. A. Rowland, Phil. Mag. (5), 13, 469 (1882). 7 R. W. Wood, On a remarkable case of uneven distribution of light in a diffraction grating spectrum, Phil. Mag. 4, 396–402 (1902). 8 J. W. S. Rayleigh, Proc. Roy. Soc. London A 79, 399 (1907). 9 D. Maystre, Rigorous vector theories of diffraction gratings, in Progress in Optics, Vol. 21, 1–67, ed. E. Wolf, Elsevier, Amsterdam, 1984. 10 D. Maystre, ed., selected Papers on Diffraction Gratings, SPIE Milestone series, Vol. MS 83, SPIE, Bellingham, 1993. 11 Lifeng Li, Multilayer-coated diffraction gratings: differential method of Chandezon et al. revisited, J. Opt. Soc. Am. A 11, 2816–2828 (1994). 12 The simulations in this chapter were performed by DELTA, a program developed by Lifeng Li for grating calculations, and by DIFFRACTTM, a product of MM Research Inc., Tucson, Arizona.

Chapter 25 Diffractive optical elements

Diffractive optical elements (DOEs), which are relatively new additions to the toolbox of optical engineering, can function as lenses, gratings, prisms, aspherics, and many other types of optical element. Typically formed in a film of only a few microns thickness, a DOE may be fabricated on an arbitrarily-shaped substrate. Flexible functionality, wide range of available optical aperture, light weight, and low manufacturing cost are among the advantages of DOEs. They can be fabricated in a broad range of materials such as aluminum, silicon, silica, and plastics, thus providing flexibility in selecting the base material for specific applications. The effects of temperature change, thermal gradients, shock, and stress in thin film optical devices, however, can cause deformation of the substrate and ultimately alter the behavior of a DOE.1,2,3,4,5,6 DOEs are wavelength sensitive; for instance, the focal length and aberration characteristics of a diffractive lens can vary substantially if the wavelength of the incident light is changed. DOEs can duplicate most of the functions provided by conventional glass optics provided that the optical system operates over a narrow spectral bandwidth, or the operation of the system requires chromatic dispersion. To date, DOEs have found widespread application in beam-combiners, head-mounted displays, beam-shaping optics, laser collimators, spectral filters, compact spectrometers, diode laser couplers, projection displays, compact disk (CD) and digital versatile disk (DVD) players, laser resonators, computer interconnects, solar concentrators, laser material processing, and wavelength division multiplexers/demultiplexers. Optimal design of advanced optical systems requires a thorough understanding of the interaction between the light beam and the various elements located between the light source and the detectors. In this chapter we use a combination of polarization ray-tracing and quasi-vector diffraction modeling to analyze the behavior of a laser beam as it propagates through various diffractive optical elements.

351

352

Classical Optics and its Applications

Transmissive diffractive optical element Figure 25.1(a) shows a geometric-optical ray (vacuum wavelength ¼ k0) arriving through a medium of refractive index n1 at the surface of a substrate (refractive index ¼ n2) coated with a variable thickness layer; the angle and the azimuth of incidence are h1, 1, those of the transmitted ray are h2, 2. The incident wavefront at the front facet of the substrate may be written as A(x, y) ¼ A0 exp [i(2pn1/k0)(xrx þ yry)], where rx ¼ sin h1 cos 1 and ry ¼ sin h1 sin 1. The coating layer has thickness t(x, y) and refractive index n. To avoid certain complications in the following analysis we shall assume that n is very large and t(x, y) very small, so that only the product (n n1) t(x, y), known as the optical path difference (OPD), has a finite value. The characteristic function of the coating layer is thus the dimensionless function F(x, y) ¼ (n n1)t(x, y)/kc, where kc is some fixed “construction wavelength.” The characteristic function is generally specified by a polynomial such as Fðx; yÞ ¼

N Nm X X

amn xm yn :

ð25:1aÞ

m¼0 n¼0

F(x, y) must be greater than or equal to zero across the surface since n n1, t(x, y) and kc are all non-negative. For later reference, the gradient of F(x, y) is written below: rFðx; yÞ ¼ ð@F=@x; @F=@yÞ

Nm X X N N Nn X X n m1 m n1 : m amn y x ; n amn x y ¼ m¼1

n¼0

n¼1

n2

n1

m¼0

n1

ð25:1bÞ

n2

t (x, y)

X

X

n Y

Z

Y

Z

Figure 25.1 (a) A ray of light (vacuum wavelength ¼ k0) is incident at an oblique angle (h1, 1) from a medium of refractive index n1 onto a substrate of index n2. The substrate is coated with a layer of index n and variable thickness t(x, y), where n is assumed to be large and t(x, y) very small, so that only the optical path difference, OPD ¼ (n n1)t(x, y), has a finite value. (b) The variable thickness layer is converted to a DOE by reducing the coating layer’s thickness wherever the OPD contains an integer multiple of the construction wavelength kc. The characteristic function of the DOE is thus the fractional part f(x, y) of the characteristic function of the coating layer in (a), defined as F(x, y) ¼ (n n1)t(x, y)/kc.

353

25 Diffractive optical elements

A diffractive optical element (DOE) is constructed from the above coating layer by reducing the layer’s thickness whenever F(x, y) happens to be greater than unity. By removing from t(x, y) all integer multiples of kc/(n n1), one obtains a coating such as that in Figure 25.1(b), for which the integer part of F(x, y), if any, has been eliminated in all locations. The characteristic function f(x, y) of the DOE, with values confined to the interval [0, 1], is simply the fractional part of F(x, y). As shown in Figure 25.2, the coating layer’s F(x, y) is truncated at contours where the function acquires integer values, so the local period (Dx, Dy) of the DOE at a point such as (x0, y0) is the shortest line segment through (x0, y0) that satisfies the equation rFðx; yÞ ðDx ^ x þ Dy ^yÞ ¼ ð@F=@xÞDx þ ð@F=@yÞDy ¼ 1:

ð25:2Þ

^ and ^y are unit vectors along the coordinate axes. Noting that In Eq. (25.2) x 2 jrFj ¼ (@F/@x)2 þ (@F/@y)2, we find (Dx, Dy) ¼ rF/jrFj2. This is the local period of the grating at (x0, y0), which is directed along rF and has magnitude 1/jrFj. In the linear approximation, a single period of the grating begins at (x, y) ¼ (x0, y0) f(x0, y0)rF/jrFj2, where f(x, y) ¼ 0, and ends at (x, y) ¼ (x0, y0) þ [1 f(x0, y0)]rF/jrFj2, where f(x, y) ¼ 1. Y

F/| F |2 (xo, yo)

X

Figure 25.2 Diagram of a DOE showing the slicing contours where the function F(x, y) assumes integer values. The DOE’s characteristic function f(x, y) is the fractional part of F(x, y). Thus, while F(x, y) is continuous across the XY-plane, f(x, y) jumps by one unit at each contour. The space between each pair of adjacent contours contains a single groove of the DOE, where f(x, y) varies continuously between the values of 0 and 1. At an arbitrary location (x0, y0) in the XY-plane, the separation between adjacent contours is given by (Dx, Dy) ¼ rF/jrFj2, which is a vector of magnitude 1/jrFj oriented orthogonal to the contours.

354

Classical Optics and its Applications

Since n is assumed to be large, inside the coating layer of Figure 25.1(a) the ray travels along the Z-axis and acquires an extra phase W(x, y) ¼ 2p(n n1) t(x, y)/k0 ¼ 2p(kc/k0)F(x, y). As long as k0 ¼ kc, the truncation of F(x, y), i.e., removal of its integer part, does not affect the acquired phase shift W(x, y); in other words, eliminating 2p multiples does not change the transmitted beam’s phase profile. However, when k0 ¼ 6 kc, the XY-plane may be divided into segments, defined by the contours of truncation, where the phase of the transmitted beam over each segment differs from W(x, y) by some integer-multiple of 2p(kc/k0); the DOE thus modulates the incident phase by w(x, y) ¼ 2p(kc/k0)f(x, y). In the vicinity of an arbitrary point (x0, y0), considering the local periodicity of the grating along the direction rF, the modulating phase function exp[iw(x, y)] may be expanded in the following (one-dimensional) Fourier series: X Cm expfi2pm½ð@F=@xÞðx x0 Þ exp½i2pðkc =k0 Þf ðx; yÞ ¼ m

þ ð@F=@yÞðy y0 Þg; where the Fourier coefficients are given by Z Cm ¼ jrF j exp½i2pðkc =k0 Þf ðx; yÞ expði2pmjrF jsÞds:

ð25:3aÞ

ð25:3bÞ

In Eq. (25.3b), the one-dimensional integral is taken in the XY-plane along a straight line segment drawn parallel to rF through (x0, y0); the range of integration, starting at (x, y) ¼ (x0, y0) f(x0, y0)rF/jrFj2 and ending at (x, y) ¼ (x0, y0) þ [1 f(x0, y0)] rF/jrFj2, covers one full period of the grating; see Figure 25.2. Expanding f(x, y) to first order in Taylor series yields f ðx; yÞ ¼ f ðx0 ; y0 Þ þ ð@F=@xÞðx x0 Þ þ ð@F=@yÞðy y0 Þ:

ð25:4Þ

Substituting for f(x, y) in Eq. (25.3b) from Eq. (25.4) and carrying out the integration, we find Cm ¼ exp½i2pmf ðx0 ; y0 Þ expfip½ðkc =k0 Þ mgsinc½ðkc =k0 Þ m;

ð25:5Þ

where sinc(x) ¼ sin(px)/px. The mth order diffraction efficiency is thus found to have the constant amplitude jCmj ¼ sinc[(kc/k0) m] across the XY-plane for any given k0. When k0 happens to be the same as the construction wavelength kc, the first order beam will have 100% efficiency while all other orders vanish. Also, if kc is an integer-multiple of k0, only one order will emerge, unattenuated, from the DOE. For all other values of k0, the various orders m ¼ 0, 1, 2, etc. will coexist. The second term in Eq. (25.5) corresponds to a constant phase, p[(kc/k0) – m], which is independent of (x0, y0) and may thus be ignored in practice. The

25 Diffractive optical elements

355

remaining phase, 2pmf(x0, y0), varies continuously across the XY-plane with absolutely no dependence on k0. Since f(x0, y0) is the fractional part of F(x0, y0), the two functions may be exchanged and the phase acquired by the mth order rays written as 2pmF(x0, y0). In practice the lack of any discontinuous jumps in this phase profile of the mth order beam is extremely important, since it means that the wavefront associated with each and every diffraction order is well-behaved. In other words, if one assembles all the mth order rays from across the DOE to construct the mth order transmitted beam, the beam will have a continuous wavefront. The transmitted wavefront around (x0, y0), the foot of the incident ray, can now be written X A 0m exp½ið2pn2 =k0 Þðxrx0m þ yry0m Þ Aðx; yÞ ¼ m

¼ A0 exp½ið2pn1 =k0 Þðxrx þ yry Þ exp½iwðx; yÞ X ¼ Cm A0 expfið2p=k0 Þ½ðn1 rx þ mk0 @F=@xÞx m

þ ðn1 ry þ mk0 @F=@yÞyg:

ð25:6Þ

The (complex) amplitude and the direction of the mth order transmitted ray are thus given by A 0m ¼ Cm A0 ; ðr0x ; r0y Þm ¼ n1 rx þ mk0 @F=@x; n1 ry þ mk0 @F=@y =n2 :

ð25:7aÞ ð25:7bÞ

Note that the mismatch between the refractive indices n1, n, and n2 is not taken into consideration in Eq. (25.7a) as far as reflection losses at the various interfaces are concerned. Also ignored in this analysis are the effects of incident polarization on the transmission coefficient Cm, which would have required a rigorous vector diffraction treatment. For m 6¼ 0, the direction of the mth order transmitted ray, (rx0, ry0)m, is seen from Eq. (25.7b) to depend on the illumination wavelength k0 in a way that gives rise to a substantial amount of chromatic aberration; this provides the basis for correcting the chromatic aberrations of conventional refractive lenses by incorporating diffractive optical elements in the so-called hybrid designs. In going from medium 1 to medium 2 of Figure 25.1, the undiffracted 0th- order ray follows the 0 0 Snell’s law since, according to Eq. (25.7b), (n2rx0 , n2ry0 ) ¼ (n1rx, n1ry). For other diffraction orders, one must add mk0rF to the incident beam’s (n1rx, n1ry) in order to obtain the transmitted beam’s (n2rx0, n2ry0) m.

356

Classical Optics and its Applications

Having exploited the localized ray picture to build the transmitted wavefront(s) across the DOE surface, we now abandon the rays and concentrate instead on the transmitted wavefronts (one for each diffracted order). When the incident wavelength k0 differs from the construction wavelength kc, the various orders will be present in the mix in different amounts, with the magnitude of the mth beam, jCmj , being a function of m and the wavelength ratio kc/k0. Although the phase profile of each diffracted order is independent of the incident wavelength k0, this does not imply that a given diffracted order behaves identically in response to different incident wavelengths. Remember that the mth order phase profile is exp[i2pmF(x, y)], so, for simplicity’s sake, let us assume that F(x, y) ¼ ax þ by, where a and b are arbitrary constants. This phase profile may then be written as exp[i(2p/k)(mkax þ mkby)], where k ¼ k0/n2 is the wavelength within the medium of refractive index n2. This represents a plane wave having direction cosines (rx, ry) ¼ (mka, mkb), whose propagation direction evidently depends on k0, even though its phase profile is independent of the incident wavelength. The bottom line is that the rays and the wavefronts that emerge from the above analysis paint a consistent picture, both leading to the same conclusions concerning the diffraction efficiency and the chromatic aberrations associated with each diffracted order of the transmitted beam. Reflective diffractive optical element The arguments of the preceding section may be extended to cover the case of an ideal reflective DOE shown in Figure 25.3. As before, the incidence medium has n1

t (x, y) Perfect Reflector

2 1

X n Y

Z

Figure 25.3 The case of a reflective DOE differs from that of a transmissive DOE in that the transparent substrate is now replaced with a perfect reflector. The incident rays, after traveling through the coating layer, bouncing back at the substrate interface, and returning through the same thickness of the coating layer, re-emerge into the incidence medium (refractive index ¼ n1). The DOE is constructed from the coating layer by removing from t(x, y) all integer multiples of ½kc/(n n1).

357

25 Diffractive optical elements

r s

z

h(r) Figure 25.4 A surface of revolution around the z-axis is defined by its sag h(r), which is the distance of the surface (along z) from the plane tangent to the surface at its vertex. The curvilinear coordinate s follows the tangent to the surface in the rz-plane. The value of s at each point is the length of the curve measured from some point of reference, such as the vertex at (r, z) ¼ (0, 0). Also shown is a pair of incident and refracted rays at the surface.

refractive index n1, but the DOE’s substrate is a perfect reflector. We assume once again that the variable-thickness layer has a large refractive index n and a correspondingly small thickness t(x, y). The optical path difference upon transmission through the layer and reflection at the substrate interface is thus given by OPD ¼ 2(n n1)t(x, y), which yields the characteristic function F(x, y) ¼ 2(n n1)t(x, y)/kc, with kc being the construction wavelength. Once again, the DOE is constructed from the above coating layer by reducing the layer’s thickness whenever F(x, y) exceeds unity. Note that the above factor of 2 in the expression for the OPD – representing the effect of double-path through the coating layer – does not affect any of the subsequent results, since the starting point of our derivations is the function F(x, y), which already incorporates this factor. The formal derivations for a reflective DOE parallel those of the transmissive DOE in the preceding section, until we reach Eq. (25.6), at which point the refractive index n2 of the medium into which the beam emerges (upon transmission through the DOE) must be replaced with n1, reflecting the fact that the incidence and emergence media are now the same. Therefore, for reflective DOEs, the only equation that needs to be modified is Eq. (25.7b), which

358

Classical Optics and its Applications

assumes the following form: ðr0x ; r0y Þm ¼ rx þ ðmk0 =n1 Þ@F/@x; ry þ ðmk0 =n1 Þ@F/@y :

ð25:8Þ

All the considerations discussed in the case of transmissive DOEs apply equally to reflective elements as well. DOE on a curved surface Curved surfaces may also be coated with DOEs, and the method of calculating reflected/transmitted rays is essentially the same as that described in conjunction with flat surfaces in the preceding sections. The reason is that all such calculations are based on the properties of the surface and of the incident and emergent rays over small patches, where curved surfaces are flat locally. The only complication arises from the fact that the DOE’s characteristic function is usually defined with respect to a coordinate system whose axes do not follow the profile of the surface. We limit the present discussion to the case of a curved surface of

X Gaussian Beam

Glass Plate DOE

Y

–2.1

x (mm)

Cover Slip

2.1 Aspheric Lens

Substrate

Destination Plane

Figure 25.5 Gaussian beam (k0 ¼ 0.66 lm, e1 radius R0 ¼ 2.5 mm, diameter D ¼ 4.0 mm) is focused by a 4.0 mm diameter lens (thickness ¼ 1.7 mm, refractive index ¼ 1.540 44, first surface: radius of curvature Rc ¼ 11.4 mm, conic constant j ¼ 0.733, aspheric coefficients A4 ¼ 2.82 · 107, A6 ¼ 3.75 · 108, A8 ¼ 1.5 · 109; second surface: Rc ¼ 98 mm). The incident beam, linearly polarized along the x-axis, has the intensity profile shown on the lefthand side. The glass plate (d1 ¼ 0.61 mm), the cover slip (d2 ¼ 0.5 mm), and the substrate (d3 ¼ 2.0 mm) all have the same refractive index n ¼ 1.520 168. The glass plate is 1.0 mm away from the lens and 14.38 mm away from the cover slip. The destination plane is at z ¼ 10.0 mm (measured from the first vertex of the lens), and is tilted by h ¼ 6.03 , as shown. The beam is subsequently propagated a distance of 10.468 mm along the normal to the destination plane, which brings the beam to its plane of best focus.

359

25 Diffractive optical elements

revolution, such as that in Figure 25.4, where the axis of symmetry is z, and the sag is a given function h(r) of r. The characteristic function of such a DOE is usually defined by a radial polynomial, FðrÞ ¼

N X

an r n :

ð25:9Þ

n¼0

Consider the local surface coordinate s shown in Figure 25.4. The value of s at each point on the surface is the length of the curve measured from some point of reference such as the vertex at (r, z) ¼ (0, 0). What we need is the characteristic function’s gradient over a short distance Ds, namely, DF/Ds. But qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 2 ð25:10Þ Ds ¼ ðDrÞ þ ðDhÞ ¼ Dr 1 þ ðdh=drÞ2 :

1.5

y (mm)

–1.5 1.5

y (mm)

–1.5 –5

x (mm)

–2 –5

x (mm)

–2 –5

x (mm)

Figure 25.6 Distributions of intensity (top) and phase (bottom) at the destination plane in the system of Figure 25.5; from left to right, x-, y-, and z-components of polarization. Note that the emergent beam is centered at x ¼ 3.6 mm. The peak intensities are in the ratio of Ix : Iy : Iz ¼ 1.0 : 0.39 · 103 : 0.13. In the residual phase profiles x, y, z, where the wavefront curvature and tilt are factored out, the color spectrum in each plot covers the range from minimum (blue) to maximum (red); here (min : max) is (0 : 39 ) for x, (147 : 39 ) for y, and (146 : 0 ) for z.

–2

360

Classical Optics and its Applications

15

y (μm)

–15 –15

x (μm)

15 –15

x (μm)

15 –15

x (μm)

15

5

y (μm)

–5 5

y (μm)

–5 –5

x (μm)

5 –5

x (μm)

5 –5

x (μm)

5

Figure 25.7 Plots of log-intensity (top), intensity (middle), and phase (bottom) at the plane of best focus in the system of Figure 25.5. From left to right: x-, y-, and z-components of polarization. The peak intensities are in the ratio of Ix : Iy : Iz ¼ 1.0 : 0.15 · 103 : 0.115. The phase profiles’ range (blue to red) is (min : max) ¼ (180 : 180 ).

Therefore, @F=@s ¼ ð@F=@rÞ

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 þ ðdh=drÞ2 :

ð25:11Þ

Equation (25.11), in conjunction with the equations derived previously for flat surfaces, is all that one needs in order to compute the various diffracted rays and wavefronts associated with DOEs on curved substrates.

361

25 Diffractive optical elements 2.0 mm

Y

1.6

Gaussian Beam

X

y(mm)

–1.6 –1.6

x (mm)

DOE

1.6

10 mm Destination Plane

Figure 25.8 A linearly polarized Gaussian beam enters a glass prism of refractive index n ¼ 1.65 whose rear facet is coated with a DOE. The incident beam’s intensity profile is shown on the left-hand side. The emergent diffracted beam is the þfirst order. The entrance and exit facets of the prism are antireflection-coated, and the destination plane is a distance Dy ¼ 10 mm below the prism’s exit facet.

Transmissive DOE sandwiched between a pair of flat substrates Figure 25.5 shows an aspheric lens illuminated with a Gaussian beam (k0 ¼ 0.66 lm, e1 radius R0 ¼ 2.5 mm, diameter D ¼ 4.0 mm). The emerging convergent beam passes through a glass plate on its way to a flat DOE sandwiched between a substrate and a cover slip. The DOE’s construction wavelength kc is the same as k0 (hence the emergent beam is the first diffracted order), and its phase profile is given by (x and y in millimeters): Fðx; yÞ ¼ 639:77 x þ 17:47 x2 19:76 y2 30:18 x3 0:0042 x2 y 33:69 xy2 þ 0:0021y3 3:25 x4 : The incident rays are traced through the entire system, then back-traced to the so-called destination plane, located at z ¼ 10 mm from the first vertex of the lens and tilted by h ¼ 6.03 , as shown. At the destination plane, the magnitude, phase, and polarization state of the rays are used to reconstruct the wavefront. Figure 25.6 shows the reconstructed wavefront’s intensity and phase distribution at the destination plane. The wavefront’s curvature and tilt are factored out, otherwise the phase variations across the cross-sectional profiles will be too great to display. Note that the y-component is nearly four orders of magnitude weaker than the x-component, whereas the z-component’s power content is non-negligible. The phase profiles of Figure 25.6 are quite

362

Classical Optics and its Applications

1.75

y (mm)

–1.75 1.75

y (mm)

–1.75 1.75

y (mm)

–1.75 –1.75

x (mm)

1.75 –1.75

x (mm)

1.75 –1.75

x (mm)

1.75

Figure 25.9 Plots of intensity (top), phase (middle) and phase minus curvature (bottom) at the destination plane of the system of Figure 25.8. From left to right: x-, y-, and z-components of polarization. The peak intensities are in the ratio of Ix : Iy : Iz ¼ 105 : 0.4 : 1.08. The range of the phase profiles (blue to red) is (min : max) ¼ (180 : 180 ).

uniform, corresponding to a small residual aberration with an r.m.s. wavefront error 0.003 k0. Figure 25.7 shows plots of log-intensity, intensity, and phase in the plane of best focus for the x-, y-, and z-components of polarization. Note that the y-component is nearly four orders of magnitude weaker than the x-component, whereas the z-component is fairly strong. The observed linear phase profile is due to the 6.03 tilt of the focal plane relative to the incident beam coordinates (see the focal plane coordinates in Figure 25.5).

363

25 Diffractive optical elements Y 1.6 Gaussian Beam

y (mm)

X

–1.6 –1.6

x (mm)

1.6

Destination Plane

DOE

Aspheric Lens

Cover Slip

Figure 25.10 A linearly polarized Gaussian beam is focused via a DOE-coated bi-aspheric lens through a glass cover slip (d ¼ 1.2 mm, n ¼ 1.573 456), which is separated from the lens by 1.0 mm. The incident beam’s intensity profile is shown on the left-hand side. The 3 mm diameter lens has thickness ¼ 1.8256 mm, refractive index ¼ 1.597 075, first surface parameters: radius of curvature Rc ¼ 1.93 mm, conic constant j ¼ 0.655 844, aspheric coefficients A4 ¼ 2.833 · 103, A6 ¼ 4.389 · 105, A8 ¼ 1.524 · 104; A10 ¼ 1.177 · 104; and second surface parameters: Rc ¼ 6.744 mm, j ¼ 31.754, A4 ¼ 7.358 · 103, A6 ¼ 2.5077 · 103, A8 ¼ 1.106 · 103; A10 ¼ 3.871 · 104. The destination plane is at the exit pupil of the aspheric singlet, and the beam is subsequently propagated to the focal plane.

Reflective DOE on flat substrate Figure 25.8 shows a flat DOE on the rear facet of a glass prism, illuminated by a Gaussian beam (k0 ¼ 0.65 lm, e1 radius R0 ¼ 2.0 mm, diameter D ¼ 3.0 mm, linearly polarized along x). The only emergent beam is the þfirst diffracted order, as the DOE’s construction wavelength kc is the same as k0. The DOE’s aperture diameter is 5.0 mm, and its phase profile within its own plane is F(x, y) ¼ 3.0(x2 þ y2); here both x and y are in millimeters. Figure 25.9 shows the reflected intensity and phase profiles at the destination plane. These plots depict intensity (top), phase (middle), and phase-minus-curvature (bottom), with the x-, y-, z-components of polarization shown from left to right. Note that the y- and z-components are several orders of magnitude weaker than the x-component. The DOE’s 45 tilt produces the astigmatism seen in the phase plots. Transmissive DOE on an aspheric glass lens Figure 25.10 shows a DOE-coated aspheric lens illuminated with a Gaussian beam (k0 ¼ 0.78 lm, e1 radius R0 ¼ 2.0 mm, diameter D ¼ 3.0 mm, linearly polarized along x). The DOE’s phase profile is F(r) ¼ 4.2r2 – 2.5r4 þ 0.25r6 (r in mm),

364

Classical Optics and its Applications 2

y (mm)

–2 2

y (mm)

–2

–2

x (mm)

2 –2

x (mm)

2 –2

x (mm)

2

Figure 25.11 Plots of intensity (top) and phase (bottom) at the exit pupil of the aspheric lens in the system of Figure 25.10. From left to right: x-, y-, and z-components of polarization. The peak intensities are in the ratio of Ix : Iy : Iz ¼ 1000 : 0.6 : 70. The range of the phase profiles (blue to red) is (min : max) ¼ (180 : 180 ).

and its construction wavelength kc is the same as k0; hence the emergent beam is the þfirst diffracted order. The incident rays are first traced through the entire system, then back-traced to the destination plane located at the exit pupil of the objective lens; the emergent wavefront is subsequently reconstructed from the traced rays. Figure 25.11 shows plots of intensity and phase at the destination plane. Shown from left to right are the x-, y-, and z-components of polarization. The curvature of the wavefront has been factored out, so what is displayed is the residual phase or aberrations. Note that the y-component is nearly three orders of magnitude weaker than the x-component, but the z-component is not so weak. The wavefront at the exit pupil is then propagated to the focal plane and shown in Figure 25.12, where the y-component of polarization is seen to be more than three orders of magnitude weaker than the x-component. Reflective DOE on a parabolic mirror Figure 25.13 shows the diagram of a DOE-coated parabolic mirror illuminated with a Gaussian beam (k0 ¼ 0.65 lm, e1 radius R0 ¼ 2.0 mm, diameter D ¼ 3.0 mm,

365

25 Diffractive optical elements 4

y (μm)

–4 –4

4 –4

x (μm)

4

x (μm)

Figure 25.12 Intensity distribution at the focal plane of the lens in the system of Figure 25.10; (left) x-component, (right) y-component of polarization. The peak intensities are in the ratio Ix : Iy ¼ 1000 : 0.27.

Y

Destination Plane Parabolic Mirror

1.6

X y(mm) DOE –1.6 –1.6

x (mm)

1.6

Gaussian Beam

10 mm

Figure 25.13 A Gaussian beam is reflected from a DOE-coated parabolic mirror. The incident beam, linearly polarized along the x-axis, has the intensity profile shown on the left-hand side. Since the DOE’s construction wavelength kc is 0.55 lm, various diffracted orders exist, although the most intense beam, shown in Figure 25.14, is the þfirst order. The destination plane is a distance Dz ¼ 10.0 mm from the vertex of the paraboloid.

linearly polarized along the x-axis). The paraboloid has radius of curvature Rc ¼ 40 mm, conic constant j ¼ 1, and aperture diameter D ¼ 3.0 mm; the DOE’s phase profile is given by F(r) ¼ r2–1.25r4 þ 0.35r6 þ 0.1r8 (r in millimeters). Since the DOE’s construction wavelength is kc ¼ 0.55 lm, various diffracted orders exist, although the most intense beam, shown in Figure 25.14, is the þfirst order. Figure 25.14 shows the reflected intensity and phase profiles at the destination plane, located 10.0 mm away from the mirror’s vertex; this also happens to be 10.0 mm before the mirror’s nominal focal plane. From left to right,

366

Classical Optics and its Applications 1

y (mm)

–1 1

y (mm)

–1 –1

x (mm)

1 –1

x (mm)

1 –1

x (mm)

1

Figure 25.14 Plots of intensity (top) and phase (bottom) at the destination plane of the system of Figure 25.13. From left to right: x-, y-, and z-components of polarization. The peak intensities are in the ratio of Ix : Iy : Iz ¼ 1.0 : 0.33 · 106 : 0.177 · 102. The range of the phase profiles (blue to red) is (min : max) ¼ (180 : 180 ). For display purposes the curvature phase factor has been taken out of the mesh.

these plots represent the x-, y-, and z-components of polarization. Note that the y-component is nearly six orders of magnitude weaker than the x-component, whereas the z-component is only 600 times weaker. References for Chapter 25 1 J. Turunen and F. Wyrowski, Diffractive Optics for Industrial and Commercial Applications, Akademie Verlag, Berlin, 1977. 2 W. Veldkamp and T. J. McHugh, Binary optics, Scientific American, May 1992, 50. 3 W. C. Sweatt, Describing holographic optical element as lens, J. Opt. Soc. Am. 67, 803 (1977). 4 M. W. Farn, Quantitative comparison of the general Sweatt model for the grating equation, Appl. Opt. 31, 5312 (1992). 5 L. N. Harza, Kinoform lenses: Sweatt model and phase function, Optics Communications 117, 31 (1995). 6 F. Wyrowski, Diffractive optical elements: iterative calculation of quantized, blazed phase structures, J. Opt. Soc. Am. A 7, 961 (1990).

26 The Talbot effect

The Talbot effect, also referred to as self-imaging or lensless imaging, was originally discovered in the 1830s by H. F. Talbot.1 Over the years, investigators have come to understand different aspects of this phenomenon, and a theory of the Talbot effect based on classical diffraction theory has emerged which is capable of explaining the various observations.2,3,4 For a detailed description of the Talbot effect and related phenomena, as well as a historical perspective on the subject, the reader may consult references 3 and 4 and further references cited therein. Since many of the standard optics textbooks do not even mention the Talbot effect, it is worthwhile to bring to the reader’s attention the essential features of this phenomenon. Lensless imaging of a periodic pattern The Talbot effect is observed when, under appropriate conditions, a beam of light is reflected from (or transmitted through) a periodic pattern. The pattern may have one-dimensional periodicity (as in traditional gratings), or it may exhibit periodicity in two dimensions (e.g., a surface relief structure or a photographic plate imprinted with identical features on a regular lattice). In what follows we shall present the diffraction patterns obtained from a periodic array of cross-shaped apertures in an otherwise opaque screen. Because the diffraction pattern of a single aperture differs markedly from that of a periodic array of such apertures, we begin by examining the behavior of an individual aperture under coherent illumination. Consider the cross-shaped opening in an opaque screen shown in Figure 26.1(a). A collimated beam of coherent light, wavelength k, illuminates the screen at normal incidence; the assumed length and height of the aperture are each 20k. Logarithmic plots of intensity distribution at distances z ¼ 100k, 200k, and 600k beyond the screen are computed and shown in Figures 26.1 (b)–(d), respectively (note the different scales of these figures). For z > 600k the 367

368

Classical Optics and its Applications a

–20

b

x/

x/

120

x/

350

d

c

–120

20 –120

x/

120 –350

Figure 26.1 (a) A cross-shaped aperture in an opaque screen, illuminated by a normally incident plane wave of wavelength k. The length and the height of the cross are each 20k. Also shown are the computed plots of intensity distribution (logarithmic) at various distances z from the aperture: (b) z ¼ 100k, (c) z ¼ 200k, (d) z ¼ 600k. Note that the scale varies.

intensity distribution will have the far field pattern of Figure 26.1(d), although its size will scale with distance from the screen. Under no circumstances do we obtain an intensity pattern that closely resembles the cross shape of the aperture itself. Now consider the periodic array of cross-shaped apertures shown in Figure 26.2(a); each aperture is identical to that in Figure 26.1(a). The center-to-center spacing between adjacent apertures along the X- and Y-directions is p ¼ 60k. (For simplicity we have assumed the periodic pattern to extend to infinity, although, for practical purposes, a finite number of apertures in a periodic arrangement will suffice.) When the pattern in Figure 26.2(a) is illuminated by a normally incident, coherent beam of light, the cross shape of the apertures is abundantly reproduced in the intensity patterns obtained at certain distances from the screen. Figures 26.2(b)–(f) show the computed patterns of intensity distribution at distances z ¼ 600k, 1200k, 1800k, 2700k, and 3600k, respectively. (Note that all pictures in Figure 26.2 have the same scale.) When the distance from object to image z ¼ p2/k, as is the case in Figure 26.2(f), the original pattern of the apertures

369

26 The Talbot effect a

b

c

d

e

f

–120

x/

120 –120

x/

120 –120

x/

120

Figure 26.2 (a) A periodic array of cross-shaped apertures in an opaque screen, illuminated by a normally incident plane wave of wavelength k. As in Figure 26.1, the crosses are 20k wide on each side. Also shown are the computed plots of intensity distribution at various distances z from the aperture: (b) z ¼ 600k, (c) z ¼ 1200k, (d) z ¼ 1800k, (e) z ¼ 2700k, (f) z ¼ 3600k. Note that the scale is the same for all the various pictures.

is reproduced, albeit with a half-period shift in both the X- and the Y-direction. In Figure 26.2(d), the distance to the image is p2/(2k), and not only is the original pattern replicated but also its frequency (along both X and Y) has doubled. In Figure 26.2(c), where the distance to the image is z ¼ p2/(3k), the pattern is repeated with three times the original frequency along both X- and Y- axes. By showing the intensity distribution at other distances from the object, Figures 26.2(b), 26.2(e) emphasize that perfect reproduction of the shapes in the original pattern does not occur everywhere but only at certain special planes. A hint as to why these periodic patterns are reproduced at certain intervals may be gleaned from the following argument. A plane wave normally incident on a periodic structure creates a discrete spectrum of plane waves propagating along the directions qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ k ¼ ðkx ; ky ; kz Þ ¼ 2p m=p; n=p; ð1=kÞ2 ðm=pÞ2 ðn=pÞ2 : ð26:1Þ The z-component of this vector may be approximated as follows: h i kz ð2p=kÞ 1 12 ðmk=pÞ2 12 ðnk=pÞ2

ð26:2Þ

370

Classical Optics and its Applications

provided that p/k is large enough that, for all m, n values of interest, the above Taylor-series expansion to first order suffices. The acquired phase after a propagation distance of z will then be kz z ð2pz=kÞ pzðm2 þ n2 Þk=p2 :

ð26:3Þ

Now, since m, n are integers, if z happens to be an even-integer multiple of p /k then the above phase will differ from the constant value 2pz/k by a multiple of 2p only. Since all plane waves emanating from the object will thus arrive at the image plane with the same phase factor, their superposition will recreate the original pattern. It turns out that z does not need to be an even-integer multiple of p2/k for selfimaging to occur. At odd-integer multiples of p2/k, for instance, a replica of the original pattern will also emerge, but with a half-period shift. Multiple images of the pattern will appear at certain non-integer multiples of p2/k as well. These aspects of the Talbot effect will be further clarified below, when we present a more rigorous analysis. Although the mathematical argument supporting the Talbot effect depends on periodicity of the object in the XY-plane, certain patterns that are not globally periodic, but appear to be so locally, will also produce self-images. For example, the concentric ring pattern shown in Figure 26.3(a), when illuminated by a normally incident coherent beam, will yield the patterns of Figures 26.3(b)–(d) at distances z ¼ 18k, 27k, and 36k, respectively. The period p of the rings is 6k and the width of the bright rings is 2k. Clearly, the self-images break down near the center and near the outer edge, because (local) periodicity is no longer valid in these regions. But a near self-image at z ¼ p2/k and a frequency-doubled image at z ¼ p2/(2k) are clearly observed. Another example is shown in Figure 26.4, where a spiral pattern with period p ¼ 9k is propagated to distances z ¼ p2/(2k), 3p2/(4k), and p2/k. Again in Figures 26.4(b), (d) the center and the outer rings are not well reproduced, but nearly everything else is. The Talbot effect is much more general than the above limited exposition may indicate. The pattern periodicities may be in one or two dimensions; the object may modulate both the amplitude and the phase of the light beam; certain applications rely on the use of incoherent light sources; in the case of twodimensional periodic patterns, the underlying lattice may be square, rectangular, hexagonal, etc.; the incident beam may be a plane wave or a spherical wavefront originating at a point source; applications are not limited to visible light but extend to X-rays and microwaves, as well as to electron and atom optics. To appreciate the variety of arrangements that lead to useful and interesting images the reader is encouraged to consult the published literature. 2

371

26 The Talbot effect 55

a

b

c

d

y/

–55 55

y/

–55 –55

x/

55 –55

x/

55

Figure 26.3 (a) A mask consisting of eight concentric rings (width ¼ 2k, spacing ¼ 6k) is illuminated by a normally incident plane wave of wavelength k. The computed intensity distributions shown here are at distances of (b) z ¼ 18k, (c) z ¼ 27k, and (d) z ¼ 36k from the mask. A bright spike appearing in the central region of each image has been blocked off in order to improve the image contrast.

A simple analysis Consider the point source shown in Figure 26.5, located at (x, y, z) ¼ (x0, y0, 0) and radiating a spherical wavefront into the region z > 0 of space. In this analysis we assume that all spatial dimensions are normalized by the vacuum wavelength k of the light; as a result, k will not appear explicitly in any of the following equations. In the z ¼ z0 plane, the complex-amplitude distribution may be written Aðx; y; z ¼ z0 Þ ¼ ð1=rÞ expði2prÞ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ¼ 1= ðx x0 Þ2 þ ðy y0 Þ2 þ z20 qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ · exp i2p ðx x0 Þ2 þ ðy y0 Þ2 þ z20 ð1=z0 Þ expði2pz0 Þ· exp ip x2 þ y2 =z0 · exp ip x02 þ y20 =z0 · exp½i2pðxx0 þ yy0 Þ=z0 :

ð26:4Þ

372

Classical Optics and its Applications 60

a

b

c

d

y/

–60 60

y/

–60 x/

–60

60 –60

x/

60

Figure 26.4 (a) A mask consisting of a spiral aperture (width 3k, spacing 9k) is illuminated by a normally incident plane wave of wavelength k. The computed intensity distributions shown here are at distances of (b) z ¼ 40.5k, (c) z ¼ 60.75k, and (d) z ¼ 81k from the aperture. As in the previous figure, a bright spike appearing in the central region of each image has been blocked off in order to improve the image contrast.

X

(x0, y0)

Z0

Z

Y

Figure 26.5 A quasi-monochromatic point source located at (x, y, z) ¼ (x0, y0, 0) radiates a cone of light into the half-space z > 0.

In deriving the above approximate expression we have used, for the exponent, the first term in the Taylor series expansion pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð26:5Þ 1 þ x2 ¼ 1 þ 12 x2 þ

373

26 The Talbot effect

Now, the first two terms on the right-hand side of Eq. (26.4) are the approximate form of the spherical wavefront emanating from a point source at the origin of the plane z ¼ 0. The next term is a constant phase factor that depends on the position (x0, y0) of the point source within the XY-plane and the last term is a linear phase factor in x and y. Next, let us assume that a periodic mask, having periods ax and ay along the X- and Y-axes, is placed at z ¼ z0 (see Figure 26.6). In the general case, where the mask modulates the phase and/or the amplitude of the light beam, its complex-amplitude transmission function may be written XX ð26:6Þ tðx; yÞ ¼ Cmn exp½i2pðmx=ax þ ny=ay Þ: When the incident spherical wavefront is multiplied by t(x, y), each Fourier component of t(x, y) will create a different spherical wavefront which, according to Eq. (26.4), appears to originate at a different point (x0, y0) ¼ (mz0 /ax, nz0 /ay) within the XY-plane. In addition, each such point source appears to have the following phase factor: expðimn Þ ¼ exp½ipðx02 þ y20 Þ=z0 ¼ exp½ipz0 ðm2 =a2x þ n2 =a2y Þ:

ð26:7Þ

The net effect of the mask, therefore, is to replace the single point source with a periodic array of point sources, as shown in Figure 26.7, where the magnitude of each point source is Cmn exp(i mn). At the observation plane, each point source will give rise to a spherical wavefront that will obey Eq. (26.4), except that the

X z0

ax

(0, 0)

Z

Point source Y

ay Periodic phase/amplitude mask

Figure 26.6 A quasi-monochromatic point source located at the origin of the coordinate system illuminates a periodic phase and/or amplitude mask placed parallel to the XY-plane at z ¼ z0. The periods of the mask’s pattern are ax along the X-axis and ay along the Y-axis.

374

Classical Optics and its Applications X z1

z0

z0 + z1

Z

Y Periodic array of point sources

Periodic Observation mask plane

Figure 26.7 Interaction between the periodic mask and the cone of light shown in Figure 26.6 gives rise to an array of (virtual) point sources, each having a certain phase and amplitude depending on the structure of the mask and its location z0 along the Z-axis. To determine the light distribution at the observation plane one may replace the mask by this “equivalent” array of point sources.

distance z0 is replaced by z0 þ z1. We thus have Aðx; y; z ¼ z0 þ z1 Þ ½1=ðz0 þ z1 Þ exp½i2pðz0 þ z1 Þ · exp½ipðx2 þ y2 Þ=ðz0 þ z1 Þ XX · Cmn exp½ipz0 ðm2 =a2x þ n2 =a2y Þ · exp½ipðm2 =a2x þ n2 =a2y Þz20 =ðz0 þ z1 Þ · expfi2p½xðmz0 =ax Þ þ yðnz0 =ay Þ=ðz0 þ z1 Þg: ð26:8Þ The first two factors in the above equation correspond to a spherical wavefront with radius of curvature z0 þ z1; we need not keep track of them any longer. The last factor can be simplified if we define a magnification factor M ¼ (z0 þ z1)/z0, in which case it is written as expfi2p½mx=ðMax Þ þ ny=ðMay Þg:

ð26:9Þ

This is just the (m, n)th plane-wave component of the spectrum, whose periods ax, ay are magnified by a factor M. Except for this scale factor, the Fourier basis functions have not changed in going from the plane of the mask (z ¼ z0) to the observation plane (z ¼ z0 þ z1). The main factors in Eq. (26.8), therefore, are the first two factors in the double sum; these can be written as follows: h i h i 2 2 2 2 2 2 2 2 exp ip m =ax þ n =ay z0 z1 =ðz0 þ z1 Þ ¼ exp ipðz1 =M Þ m =ax þ n =ay : ð26:10Þ

26 The Talbot effect

375

Let us now assume that a2x and ay2 have a least common multiple in the following sense: l a2x ¼ ma2y ¼ a2 ;

ð26:11Þ

where both l and m are integers. Then the phase factor in Eq. (26.10) may be written expfip½z1 =ðMa2 Þðlm2 þ mn2 Þg:

ð26:12Þ

Since lm2 þ mn2 is an integer, if z1 is chosen to be 2jMa2 with j integer, then the phase factor in Eq. (26.12) will become unity for all values of m and n and can therefore be ignored. Under such circumstances Eq. (26.8) will yield a magnified image of the mask at the observation plane. This is the essence of the Talbot effect. By allowing z0 to approach infinity, the above results can be readily extended to the case of plane-wave illumination. The magnification factor M will become unity in this case, but no other change will be necessary in the preceding equations.

Image multiplicity The appearance of multiple images at the observation plane may be readily explained in the special case where the periodicity is one dimensional and the frequency of the image is twice that of the object. The explanation, nonetheless, captures the essence of the phenomenon and can be easily extended to periodicity in two dimensions and to higher multiplicities. Consider the periodic function f (x) shown in Figure 26.8(a). Note that the period ax is much larger than the width of the individual “features” of the function, so that there is plenty of space to insert additional features. Let the Fourier-series representation of this function be X f ðxÞ ¼ Cm expði2p mx=ax Þ: ð26:13Þ In the Fourier domain, the Fourier transform F(m) of f (x) is a “comb” function with period 1/ax, where the delta function at position m is multiplied by the corresponding Fourier coefficient Cm, as shown in Figure 26.8(b). Now, let us assume that the odd coefficients of F(m) are multiplied by a complex constant b. (This would happen in Eq. (26.12), for instance, if l ¼ 1, m ¼ 0, and z1 ¼ 12 Ma2 , in which case b ¼ i.) We can then separate the Fourier coefficients of f (x) into even and odd terms, as shown in Figure 26.9. Both the resulting comb functions in the Fourier domain will have twice the period of the

376

Classical Optics and its Applications (a)

f (x)

–ax

–2ax

ax

0

x

2ax

F(m)

(b) C–4 C–5

C–1 C–3

C0

C1 C2

C–2

C4

C3

C5 m

–5 –4 –3 –2 –1

0

1

2

3

4

5

Figure 26.8 (a) A periodic function f (x) in one-dimensional space; the individual “features” of the function are much narrower than its period ax. (b) The Fourier transform of f (x) consists of a sequence of delta functions located at integer multiples of 1/ax in the Fourier domain.

Feven(m) C4

C–4 C0

C–2

C2

m –5 –4 –3 –2 –1

0

1

2

3

4

5

Fodd(m)

C–5

C–3

C–1

C1

C3

C5 m

–5 –4 –3 –2 –1

0

1

2

3

4

5

Figure 26.9 In Figure 26.8(b), when the odd components of the Fouriertransformed function F(m) are multiplied by a constant b, the function may be resolved into two “comb” functions, Feven(m) and Fodd(m). In these new functions the spacing between adjacent delta functions is 2/ax and, in the case of Fodd(m), the function is shifted by a half-period.

377

26 The Talbot effect

original comb function; therefore, their inverse transforms in the x-domain will have twice the frequency. The second comb function in Figure 26.9 is also shifted by a half-period, which means that its inverse transform must be multiplied by exp(i2px/ax). The resulting comb functions in the x-domain are shown in Figure 26.10. The net result is that when we add the two comb functions of Figure 26.10 and convolve the resultant with the unit-period function f0(x), we will find the function shown in Figure 26.11. Because the width of f0(x) is less than half the period ax, the new features added to the function will not overlap with the old ones, yielding a function with an apparently increased frequency.

x –2ax

–ax

ax

2ax

–2ax

–ax

ax

2ax

x

Figure 26.10 The comb function corresponding to Feven(m), when inversetransformed to the x-domain, will yield a comb function that has twice the frequency of the original function f (x). Likewise, the inverse transform of the comb function corresponding to Fodd(m) will have a spacing of 12 ax between its

adjacent delta functions but, because of the half-period shift in the Fourier domain, every other delta function is flipped over. ½(1 + ) f0(x)

½(1 – ) f0(x)

–2ax

–ax

0

ax

2ax

x

Figure 26.11 When the sum of the two comb functions in Figure 26.10 is convolved with the individual features f0 (x) of f (x), the resulting function appears to have twice the frequency of the original f (x). Note, however, that the “features” of the new function are alternately multiplied by 12 ð1 þ bÞ and 12 ð1 bÞ.

378

Classical Optics and its Applications

However, the periodicity is only in the amplitude of the function, since the phase of each feature differs from the phase of its neighbors. In any event, this description explains why the apparent periodicity of the pattern in Figure 26.2 increases at certain distances between the object and the image. References for Chapter 26 1 H. F. Talbot, Phil. Mag. 9, 401 (1836). 2 Lord Rayleigh, Phil. Mag. 11, 196 (1881). 3 O. Bryngdahl, Image formation using self-imaging techniques, J. Opt. Soc. Am. 63, 416–419 (1973). 4 J. F. Clauser and M. W. Reinsch, New theoretical and experimental results in Fresnel optics with applications to matter-wave and X-ray interferometry, Appl. Phys. B 54, 380–395 (1992).

27 Some quirks of total internal reflection

Readers are undoubtedly familiar with the phenomenon of total internal reflection (TIR), which occurs when a beam of light within a high-index medium arrives with a sufficiently great angle of incidence at an interface with a lower-index medium. What is generally not appreciated is the complexity of phenomena that accompany TIR. For instance, consider the simple optical setup shown in Figure 27.1, where a uniform beam of light is brought to focus by a positive lens, being reflected, somewhere along the way, at the rear facet of a glass prism. Assuming a refractive index n ¼ 1.65 for the prism material, the critical angle of incidence is readily found to be hcrit ¼ sin1(1/n) ¼ 37.3 . Let the lens have numerical aperture NA ¼ 0.2 (i.e., f-number ¼ 2.5). Then the range of angles of incidence on the prism’s rear facet will be (33.5 , 56.5 ). The majority of the rays thus suffer total internal reflection and converge, as depicted in Figure 27.1, towards a common focus in the observation plane. Figure 27.2 shows computed plots of intensity and phase at the observation plane, indicating that the focused spot essentially has the Airy pattern, albeit with minor deviations from the ideal. The diameter of the first dark ring, for example, is approximately 6k, which is close to the theoretical value of 1.22k/NA for the Airy disk.1 The coma-like tail appearing on the right-hand side of the focused spot is caused by those rays that strike the prism in the neighborhood of the critical TIR angle, hcrit, thus introducing apodization and aberration. (Apodization is due to a reduction of the reflectivity of the prism below the critical angle, and aberration is caused by deviations from linearity of phase as a function of angle of incidence.) One noteworthy feature of the focused spot of Figure 27.2 is that it is not centered on the optical axis, but is shifted to the right by about one wavelength. This shift is known as the Goos–Ha¨nchen effect,2, 3, 4 and its cause will become clear in the course of the following discussion. For the prism of Figure 27.1 the computed amplitude and phase of Fresnel’s reflection coefficients at the glass-to-air interface are presented in Figure 27.3.1 379

380

Classical Optics and its Applications Observation plane

X

Z

Y TIR Prism

Lens

Figure 27.1 Focusing of a uniform beam through a TIR prism. The incident beam is linearly polarized along the X-axis, the numerical aperture of the lens is 0.2, and the refractive index of the prism material is 1.65. The entrance and exit facets of the prism are assumed to be spherical so that ray-bending by Snell’s law at these surfaces is avoided, thus eliminating the corresponding spherical aberrations.

+10

y/

–10 –10

z/

+10 –10

z/

+10

Figure 27.2 Plots of (a) logarithmic intensity distribution and (b) phase, at the focal plane of the lens. The center of the bright spot is shifted to the right by about one wavelength in consequence of the Goos–Ha¨nchen effect. The light and dark rings in the phase plot correspond to regions of 0 and 180 phase, respectively.

The curves for both p- and s-components of polarization are shown, even though in our example we are primarily concerned with p-polarized light. Note that beyond the critical angle the phase of the reflected p-light has a very large slope. To the extent that this phase may be approximated by a straight line (within the range of incidence angles of interest) it imparts a linear phase shift to the beam upon reflection from the prism’s rear facet. This linear phase shift is nothing other than a wavefront tilt, which causes a displacement of the focused spot; in other words, it gives rise to the Goos–Ha¨nchen effect. One might phrase the same explanation in the language of Fourier-transform theory by stating that when a function is multiplied by a linear phase factor, its Fourier transform is displaced by an amount proportional to the slope of that phase factor. Note that the largest slopes of the phase plots in Figure 27.3(b) occur immediately after the critical angle; therefore, the greatest effects would be observed

381

27 Some quirks of total internal reflection (a)

(b)

180 Phase of Reflection Coefficient (degrees)

Amplitude Reflection Coefficient

1.0

0.8

0.6 |rs| 0.4

|rp| 0.2

160 140 120 fs

100 80 fp

60 40 20

0.0

0

15

30

45 (degrees)

60

75

90

0

0

15

30

45

60

75

90

(degrees)

Figure 27.3 Plots of amplitude and phase for the reflection coefficients of the p- and s-components of polarization at a glass–air interface. The assumed index of the glass is n ¼ 1.65. The critical angle for TIR is hcrit ¼ sin1(1/n) ¼ 37.3 , and the Brewster angle is hB ¼ tan1(1/n) ¼ 31.2 .

when the incident beam’s angular spectrum is confined to the vicinity of hcrit. In our example, of course, the range of incident angles is fairly large (33.5 to 56.5 ), and deviations from linearity of the phase function show up as higher-order aberrations (e.g., coma, astigmatism, spherical aberration, defocus). It is this deviation from linearity that is mainly responsible for the aberration of the focused spot seen in Figure 27.2. A question frequently asked about TIR concerns the balance of energy among the incident beam, the reflected beam, and the evanescent waves that exist in the medium beyond the prism. If all the light is reflected at the glass–air interface, then how can there be any energy in the form of electromagnetic fields in the region immediately beyond the interface? To answer this question one must distinguish between the steady state of the system, which prevails once the waves have established themselves throughout space, and the transient state, which exists in the earlier stage immediately after the light source has been turned on. In the transient state, some of the incident energy goes into developing the evanescent waves, which are established early on and remain for as long as the system remains undisturbed. If one calculated for the evanescent field the component of the Poynting vector perpendicular to the interface, one would find that the electric and magnetic components of

382

Classical Optics and its Applications

this field are exactly 90 out of phase and, therefore, that the perpendicular component of the Poynting vector is zero. In other words, no energy is carried away from the interface by these evanescent waves. Consequently, all the incident optical energy in the steady state is carried away by the reflected beam. Next, we consider the effect of a collimating lens (identical to the original focusing lens), placed so as to capture the radiation emanating from the focused spot. (In the system of Figure 27.1, this lens would be placed one focal length above the observation plane and parallel to it.) The resulting collimated beam is depicted in Figure 27.4, which shows computed plots of intensity and phase at the 3100

(a)

y/

–3100 –3100

z/

3100

(b)

Phase (degrees)

300 200 100 0 –2000

2000 z /

0

0 2000

–2000

y/

Figure 27.4 (a) Plot of intensity distribution at the exit pupil of the collimating lens. The low-contrast rings are caused by diffraction effects during propagation and by loss of the high-spatial-frequency content of the spectrum. The rays on the left side of the beam, having been below the critical angle for TIR, have been partially transmitted through the prism. (b) Distribution of phase at the exit pupil of the collimating lens. The small linear slope is responsible for the Goos– Ha¨nchen displacement of the focused spot. The plateau on the left-hand side is caused by the (partially reflected) rays that fall below the critical angle. The sharp rise immediately before reaching the plateau is due to the rapidly decreasing phase of the reflected rays just above the critical angle.

27 Some quirks of total internal reflection

383

exit pupil of the collimator. Note, in particular, the strong attenuation of the left edge of the beam (owing to a loss of rays below hcrit), and also the near-linearity of the phase plot in regions far from the critical angle. As the light rays approach hcrit from above, the phase pattern in Figure 27.4(b) rises rather sharply and then flattens. This is precisely what one would expect based on the behavior of p in the interval (33.5 , 56.5 ), shown in Figure 27.3(b). One cannot leave the subject of TIR without at least mentioning the fascinating phenomena associated with frustrated TIR, which occur when a second prism is brought to the vicinity of the interface at which TIR occurs. Consider a pair of identical glass hemispheres separated by an air gap of width D, as shown in Figure 27.5. Displayed in Figure 27.6 are computed plots of amplitude reflection coefficients jrpj and jrsj versus the angle of incidence h for three different values of D. In Figure 27.6(a), where D ¼ 100 nm, one can see close similarities to Figure 27.3(a), albeit with TIR completely suppressed: the Brewster angle at hB ¼ 31.2 is still there, but there are no sharp transitions to 100% reflectivity. In Figure 27.6(b), D is set to 300 nm and the curves are beginning to look more like those in Figure 27.3(a); it appears that by increasing D one can make a rather smooth transition to TIR. But wait! In Figure 27.6(c), where D ¼ 400 nm, there is a radical departure from the presumed “smooth transition”. Specifically, at h ¼ 20.7 both rp and rs vanish identically. What is going on here? What will happen if the gap width D keeps increasing? These questions are not difficult to answer but require some thought. Essentially, at a certain gap width D and at some angle of incidence h both rp and rs vanish. The gap width is such that, at this angle, D cos h ¼ k/2 exactly. Now, whenever a non-absorbing layer’s thickness

Incident Beam Es

Ep

Reflected Beam

Air gap

Glass Glass

Transmitted Beam

Figure 27.5 A pair of glass hemispheres separated by an air gap may be used to demonstrate the phenomenon of frustrated TIR. The coherent beam of light is directed at the center of the upper hemisphere at incidence angle h. The width D of the air gap is adjustable.

384

(b) Air Gap = 300 nm

Amplitude Reflection Coefficient

(a) Air Gap = 100 nm

(c) Air Gap = 400 nm

1.0

1.0

1.0

0.8

0.8

0.8

0.6

0.6

|rs|

0.6 |rs|

|rs|

|rp| 0.4

0.4

0.4

|rp|

|rp| 0.2

0.2

0.0

0

15

30 45 60 (degrees)

75

90

0.0

0.2

0

15

30 45 60 (degrees)

75

90

0.0 0

15

30 45 60 (degrees)

75

90

Figure 27.6 Computed amplitude reflection coefficients, jrpj and jrsj, for p- and s-polarized light in the system of Figure 27.5. The refractive index of the glass hemispheres is n ¼ 1.65, the wavelength of the incident beam is k ¼ 650 nm, and the width D of the air gap is (a) 100 nm, (b) 300 nm, (c) 400 nm.

385

27 Some quirks of total internal reflection

becomes an integer multiple of a half-wavelength, that layer will have no effect on the multiple-beam interferences and can, therefore, be eliminated from consideration. Removing the air gap would bring the two hemispheres into contact, in which case all the incident light will naturally pass from one hemisphere to the other, leaving no reflected light whatsoever for either type of polarization. If the flat surface of the bottom hemisphere in Figure 27.5 is coated with a metallic layer, one would observe the phenomenon of attenuated TIR.5 Figure 27.7(a) shows plots of jrpj and jrsj versus h for the case of an aluminumcoated surface separated from the top hemisphere by a 875 nm air gap. The s-polarized light does not exhibit any interesting effects, but the drop in p-light reflectivity around h ¼ 37.4 (just 0.1 above hcrit) is quite impressive. In fact,

(a)

(b) |rs|

1.0 |rp|

Normal Component of Poynting Vector

Amplitude Reflection Coefficient

1.0

0.8

0.6

0.4

0.2

0.0

0

15

30

45 (degrees)

60

75

90

0.8

0.6

0.4

0.2

0.0 0

200

400

600

800

1000

Z (nm)

Figure 27.7 (a) Computed amplitude reflection coefficients, jrpj and jrsj, for p- and s-polarized light in the system of Figure 27.5, when the flat surface of the bottom hemisphere is coated with a thick layer of aluminum: (n, k) ¼ (1.47, 7.8), thickness ¼ 200 nm. The top glass hemisphere is assumed to have refractive index n ¼ 1.65, the wavelength of the incident beam is k ¼ 650 nm, and the width D of the air gap is 875 nm. This particular gap width was chosen because it brought the minimum in rp close to zero. At other gap widths the behavior is qualitatively the same but the minimum of reflectivity is higher. (b) Component of the Poynting vector perpendicular to the gap, computed at h ¼ 37.4 . The horizontal axis is the distance measured from the top of the air gap towards the aluminized surface at the bottom. The optical energy flows unattenuated through the air before being fully absorbed in the top 30 nm of the aluminum layer.

386

Classical Optics and its Applications

when the angle of incidence h is properly selected, it is possible to modulate the reflectivity of the p-light from essentially 0% to 100% by adjusting the gap width, without ever bringing the two surfaces into contact (or near contact). This has provided the mechanism for a novel light-intensity modulator, which was patented some years ago.6 Figure 27.7(b) shows the component of the Poynting vector perpendicular to the gap as a function of the vertical distance from the top surface into the gap; the assumed angle of incidence within the top hemisphere is 37.4 . The optical energy is seen to propagate unattenuated through the 875 nm gap, before being fully absorbed within the top 30 nm of the aluminum layer. The physics of attenuated TIR involves the excitation of surface plasmons in the metallic layer by p-polarized evanescent waves in the air gap. Surface plasmons are fairly easy to describe and to understand; in fact, they are just inhomogeneous plane-wave solutions to Maxwell’s equations in absorptive media. See Chapter 10, “What in the world are surface plasmons?”, for a more comprehensive discussion of this subject. References for Chapter 27 1 M. Born and E. Wolf, Principles of Optics, sixth edition, Pergamon Press, New York, 1983. 2 F. Goos and H. Ha¨nchen, Ann. Phys. Lpz. (6) 1, 333 (1947). 3 F. Goos and H. Lindberg-Ha¨nchen, Ann. Phys. Lpz. (6) 5, 251 (1949). 4 H. K. V. Lotsch, Beam displacement at total reflection: the Goos–Ha¨nchen effect, Optik 32: part I, 116–137, part II, 189–204 (1970); part III, 299–319, part IV, 553–569 (1971). 5 A. Otto, Zeit fu¨r Physik 216, 398 (1968). 6 G. T. Sincerbox and J. G. Gordon, Appl. Opt. 20, 1491–1494 (1981).

28 Evanescent coupling

Evanescent electromagnetic waves abound in the vicinity of luminous objects. These waves, which consist of oscillating electric and magnetic fields in regions of space immediately surrounding an object, do not transfer their stored energy to other regions and, therefore, remain localized in space. Like all electromagnetic waves, the behavior of evanescent waves is governed by Maxwell’s equations, and their presence in the vicinity of an object helps to satisfy the requirements of field continuity at the object’s boundaries. Evanescent fields decay exponentially with distance away from the object’s surface, making them exceedingly difficult to detect at distances much greater than a wavelength.1 When a beam of light shines on a diffraction grating, for example, various diffracted orders partake of the energy of the incident beam and carry it away in different directions. At the same time, evanescent waves are created around the grating, which ensure the continuity of the field at the grating’s corrugated surface. Similarly, a beam of light shining on an aperture or on a small particle sets up evanescent fields around the boundaries of these objects. Perhaps the bestknown example of evanescence, however, is provided by total internal reflection (TIR) from an internal facet of a prism (see Figure 28.1). Here the evanescent field is formed in the free-space region behind the prism, and remains distinct and isolated from the propagating (i.e., incident and reflected) beams; this phenomenon was discussed briefly in Chapter 27. Bringing an object to the vicinity of another object that has an established evanescent field in its neighborhood could change the distribution of the electromagnetic field throughout the entire space. For example, if a material object is placed behind the prism of Figure 28.1, close enough to sense the evanescent field but not close enough for the two to make physical contact, photons will tunnel through the small gap thus created, diverting a fraction of the incident beam across the gap and into the latter object. This is the essence of evanescent coupling, of which we present several examples in this chapter. The well-known 387

388

Classical Optics and its Applications Reflected Beam

Glass Prism

Evanescent Field Incident Beam

Figure 28.1 A beam of light is totally internally reflected from the rear facet of a glass prism. The electromagnetic field lurking in the free space region behind the prism is evanescent; both its electric and magnetic components decay exponentially with distance from the interface, and the projection of its Poynting vector perpendicular to the interface is zero. The energy stored in the evanescent field is deposited there at the time when the light source is first turned on. In the steady state, energy is neither added to nor removed from the evanescent field; all the incoming optical energy is reflected at the rear facet of the prism.

phenomena of frustrated TIR and attenuated TIR, which are of relevance here, were discussed in previous chapters (see Chapter 10, “What in the world are surface plasmons?”, and Chapter 27, “Some quirks of total internal reflection”). Focusing through a glass hemisphere We begin by considering the system of Figure 28.2, in which a uniform, collimated, linearly polarized beam of light (vacuum wavelength k0 ¼ 633 nm) is brought to focus by an aberration-free 0.8NA objective lens. A glass hemisphere of refractive index n ¼ 2, also referred to as a solid immersion lens (SIL), is placed over the focal plane so that the focused spot rests at its flat facet.2 For simplicity, we assume that the objective lens and the spherical surface of the SIL are antireflection coated; thus the only reflected light originates at the flat facet of the SIL. For rays that arrive at this flat facet at an angle below the critical TIR angle (hc ¼ arcsin(1/n) ¼ 30 ) the reflectance is fairly small (about 11% at normal incidence). For h > hc, however, reflectivity is 100%, so that the cone of light covering the range of ray angles from critical to marginal is fully reflected. The computed intensity distribution for the reflected light at the exit pupil of the objective is shown in Figure 28.3(a). The bright ring resulting from TIR is clearly visible in this plot. The central region of the aperture is not totally dark either, but

389

28 Evanescent coupling X Objective Glass Hemisphere (SIL)

Z Y

Figure 28.2 A collimated beam of light, uniform, monochromatic (k0 ¼ 633 nm), and linearly polarized along X, enters an aplanatic 0.8 NA objective lens (f ¼ 3750k0). A glass hemisphere – also known as a SIL – of refractive index n ¼ 2 is placed so that its flat facet coincides with the objective’s focal plane. The surfaces of the objective as well as the spherical surface of the SIL are antireflection coated, but the flat facet of the SIL is bare. The light reflected from this flat facet returns to the objective, is collimated by it, and appears at the exit pupil.

to discern it requires a picture with better contrast. Figure 28.3(b), a logarithmic plot of the same distribution as in Figure 28.3(a), shows the structure of the central region. The two dark spots inside the ring along the horizontal axis arise from low reflectance at and around the Brewster angle. The overall reflectivity at the flat facet of the hemisphere (as measured at the objective’s exit pupil) is 66%. Because Fresnel’s reflection coefficients at the flat facet differ for the p- and s-polarized light, the reflected beam appearing at the exit pupil is no longer linearly polarized. Figures 28.3(c), (d) show distributions of the x- and y-components of polarization, respectively. Ex contains about two-thirds of the reflected optical power, while Ey contains the remaining one-third. What is more, the relative phase of Ex and Ey varies over the aperture, thus creating a non-uniform state of polarization. The computed distribution of the polarization ellipticity g is shown in Figure 28.3(e); here the gray-scale encodes angles from 37 (black) to þ 37 (white). The distribution of the polarization rotation angle q is shown in Figure 28.3(f), where the gray-scale represents angles from 90 (black) to þ 90 (white). Clearly the state of polarization in the TIR region is quite complex. Suppose now that an identical hemisphere is placed in front of the SIL and separated from it by a narrow air gap (see Figure 28.4). Under these circumstances, evanescent coupling causes a good fraction of the beam to be transmitted through to the second hemisphere. Figure 28.5 shows the computed distributions of the reflected light at the exit pupil of the objective lens for a 100 nm air gap. These distributions should be compared directly with those in Figures 28.3. The overall reflectance is now 43%, of which two-thirds is again in the x-component of polarization (Figure 28.5(c)) and one-third in the y-component (Figure 28.5(d)).

390

Classical Optics and its Applications a

b

c

d

e

f

–3200

x/0

3200 –3200

x/0

3200

Figure 28.3 Various distributions of the reflected light at the exit pupil of the objective lens of Figure 28.2. (a) Plot of reflected intensity corresponding to a 66% overall reflectivity at the flat facet of the hemisphere. (b) Logarithmic plot of the reflected intensity. (c) Intensity distribution for the x-component of polarization, Ex. (d) Intensity distribution for the y-component of polarization, Ey. (e) The polarization ellipticity g encoded by gray-scale, covering a range from 37 (black) to þ37 (white). (f) The polarization rotation angle q encoded by gray-scale, covering a range from 90 (black) to þ90 (white).

Where there was a bright ring of light at the exit pupil in Figure 28.3(a), now there is a gradual brightening toward the margins in Figure 28.5(a), indicating the gradual decrease in evanescent coupling with increasing angle of incidence. The two dark spots in the vicinity of the Brewster angle are clearly visible in the logarithmic plot of Figure 28.5(b). The ellipticity g shown in Figure 28.5(e) varies over the aperture in the range 29.5 , while the polarization rotation angle q has the distribution shown in Figure 28.5(f).

28 Evanescent coupling X Objective

391

Air Gap

Z

Y Glass Hemispheres

Figure 28.4 A collimated beam of light, uniform, monochromatic (k0 ¼ 633 nm), and linearly-polarized along X, enters an aplanatic 0.8NA objective lens (f ¼ 3750k0). Two glass hemispheres of refractive index n ¼ 2, separated by an air gap, are arranged in such a way that the focal plane of the objective coincides with the mid-plane of the air gap. Both hemispheres are antireflection coated on their spherical surfaces, but are left bare on their flat surfaces. The light reflected at the air gap returns to the objective, is collimated by it, and appears at the exit pupil.

Up to this point we have considered the effects of evanescent coupling upon a full cone of light, which contains rays both below and above the critical TIR angle, hc. Let us now place a circular mask in the path of the incident beam in Figure 28.4 in order to block those rays that arrive at the air gap at angles below hc. The semi-hollow cone of light thus formed by the objective lens contains only rays with h > hc. Figure 28.6 is the computed plot of reflectance versus gap width for the system of Figure 28.4 augmented with a mask that completely blocks the central part of the beam (i.e., the region with no contribution to evanescent coupling). The reflectance curve is seen to start at zero, when the hemispheres are in contact and the light is fully transmitted. With a widening gap, however, the reflectance increases rapidly and saturates at 100% before the gap width reaches even one wavelength of the light. Evanescent coupling to a metallic film Suppose now that the flat facet of the second hemisphere of Figure 28.4 is coated with a metallic layer, say, a layer of aluminum 50 nm thick (n ¼ 1.4, k ¼ 7.6). For a 100 nm air gap, Figure 28.7 shows the various distributions of the reflected light at the exit pupil of the objective lens. The plot of reflected intensity in Figure 28.7(a) shows high reflectance everywhere except in two crescent-shaped areas, which correspond to a dip in the Fresnel reflection coefficient for p-polarized light. The overall reflectance is 92%, of which 66% has x-polarization and 26% has y-polarization. The remaining 8% of the light has been absorbed by the

392

Classical Optics and its Applications a

b

c

d

e

f

–3200

x/0

3200 –3200

x/0

3200

Figure 28.5 Various distributions of the reflected light at the exit pupil of the objective lens of Figure 28.4; the gap width is fixed at 100 nm. (a) Plot of the reflected intensity corresponding to a 43% overall reflectivity at the air gap. (b) Logarithmic plot of the reflected intensity. (c) Intensity distribution for Ex. (d) Intensity distribution for Ey. (e) The polarization ellipticity g encoded by gray-scale, covering a range from 29.5 (black) to þ29.5 (white). (f) The polarization rotation angle q encoded by gray-scale, covering a range from 90 (black) to þ90 (white).

aluminum layer. (At 50 nm thickness, the aluminum film is opaque; the incident light is partly absorbed and partly reflected from the film’s surface, practically no light being transmitted through the film.) The absorbed light comes partly from the central region of the incident beam, which is transported through the gap by ordinary (i.e., propagating) waves, and partly from the remaining annular region, which is transported by evanescent coupling. As before, the reflected

393

28 Evanescent coupling 1.0

Reflectance

0.8

0.6

0.4

0.2

0.0 0

150

300

450

600

Gap width (nm)

Figure 28.6 Computed plot of reflectance versus gap width in the system of Figure 28.4, when a circular mask is placed in the incident beam’s path to block the rays that arrive at the interface between the hemispheres at or below the critical TIR angle.

polarization state is quite complex, regions near the critical angle being RCP in two quadrants and LCP in the other two quadrants. (The coarseness of the mesh used in these calculations does not reveal the resonant absorption by surfaceplasmon excitation. This type of absorption occurs within a narrow range of angles just above the critical angle for p-polarized light. Because the angular range of resonant absorption is extremely narrow, however, its contribution to the overall absorption within the aluminum film may be neglected.) To compute the fraction of light absorbed by the aluminum film through evanescent coupling, we place once again a circular mask in the central region of the incident beam, blocking all the rays that would arrive at the gap below the critical angle. The results of calculations in this case are shown in Figure 28.8 for a bare aluminum film (solid curve) as well as a coated film (broken curve). With increasing gap width the reflectance of the bare film drops slightly at first, then rises rapidly to saturate at 100%. When the aluminum film is in contact with the SIL (i.e., zero gap width) it absorbs about 16% of the light, but by the time the gap widens to 150 nm the absorption is down to a mere 3%. One can improve upon this situation by applying a dielectric coating over the aluminum layer. The broken curve in Figure 28.8 is a plot of reflectance versus gap width for an aluminum film 50 nm thick coated with a layer of SiO 100 nm thick. It is clear that evanescent coupling now takes place over a wider range of gaps; in

394

Classical Optics and its Applications a

b

c

d

e

f

–3200

x/0

3200 –3200

x/0

3200

Figure 28.7 Various distributions of the reflected light at the exit pupil of the objective lens of Figure 28.4, when the flat facet of the second hemisphere is coated with a layer of aluminum 50 nm thick. The gap width is fixed at 100 nm. (a) Plot of the reflected intensity, showing a 92% overall reflectance. (If the SIL is removed, the reflectivity drops to 90%.) (b) Logarithmic plot of the reflected intensity. (c) Intensity distribution for Ex. (d) Intensity distribution for Ey. (e) The polarization ellipticity g ranging from 43.4 (black) to þ43.4 (white). (f) The polarization rotation angle q, ranging from 90 (black) to þ90 (white).

particular, between gap widths of 150 nm and 200 nm there is a plateau of about 10% absorption. (In the absence of the mask blocking the center of the beam, the dielectric-coated aluminum film absorbs a total of 19% of the incident power at a gap width of 100 nm.) The above aluminum–SiO bilayer is discussed for illustration purposes only; it does not represent the most efficient multilayering scheme for coupling the light

395

28 Evanescent coupling 1.00 50 nm Aluminum

Reflectance

0.95

0.90 50 nm Aluminum + 100 nm SiO 0.85

0

150

300

450

600

Gap width (nm)

Figure 28.8 Computed plots of reflectivity versus gap width for two different samples. The solid curve shows the reflectance when an aluminum layer 50 nm thick coats the flat facet of the second hemisphere of Figure 28.4. The broken curve corresponds to the case where a layer of SiO 100 nm thick and with n ¼ 2 coats the 50 nm thick aluminum film. A mask blocks the central region of the beam in both cases.

to the aluminum layer. One can do somewhat better by adding additional layers on top of the aluminum film and/or beneath the flat facet of the SIL and by optimizing the thickness and refractive index of each such layer. Magneto-optical disk A major application area for evanescent coupling is the field of magneto-optical (MO) disk data storage.2,3,4 Here a disk, which is typically a multilayer stack of metallic and dielectric layers on a glass or plastic substrate, is placed under a solid immersion lens (SIL). As the disk spins, the SIL rides on an air cushion, which separates the two by a fixed gap width. Two of the most important questions in this area are: (1) how much of the focused optical energy is absorbed within the optical disk?; (2) how does the reflected MO signal depend on the air gap? Before answering these questions, however, we must give a brief overview of the physical mechanisms involved in MO recording and readout. The disk consists of a thin magnetic layer sandwiched between two dielectric layers coated atop a reflector such as an aluminum-coated substrate (see Figure 28.9). The layer thicknesses and refractive indices shown in the figure are typical but in fact can

396

Classical Optics and its Applications Incident beam (0 = 633 nm)

100 nm

Dielectric (n = 2)

25 nm

Magnetic

30 nm 25 nm

Dielectric (n = 2) Aluminum (n, k) = (1.4, 7.6)

Substrate

Figure 28.9 Quadrilayer structure of a typical MO disk used in conjunction with the SIL. Within the drive, the disk rotates at several thousand r.p.m., and the SIL flies over the top dielectric layer of the disk, supported on a cushion of air several tens of nanometers thick. The local state of magnetization (up or down) represents the state of the recorded bit (0 or 1). The size of the focused spot directed through the SIL at the magnetic layer determines the minimum mark size that can be recorded and read out.

vary somewhat, depending on the configuration of the drive for which the disk is intended. The disk is used in reflection, and the multilayer stack is designed to take advantage of optical interference in order to maximize the coupling of the laser beam to the magnetic layer.3 The aluminum reflector is an important component of this optical interference device, but it also serves as a heat sink to remove from the magnetic layer the thermal energy deposited there by the focused laser beam. The dielectric layers protect the magnetic film from the environment and, through their thickness and refractive index, provide the necessary degrees of freedom for adjusting the optical characteristics of the stack. Also, the dielectric layer between the magnetic film and the aluminum layer controls the flow of heat between these two metallic layers. The optical properties of the magnetic film are fully specified by its dielectric tensor, namely, 0 1 e e0 0 e ¼ @ e0 e 0 A: ð28:1Þ 0 0 e

28 Evanescent coupling

397

For conventional MO media, typical values of the diagonal and off-diagonal elements of this tensor at k0 ¼ 633 nm are e ¼ 8þ27i and e0 ¼ 0.6 0.2i. The off-diagonal elements are responsible for cross-coupling the x- and y-components of polarization, this being the origin of optical activity in these media. If e0 is set to zero, MO activity vanishes, as if the magnetization of the material had disappeared. Reversing the direction of magnetization causes a sign reversal of e0 . Suppose a plane wave, linearly polarized along the X-axis, is directed at normal incidence onto the MO stack of Figure 28.9. Upon reflection from the stack the beam will have two components of polarization: Ex along the original X-axis and Ey along the Y-axis. The y-component of the reflected polarization is created by the optical activity of the magnetic layer. If the magnetization of the sample is reversed, Ex does not change at all, and Ey undergoes a sign change only. If Ex and Ey happen to be in phase, the reflected polarization appears to have been rotated from its original direction; this is referred to as MO Kerr rotation. If, however, Ey is 90 out of phase with respect to Ex, the returning beam will have pure Kerr ellipticity. In general, the polarization components have an arbitrary relative phase and, therefore, the reflected beam exhibits both Kerr rotation and ellipticity. Both Kerr angles convey information about the state of magnetization of the disk: when the magnetization reverses, the sign of Ey is switched, in which case both angles (i.e., rotation and ellipticity) change sign.3 In practice a disk is both recorded and read out using a focused laser beam. The writing involves local heating of the magnetic film in the presence of an external magnetic field. At high enough temperature the external magnetic field succeeds in reversing the direction of local magnetization. Obviously, a small focused spot yields a small recorded mark (i.e., a small magnetic domain). The SIL is valued in optical recording precisely because it does produce a small focused spot4 (see Chapter 37, “Scanning optical microscopy”). During readout, the same focused spot is used, albeit at a lower power to avoid heating the media. The sign of the Kerr rotation (or ellipticity) then provides the read signal for the detection system. Once again, the usefulness of the SIL for this application becomes apparent when one recognizes that by producing a small focused spot the SIL helps to resolve small recorded marks.2,4 Differential detection The standard method of detecting the MO signal in conventional optical disk drives is shown in Figure 28.10.3 The beam reflected from the disk and collected by the objective lens is sent through a Wollaston prism and thus divided between two identical detectors. Typically the detection module is oriented at 45 with respect to the original direction of polarization of the laser, so that Ex and Ey are

398

Classical Optics and its Applications

X

Wollaston Prism

Split-detector S1

Y Z

S2

+ ΔS –

Figure 28.10 Schematic diagram of a differential detection module consisting of a Wollaston prism and two identical photodetectors. Since it can rotate around the optical axis Z, the module may be oriented arbitrarily relative to the ellipse of polarization of the incident beam. In particular, if the original linear polarization of the laser beam is along the X-axis and the magneto-optically generated component of polarization is along the Y-axis, then the module may be aligned in such a way that the transmission axes of the Wollaston prism make 45 angles with X and Y.

equally split between the two detectors. Whereas both the magnitude and phase of Ex at the two detectors are identical, Ey arrives at each detector with a different pﬃﬃﬃ sign. The total light amplitude arriving at the detectors is thus (Ex Ey Þ= 2, the plus sign corresponding to one detector and the minus sign corresponding to the other. If the phase difference between Ex and Ey is denoted x y then the net differential signal may be written as Z 1 1 2 2 jEx þEy j jEx Ey j dxdy DS ¼ S1 S2 ¼ c 2 2 ð28:2Þ Z ¼ 2c jEx Ey jcosðx y Þdxdy: In this equation, c is the responsivity of the detectors (in volts per watt of optical energy) and the integrals are over the individual detector areas. Note that when the magnetization direction at the disk is reversed the sign of Ey reverses, resulting in a sign reversal for DS. Also note that any phase difference between Ex and Ey reduces the output signal by the cosine factor in the above equation. In principle, this phase difference may be eliminated by a properly patterned phase plate placed immediately before the Wollaston prism. In practice, however, unless x y is fairly uniform over the aperture, it is difficult to correct the effects of this relative phase. Evanescent coupling to an optical disk Consider the typical MO disk structure of Figure 28.9 placed under the SIL of Figure 28.2. When in contact with the SIL and at normal incidence, the disk has

399

28 Evanescent coupling

reflectance 36%, Kerr rotation angle 0.66 , and Kerr ellipticity 0.05 . Focusing the beam by the 0.8NA objective through the SIL changes these parameters only slightly, as long as the disk and the SIL remain in contact. However, a small air gap between the disk and the SIL can change the disk’s performance drastically. Figure 28.11 shows computed distributions at the exit pupil of the objective lens for a 100 nm gap width. Figure 28.11(a) is the intensity distribution for the reflected a

b

c

d

e

f

–3200

x/0

3200 –3200

x/0

3200

Figure 28.11 Various distributions of the reflected light at the exit pupil of the objective lens of Figure 28.2, when the MO stack of Figure 28.9 is placed in front of the SIL with a 100 nm air gap. (a) Intensity distribution for Ex, containing 36% of the incident optical power. (b) Phase distribution for Ex; the grayscale covers the range 180 (black) to þ180 (white). (c) Intensity distribution for Ey, containing 11.5% of the incident power. (d) Phase distribution for Ey; the phase difference between adjacent quadrants is nearly 180 . (e) The polarization ellipticity g ranging from 45 (black) to þ45 (white). (f) The polarization rotation angle q ranging from 90 (black) to þ90 (white).

400

Classical Optics and its Applications

Ex, containing 36% of the incident power. The dark oval-shaped region in the middle indicates an area of strong absorption by the disk. The phase of Ex, shown in Figure 28.11(b), is non-uniform over the aperture, ranging in value from 180 (black) to þ180 (white). At the center the phase is about þ100 , and drops continuously along the X-axis to 150 at the edge. Figure 28.11(c) is a plot of intensity distribution for the reflected Ey, which contains 11.5% of the incident power. This y-component is due mainly to the Fresnel reflection coefficients at the interface between the SIL and the multilayer stack. The fraction of Ey created by MO activity is relatively small and, although embedded in Figure 28.11(c), is difficult to recognize at this point. The phase of Ey depicted in Figure 28.11(d) shows the well-known p shift between adjacent quadrants. The polarization distribution over the exit pupil (see Figures 28.11(e), (f)) is highly non-uniform and contains all possible states of polarization, i.e., linear, elliptical, and circular. To observe the contribution to Ey by MO activity, we eliminate the magnetization of the disk by setting to zero the off-diagonal element e0 of the tensor, then subtracting the complex-amplitude distributions at the exit pupil with and without the magnetization. In doing so the x-component cancels out exactly, showing that there are no magnetic contributions to the reflected Ex. However, the y-component shows a residual distribution Ey. Figure 28.12 shows the intensity and phase plots for DEy at the exit pupil of the objective for a 100 nm gap width. Notice that this MO contribution to Ey has circular symmetry; moreover, it is large in the region where absorption by the disk is strong (compare the position of the bright ring in Figure 28.12(a) with that of the dark oval-shaped region in Figure 28.11(a)). a

–3200

b

x/0

3200 –3200

x/0

3200

Figure 28.12 Plots of intensity and phase for the MO contribution to the reflected light, DEy, at the exit pupil of the objective lens of Figure 28.2. The multilayer stack of Figure 28.9 is assumed to be in front of the SIL with a 100 nm gap. (a) Intensity distribution, containing a fraction 0.37 · 104 of the incident optical power. (b) Phase distribution, ranging from 70 (black) to þ246 (white).

401

28 Evanescent coupling

The total optical power contained in the distribution of Figure 28.12 is 0.37 · 104 of the incident power; the corresponding value for the case of zero gap width is 0.44 · 104. Despite the fact that interference effects at the air gap have boosted DEy in the central region of the aperture, a substantial reduction in evanescent coupling has caused an overall reduction of DEy. Also notice the phase non-uniformity of DEy in Figure 28.12(b), ranging from 246 at the center to 70 at the edge of the lens. Variations over the aperture of the relative phase between Ex and DEy (see Figures 28.11(b) and 28.12(b)) have negative implications for the readout signal from the disk, as will be discussed shortly. To isolate the contribution to the MO signal made by evanescent coupling, we place a mask in the central region of the beam, blocking all the rays below the critical angle. Figure 28.13 shows computed plots of the total reflectivity (i.e., jExj2 þ jEyj2 integrated over the aperture), versus the gap width, and the total contribution to Ey by the MO activity (i.e., the integrated value of jDEyj2) versus the gap width. With increasing gap width the reflectance increases, leaving less light to be coupled to the magnetic film. In consequence of this reduced coupling, the MO content of Ey progressively decreases; by the time the gap width reaches 200 nm, there is hardly any Ey left from the evanescently coupled MO interaction. A similar trend may be seen in the normalized differential signal, (S1S2)/ (S1þS2), which is plotted versus the gap width in Figure 28.14. (See Figure 28.10 1.0 |Ex|2 + |Ey|2

Reflectance

0.8

0.6

0.4 20000 |ΔEy|2 0.2

0.0 0

150

300

450

600

Gap width (nm)

Figure 28.13 Total reflectivity (solid line) and the integrated intensity of the MO signal (broken line) at the exit pupil as functions of the gap width. These calculations correspond to the system of Figure 28.2 in conjunction with the quadrilayer MO stack of Figure 28.9, when a mask blocks the central region of the beam.

402

Classical Optics and its Applications 0.020

(S1 – S2)/(S1 + S2)

0.015

0.010

0.005

0.000 0

150

300

450

600

Gap width (nm)

Figure 28.14 A computed plot of the normalized differential signal, (S1 S2)/(S1þ S2), versus the gap width. This result corresponds to the system of Figure 28.2 in conjunction with the quadrilayer MO stack of Figure 28.9 and the differential detector of Figure 28.10. It is assumed that a mask blocks the central region of the incoming laser beam, thus eliminating all the rays below the critical angle.

and Eq. (28.2) for the definition of S1, S2.) Again we have blocked the central region of the incident beam in order to concentrate on the effects of evanescent coupling. With the SIL and the disk in contact, the normalized differential signal is close to its ideal value, which is twice the tangent of the Kerr rotation angle, namely, 2 tan 0.66 ¼ 0.023. As the gap widens, the differential signal drops sharply: at 100 nm gap width, for instance, the signal is down by a factor of four. Roughly one-half of this drop may be attributed to the reduction in DEy and the corresponding rise in reflectivity (see Figure 28.13). The remaining half, however, is due to variations over the beam’s cross-section of the relative phase x y of Ex and DEy. It must be emphasized that the quadrilayer stack of Figure 28.9 is not specifically optimized for operation with the system of Figure 28.2. By changing the thicknesses and the refractive indices of the various layers and/or by introducing dielectric coatings at the bottom of the SIL, it might be possible to improve upon the aforementioned performance figures. It is highly unlikely, however, that one can achieve significant gains in terms of the coupling efficiency and the magnitude of the MO Kerr signal over what we have already reported.

28 Evanescent coupling

403

References for Chapter 28 1 2 3 4

M. Born and E. Wolf, Principles of Optics, 6th edition, Pergamon Press, Oxford, 1980. S. M. Mansfield, W. R. Studenmund, G. S. Kino, and K. Osato, High numerical aperture lens system for optical storage, Opt. Lett. 18, 305–307 (1993). T. W. McDaniel and R. H. Victora, eds., Handbook of Magneto-optical Recording, Noyes Publications, Westwood, New Jersey, 1997. B. D. Terris, H. J. Mamin, and D. Rugar, Near-field optical data storage using a solid immersion lens, Appl. Phys. Lett. 65, 388–390 (1994).

29 Internal and external conical refraction

Sir William Rowan Hamilton (1805–1865). Irish mathematician and astronomer who put forward the theory of quaternions, a landmark in the development of algebra, and discovered the phenomenon of conical refraction. His unification of dynamics and optics has had a lasting influence on mathematical physics, even though the significance of this work was not fully appreciated until after the rise of quantum mechanics. Hamilton had learned Latin, Greek, and Hebrew by the time he was five years old and learned many more languages afterwards. In 1827, while still an undergraduate, he was appointed Professor of Astronomy at Dublin’s Trinity College. Hamilton published his third supplement to Theory of Systems of Rays in 1832. Near the end of this work he applied the characteristic function to study Fresnel’s wave surface. From this he predicted conical refraction and asked Humphrey Lloyd, a professor of physics at Trinity College, to try to verify his prediction experimentally. Lloyd’s confirmation two months later of conical refraction brought great fame to Hamilton. (Photo: courtesy of AIP Emilio Segre´ Visual Archives.) 404

29 Internal and external conical refraction

405

The phenomenon of conical refraction was predicted by Sir William Rowan Hamilton in 1832 and its existence was confirmed experimentally two months later by Humphrey Lloyd.1,2 (James Clerk Maxwell was only a toddler at the time.) The success of this experiment contributed greatly to the general acceptance of Fresnel’s wave theory of light. Conical refraction has been known for nearly 170 years now,1,2 and a complete explanation based on Maxwell’s electromagnetic theory has emerged, which is accessible through the published literature.3,4 The complexity of the physics involved, however, is such that it prevents us from attempting to give a simple explanation. We shall, therefore, confine our efforts to presenting a descriptive picture of internal and external conical refraction by way of computer simulations based on Maxwell’s equations. Overview To observe internal conical refraction one must obtain a slab of biaxial birefringent crystal, such as aragonite, that has been cut with one of its optic axes perpendicular to the polished parallel surfaces of the slab (see Figure 29.1). When a collimated beam of light (say, from a HeNe laser) is directed at normal incidence towards the front facet of the slab, the beam enters the crystal and spreads out in the form of a hollow cone of light. Upon reaching the opposite facet, the beam emerges as two concentric hollow cylinders, propagating in the same direction as the original, incident beam. External conical refraction is, in a way, the above phenomenon in reverse. Specifically, a hollow cone of light, converging towards a point on the surface of Optic axis

Emergent beams Incident beam

Slab of biaxial crystal

Figure 29.1 Internal conical refraction. A normally incident coherent beam arriving at the front facet of a slab of biaxial birefringent crystal propagates inside the slab in the form of a cone of light, and emerges from the rear facet as two hollow concentric cylinders. The crystal is cut so that one of its optic axes is perpendicular to the polished parallel surfaces of the slab.

406

Classical Optics and its Applications Incident beam

Opaque mask with pinhole

Focusing lens

Slab of birefringent crystal

Observation plane

Collecting lens

Figure 29.2 External conical refraction. A coherent monochromatic beam of light (wavelength k0) is focused by a lens at a biaxial birefringent crystal slab, which is cut with its polished surfaces perpendicular to one of its optic-ray axes. The exit facet of the crystal is painted black, except for a small aperture in the middle that is left clear to allow rays that propagate near the opticray axis to exit the crystal. The exiting rays propagate to a second lens where they are collected and recollimated. In our simulations the incident beam is uniform over the entrance pupil of the focusing lens, both lenses have NA ¼ 0.075 and f ¼ 46667k0, the crystal slab has thickness ¼ 5000k0 and principal refractive indices nx ¼ 1.533, ny ¼ 1.500, nz ¼ 1.565, and the pinhole diameter d ¼ 100k0.

a biaxial crystal slab, becomes collimated along the optic-ray axis of the crystal and continues to propagate along that axis for as long as the beam remains within the crystal slab (see Figure 29.2). When the beam reaches the opposite facet of the slab, it emerges as an expanding cone of light. The focused cone thus “remains in focus” in its entire path through the crystal and diverges only after exiting the slab. There are certain subtle differences between internal and external conical refraction; for instance, the optic axis of wave normals along which the beam propagates in the former case is not the same as the optic-ray axis in the latter. This and other differences will become clear in the course of the following discussions. Biaxial birefringent crystals and their optic axes In general, a birefringent crystal has three different refractive indices along the directions of its three principal axes. Assuming that the principal axes are the X-, Y-, and Z-axes of a Cartesian coordinate system, the principal indices may be denoted nx, ny, nz. The index ellipsoid of this crystal has semi-axis lengths nx, ny, nz along the coordinate axes, as shown in Figure 29.3. For a plane wave propagating along a given wave-vector k, the plane passing through the center of the ellipsoid and perpendicular to k will, in general, have an elliptical crosssection with the index ellipsoid. The semi-axes of this cross-sectional ellipse

29 Internal and external conical refraction

407

Z

Optic axis 2

Optic axis 1

Y

Figure 29.3 The index ellipsoid has semi-axes of length nx, ny, nz along the principal axes X, Y, Z of the crystal. For a plane wave propagating in a given direction, a plane through the center of the ellipsoid and perpendicular to the wave normal will have an elliptical cross-section with the index ellipsoid. A propagation direction for which the cross-sectional ellipse becomes a circle is known as an optic axis. Similarly, the ray ellipsoid has semi-axes of length 1/nx, 1/ny, 1/nz along the principal axes. For a given ray direction, a plane through the center of the ray ellipsoid and perpendicular to the ray will have an elliptical cross-section with the ray ellipsoid. A propagation direction for which the crosssectional ellipse becomes a circle is known as an optic-ray axis. In general, biaxial crystals have two optic axes and two optic-ray axes.

yield the refractive indices associated with the two orthogonal polarizations of the beam. If the wave-vector k happens to be in such a direction that its corresponding cross-sectional ellipse becomes a circle, then the beam will “see” a single refractive index, irrespective of its state of polarization. The propagation direction corresponding to this circular cross-section is known as the optic axis. Crystals in which the three principal indices of refraction nx, ny, nz are all different exhibit two such optic axes and are, therefore, referred to as biaxial. A crystal in which one index of refraction differs from the other two exhibits one optic axis and is known as a uniaxial birefringent crystal. Conical refraction occurs only in biaxial birefringent crystals. Birefringent crystals also have a ray ellipsoid with semi-axis lengths 1/nx, 1/ny, 1/nz along the principal axes. The ray ellipsoid, therefore, is different from the index ellipsoid, whose semi-axis lengths are the refractive indices themselves. While the index ellipsoid is relevant to the discussion of internal conical refraction, it is the ray ellipsoid that plays the central role in the case of external conical refraction. For a ray propagating along a given direction, the plane

408

Classical Optics and its Applications

passing through the center of the ray ellipsoid and perpendicular to the ray will, in general, have an elliptical cross-section with the ellipsoid. If a ray happens to be in such a direction that its corresponding cross-sectional ellipse becomes a circle, then the direction of that ray defines an optic-ray axis. In general, the optic-ray axis is different from the optic axis, which is obtained in a similar fashion from the index ellipsoid. Biaxial birefringent crystals thus possess two optic-ray axes in addition to their two optic axes. Assuming ny < nx < nz, it is not difficult to show that the optic-ray axis is in the YZ-plane and makes an angle h with the Z-axis, where qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ tan h ¼ ðn2x n2y Þ=ðn2z n2x Þ: Internal conical refraction To give a specific example, let us consider a slab of crystal having three principal refractive indices nx ¼ 1.533, ny ¼ 1.500, nz ¼ 1.565, and thickness ¼ 25000k0, where k0 is the vacuum wavelength of the incident beam. (For the red HeNe wavelength of 633 nm, for example, the assumed thickness of the slab would be about 1.6 cm.) It is not difficult to show that the optic axes of this crystal are in the YZ-plane, located symmetrically with respect to the Z-axis at angles of 46.35 from it. We assume the slab is cut with one of its optic axes perpendicular to its polished surfaces. Next, we assume that a collimated beam of coherent monochromatic light is normally incident on this slab; the beam has a Gaussian profile and its 1/e diameter is 150k0. The intensity distribution for this beam is shown as a small bright spot in Figure 29.4(a). (The coordinate system is now redefined in such a way that the incident beam propagates along the Z-axis, and the polished surfaces of the crystal are parallel to the XY-plane.) The incident beam, upon entering the crystal, breaks up into a multitude of rays that propagate as a cone of light through the crystal, and emerge from the opposite facet of the slab in the form of two concentric hollow cylinders; the plot in Figure 29.4(b) shows the computed intensity distribution immediately after the beam exits the crystal. The scale of Figure 29.4(b) is the same as that of Figure 29.4(a), so one can compare the size and position of the bright rings with those of the incident beam. Note, in particular, that the incident beam is not at the center of the emerging cylinder, but at its bottom; see Figure 29.1. (Had the crystal been cut in such a way that its other optic axis was perpendicular to the facets, the incident beam would have been at the top of the cylinder.) To obtain the full rings seen in the present example we have assumed the state of polarization of the incident beam to be circular; states of both right and left

29 Internal and external conical refraction 1250 a

y/0

–150 1250 b

y/0

–150 1250 c

y/0

–150 –700

x/0

700

Figure 29.4 (a) The incident intensity distribution at the front facet of the crystal slab in Figure 29.1. (b) Emergent intensity distribution at the rear facet of the slab, corresponding to the circularly polarized incident beam shown in (a). (c) Distribution of the angle of the emergent polarization vector to the X-axis. The gray-scale is such that a white pixel represents a þ90 angle while a black pixel corresponds to a 90 angle. The emergent polarization is linear at any given point on the beam’s cross-section, but its direction varies from point to point. At the top of the rings there is an apparent 180 discontinuity in the direction of polarization. The jaggedness of the discontinuity is caused by small numerical errors that are inevitable when computing the state of polarization in the dark regions around the rings.

409

410

Classical Optics and its Applications

circular polarization (RCP and LCP) yield the same results. Alternatively, the incident beam may be assumed to be unpolarized for the full rings to emerge. As we shall see later, with linearly polarized light a certain part of the rings will be missing. Polarization and phase patterns of the refracted beam The polarization and phase distributions emerging in internal conical refraction are quite interesting. At any given point on the beam’s cross-section, the state of polarization is linear, but the direction of the E-field varies as one scans the crosssection of the beam. The gray-scale plot in Figure 29.4(c) shows the distribution of the angle between the polarization vector and the X-axis at the exit facet of the crystal. In this picture a black pixel represents a 90 angle, a white pixel corresponds to þ90 , and the gray pixels represent angles in between. At the bottom of the emergent rings of light the polarization angle is 0 , that is, the E-field is parallel to the X-axis. The angle increases continuously from 0 to 90 as one moves from the bottom to the top on the right side of the rings. Similarly, on the left side, the orientation angle of the E-field varies continuously from 0 at the bottom to 90 at the top. Thus the polarization vector rotates by 180 as the point of observation moves a full circle around the beam’s cross-section. This seems to imply a discontinuity in the E-field distribution at the top of Figure 29.4 (c). In reality, however, this discontinuity does not occur, because the phase of the E-field (not shown here) also undergoes a 180 change in a full circle around the rings. Thus the polarization vector rotates by 180 and, at the same time, its phase changes by 180 in a round trip of the circumference of the rings, so that the E-field distribution is continuous at all locations. We emphasize once again that the incident beam in the case shown in Figure 29.4 is circularly polarized. Whether this beam is RCP or LCP, however, is immaterial because the chirality of the incident beam affects neither the intensity distribution nor the polarization state of the emergent beam. In other words, if an observer facing the beam scans the rings in a clockwise sense, the polarization vector also appears to rotate clockwise, whether the incident beam is RCP or LCP. The only way to determine the state of incident polarization is by examining the phase distribution of the emergent beam, which increases clockwise in one case and counterclockwise in the other. It is also interesting to note that unpolarized light (i.e., light containing equal amounts of RCP and LCP that have randomly varying amplitudes and phases) gives exactly the same distributions as in Figure 29.4. In the case of unpolarized light, however, the phase distribution is meaningless, because it varies randomly with time and with location on the beam’s cross-section.

411

29 Internal and external conical refraction 1250 a

b

y/0

–150 –700

x/0

700 –700

x/0

700

Figure 29.5 Interferograms showing the intensity distribution resulting from the superposition of the emergent beam (at the rear facet of the crystal slab) with a uniform reference beam. The beam incident on the crystal is RCP in both cases, but the reference beam is RCP in (a) and LCP in (b).

To gain an appreciation for the phase distribution over the beam’s crosssection, we show in Figure 29.5 two computed interferograms corresponding to the superposition of the beam emerging from the exit facet of the crystal and a uniform reference beam. The beam entering the crystal is assumed to be RCP in all cases, but the reference beam is RCP in Figure 29.5(a) and LCP in Figure 29.5(b). We notice that in Figure 29.5(a) the outer ring has interfered constructively with the reference beam, whereas the inner ring shows destructive interference. As a general rule, there is a 180 phase shift between the inner and outer rings at radially adjacent locations, irrespective of the state of incident polarization. This phase difference aside, the two rings are identical in their polarization and phase distributions. The interferogram of Figure 29.5(b) is more complicated than that of Figure 29.5(a); nevertheless, it can be fully explained in terms of the states of polarization and the distribution of phase over the rings, which we have already described. Effect of linear incident polarization Figure 29.6 shows the computed intensity distributions at the exit facet of the crystal for three cases of linear incident polarization: (a) parallel to the X-axis; (b) at 45 to X; and (c) parallel to Y. In all three cases the emergent state of polarization (not shown) is similar to that in Figure 29.4(c). A segment from the top of the rings is missing in Figure 29.6(a); this is the region that would have had polarization parallel to Y had the incident beam contained the corresponding E-field component. Similarly, the bottom of the rings is missing in

412

Classical Optics and its Applications 1250 a

y/0

–150 1250 b

y/0

–150 1250 c

y/0

–150 –700

x/0

700

Figure 29.6 When the incident beam is linearly polarized, the emergent rings of light will be incomplete. This figure shows the intensity distribution at the rear facet of the slab in the cases where the incident E-field is (a) parallel to the X-axis, (b) at 45 to X, and (c) parallel to the Y-axis.

Figure 29.6(c); this region would have had polarization along the X-axis. Unlike the distribution of polarization over the rings, which is independent of the state of incident polarization, the phase of the rings is very much a function of the polarization of the incident beam. When the incident beam is linearly polarized, as in Figure 29.6, the emergent phase (not shown) will have a constant value over

29 Internal and external conical refraction

413

the entire area of each ring. (As before, the two rings will have a 180 phase difference.) One may verify the above statements by considering the various linearly polarized incident beams as superpositions of RCP and LCP beams and by analyzing the corresponding superpositions at the exit facet of the crystal. External conical refraction Consider the system of Figure 29.2, which consists of a focusing lens, a slab of biaxial birefringent crystal, a pinhole, and a collimating lens. The incident beam is uniform, coherent, and monochromatic with a vacuum wavelength of k0. The crystal slab has refractive indices nx ¼ 1.533, ny ¼ 1.500, nz ¼ 1.565, and its thickness is 5000k0. (For the red HeNe wavelength of 633 nm, for example, this slab would be 3.165 mm thick.) The optic-ray axes of this crystal are located symmetrically in the YZ-plane at angles h ¼ 45.14 from Z. The slab is cut with its polished flat surfaces perpendicular to one of its optic-ray axes. (The coordinate system is now redefined to be such that the incident beam propagates along the Z-axis and the polished surfaces of the crystal are parallel to the XY-plane.) When the incident rays enter the crystal slab they will propagate, in general, in various directions, but the rays that happen to be on a special cone, namely, the cone of external conical refraction, propagate strictly along the optic-ray axis and will emerge from a point opposite the point of entry into the crystal. A small pinhole (of diameter 100k0 in the present example) on the exit facet of the slab allows only these axial rays to emerge. The emergent rays diverge in a hollow cone as they propagate towards a collecting lens, where they are recollimated and directed towards the observation plane. Figure 29.7 shows computed plots of the intensity distribution, polarization ellipticity, and polarization rotation angle at the observation plane, corresponding to a circularly polarized incident beam. Note that the emergent rings of light in Figure 29.7(a) are in the bottom half of the exit pupil. (Had the crystal been cut with its other optic-ray axis perpendicular to the polished surfaces, the rings would have appeared in the top half of the exit pupil instead.) The ellipticity plot in Figure 29.7(b) is coded in gray-scale, black corresponding to 45 (i.e., LCP) and white to þ45 (i.e., RCP). The relevant part of the plot, which is the region in the bottom half of the pupil where the emergent beam’s intensity is non-vanishing, shows zero ellipticity. The emergent rings of light, therefore, are linearly polarized. The direction of this linear polarization varies over the rings, however, as the plot of polarization rotation angle in Figure 29.7(c) indicates. (The gray-scale used here assigns black to 90 and white to þ90 .)

414

Classical Optics and its Applications 3750

a

y/0

–3750 3750

b

y/0

–3750 3750

c

y/0

–3750 –3750

x/0

3750

Figure 29.7 The distributions of (a) intensity, (b) polarization ellipticity, and (c) polarization rotation angle at the observation plane. The incident beam at the entrance pupil of the focusing lens is assumed to be circularly polarized. The ellipticity plot in (b) is coded in gray-scale, black corresponding to 45 (i.e., LCP) and white to þ45 (i.e., RCP). The distribution of polarization rotation angle depicted in (c) is also coded in gray-scale, but the black pixels in this case represent 90 rotation from the X-axis and the white pixels represent þ90 rotation. As before, the jaggedness of the transition from black to white in the lower part of (c) is caused by small numerical errors; since the discontinuity represented by this transition is not a physical discontinuity, this jaggedness has no physical significance.

29 Internal and external conical refraction

415

According to Figure 29.7(c), over the circumference of the rings the polarization vector rotates from 90 at the bottom (i.e., E-field antiparallel to Y-axis) to 0 at the top (E-field parallel to X) and back to þ90 at the bottom (E-field parallel to Y). The apparent discontinuity of polarization direction at the bottom of the rings does not signify a physical discontinuity, as before, because the phase of the rings (not shown here) also exhibits a 180 change during one full cycle around the rings. The overall E-field distribution turns out to be continuous after all. Character of the emergent beam at the pinhole and the effect of incident polarization The beam emerging from the pinhole in the system of Figure 29.2 possesses certain interesting features. Figure 29.8 shows computed plots of (a) intensity distribution, (b) polarization ellipticity, and (c) polarization rotation angle at the pinhole, for a circularly polarized incident beam. The intensity plot in Figure 29.8 (a) shows a bright spot at the center of the pinhole, surrounded by a diffuse, more or less uniform background distribution. The origin of the diffuse light may be traced back to those incident rays that were outside the cone of external refraction and, therefore, once inside the crystal, did not become aligned with the optic-ray axis. The plot of polarization ellipticity in Figure 29.8(b) shows that the state of polarization varies from RCP in the bright white rings to LCP in the dark rings, covering the full gamut of elliptical polarization in the intervening regions. The plot of polarization rotation angle in Figure 29.8(c) indicates that the orientation of the ellipse of polarization is not uniform over the aperture but rotates through 180 around certain circular bands. All in all, this is a complex and fascinating state of affairs compared to the dull, uniform polarization state of the focused spot that first entered the crystal. As was the case with internal conical refraction, the full cone of external refraction appears only when the incident beam contains all possible polarization directions. This is the case with RCP or LCP as well as with unpolarized light. When the incident beam happens to have linear polarization, however, certain parts of the emergent cone of light will be missing. This is shown in Figure 29.9 for an incident beam that is linearly polarized along the X-axis. The distributions of intensity, polarization ellipticity, and polarization rotation angle at the observation plane shown in Figure 29.9 are analogous to those in Figure 29.7, where the incident beam is circularly polarized. The lower part of the rings in Figure 29.9(a), however, is missing simply because the corresponding polarization, linear along Y, is not present in the incident beam. Aside from this missing segment, other features of the emergent beam shown in Figure 29.9 are quite similar to those in Figure 29.7.

416

Classical Optics and its Applications 60

a

y/0

–60 60 b

y/0

–60 60

c

y/0

–60 –60

x/0

60

Figure 29.8 Distributions of (a) intensity, (b) polarization ellipticity, and (c) polarization rotation angle within the pinhole at the exit facet of the crystal slab. The incident beam at the entrance pupil of the focusing lens is assumed to be circularly polarized. The ellipticity plot in (b) is coded in gray-scale, black corresponding to 45 (i.e., LCP) and white to þ45 (i.e., RCP). The distribution of polarization rotation angle depicted in (c) is also coded in gray-scale, but the black pixels in this case represent 90 rotation from the X-axis and the white pixels represent þ90 rotation.

29 Internal and external conical refraction

417

3750 a

y/0

–3750 3750 b

y/0

–3750 3750 c

y/0

–3750 –3750

x/0

3750

Figure 29.9 Same as Figure 29.7 except for the state of polarization of the incident beam, which is linear along X in the present case.

References for Chapter 29 1 2 3 4

W. R. Hamilton, Trans. Roy. Irish Acad. 17, 1 (1833). H. Lloyd, Trans. Roy. Irish Acad. 17, 145 (1833). M. Born and E. Wolf, Principles of Optics, 6th edition, chapter 14, Pergamon Press, Oxford, 1980. M. V. Klein, Optics, Wiley, New York, 1970.

30 Transmission of light through small elliptical apertures†

The apertures of classical optics simply block those parts of an incident wavefront that fall outside the aperture, allowing everything else to go through intact. Moreover, multiple apertures act upon an incident beam independently of each other, polarization effects are usually negligible (i.e., scalar diffraction), and it is not necessary to keep track of both the electric- and the magnetic-field components of the beam.1 All of the above assumptions break down when apertures shrink to dimensions comparable to or smaller than a wavelength.2,3 For example, transmission through two small adjacent apertures cannot be treated by assuming that only one aperture is open at a time, then adding the fields transmitted by the individual apertures. (This is because the electric charge and current distributions in the vicinity of one aperture are influenced by the radiation pattern of the other aperture.) Polarization effects are extremely important for small apertures, as exemplified by the case of a normally incident beam going through an elliptical aperture in a thin metal film; whereas in the case of polarization (i.e., E-field) parallel to the long axis of the ellipse there is negligible transmission, when the incident polarization is rotated 90 to point along the ellipse’s minor axis, the aperture transmits a substantial fraction of the incident light. Finally, to analyze the interaction of light with small apertures, it is generally necessary to keep track of both E and B components of the electromagnetic wave, as the modification of one of these fields produces non-trivial changes in the other field’s distribution.4 This chapter presents the results of computer simulations based on the Finite Difference Time Domain (FDTD)5 method for an elliptical aperture in a thin metal film illuminated by a normally incident, monochromatic plane wave. Both cases of incident polarization parallel and perpendicular to the long axis of the †

The co-authors of this chapter are Armis R. Zakharian, now at Corning Corp., and Jerome V. Moloney of the University of Arizona.

418

30 Light transmission through small elliptical apertures

419

ellipse will be considered. We begin by developing an intuitive description of the behavior of the electromagnetic fields in each case, then present simulation results that exhibit patterns similar to those expected from this qualitative analysis. The simulations reveal, in quantitative detail, the amplitude and phase behavior of the E- and B-fields in and around the aperture. Maxwell’s equations In developing an intuitive understanding of the electromagnetic field distribution around an aperture, we rely heavily on Maxwell’s divergence equations, r · D ¼ q and r · B ¼ 0, where D ¼ e0eE, B ¼ l0H, and q is the electric charge density.1,4 (e0 and l0 are the permittivity and permeability of free-space, while e is the relative permittivity of the local environment.) The divergence-free nature of the magnetic field simply means that the B-field lines cannot be interrupted; they can go around in loops or they can form unbroken infinite lines, but they cannot originate, nor can they terminate, at specific points in space. A similar argument applies to D-field lines, except in locations where electric charges exist. When charges are present, lines of D originate on positive charges and terminate on negative charges; everywhere else the D-lines can twist and turn in space, but they cannot start or stop. The other two of Maxwell’s equations, r · H ¼ J þ @D/@t and r · E ¼ – @B/@t, are necessary not only for generating the E and B fields from electrical currents (J is the local current density), but also to sustain these fields in source-free regions of space.4 When highly conducting media (e.g., metallic bodies) are present in a system, surface currents Is develop that support the magnetic field H immediately outside the conducting surfaces. Aside from these electrical currents that act as sources of the H-field, time variations of the E-field are needed at each point of space to maintain the local B-field. In a similar vein, aside from electric charges that act as sources and sinks for the D-field, time variations of B are necessary to maintain the local E-field. The lines of the current density J remain divergence-free, except in those locations where they deposit electrical charges, that is, r · J ¼ – @q/@t.4,6 Inside an electrical conductor J ¼ rE, where r is the conductivity of the material. Good conductors (e.g. metals) have large conductivities, which means that the E-field must all but vanish from the interior of such bodies. When the fields are oscillatory, any magnetic fields inside a good conductor will produce, by virtue of the Faraday law, r · E ¼ – @B/@t, a local electric field. Since E-fields are not allowed inside a conductor, time-varying magnetic fields, being intimately associated with the electric fields, must also be absent. The interior of good conductors thus remains free of charges, currents, and time-varying electromagnetic fields.

420

Classical Optics and its Applications

Charges and currents, however, can and do develop on the conductor’s surface, where they give rise to E- and B-fields in the vicinity of the surface outside the conductor. The fifth equation of classical electrodynamics, the Lorentz law of force, F ¼ q(E þ V · B), expresses the force F experienced by a particle of charge q and velocity V.4 This equation is occasionally useful in developing a qualitative picture of the current distribution in the vicinity of small apertures. For example, within the skin depth of a conductor, the directions of E and B would indicate the sense in which local surface currents are affected by the Lorentz force acting on the charge carriers. Typically, the E-field is the dominant factor in this regard, as evidenced by the constitutive relation J ¼ rE. Any transverse deflections of the current by the B-field are generally neglected, unless the Hall conductivity of the medium is explicitly included in the constitutive relations. Radiation by an oscillating dipole With reference to Figure 30.1(a), a static electric dipole p creates, in its surrounding environment, electric-field lines that emerge from the positive pole and disappear into the negative pole. A periodically oscillating electric dipole emanates E-field lines that reverse direction at half-period intervals. The constant speed of light in all directions in space then dictates that these E-field reversals occur on spherical shells separated by a half-wavelength (k/2) from their adjacent shells. The zero-divergence requirement imposed on the E-lines by the first Maxwell equation thus requires the existence of the closed lines of field depicted in Figure 30.1(b). The curl of the E-field gives rise to B-field lines that encircle the dipole in closed loops, sustaining the E-field oscillations while simultaneously being generated by them. In the space between adjacent spherical shells separated by k/2, the E-lines are not parallel to these shell surfaces, but bend inward or outward as shown to maintain the divergence-free condition of the E-field.6 A static magnetic dipole m, shown in Figure 30.1(c), is a closed loop of electrical current whose B-field pattern is similar to the E-field of an electric dipole. Figure 30.1(d) shows an oscillating magnetic dipole, which behaves in much the same way as an electric dipole does, albeit with a role reversal for E and B.6 These examples indicate that by direct appeal to Maxwell’s equations, especially the divergence laws, it is possible to obtain an intuitive picture of the electromagnetic field distribution. In the discussions that follow, we will use the dipole radiation patterns sketched in Figures 30.1(b) and 30.1(d) to elucidate the nature of transmission through subwavelength apertures in a thin metal film.

30 Light transmission through small elliptical apertures

421

(b)

(a)

E E B

p

(d)

(c)

B

B E m

Figure 30.1 (a) E-field lines of a static electric dipole p emerge from the positive pole and disappear into the negative pole. (b) An oscillating electric dipole emanates E-lines that reverse direction on spherical shells separated by k/2. The curl of the E-field creates B-field lines that surround the dipole in closed circular loops. (c) A static magnetic dipole m is a closed loop of electrical current whose B-field pattern is similar to the E-field of an electric dipole. (d) An oscillating magnetic dipole behaves similarly to an electric dipole, albeit with the roles of E and B reversed.

Plane wave reflection from a (highly conducting) flat mirror Figure 30.2 shows the case of a normally incident plane wave on a perfect conductor (yellow slab at the bottom). The incident beam induces a surface current Is in the conductor, which creates equal-amplitude plane waves propagating in the Z-directions.4,6 In the half-space below the conductor, the induced

422

Classical Optics and its Applications

Z E

/4

B Y

/4

E Is

/4

B

X

Figure 30.2 Normally incident plane-wave on a perfect conductor (yellow slab) induces a surface current Is, which radiates two equal-amplitude plane waves in Z-directions. In the lower half-space the induced beam cancels out the incident beam. In the upper half-space, the incident and reflected beams interfere, creating standing-wave fringes of both E- and B-fields.

and incident plane-waves cancel each other out. In the half-space above the conductor, interference between the incident and reflected beams creates standingwave fringes of the electric-field E and the magnetic field B. The B-field is strongest at the surface of the conductor, reversing sign at intervals of Dz ¼ k/2, where its adjacent peaks are located. The peaks of the E-field, also located at k/2 intervals, are staggered relative to the B-field peaks, thus coinciding with planes of vanishing magnetic field. At the upper surface of the conductor, where the E-field is zero, the B-field is sustained by the surface current Is. (Although Is is shown antiparallel to the standing-wave’s E-field at Dz ¼ k/4, in reality Is is 90 behind this E-field, reaching maximum when the E-field directly above the surface is going through zero on its way to the peak.) In the half-space above the conductor, in the absence of any electrical charges and currents, the E-field is sustained by the time-variations of the B-field (r · E ¼ – @B/@t ), and vice versa (r · H ¼ @D/@t ). In an imperfect conductor, where conductivity is large but finite, the E- and B-fields penetrate slightly beneath the surface, producing a Lorenz force on the moving charges that comprise the surface current. While the E-field provides the current’s driving force, the magnetic component of the Lorentz force attempts to

423

30 Light transmission through small elliptical apertures

drive the surface current further down into the conductor (radiation pressure). In general, the surface current Is need not be in-phase with the penetrating E-field, since, at optical frequencies, the electrical conductivity r is a complex number. Elliptical aperture illuminated with plane-wave polarized along the long axis The presence of a small (subwavelength-sized) elliptical aperture in the system of Figure 30.2 distorts the surface current Is in the vicinity of the aperture by diverting the current’s path to avoid the hole, as shown in Figure 30.3. The B-lines within the fringe immediately above the mirror surface reorient in such a way as to remain perpendicular to the lines of Is, thus bending toward the center of the aperture. The B-lines directly above the aperture, lacking support from an underlying surface current, drop into the hole on the left side and re-emerge on the right side. (The B-lines, of course, cannot break up because r · B ¼ 0 everywhere; they can only bend locally and change direction, but must remain continuous at all times.) The lines of surface current Is that begin and end on the ellipse’s sharp corners deposit electric charges around these corners, which charges then act as sources

Y Is +

B

+ +

E Z

X

Figure 30.3 A small elliptical aperture in the system of Figure 30.2, with its major axis parallel to the surface current Is, distorts the current distribution by diverting its path to avoid the hole. The B-lines immediately above the surface bend toward and into the aperture, without breaking up. The E-field in and around the aperture gets redistributed in a way that supports the B-field while staying away from the long side-walls of the hole. The surface currents in the vicinity of the aperture deposit opposite charges around the sharp corners of the ellipse, causing the E-lines to break up at these corners.

424

Classical Optics and its Applications

and sinks for the E-lines in their neighborhood. Elsewhere, lack of any significant amount of charge means that the E-lines cannot break up, but rather they must twist and turn continuously as they adjust to the new environment created by the presence of the hole. The E-field in and around the aperture must be distributed in a way that would support the B-field (through the curl equations), but, because a parallel E-field cannot exist on conducting surfaces, it must also stay away from the interior walls of the hole. Figure 30.3 shows a possible way for the E-lines just above the aperture to dodge the side-walls and concentrate near the center, as they drop into the hole from above. The bundle of E-lines in the middle of the hole (parallel to the ellipse’s long-axis) then acts as a source of circulating magnetic fields that wrap around the long axis (r · H ¼ @D/@t ), thus supporting the B-field above, below, and inside the aperture. Figure 30.4(a) shows that, in the central XZ cross-section of the aperture, the B-lines above the aperture, without breaking up, thin down and sag toward and into the hole. Magnetic energy thus leaves the mid-section of the strong B-fringe above the hole and leaks into the hole and beyond. The behavior of the E-field in the central YZ-plane is depicted in Figure 30.4(b). Here the strong fringe, which is not immediately above the aperture but a distance of Dz ¼ k/4 away, is squeezed laterally toward the hole’s center, while, at the same time, leaking some of its energy into the aperture. Some of the E-lines originate or terminate on the charges deposited by the surface current Is on the sharp corners of the ellipse. (The dashed lines in Figure 30.4(b) represent the bending of the E-field out of the YZ-plane toward charges that reside on the side-walls near these sharp corners.) Note that the charge polarity is such that the E-lines above have the same direction as those inside and below the aperture. It is important to recognize that the surface current Is lags 90 behind the E-field of the first fringe. Thus, when the E-field directly above the aperture reaches its maximum along the negative Y-axis, Is, which has been traveling in the positive Y-direction until that moment, has stopped and is beginning to reverse direction. This explains why the charges reach their maximum strength when the E-field immediately above the aperture is at a peak, and also clarifies the reasoning behind the polarity chosen for the charges in Figure 30.4(b). Aside from the incident beam, which is fixed at the outset, all other radiation in the system of Figure 30.3 is generated by the surface currents Is (and the charges deposited by Is around the sharp corners of the aperture). The same is true of the system of Figure 30.2, with its uniform current confined to the upper surface of the conductor. Any differences between the radiation fields in the systems of Figure 30.2 and Figure 30.3 must therefore arise from the difference between the two surface current distributions. Subtracting the (uniform) surface current of Figure 30.2 from that of Figure 30.3 yields the distribution sketched in

425

30 Light transmission through small elliptical apertures (a)

Z B

Is

X

(b) Z

E Is

– – –

+

+ +

Is Y

Figure 30.4 (a) The B-field above the aperture of Figure 30.3, without breaking up, thins down and sags into the hole. (b) The E-field, whose strong fringe is not immediately above the aperture but a distance of Dz ¼ k/4 away, is squeezed toward the center of the hole, while, at the same time, leaking some of its energy into the aperture. The E-lines can originate or terminate on the charges deposited by the surface current Is on the sharp corners of the ellipse. (Dashed lines represent the bending of some of the E-field out of the YZ-plane toward charges that reside on the side-walls near these sharp corners.) Note that the charge polarity is such that the E-lines above have the same direction as those inside and below the aperture.

Figure 30.5(a). Far from the aperture, of course, the perturbation caused by the aperture is small and the two surface currents must cancel out. In the vicinity of the aperture we find two loops of current circulating in opposite directions, as well as positive and negative charges in those regions where the divergence of the local current is non-zero. As shown in Figure 30.5(b), these circulating currents are equivalent to a pair of oppositely oriented magnetic dipoles þm and –m (i.e., a magnetic quadrupole, assuming their separation is much less than a wavelength); the charges localized on the aperture’s sharp corners give rise to an oscillating electric dipole p. Thus, adding the dipoles p and m to the system of Figure 30.2 should transform it over to the system of Figure 30.3. Figure 30.5(c) shows that, in the vicinity of the aperture, the combined radiation pattern of the electric dipole and the magnetic quadrupole consists of a

426

Classical Optics and its Applications (a) Y

+ +

+

–

–

Is

– X (b)

(c)

+++

p

B –m

m

E

Figure 30.5 (a) Surface current distribution obtained when the (uniform) surface current of Figure 30.2 is subtracted from that of Figure 30.3. Charges appear in regions where the current’s divergence is non-zero. (b) The net effect of the aperture on the uniform surface current of Figure 30.2 is the addition of an electric dipole p and two loops of current that circulate in opposite directions; each current loop is a magnetic dipole m. (c) In the vicinity of the aperture, the combined radiation pattern of the electric and magnetic dipoles consists of a circulating B-field around the major axis of the ellipse and an E-field pattern that tends to stay away from the long side-walls of the aperture.

30 Light transmission through small elliptical apertures

427

circulating B-field around the major axis of the ellipse, and an E-field distribution that tends to stay away from the long side-walls of the aperture. These fields, when added to the E- and B-fringes of Figure 30.2, produce the field profiles of Figures 30.3 and 30.4. The circulating magnetic field around the ellipse’s major axis in Figure 30.5(c) is responsible for the bending of the B-lines toward and into the hole, as sketched in Figures 30.3 and 30.4(a). Similarly, superposition of the E-field pattern of Figure 30.5(c) with the uniform E-fringe that exists above an apertureless mirror gives rise to the E-field pattern of Figure 30.3 in the XY-plane immediately above the aperture. In practice, the metallic film has a finite thickness, and the combined radiation by the dipole p and quadrupole m of Figure 30.5(b) must vanish within the body of the film. To this end, the magnetic dipoles may have to tilt sideways, one to the right and the other to the left, so that everywhere inside the metal film their E- and B-fields will be canceled by the corresponding fields of the electric dipole. Physically, the sideways tilt of the m dipoles is a consequence of the induced surface currents on the interior side-walls of the aperture, which currents also help to support the B-field adjacent to these side-walls; see Figure 30.4(a). All in all, the primary source of radiation through the aperture of Figure 30.3 seems to be the m quadrupole depicted in Figure 30.5(b); the induced dipole p in this system is relatively weak and plays a secondary role, namely, canceling the quadrupole’s radiation inside the metal film. In general, quadrupolar sources are weak radiators, thus accounting for the weakness of transmission through an elliptical aperture illuminated by a plane wave whose polarization direction coincides with the major axis of the ellipse. Figure 30.6 shows computed plots of Ex, Ey, Ez in the XY-plane located 20 nm above the surface of the conductor in the system of Figure 30.3. The simulated conductor is a 124 nm-thick film of silver (n þ ik ¼ 0.226 þ i6.99 at k ¼ 1.0 lm) having an 800 nm-long, 100 nm-wide elliptical aperture.7 The magnitude of each field component is plotted in the top row of Figure 30.6, and the corresponding phase profile appears below it. For our purposes, the main utility of the phase distribution is to indicate the relative orientation of the various field components. For instance, if the phase of Ey at a given location happens to be 0, then if the phase of Ex at that location turns out to be equal (or nearly equal) to 0, we will know that Ex x þ Ey y oscillates back and forth between the first and third quadrants of the XY-plane. However, if the phase of Ex hovers around 0 180 , then Ex x þ Ey y oscillates between the second and fourth quadrants. The E-field distribution of Figure 30.6 is consistent with the qualitative behavior sketched in Figures 30.3, 30.4(b), and 30.5(c). The Ex component bends the central field lines toward the middle of the aperture, and pushes the peripheral lines further way, thus ensuring that the long side-walls repel the parallel E-field.

428

y [nm]

–600

–400

–200

0

200

400

600

–600

–400

–200

0

200

400

600

0.0000

60

0.0182

0 200 x [nm]

–60

0.0091

–600 –400 –200

f (Ex) –180

|Ex|

400 600

180

0.0273

7.8

0.50

0 200 x [nm]

3.9

0.25

–600 –400 –200

0.0

0.00

f (Ey)

|Ey|

400

600

11.7

0.75

–180

60

0.096

0 200 x [nm]

–60

0.048

–600 –400 –200

f (Ez)

|Ez| 0.000

400

600

180

0.144

Figure 30.6 Computed plots of Ex, Ey, Ez in an XY-plane located a short distance (Dz ¼ 20 nm) above the surface of the conductor in the system of Figure 30.3. Top row: amplitude, bottom row: phase. The silver film is 124 nm thick, the aperture is 800 nm long and 100 nm wide, and the radiation wavelength is k ¼ 1 lm.

y [nm]

429

30 Light transmission through small elliptical apertures

The Ey component is strengthened near the center of the aperture because the field lines are pushed upward and squeezed laterally toward the center. Finally, the Ez component confirms the presence of charges of opposite sign at and around the sharp corners of the aperture (r · D ¼ q). These pictures are consistent with the presence of a weak electric dipole and a magnetic quadrupole in and around the elliptical aperture. Computed amplitude and phase plots of Ey, Ez in the central YZ-plane of the aperture are shown in Figure 30.7. The bands of Ey above the aperture are the standing-wave fringes created by the interference between the incident and reflected beams. The weak nature of transmission through the aperture is evident in the very small perturbation of the fringes, as they sag ever so slightly to fill the top of the aperture. The profile of Ez, once again, confirms the accumulation of electric charges around the sharp corners of the hole. Moreover, it shows that the |Ey| 0.00

0.68

1.36

2.05

|Ez| 0.00

60

180

f (Ez) –180

0.088

0.177

0.265

–60

60

180

600 400

z [nm]

200 0 –200 –400 –600 f (Ey) –180

–60

600 400

z [nm]

200 0 –200 –400 –600 –600 –400 –200 0 200 400 600 y [nm]

–600 –400 –200 0 200 400 600 y [nm]

Figure 30.7 Computed amplitude and phase plots of Ey, Ez in the central YZ-plane in the system of Figure 30.3. The silver film’s cross-section is indicated with dashed lines. The standing wave fringes are only slightly perturbed by the aperture.

430

Classical Optics and its Applications

charges on the top facet of the metal film, while much stronger than those on the bottom facet, have the same sign as the charges on the bottom; in other words, the top and bottom charges are both positive at one end of the ellipse, and both negative at the opposite end. Figure 30.8 shows plots of Hx, Hy, Hz in the XY-plane 20 nm above the surface of the conductor, while amplitude and phase plots of Hx and Hz in the central XZ-plane appear in Figure 30.9. As expected from the preceding discussion of Figures 30.3 and 30.4, the magnetic fringe nearest the surface is seen to leak into the aperture by bending the H-lines near the corners of the ellipse toward the center and down into the hole. Computed plots of Ex, Ey, Ez in the XY-plane 20 nm below the conductor are shown in Figure 30.10, and the corresponding H-field distributions appear in Figure 30.11. While the profiles of these fields confirm the behavior expected from our earlier qualitative analysis, their small magnitudes testify to the weak nature of radiation by the m quadrupole (and the accompanying p dipole) induced by the incident beam in the vicinity of the aperture of Figure 30.3.

|Hx| × 10–3 0.00

1.88

3.75

34

68

|Hy| × 10–3 0.000

0.085

0.169

102

f (Hy) –180

–60

60

600

–600 –400 –200

5.63

|Hz| 0.254 × 10–3 0.00

0.38

0.75

1.13

180

f (Hz) –180

–60

60

180

600

–600 –400 –200

600 400

y [nm]

200 0 –200 –400 –600 f (Hx)

0

600 400

y [nm]

200 0 –200 –400 –600 –600 –400 –200

0 200 x [nm]

400

0 200 x [nm]

400

0 200 x [nm]

400

Figure 30.8 Computed plots of Hx, Hy, Hz in the XY-plane 20 nm above the conductor surface in the system of Figure 30.3. Top row: amplitude; bottom row: phase.

600

431

30 Light transmission through small elliptical apertures |Hx| × 10–3 0 600

2

4

6

|Hz| × 10–3 0.00

1.02

2.03

3.05

–60

60

180

f(Hz) –180

–60

60

180

400

z [nm]

200 0 –200 –400 –600 f(Hx) –180 600 400

z [nm]

200 0 –200 –400 –600 –600 –400 –200 0 200 400 600 x [nm]

–600 –400 –200 0 200 400 600 x [nm]

Figure 30.9 Amplitude and phase plots of Hx, Hz in the central XZ-plane in the system of Figure 30.3. The silver film’s cross-section is indicated with dashed lines.

Figure 30.12 shows distributions of the magnitude jSj of the Poynting vector in various cross-sections of the system of Figure 30.3. The superimposed arrows on each plot show the projection of S in the corresponding plane.7 For instance, in the XZ cross-section depicted in Figure 30.12(a), the arrows represent Sx x þ Sz z, whereas in the YZ cross-section of Figure 30.12(b) the arrows correspond to the projection Sy y þ Sz z of the Poynting vector on the YZ-plane. The plots in Figure 30.12(c) and (d) show the distributions of jSj in the XY-planes immediately above and below the aperture. In the absence of an aperture, S is essentially zero everywhere, as the reflected beam cancels out the incident beam’s energy flux. When the aperture is present, however, the fields are redistributed in such a way as to draw the incident optical energy toward the aperture. In the present case, the energy flows in from the periphery, fails to find a way through the aperture, bounces back and returns toward the source in the region directly above the aperture. In the process, several vortices are formed, where the incoming energy makes a sharp turnaround and heads back toward the source.

432

Classical Optics and its Applications |Ex| 0.0001

0.0021

0.0042

0.0062

|Ey| 0.0000

0.0203

0.0406

0.0609

24.7

49.4

74.1

|Ez| 0.000

0.0099

0.0197

0.0296

600 400

y [nm]

200 0 –200 –400 –600 f(Ex) –180

–60

60

180

f(Ey)

0.0

f(Ez) –180

–60

60

180

600 400

y [nm]

200 0 –200 0–400 –600 –600 –400 –200

0 200 x [nm]

400 600

–600 –400 –200

0 200 x [nm]

400 600

–600 –400 –200

0 200 x [nm]

400

600

Figure 30.10 Computed plots of Ex, Ey, Ez in the XY-plane 20 nm below the bottom facet of the conductor in the system of Figure 30.3. Top row: amplitude; bottom row: phase.

In Figure 30.12(d) the Poynting vector S ¼ ½ Re (E · H*) at the bottom of the hole has a magnitude jSj 2.5 · 10–6 W/m2, consistent with the transmitted E-field of 0.06 V/m and H-field of 2.3 · 10–4 A/m, considering the large phase difference of D 70 between the E- and H-fields near the bottom of the aperture; see Figures 30.10 and 30.11. Since the incident plane-wave is assumed to have Ey ¼ 1.0 V/m, Hx ¼ Ey /Z0 ¼ 2.65 · 10–3 A/m (free-space impedance Z0 377 X), which correspond to an incident energy density 1.32 · 10–3 W/m2, the power transmission efficiency g at the center of the elliptical aperture of Figure 30.3 is seen to be just under 0.2%. We will see in the next section that when the incident polarization is rotated 90 (to point along the minor axis of the ellipse), the transmission efficiency through the aperture increases to g 93%, a nearly 500-fold improvement. Elliptical aperture illuminated with plane-wave polarized along the short axis When the incident E-field is parallel to the minor axis of an elliptical aperture, the surface currents Is deposit charges at and around the long side-walls of the

433

30 Light transmission through small elliptical apertures |Hx| × 10–3 0.000

0.077

0.154

0.231

f(Hx) –180

–60

60

180

|Hy| × 10–3 0.000

0.0152

0.0303

0.0455

|Hz| × 10–3 0.00

0.08

0.16

–60

60

0.24

600 400

y [nm]

200 0 –200 –400 –600 f(Hy)

–180

–60

60

180

f(Hz)

–180

180

600 400

y [nm]

200 0 –200 –400 –600 –600 –400 –200

0 200 x [nm]

400

600

–600 –400 –200

0 200 x [nm]

400

600

–600 –400 –200

0 200 x [nm]

400

600

Figure 30.11 Computed plots of Hx, Hy, Hz in the XY-plane 20 nm below the bottom facet of the conductor in the system of Figure 30.3. Top row: amplitude; bottom row: phase.

aperture, as shown in Figure 30.13. These oscillating charges radiate as an electric dipole flanked by a pair of magnetic dipoles, creating circulating magnetic fields around the ellipse’s minor axis that push the incident B-lines upward and sideways. In the area surrounding the hole, the E-field produced by these dipoles bends the Is lines toward the mid-section of the aperture, as shown in Figure 30.13, and as required for self-consistency. Aside from the incident beam, all the radiation in the system of Figure 30.13 is produced by the surface currents Is and the charges created by these currents. Subtracting the (uniform) surface current in the system of Figure 30.2 from that in Figure 30.13 thus yields the current distribution of Figure 30.14(a), which is responsible for the difference between the radiation patterns in the two systems. When added to the uniform current of Figure 30.2, the currents of Figure 30.14(a) produce the Is pattern shown in Figure 30.13. The current loops of Figure 30.14(a) are equivalent to a pair of oppositely oriented magnetic dipoles, þ m and – m, while the charges deposited on the long sides of the aperture constitute an electric dipole p; see Figure 30.14(b). Figure 30.14(c) shows that, in the XY-plane immediately above the aperture, the E-field is dominated by the electric dipole p.

434

Classical Optics and its Applications |S| × 10–5 0 600

4

8

12

|S| × 10–5 0

5

10

15

b

a

400

z [nm]

200 0 –200 –400 –600 –600 –400 –200

0

200

400

600

–600 –400 –200

x [nm] |S| × 10–5 0.0 600

5.8

0

200

400

600

y [nm]

11.6

17.4

|S| × 10–5 0.000

0.082

0.164

0.246

d

c

400

y [nm]

200 0 –200 –400 –600 –600 –400 –200

0 200 x [nm]

400

600

–600 –400 –200

0 200 x [nm]

400

600

Figure 30.12 Profiles of the magnitude jSj of the Poynting vector in various cross-sections of the system of Figure 30.3. The superimposed arrows show the projection of S in the corresponding plane. (a) Central XZ-plane. (b) Central YZ-plane. (c, d) XY-planes located 20 nm above and below the aperture.

The contribution of the magnetic dipoles is to enhance the E-field at the center of the aperture, while weakening it in the outer regions. Figure 30.14(d) shows that in the XY-plane directly above the aperture, the B-field profile is shaped by competition between the electric dipole p and the magnetic dipoles m. The electric dipole dominates near the center but, further

435

30 Light transmission through small elliptical apertures Y Is

+ Z

–

– +

+

–

B

X

Figure 30.13 When the incident E-field is parallel to the minor axis of an elliptical aperture, the surface currents Is deposit charges at and around the long side-walls of the aperture. These oscillating charges radiate as an electric dipole flanked by a pair of magnetic dipoles, creating circulating magnetic fields around the ellipse’s minor axis that push the incident B-lines upward and sideways.

away, the magnetic dipoles dictate the B-field’s behavior. The dotted B-lines near the sharp corners of the ellipse in Figure 30.14(d) show the field leaving the XY-plane to enter/exit the hole vertically (i.e. in the Z-direction). Although not shown in this figure, B-lines that enter the hole from above close the loop by circling beneath the metal film and returning through the hole to reconnect with the B-lines above the film; see Figure 30.15. The surface charges and currents of Figure 30.14(a) create magnetic fields in the free-space regions inside the hole as well as those above and below the metal surface. The B-field of the electric dipole p combines with that of the magnetic dipoles m to produce closed loops in the vicinity of the aperture, as shown in Figure 30.15. The solid B-lines in this figure bulge above and below the metal surface, while the dashed lines hug the conductor’s top and bottom surfaces. (The B-field cannot penetrate into the conductor, but, as it emerges from the hole, it bends above and below the surface in such a way as to bring the field lines close to the metallic surface.) In all cases, the lines of B must form closed loops to guarantee the divergence-free nature of the field. Since neither E- nor B-fields can exist within the conductor, the fields radiated by the electric dipole p must cancel out those of the magnetic dipoles m everywhere inside the metallic medium. The radiation emanating from these dipoles, however, permeates the interior of the hole as well as the free-space regions on both sides of the conductor. To get in and out of the hole, the B-lines of Figure 30.15 appear to descend through one of the current loops that constitutes a magnetic dipole in Figure 30.14(b), then return through the other loop. Note the change of direction of the magnetic field at the upper surface of the elliptical aperture: the direction of B just above the hole is

436

Classical Optics and its Applications (b)

(a) Y

Is –

– – –

+

+

+ +

– – –

+

m

-m p

+ + X (d)

(c)

B

E –– –

+ + +

Figure 30.14 (a) Surface currents and the accompanying charge distribution produced by the elliptical aperture of Figure 30.13. When added to the uniform current of Figure 30.2, these currents produce the Is pattern shown in Figure 30.13. (b) The current loops in (a) are equivalent to a pair of magnetic dipoles, m, while the charges deposited on opposite sides of the aperture constitute an electric dipole p. (c) In the XY-plane immediately above the aperture, the E-field is dominated by the electric dipole p. (d) In the XY-plane immediately above the aperture, the B-field profile is shaped by competition between the electric dipole p and the magnetic dipoles m. Dotted B-lines near the sharp corners of the ellipse show the B-field leaving the XY-plane to enter/exit the hole.

opposite to that beneath the hole’s upper surface. This 180 phase shift, dictated by the presence of the (uniform) Is on the top surface of the elliptical aperture in Figure 30.14(a), will disappear when the fringes of Figure 30.2 are added to the fields produced by p, m, and –m to yield the total field in and around the aperture. The induced electric charges on the surfaces surrounding the aperture produce an oscillating E-field in the short gap between the long side-walls as well as in the regions immediately above and below the aperture. The time rate of change of

30 Light transmission through small elliptical apertures

437

B Z

Y

X

B

Figure 30.15 With reference to Figure 30.14(b), the B-field of the electric dipole p combines with that of the magnetic dipoles m to produce closed loops in and around the aperture. The solid B-lines bulge above and below the metal film, while the dashed B-lines hug the conductor’s top and bottom surfaces.

this field, @D/@t, which is equivalent to an electric current density J across the gap, creates circulating magnetic fields around the short axis of the ellipse.4 These B-fields by themselves, however, are not sufficient to explain the field profile depicted in Figure 30.15, and must be augmented by the fields produced by the circulating currents around the ellipse’s sharp corners (i.e., the m dipoles) to yield a complete picture. Moreover, inside the metallic medium, the E- and B-fields of the p dipole cannot vanish without the compensating contributions of the m dipoles. Figure 30.16 shows cross-sections of the system of Figure 30.13 in YZ- and XZ-planes. Since Is lags 90 behind the incident E-field immediately above the aperture, the accumulating charges on and around the side-walls produce electric fields opposite in direction to the incident E-field. The E-lines may now start on positive charges and end on negative charges (r · D ¼ q), as shown in Figure 30.16(a). This change of direction of the E-field causes a 180 phase shift in Ey from above to below the aperture. The E-fringe just above the aperture thus becomes weaker, sharing some of its energy with the E-field inside and below the aperture. The XZ cross-section of the system of Figure 30.13 depicted in Figure 30.16(b) shows how the oscillating electric dipole p and magnetic dipoles m push the B-fringe above the aperture upward and sideways to make room for circulating B-fields that surround the short axis of the elliptical aperture. The resulting redistribution of the magnetic energy of the B-fringe above the hole thus makes it

438

Classical Optics and its Applications (a) Z

E

+ Is

Is

++

Y

(b) Z

B

Is

Is

X

Figure 30.16 Cross-sections of the system of Figure 30.13 in YZ- and XZ-planes. (a) The charges accumulating on the aperture’s side-walls produce an E-field opposite in direction to the incident field. The lines of E may now start on positive charges and end on negative charges. (b) The dipoles p, m, and – m of Figure 30.14(b) push the B-fringe above the aperture upward and sideways to make room for circulating B-fields that surround the short axis of the elliptical aperture.

possible for some of the energy stored in this fringe to leak into the hole as well as the space below the hole. (The B-field distribution inside the aperture and in the region below the metal film is the same as that in Figure 30.15, since the added fringes contribute only to the half-space above the conductor.) The divergence-free nature of the B-lines requires their continuity, which is evident in Figure 30.16(b), in contrast to the E-lines of Figure 30.16(a), which break up whenever they meet electrical charges. Figure 30.17 shows computed plots of Ex, Ey, Ez in the XY-plane 20 nm above the surface of the conductor in the system of Figure 30.13 (top row: magnitude; bottom row: phase). As before, the 124 nm-thick silver film used in these simulations has n þ ik ¼ 0.226 þ i6.99 at k ¼ 1.0 lm, and the ellipse’s diameters along its major and minor axes are 800 nm and 100 nm, respectively.7 The strong

439

30 Light transmission through small elliptical apertures |Ex| 0.000

0.023

0.046

0.069

–60

60

180

|Ey| 0.0000

0.245

0.491

0.736

45

90

|Ez|

0.00

0.33

0.66

0.99

135

f(Ez) –180

–60

60

180

600

–600 –400 –200

600 400

y [nm]

200 0 –200 –400 –600 f(Ex) –180

f(Ey)

0

600 400

y [nm]

200 0 –200 –400 –600 –600 –400 –200

0 200 x [nm]

400 600

–600 –400 –200

0 200 x [nm]

400

0 200 x [nm]

400

600

Figure 30.17 Computed plots of Ex, Ey, Ez in the XY-plane 20 nm above the conductor’s surface in the system of Figure 30.13. Top row: amplitude; bottom row: phase. The silver film is 124 nm thick, the aperture is 800 nm long and 100 nm wide, and the radiation wavelength is k ¼ 1 lm. The aperture boundaries are indicated with dashed lines.

z-component of E indicates the presence of significant amounts of electric charge on the conducting surfaces in the vicinity of the hole; the sign-reversal of Ez from one side of the hole to the other shows that the charges on the two sides have opposite signs. Figure 30.18, left panel, shows the amplitude and phase of Ey in the central XZ-plane, while the right panel shows Ey, Ez in the central YZ-plane. Inside and below the aperture Ey is seen to be strong, and to have reversed direction relative to the E-field immediately above the aperture; its energy appears to have been extracted from the E-fringe directly above the hole. The distribution of Ez shows, once again, the presence of electric charges on the top and bottom surfaces of the conductor; these charges have the same sign on the top and bottom surfaces on either side of the hole, but their sign is reversed in going from the left-side to the right-side. Computed plots of Hx, Hy, Hz in the XY-plane 20 nm above the conductor’s surface appear in Figure 30.19. Figure 30.20, left panel, depicts the amplitude and

440

Classical Optics and its Applications 0.00

0.68

1.36

600

600

400

400

200

200

0

0.00

1.48

2.97

4.45

|Ez| 0.0

1.3

2.6

3.9

–60

60

180

f(Ez) –180

–60

60

180

0

–200

–200

–400

–400

–600

–600 f(Ey) –180

–60

60

f(Ey) –180

180

600

600

400

400

200

200 z [nm]

z [nm]

|Ey|

2.05

z [nm]

z [nm]

|Ey|

0

0

–200

–200

–400

–400

–600

–600 –600 –400 –200 0 200 400 x [nm]

600

–600 –400 –200 0 200 400 600 y [nm]

–600 –400 –200 0 200 400 600 y [nm]

Figure 30.18 (Left) amplitude and phase of Ey in the central XZ-plane; (right) amplitude and phase of Ey, Ez in the central YZ-plane in the system of Figure 30.13. The silver film’s cross-section is indicated with dashed lines. The fringes in the two panels are differently colored because the color scale for Ey in the YZ-plane has been greatly expanded by two (barely visible) hot spots on the sidewalls near the bottom of the hole.

phase of Hx, Hz in the central XZ cross-section, while the right panel shows the distribution of Hx in the central YZ-plane. The magnetic field’s behavior in these pictures is in accord with the qualitative behavior sketched in Figure 30.16(b). Note, in particular, that the profile of Hz in Figure 30.20 resembles the z-component of the circulating B-field in Figure 30.16(b). Note also the draining of magnetic energy out of the B-fringe above the hole, and its redistribution not only in the form of magnetic fields inside and below the aperture, but also in the enhanced values of Hx directly above the conductor’s surface. Plots of Ex, Ey, Ez in the XY-plane 20 nm below the bottom surface of the conductor are shown in Figure 30.21, and the corresponding magnetic-field plots appear in Figure 30.22. These pictures are in full agreement with the qualitative diagrams of Figures 30.14–30.16. Figure 30.23 shows distributions of the magnitude jSj of the Poynting vector in various cross-sections of the system of Figure 30.13. The superimposed arrows on each plot show the projection of S in the corresponding plane.7 For instance, in

441

30 Light transmission through small elliptical apertures |Hx| × 10–3 0.00

1.94

3.89

35

69

5.83

|Hy| × 10–3 0.00

0.31

0.62

0.92

|Hz| × 10–3 0.00

0.45

0.89

1.34

f(Hy) –180

–60

60

180

f(Hz) –180

–60

60

180

600

–600 –400 –200

600 400

y [nm]

200 0 –200 –400 –600 f(Hx)

0

104

600 400

y [nm]

200 0 –200 –400 –600 –600 –400 –200

0 200 x [nm]

400

600

–600 –400 –200

0 200 x [nm]

400

0 200 x [nm]

400

600

Figure 30.19 Computed plots of Hx, Hy, Hz in the XY-plane 20 nm above the surface of the conductor in the system of Figure 30.13. Top row: amplitude; bottom row: phase. The aperture boundaries are indicated with dashed lines.

the XZ cross-section depicted in (a) the arrows represent Sx x þ Sz z, whereas in the YZ cross-section of (b) the arrows correspond to the projection of the Poynting vector on the YZ-plane, namely, Sy y þ Sz z. The plots in Figures 30.23(c) and (d) show the distributions of jSj in the XY-planes 20 nm above and below the aperture. In the absence of an aperture, S is essentially zero everywhere, as the reflected beam cancels out the incident beam’s energy flux. When the aperture is present, however, the fields are redistributed in such a way as to draw the incident optical energy toward the aperture. The energy flows in from the region directly above as well as from the periphery of the hole in every direction. In addition to the straight-ahead energy, some of the peripheral energy also goes through the hole, thus enhancing the overall transmission. Further away from the aperture, especially in the YZ-plane (which contains the ellipse’s short axis), the peripheral incoming energy turns away from the aperture and returns to the source. The magnitude of the Poynting vector in the center at the bottom of the hole is jSj 1.23 · 10–3 W/m2, which is consistent with the transmitted E- and B-fields of 1.6 V/m and 2.14 · 10–3 A/m, with a phase difference D ¼ E – B 45 (see Figures 30.21 and 30.22). The transmission efficiency of the optical power

442

Classical Optics and its Applications

600

|Hx| × 10–3 0.00 600

400

400

200

200

1.9

|Hz| 5.7 × 10–3 0.00

3.8

0.77

1.54

2.31

z [nm]

z [nm]

|Hx| × 10–3 0.0

0

3.95

5.93

–60

60

180

0

–200

–200

–400

–400 –600

–600 f(Hx) –180

–60

60

180

f(Hz) –180

–60

60

f(Hx) –180

180

600

600

400

400

200

200 z [nm]

z [nm]

1.98

0

0

–200

–200

–400

–400

–600

–600 –600 –400 –200 0 200 x [nm]

400

600

–600 –400 –200 0 200 x [nm]

400

600

–600 –400 –200 0 200 y [nm]

400

600

Figure 30.20 (Left) amplitude and phase of Hx, Hz in the central XZ-plane; (right) amplitude and phase of Hx in the central YZ-plane in the system of Figure 30.13. The silver film’s cross-section is indicated with dashed lines.

density at the center of this aperture is thus g 93%, which is nearly 500 times greater than that obtained when the incident polarization was parallel to the ellipse’s long axis. (g is the ratio of jSj at the aperture’s center just below the conductor to the incident plane-wave’s optical power density, jSincj 1.32 · 10–3 W/m2.) Several factors appear to have contributed to this strong performance (compared to the case of parallel polarization), among them, more electrical charges and stronger surface currents (especially on the bottom facet of the conductor), and a greater separation between the m magnetic dipoles, which tend to cancel each other out when they are close together. Concluding remarks We have analyzed the transmission of light through a small elliptical aperture in a thin silver film at k ¼ 1.0 lm. Both cases of incident polarization parallel and perpendicular to the major axis of the ellipse were considered. The transmission efficiency g was found to be low for parallel polarization and high for perpendicular polarization.

443

30 Light transmission through small elliptical apertures |Ex|

0.000

|Ey|

0.00

|Ez|

0.53

1.07

1.60

–60

60

180

f(Ez) –180

600

–600 –400 –200

0.025

0.050

0.075

–60

60

180

f(Ey) –180

600

–600 –400 –200

0.00

0.42

0.84

1.25

–60

60

180

600 400

y [nm]

200 0 –200 –400 –600 f(Ex)

–180

600 400

y [nm]

200 0 –200 –400 –600 –600 –400 –200

0 200 x [nm]

400

0 200 x [nm]

400

0 200 x [nm]

400

600

Figure 30.21 Computed plots of Ex, Ey, Ez in the XY-plane 20 nm below the bottom facet of the conductor in the system of Figure 30.13. Top row: amplitude; bottom row: phase.

The hallmark of the low-transmission case was a weak excitation of electric and magnetic dipoles on the upper surface of the metal film, which produced even weaker excitations on the lower surface. Although not described here, we have observed similar behavior for a circular aperture (diameter ¼ 100 nm, silver film thickness ¼ 124 nm, g ¼ 0.06 % at the center of the aperture 20 nm below the conductor), and also for an infinitely long, 100 nm-wide slit (g ¼ 0.14% at the center of the slit 36 nm below the bottom facet; incident polarization parallel to the slit). For the elliptical hole under low-transmission conditions, g drops rapidly with an increasing film thickness h, from 0.2 % at h ¼ 124 nm, to 0.008 % at h ¼ 186 nm, and to below 0.001% at h ¼ 248 nm. It appears that the elliptical hole, when considered as a waveguide,8,9 does not support any guided mode whose E-field is predominantly aligned with the ellipse’s long axis. The high-transmission ellipse revealed the excitation of fairly strong electric and magnetic dipoles on the upper surface of the metal film, which induced even stronger dipoles on the film’s lower facet. In this case g remains high for thicker films as well (g ¼ 93 % for h ¼ 124 nm, 86 % for h ¼ 186 nm,

444

Classical Optics and its Applications |Hx| × 10–3 0.00

0.71

1.43

f(Hx) –180

–60

60

|Hy| × 10–3 0.00

0.35

0.70

1.05

|Hz| × 10–3 0.00

0.53

1.06

1.59

180

f(Hy) –180

–60

60

180

f(Hz) –180

–60

60

180

600

–600 –400 –200

2.14

600 400

y [nm]

200 0 –200 –400 –600

600 400

y [nm]

200 0 –200 –400 –600 –600 –400 –200 0 200 x [nm]

400

0 200 x [nm]

400

600

–600 –400 –200 0 200 x [nm]

400

600

Figure 30.22 Computed plots of Hx, Hy, Hz in the XY-plane 20 nm below the bottom facet of the conductor in the system of Figure 30.13. Top row: amplitude; bottom row: phase.

and 136 % for h ¼ 248 nm), indicating propagation through the hole (along the Z-axis) of a guided mode whose E-field is largely parallel to the ellipse’s short axis. We also found that an infinitely long, 100 nm-wide slit exhibits strong transmission for an incident polarization aligned with the narrow dimension of the slit (g 69 % at the center of the slit, 36 nm below a 124 nm-thick silver film). It thus appears that achieving a large g requires an aperture that can excite strong oscillator(s) on the upper surface of the film, which would then induce strong oscillations on the lower facet, thereby creating the conditions for the passage of a substantial amount of electro-magnetic energy through the subwavelength opening in the metal film. The ability of a hole (or slit) to support a guided mode that can be excited by the incident polarization appears to be critical for achieving large transmission, especially for thicker films. Recent reports of various aperture designs that have significant throughputs (compared with simple circular or square-shaped apertures)3,10 indicate that the aforementioned principles, far from being specific to elliptical holes in thin metal films, have a broad range of application.

445

30 Light transmission through small elliptical apertures |S| × 10–5 0 600

59

118

177

|S| × 10–5 0

120

240

360

b

a

400

z [nm]

200 0 –200 –400 –600 –600 –400 –200

0

200

400

600

–600 –400 –200

x [nm] |S| × 10–5 0 600

42

0

200

400

600

y [nm]

84

126

|S| × 10–5 0

41

82

123

d

c

400

y [nm]

200 0 –200 –400 –600 –600 –400 –200

0 200 x [nm]

400

600

–600 –400 –200

0 200 x [nm]

400

600

Figure 30.23 Profiles of the magnitude jSj of the Poynting vector in various cross-sections of the system of Figure 30.13. The superimposed arrows show the projection of S in the corresponding plane. (a) Central XZ-plane. (b) Central YZ-plane. (c, d) XY-planes located 20 nm above and below the aperture.

References for Chapter 30 1 2

M. Born and E. Wolf, Principles of Optics, seventh edition, Cambridge University Press, UK (1999). H. A. Bethe, Theory of diffraction by small holes, Physical Review 66, 163 (1944).

446

Classical Optics and its Applications

3 T. Thio, K. M. Pellerin, R. A. Linke, H. J. Lezec, and T. W. Ebbesen, Enhanced light transmission through a single subwavelength aperture, Optics Letters 26, 1972–1974 (2001). 4 J. D. Jackson, Classical Electrodynamics, second edition, Wiley, New York, 1975. 5 A. Taflove and S. C. Hagness, Computational Electrodynamics, Artech House, Norwood, MA (2000). 6 Jin Au Kong, Electromagnetic Wave Theory, EMW Publishing, Cambridge, MA, 2000. 7 The computer simulations reported in this chapter were performed by Sim3D_Max, a product of MM Research, Inc., Tucson, AZ. 8 J. A. Porto, F. J. Garcia-Vidal, and J. B. Pendry, Transmission resonances on metallic gratings with very narrow slits, Phys. Rev. Lett. 83, No.14, 2845–48 (1999). 9 Q. Cao and P. Lalanne, Negative role of surface plasmons in the transmission of metallic gratings with very narrow slits, Phys. Rev. Lett. 88, No.5, 57403 (2002). 10 X. Shi, L. Hesselink, and R. L. Thornton, Ultrahigh light transmission through a C-shaped nanoaperture, Optics Letters 28, No. 15, 1320–22 (2003).

31 The method of Fox and Li

The electromagnetic fields within a waveguide or a resonator cannot have arbitrary distributions. The requirements of satisfying Maxwell’s equations as well as the boundary conditions specific to the waveguide (or the resonator) confine the distribution to certain shapes and forms. The electromagnetic field distributions that can be sustained within a device are known as its stable modes of oscillation.1,2 When the device and its geometry are simple, the stable modes can be determined analytically. For complex systems and complicated geometries, however, numerical methods must be used to solve Maxwell’s equations in the presence of the relevant boundary conditions. The method of Fox and Li is an elegant numerical technique that can be applied to certain waveguides and resonators in order to obtain the operating mode of the device. Instead of solving Maxwell’s equations explicitly, the method of Fox and Li uses the Fresnel– Kirchhoff diffraction integral to mimic the physical process of wavefront propagation within the device, thus arriving at its stable mode of operation after several iterations.3,4 To illustrate the method of Fox and Li we focus our attention on the confocal resonator shown in Figure 31.1(a). Let us assume that the two mirrors are aberration-free parabolas with an effective numerical aperture NA ¼ 0.01 and focal length f ¼ 62 500k0 (k0 is the vacuum wavelength of the light confined within the cavity). The clear aperture of each mirror will therefore have a diameter of 1250k0. (For the HeNe wavelength of k0 ¼ 0.633 lm, for example, this resonator will be a filament 8 cm long and 0.8 mm wide.) The resonator of Figure 31.1(a) may be modeled as the periodic-lens waveguide depicted in Figure 31.1(b). The beam starts at the focal plane of the first lens, becomes collimated, reaches the second lens, is focused by the second lens, and the process repeats itself over and over again. The essence of the method of Fox and Li for the computation of the stable mode within this cavity may now be described as follows. An initial distribution is propagated through the periodic-lens waveguide until a steady-state 447

448

Classical Optics and its Applications (a)

Z

(b)

Z

Figure 31.1 (a) Schematic diagram of a confocal optical resonator consisting of two parabolic mirrors. The mirrors are identical, each having numerical aperture NA and focal length f. (b) A periodic-lens waveguide that can be used to simulate the behavior of the resonator.

distribution is reached. In the steady state the shape of the complex-amplitude distribution within the cavity will no longer change with successive iterations, but its power content will decline at a constant rate due to losses in the cavity. These ideas may best be explained by several examples.

The lowest-order mode The mode of the cavity that is easiest to obtain by the method of Fox and Li is the lowest-order mode. Typically, just about any arbitrary initial distribution that one picks will converge to the lowest-order mode. From a practical standpoint this is very useful, because the lowest-order mode is also the mode in which the resonator operates, under most practical conditions. In Figure 31.2(a) we show a uniform initial distribution within a fairly large circular aperture. After going through about 80 iterations, this distribution settles into the mode known as the 0,0 mode of the cavity and shown in Figures 31.2(b)–(d). The 0,0 mode is essentially Gaussian in character, although, as the logarithmic plot of intensity in Figure 31.2(c) and the phase plot of Figure 31.2(d) show, it has an oscillating tail. The oscillation is caused by the finite apertures of the mirrors, which truncate the ideal, Gaussian mode.

449

31 The method of Fox and Li 300

a

b

c

d

y/0

–300 300

y/0

–300

–300

x/0

300 –300

x/0

300

Figure 31.2 Computing the 0,0 mode of the resonator shown in Figure 31.1 using the method of Fox and Li. (a) The assumed initial distribution, having uniform amplitude and constant phase across a wide circular aperture. (b) Computed intensity distribution at the mid-plane of the cavity, obtained after 80 iterations. (c) Same as (b) but showing the logarithm of intensity on a scale of 1 (white) to 105 (black). (d) Distribution of the phase in the mid-plane of the cavity corresponding to the steady-state intensity distribution shown in (b) and (c). In this picture a white pixel represents a þ180 phase angle, a black pixel represents a 180 phase angle, and the gray pixels represent the continuum of values in between.

For the above simulation a plot of the power attenuation coefficient c versus iteration number is shown in Figure 31.3. c is the ratio of the optical power contained in the beam after a given iteration to the same quantity before the iteration. It thus represents, for the particular mode under consideration, the fractional losses of the cavity during one round trip of the beam. The steady-state value of c is also related to the eigenvalue of the mode under consideration; the mode itself is an eigenfunction of the cavity. In the present example, where the steady-state value of c is 0.97, the losses for the lowest-order mode are indeed very small.

450

Classical Optics and its Applications

Attenuation Coefficient

1.0

0.9

0.8

0.7

0.6 0

20

40

60

80

Number of Iterations

Figure 31.3 Evolution of the power attenuation coefficient c during the simulation that led to the 0,0 mode shown in Figure 31.2. The computation stabilizes after about 80 iterations, and the steady-state value of c is close to 0.97.

Higher-order modes Although the method of Fox and Li is ideally suited for computation of the lowest-order mode of the cavity, under special circumstances (and sometimes with the aid of special tricks) it is possible to compute some of the higher-order modes as well. As an example, consider the initial distribution shown in Figure 31.4(a), which consists of four identical lobes, each having the same uniform intensity distribution. Although not shown, it is also assumed that the phase is 0 for the pair of lobes along one diagonal and 180 for the opposite pair. The stage is thus set for excitation of the so-called 1,1 mode of the cavity. Figures 31.4(b)–(d) show the computed 1,1 mode obtained from the initial distribution of Figure 31.4(a) after 64 iterations. The plot of attenuation coefficient c versus iteration number shown in Figure 31.5 reveals that the steady-state is reached after only about 40 iterations and that the final value of c is 0.87. That this value of c is less than that for the 0,0 mode is consistent with the observation that the 1,1 mode is more spread out and, therefore, must suffer higher truncation losses at the apertures of the mirrors. For comparison with the steady-state distribution, two of the intermediate distributions obtained in this simulation are shown in Figure 31.6. In this figure the intensity plots appear on the left-hand side and the corresponding log(intensity) plots appear on the right-hand side. The patterns in Figures 31.6(a), (b) are obtained after 6 and 17 iterations, respectively.

451

31 The method of Fox and Li 300

a

b

y/0

–300 300

c

d

y/0

–300 –300

x/0

300 –300

x/0

300

Figure 31.4 Computation of the 1,1 mode of the confocal resonator in Figure 31.1 begins with the initial distribution shown in (a). Here the four lobes of the initial pattern have uniform and equal intensities, but the phase of each lobe (not shown) differs from that of its adjacent lobes by 180 . The steady-state distribution in the mid-plane of the cavity is obtained after 64 iterations. (b) Plot of the intensity distribution in the steady state. (c) The same as (b) but showing the logarithm of intensity on a scale of 1 (white) to 104 (black). (d) The distribution of phase in the steady state. (For a description of the gray-scale see the caption to Figure 31.2(d).)

Another example of a high-order mode is shown in Figure 31.7. Here the starting distribution of Figure 31.7(a) has eight lobes, each having the same uniform amplitude; the phase of the adjacent lobes alternates between 0 and 180 . After 16 iterations the distribution of Figures 31.7(b)–(d) is obtained. Although this is very close to one of the high-order modes of the cavity, the simulation does not converge at this point, but continues to evolve towards the 0,0 mode. Figure 31.8, which is the corresponding plot of attenuation coefficient c versus iteration number, clearly demonstrates the situation. Although after about 20 iterations the simulation appears to be stabilizing, small numerical errors disturb the system and push it away from the high-order mode. We confirmed that the

452

Classical Optics and its Applications 1.0

Attenuation Coefficient

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0

10

20

30

40

50

60

Number of Iterations

Figure 31.5 Evolution of the power attenuation coefficient c during the simulation that led to the 1,1 mode shown in Figure 31.4. The computation stabilizes after about 40 iterations, and the final value of c is approximately 0.87.

steady-state distribution in this case was the same as the 0,0 mode of Figure 31.2; also notice that the steady-state value of c in Figure 31.8 is 0.97, in agreement with our previous estimate of c for the lowest-order mode. To get an idea as to how the pattern in Figure 31.7 reconfigures itself to resemble that of the 0,0 mode, we show in Figure 31.9 an intermediate state obtained from the initial state of Figure 31.7(a) after 33 iterations. Notice that some of the lobes have moved towards the center and have begun to merge, giving rise to a central bright spot which, thanks to its lower losses, will eventually overtake the higher-order mode. Effect of misalignments and aberrations One of the great advantages of the method of Fox and Li is that in the presence of misalignments and other imperfections, when analytical methods become intractable, this numerical scheme continues to be effective in calculating the stable mode of the resonator. As an example consider the same resonant cavity as that of Figure 31.1(a), but now suppose that one of the mirrors has two waves of primary coma. The results of computer simulations pertaining to this case are shown in Figures 31.10 and 31.11. Note that the stable mode in this case is a somewhat elongated version of the 0,0 mode, exhibiting a comatic tail. Note also that the steady-state attenuation coefficient c is slightly reduced from its value in the unaberrated case.

453

31 The method of Fox and Li 300 a 1

a2

y/0

–300 300 b 1

b2

y/0

–300 –300

x/0

300 –300

x/0

300

Figure 31.6 Intermediate patterns of intensity distribution in the cavity’s midplane during the computation of the 1,1 mode shown in Figure 31.4. For each intensity plot on the left-hand side the corresponding logarithmic plot is shown on the right-hand side. The scale of the logarithmic plots is from 1 (white) to 102 (black). (a) After six iterations; (b) after 17 iterations.

454

Classical Optics and its Applications 300 a

b

y/0

–300 300 c

d

y/0

–300

–300

x/0

300 –300

x/0

300

Figure 31.7 Computed results for a high-order mode of the confocal resonator shown in Figure 31.1. The assumed initial distribution in the cavity’s mid-plane has eight lobes of uniform amplitude, as shown in (a), but its phase distribution (not shown) alternates between 0 and 180 from lobe to adjacent lobe. (b) Intensity distribution in the cavity’s mid-plane after 16 iterations. (c) Same as (b), but showing the logarithm of intensity on a scale of 1 (white) to 103 (black). (d) Distribution of phase in the mid-plane of the cavity after 16 iterations, corresponding to the intensity patterns in (b) and (c). (For a description of the gray-scale see the caption to Figure 31.2(d).)

455

31 The method of Fox and Li 1.0

Attenuation Coefficient

0.9

0.8

0.7

0.6

0.5 0

10

20

30

40

50

60

Number of Iterations

Figure 31.8 Evolution of the power attenuation coefficient c during the simulation that started with the distribution of Figure 31.7(a) and went through the state shown in Figures 31.7(b)–(d). At first the simulation appears to stabilize with c around 0.73, but instability sets in after about 25 iterations, forcing the system towards the 0,0 mode and a value of c 0.97.

456

Classical Optics and its Applications 300

a

y/0

–300 300

b

y/0

–300 300

c

y/0

–300

Figure 31.9 Distributions of (a) intensity, (b) log (intensity), and (c) phase at the cavity’s mid-plane after a total of 33 iterations, starting in the initial state of Figure 31.7(a). This is a snap-shot from an intermediate state in the simulation whose other results are depicted in Figures 31.7 and 31.8. Note that four of the lobes have moved towards the center and started to merge into a bright central spot. This is the spot that will eventually become the dominant 0,0 mode.

31 The method of Fox and Li 300 a

y/0

–300 300

b

y/0

–300 300 c

y/0

–300 –300

x/0

300

Figure 31.10 Computing the lowest-order mode of the confocal resonator of Figure 31.1 when one of the mirrors has two waves of primary coma. The assumed initial distribution has uniform amplitude and constant phase across a wide, circular aperture, as shown in Figure 31.2(a). (a) Computed intensity distribution at the mid-plane of the cavity, obtained after 64 iterations. (b) Same as (a) but showing the logarithm of intensity on a scale of 1 (white) to 105 (black). (c) Distribution of phase in the mid-plane of the cavity corresponding to the steady-state intensity distribution shown in (a) and (b). (For a description of the gray-scale see the caption to Figure 31.2(d).)

457

458

Classical Optics and its Applications 1.00

Attenuation Coefficient

0.95 0.90 0.85 0.80 0.75 0.70 0.65 0

10

20

30

40

50

60

Number of Iterations

Figure 31.11 Evolution of the power attenuation coefficient c during the simulation that led to the stable mode shown in Figure 31.10. The computation stabilizes after about 30 iterations, and the steady-state value of c is close to 0.95.

References to Chapter 31 1 A. E. Siegman, An Introduction to Lasers and Masers, McGraw-Hill, New York (1971). 2 H. Kogelnik and T. Li, Laser beams and resonators, Proc. IEEE 54, 1312 (1966). 3 A. G. Fox and T. Li, Resonant modes in a maser interferometer, Bell Syst. Tech. J. 40, 453 (1961). 4 A. G. Fox and T. Li, Modes in a maser interferometer with curved and tilted mirrors, Proc. IEEE 51, 80 (1964).

32 The beam propagation method†

The beam propagation method (BPM) is a simple numerical algorithm for simulating the propagation of a coherent beam of light through a dielectric waveguide (or other structure).1 Figure 32.1 shows the split-step technique used in the BPM, in which the diffraction of the beam and the phase-shifting action of the guide are separated from each other in repeated sequential steps, of separation Dz. One starts a BPM simulation by defining an initial cross-sectional beam profile in the XY-plane. The beam is then propagated (using classical diffraction formulas) a short distance Dz along the Z-axis before being sent through a phase/amplitude mask. The properties of the mask are derived from the cross-sectional profile of the waveguide (or other structure) in which the beam resides. The above steps of diffraction followed by transmission through a mask are repeated until the beam reaches its destination or until one or more excited modes of the guide become stabilized.2,3 Instead of propagating continuously along the length of the guide, the beam in BPM travels for a short distance in a homogeneous isotropic medium, which has the average refractive index of the guide but lacks the guide’s features (e.g., core, cladding, etc.). After this diffraction step, a phase/amplitude mask is introduced in the beam path. To account for the refractive index profile of the guide, the mask must phase-shift certain regions of the beam relative to others. The mask must also adjust the beam’s amplitude distribution to simulate the effects of regions that absorb or amplify, when the guide happens to contain such regions. A good approximation to the real physical situation is obtained in the limit when Dz ! 0 and the phase/amplitude modulation imparted by the mask is scaled in proportion to Dz. In practice, the BPM works quite well without the need to make Dz excessively small. The various examples presented in this chapter should make the capabilities of the BPM abundantly clear. †

The coauthors of this chapter are Ewan M. Wright and Mahmoud Fallahi of the College of Optical Sciences, University of Arizona.

459

460

Classical Optics and its Applications ΔZ

Y

ΔZ

X

Z

Diffraction

Mask Diffraction

Mask Diffraction

Mask

Figure 32.1 The split-step technique used in the BPM. Instead of continuously propagating the beam in an inhomogeneous environment, the method alternates between diffracting the beam a short distance through a homogeneous medium and then modulating its phase/amplitude through a mask. The mask imparts to the incident beam the cumulative effect of phase shifts and amplitude attenuations (or amplifications) during each propagation step.

Single-mode step-index fiber Figure 32.2 shows a basic setup for injecting a laser beam into an optical fiber. The Gaussian beam of the laser diode is captured and truncated by the lens, then focused onto the entrance facet of the fiber. The focused beam typically loses about 4% of its power to reflection at the cleaved facet, but the remaining 96% enters the fiber. A fraction of this optical energy couples into the fiber’s propagating mode and travels along the axis in the Z-direction; the rest radiates away from the axis and disappears in the region beyond the cladding. For a 0.2NA diffraction-limited lens, Figure 32.3(a) shows the computed intensity profile of the focused spot in the XY-plane immediately after entering the fiber. The average refractive index n of the silica glass fiber is assumed to be 1.5, and so the wavelength k of the light in this medium is k0 /n, where k0 is the vacuum wavelength. The light amplitude distribution of Figure 32.3(a) serves in this example as the initial distribution for the BPM. To simulate a single-mode step-index fiber, we use the phase mask of Figure 32.3(b) and choose Dz ¼ 2.5k. The assumed core and cladding diameters are 5k and 30k, respectively, and the index difference of Dn ¼ ncore nclad ¼ 0.0125 results in a 3 phase shift per distance k of propagation. The mask depicted in Figure 32.3(b) is therefore required to advance the phase by 7.5 in its core region (relative to the cladding) during each BPM step. The light amplitude distribution inside the fiber reaches a steady state after a few hundred iterations; the intensity profile shown in Figure 32.3(c) is obtained at z ¼ 1500k. The mesh used in this simulation had 512 · 512 pixels, and the entire computation on a modern personal computer took less than one hour.

461

32 The beam propagation method X

Y

Z

Laser diode

Fiber Lens

Figure 32.2 The emergent beam from a semiconductor laser diode is captured by a lens and focused onto the cleaved facet of a fiber. The numerical aperture of the focused cone of light is NA ¼ sin h, where h is the half-angle of the cone of light arriving at the fiber. A small fraction of the incident beam is typically lost by reflection from the facet, while the remaining light penetrates the fiber, entering the core and cladding. Depending on the modal structure of the fiber and the cross-sectional profile of the injected beam, a certain fraction of the input optical power is coupled into the guided mode, which will then propagate along the fiber’s axis. The remaining (uncoupled) light radiates away from the core and is lost in the surrounding regions.

The fraction of the beam that radiates away from the core during a BPM simulation should not be allowed to reach the mesh boundary. The reason is that the periodic boundary condition imposed on the mesh by the fast Fourier transform (FFT) algorithm used in diffraction calculations tends to return the radiation modes into the computational region, via aliasing. In the present simulation we solved this problem by our choice of the mask, which, in addition to a core and cladding, contains a strongly absorbing region beyond the cladding (transmission coefficient zero for r > 15k). The transition from cladding to absorber is tapered to minimize back-reflections into the core and cladding. Figure 32.4(a) is a cross-sectional plot of the stabilized light amplitude distribution in the fiber; the vertical broken lines mark the core–cladding boundary. This guided mode is essentially trapped in the core but has evanescent tails in the cladding. The significant penetration of the evanescent waves into the cladding is a direct consequence of the small index-contrast Dn chosen for this particular example.2,3 Figure 32.4(b) shows the power content of the beam versus z throughout the simulation. (The power at any given point along the Z-axis is obtained by integrating the beam’s intensity over the entire cross-sectional area in the XY-plane.) The power of the beam, which is set to unity at the entrance pupil of the lens (see Figure 32.2), drops to 0.96 immediately after the beam enters the fiber. It then drops rapidly until z 1000k while the beam adjusts itself to

462

Classical Optics and its Applications a

b

c

–20

x/

20

Figure 32.3 (a) Logarithmic plot of intensity distribution immediately after the front facet of a silica glass fiber, produced by a laser beam of wavelength k0 focused through a 0.2NA lens. (b) Phase mask used in the BPM simulation of a single-mode step-index fiber. The core and cladding diameters are 5k and 30k, respectively, and the phase shift imparted by the core relative to that imparted by the cladding is 3 per k; here k is the wavelength inside the fiber. The region beyond the cladding is absorptive. (c) Logarithmic plot of intensity distribution in the cross-section of the fiber at a distance z ¼ 1500k from the fiber’s front facet.

the fiber, shedding some of its energy by radiating into the absorption region. The slight decline of optical power for z > 1000k is partly due to the slow decay of the radiative modes. Nonetheless, the curve in Figure 32.4(b) continues to exhibit a small negative slope even after a long propagation distance. This

1.00 (a)

1.00

0.75

0.95 Power

Normalized amplitude

32 The beam propagation method

0.50

463

(b)

0.90 0.85

0.25 0.80 0.00 –15 –10

–5

0 x/

5

10

15

0

1250 2500 3750 5000 Propagation distance (z/)

Figure 32.4 (a) Computed amplitude of the stable mode along the X-axis for the single-mode fiber of Figures 32.2 and 32.3 (the vertical lines denote the boundary between core and cladding). The mode stabilizes at about z ¼ 1500k and does not change afterwards. (b) Computed optical power along the length of the fiber when the incident power at the front facet is set to unity.

behavior is indicative of the presence of a small loss factor despite the fact that the simulated guide is lossless. So long as the diffraction step is treated nonparaxially, this small loss factor, which is a consequence of the discrete approximation to an inherently continuous problem, will remain an unavoidable feature of the BPM.3 Fiber with a complex core structure Figure 32.5 shows the cross-section of a special fiber with a core that contains 19 low-index filaments symmetrically arranged around its axis. Such structures, known as photonic crystals or photonic bandgap structures, currently command worldwide attention because of their unique optical properties. For our BPM simulation of the fiber depicted in Figure 32.5 we chose core and cladding diameters of 50k and 100k, respectively, and an index contrast of Dn ¼ 0.0125, while placing a tapered absorber outside the cladding region. The core filaments each had diameter 4k and the same index of refraction as the cladding. A uniform beam, having a circular cross-section of diameter 40k and unit optical power, was used as the initial distribution. Figure 32.6 shows the various cross-sectional patterns of intensity obtained along a propagation path 5000k long. It is clear that several modes of the fiber have been excited and that interference among these modes gives rise to the observed patterns. Note also that the light tends to avoid the low-index filaments at all times. The beam’s power content, plotted versus z in Figure 32.7, is seen to decline slowly with the

464

Classical Optics and its Applications

–60

x/

60

Figure 32.5 Phase mask used in simulating a special fiber containing 19 lowindex filaments within its core. The core and cladding radii are 25k and 50k, respectively, and the region beyond the cladding is absorptive. The filaments each have a diameter of 4k and the same refractive index as the cladding. All transition regions are tapered. The phase shift imparted to the beam by the high-index regions of the core (relative to the low-index cladding and the filaments) is 3 per k.

propagation distance. The power stabilizes when the radiative modes leave the core and cladding to disappear into the surrounding absorber; however, a small negative slope similar to the one mentioned in connection with Figure 32.4(b) remains even after stabilization. Y-branch beam-splitter Figure 32.8 is a diagram of a Y-branch channel waveguide. This structure is typically embedded in a lower-index medium that plays the role of cladding.2 The beam, injected on the left-hand side, establishes in the initial section a guided mode that propagates along the Z-axis; in Figure 32.8 the length of this initial section of the guide is z1. The waveguide then slowly opens up over a distance z2, and the beam follows this expansion adiabatically (i.e., without significant loss of power and without exciting higher-order modes). Once the guide has been sufficiently broadened, it splits into two channels that slowly recede from each other over a distance z3 until they are optically isolated. Afterwards the two channels may remain parallel for a distance z4. Thus a beam injected into the initial section of the guide will split in two, each of which may be extracted from a separate channel. A set of phase masks for simulating a symmetric Y-splitter is shown in Figure 32.9(a). From top to bottom these masks represent the initial section of

465

32 The beam propagation method a

b

c

d

e

f

g

h

i

–30

x/

30 –30

x/

30 –30

x/

30

Figure 32.6 Computed intensity profiles in the core region of the fiber of Figure 32.5 obtained in a BPM simulation. The assumed split steps are of length Dz ¼ 2.5k. The initial distribution is a uniform circularly symmetric beam of diameter 40k, which enters the fiber at z ¼ 0. The distributions in (a) to (i) correspond to propagation distances z/k ¼ 500, 1250, 1750, 2500, 3250, 3750, 4000, 4500, 5000.

the guide (5k · 5k square), the end of the expanded region in which the width of the mask increases to 10k, and the length along the split section where the center-to-center separation of the two channels slowly increases from zero to 45k (branching angle ¼ 1.15 ). The assumed lengths of the various sections are z1 ¼ z2 ¼ 1000k and z3 ¼ z4 ¼ 2000k (see Figure 32.8). Each mask imparts a 15 phase shift to the incident beam after a propagation step of Dz ¼ 5k; this corresponds to a 3 phase shift per k. For the initial distribution at z ¼ 0 we chose a uniform beam having a circular cross-section of diameter 14k. The output of the device at z ¼ 6000k is shown in Figure 32.9(b). The intensity profile in the broad section of the guide at z ¼ 2000k (just before branching) is shown in Figure 32.9(c), while a plot of the phase distribution at the same location appears in Figure 32.9(d).

466

Classical Optics and its Applications

Power

1.00

0.95

0.90 0

1250

2500

3750

5000

Propagation distance z/

Figure 32.7 Total power of the beam versus propagation distance along the length of the fiber for the BPM simulation depicted in Figure 32.6. The incident power at z ¼ 0 is set to unity.

Incident beam

z1

z2

z3

z4

Figure 32.8 Core region of a Y-branch channel waveguide, which is typically embedded in a lower-index cladding. For adiabatic operation the initial section of the guide (length z1) is slowly broadened over a distance z2 before splitting into two branches. The branches then move apart at a small angle (typically less than 1 ) over a distance z3 until they are optically isolated from each other. Afterwards the two channels remain parallel for a distance z4.

The phase plot indicates the propagation of the radiative modes away from the core and into the absorbing region beyond the cladding. Figure 32.10 shows the computed amplitude distributions at several crosssections of the Y-splitter of Figure 32.8. Evident in this picture is the evolution of the guided mode from a narrow beam in the initial section of the guide to a wider beam in the broadened section, and onwards to a pair of well-confined beams in the divided channel. The power content of the entire beam is plotted versus z in Figure 32.11, indicating the losses in various sections of the guide and confirming the stabilization of power in the output channels once they are sufficiently separated from each other.

467

32 The beam propagation method a

–30

b

x/

30

c

–12

–30

x/

30

x/

12

d

x/

12

–12

Figure 32.9 (a) A set of phase masks used to simulate the Y-splitter of Figure 32.8. From top to bottom: at the start of the guide; at the end of the expanded region; at three locations in the split section. (b) The computed intensity pattern at the end of the guide, z ¼ 6000k, showing two output beams confined to their respective channels. (c), (d) Intensity and phase distributions in the broad section of the guide, just before branching. In the phase plot the remnants of the incident beam, which are not coupled into a guided mode, are seen to be radiating away from the core.

Directional coupler Figure 32.12 shows a channel waveguide known as a directional coupler, which has applications such as switching in optical communication systems.4 A beam of light injected into channel 1 propagates along that channel until it reaches a point where channel 2 is close enough to sense the evanescent tail of the guided mode. At this point the beam leaks into channel 2 and, after a certain distance, moves entirely into the second channel. If the parallel section of the guide is long enough, the back and forth coupling between the two channels may be repeated many times. In this region of strong coupling, the lowest-order modes of the guide are the even and odd modes depicted in Figure 32.12. Because these modes travel at different speeds, their relative phase 1 2 varies with distance along

468

Classical Optics and its Applications

Amplitude

0.15

(a)

0.10 z=0 0.05

0.00 –40

–20

0

Amplitude

0.15 (b)

20

40

z = 1000

0.10 z = 2000 0.05

0.00 –40

Amplitude

0.15

–20

0

20

40

0 z/

20

40

(c)

z = 6000

0.10

0.05

0.00 –40

–20

Figure 32.10 Plots of amplitude distribution at various cross-sections of the Y-splitter depicted in Figures 32.8 and 32.9. (a) The initial distribution at z ¼ 0 is uniform, having a circular cross-section of diameter 14k. (b) At z ¼ 1000k (solid line), the end of the single-mode input channel, the beam is confined to the core region. At z ¼ 2000k (broken line), just before branching, the broadened beam is seen to fit into the wider channel. (c) Emerging from the guide at z ¼ 6000k are two identical beams.

the guide. The beam resides entirely in channel 1 or channel 2 when 1 2 is 0 or 180 . When 1 2 ¼ 90 the two channels contain equal amounts of light, albeit with a 90 relative phase. Eventually the two channels recede from each other, and the beam stays in the guide in which it was residing just before the separation.

469

32 The beam propagation method 1.0

Power

0.8

0.6

0.4 0

1000

2000

3000

4000

5000

6000

Propagation distance (z/)

Figure 32.11 Power content of the beam versus z/k in the BPM simulation of the Y-splitter depicted in Figures 32.9 and 32.10. The arrows indicate the beginning of the split section, the end of the split section, and the location where the split channels stop receding from each other and become parallel.

X

Even

X

Odd

Channel 1 5 5

9

3

0.23°

0.23° Channel 2

750

1500

7750

1500

Z

Figure 32.12 A directional coupler allows the coupling of light between adjacent waveguides. Each guide’s cross-section in this example is a 5k · 5k square, and both channels are embedded in a cladding of lower refractive index. Between z ¼ 0 and z ¼ 750k the separation between the guides is fixed at 9k; it then decreases continuously to 3k by z ¼ 2250k, and remains fixed at 3k until z ¼ 104k. In the coupling region, where the guides are close together, the lowestorder modes of the two channels (taken together as a single waveguide) are the displayed even and odd modes.

470

Classical Optics and its Applications a

b

–25

x/

25

Figure 32.13 (a) Phase masks used in the BPM simulation of the directional coupler of Figure 32.12. Each mask, which consists of a pair of 5k · 5k square apertures, imparts a phase shift of 7.5 to the beam at the end of each propagation step of Dz ¼ 2.5k. (b) Logarithmic plot of the intensity distribution created by a 0.2NA lens at the entrance to channel 1. The wavelength k is that inside the cladding material, and the effect of losses incurred upon reflection from the front facet of the waveguide is included in this picture.

Figure 32.13(a) displays the phase masks used in our BPM simulation of the directional coupler shown in Figure 32.12. The initial intensity distribution produced by a 0.2NA lens at the entrance to channel 1 is shown in Figure 32.13(b). Figure 32.14 shows several intensity plots at various cross-sections of the guide, demonstrating the transfer of light between the two channels. Figure 32.15 is a plot of the phase distribution at a location along the guide where the two channels carry equal amounts of optical power; the plot indicates the existence of a 90 phase difference between the two channels. Figure 32.16 shows the amplitude distributions at several cross-sections of the guide. Figure 32.17 shows the power content of each channel versus z, indicating several oscillations of the power between the two channels.

32 The beam propagation method

–15

x/

471

15

Figure 32.14 Computed plots of intensity distribution at various cross-sections along the directional coupler depicted in Figures 32.12 and 32.13. From top to bottom: z/k ¼ 750, 2250, 2500, 2750, 3250, 3750, 4000, 4250.

Multimode interference device Figure 32.18 is a diagram of a multimode interference (MMI) device used as a three-way power splitter. This device consists of an input channel, a wide (multimode) section, and three output channels. The single-mode input guide carries the incident beam to the multimode region, where the beam suddenly expands, exciting the various modes of the broad waveguide. The ensuing interference among these modes creates periodic patterns of intensity and phase

472

Classical Optics and its Applications

x/

–15

15

Figure 32.15 The computed phase distribution at z ¼ 4000k in the BPM simulation of the directional coupler depicted in Figure 32.14, showing a 90 phase difference between the two channels at this location.

0.25 2250 z = 750 0.20

Amplitude

2500 0.15 3250 0.10

0.05

0.00 –15

–10

–5

0 x/

5

10

15

Figure 32.16 Plots of the light amplitude distribution at various cross-sections of the directional coupler depicted in Figures 32.12–32.15. At z ¼ 750k (solid line) a single guided mode is established in channel 1. At z ¼ 2250k (broken and double dotted line) the two channels have come close to each other, and some of the light has already leaked into channel 2. At z ¼ 2500k (dotted line) the power contents of the two channels are nearly equal. At z ¼ 3250k (broken line) the beam has all but moved into channel 2.

473

32 The beam propagation method 1.00

0.75

Power

Channel 1

0.50

0.25

Channel 2 0.00 0

5000 7500 2500 Propagation distance (z/)

10000

Figure 32.17 Power content versus z for each channel in the directional coupler of Figure 32.12. The incident focused laser beam loses about 4% of its power upon entering the front facet of the guide, and another 14% while establishing itself in channel 1. When the two channels slowly approach each other, the total power in the guide does not change appreciably, but it begins to couple out of channel 1 and into channel 2. By the time the separation of the channels has reached 3k, a fraction of the beam already resides in channel 2. The oscillation of optical power between the two channels continues as long as they remain close to each other. The arrows at the top of the figure mark the locations of the intensity plots of Figure 32.14.

at specific locations along the Z-axis. This behavior is reminiscent of the Talbot effect, and in fact its explanation rests on the same principles (see Chapter 24, “The Talbot effect”). At a particular distance L from the port of entry, the beam breaks up into several bright spots of equal intensity. If access channels are placed at this location they carry away the resulting isolated beams.5,6 Figure 32.19 shows the phase masks used in the BPM simulation of the MMI device depicted in Figure 32.18. At the top of the figure is the cross-section of the 5k · 5k input channel (length 750k), in the middle is the 45k · 5k multimode section of the guide (length 3000k), and at the bottom are the cross-sections of

474

Classical Optics and its Applications Y X L Z

W

D

Figure 32.18 In a multimode interference (MMI) device the beam carried by a single-mode channel suddenly expands into a broad, multimode section of length L, width W, and thickness D. The many modes of the broad waveguide thus excited propagate at different speeds along the Z-axis, their interference giving rise to complex patterns of intensity distribution confined within the guide’s cross-section in the XY-plane. Access guides placed at the end of the multimode section carry away the concentrated optical energy localized in isolated bright spots at z ¼ L.

–30

x/

30

Figure 32.19 Phase masks used in the BPM simulation of the 1 · 3 splitter depicted in Figure 32.18. Each mask imparts a phase shift of 7.5 to the beam at the end of each propagation step of Dz ¼ 2.5k.

the three 5k · 5k output channels (length 1250k). Computed intensity profiles at several cross-sections of this device are shown in Figure 32.20. Depending on the distance from the port of entry, the guide’s width W, and the wavelength k, interference among the excited modes can give rise to a number of different intensity patterns. In the present example, the chosen parameters of the multimode section (L ¼ 3000k, W ¼ 45k) result in a three-way splitting of the input optical power. The computed intensity pattern at the end of the output channels appears in the bottom frame of Figure 32.20.

32 The beam propagation method

–30

x/

475

30

Figure 32.20 Computed plots of intensity distribution in the MMI device of Figures 32.18 and 32.19, showing (from top to bottom) the single-mode beam in the input channel just before entering the multimode section at z ¼ 0, the distribution of light in the multimode region at z/k ¼ 250, 750, 1125, 1775, 2250, and at the end of the multimode section at z/k ¼ 3000. The bottom frame shows the intensity distribution emerging from the three output channels. (The initial distribution entering the input channel at z ¼ 750k was uniform and had a circular cross-section of diameter 10k.)

References for Chapter 32 1

2 3 4 5 6

M. D. Feit and J. A. Fleck, Computation of mode properties in optical fiber waveguides by the propagating beam method, Applied Optics 19, 1154 (1980); Analysis of rib waveguides and couplers by the propagating beam method, J. Opt. Soc. Am. A 7, 73–79 (1990). T. Tamir, ed., Guided-wave Optoelectronics, 2nd edition, Springer-Verlag, Berlin, 1990. D. Marcuse, Theory of Dielectric Optical Waveguides, 2nd edition, Academic Press, New York, 1991. C. R. Pollock, Fundamentals of Optoelectronics, R. D. Irwin, Chicago, 1995. O. Bryngdahl, Image formation using self-imaging techniques, J. Opt. Soc. Am. 63, 416–419 (1973). R. Ulrich, Image formation by phase coincidences in optical waveguides, Optics Communication 13, 259–264 (1975).

33 Launching light into a fiber

A typical single-mode silica glass fiber has a mode profile that is well approximated by a Gaussian beam. At k ¼ 1.55 lm, this Gaussian mode has a (1/e2 intensity) diameter of 10 lm. One method of launching light into a fiber calls for placing the polished end of the fiber in contact with (or close proximity to) the polished end of another, signal-carrying fiber that has a matching mode profile. Alternatively, a coherent beam of light may be focused directly onto the polished end of the fiber. If the focused spot is well aligned with the fiber’s core and has the same amplitude and phase distribution as the fiber’s mode profile, then the launched mode will carry the entire incident optical power into the fiber. In general, however, the focused spot is neither perfectly matched to the fiber’s mode, nor is it completely aligned with the core. Under these circumstances, only a certain fraction of the incident optical power will be launched into the fiber. The numerical value of this fraction, commonly referred to as the coupling efficiency, will be denoted by g throughout this chapter. It is well-known that the strength of the launched mode may be computed by evaluating the overlap integral between the mode profile and the (complex) light amplitude distribution that arrives at the polished facet of the fiber.1,2,3 The problem of computing the coupling efficiency g is thus reduced to determining the light amplitude distribution immediately in front of the fiber. In what follows, we will evaluate the performance and tolerances of three different lenses designed for coupling a collimated beam of light into a single-mode fiber. Radial GRIN lens The first lens to be studied is a radial gradient-index (GRIN) lens, shown schematically in Figure 33.1. This lens has plane surfaces on its front and rear sides, which may be antireflection coated to reduce ordinary reflection losses at both facets. The lens diameter ¼ 3.0 mm, its length L ¼ 7.89 mm, and its 476

33 Launching light into a fiber

477

GRIN lens Fiber

Incident beam (collimated)

L

Figure 33.1 Radial gradient-index (GRIN) lens designed to focus a collimated beam of light into a single-mode fiber attached to its rear facet. In our simulations the lens has diameter ¼ 3.0 mm and length L ¼ 7.89 mm. The single-mode silica glass fiber has a Gaussian mode profile with 1/e2 (intensity) diameter of 10 lm.

refractive index profile n(r) ¼ n0[1 – q(r/rmax)2], where n0 ¼ 1.5901, q ¼ 0.044 55, rmax ¼ 1.5 mm. The lens is permanently affixed to a single-mode fiber whose guided mode diameter (at the 1/e2 intensity point) is 10 lm. A collimated Gaussian beam, having radius R0 (at the 1/e2 intensity point) and some wavefront distortion, is incident on the front facet of the lens. Figure 33.2, top row, shows cross-sectional plots of intensity, log intensity, and phase for this k ¼ 1.55 lm beam arriving at the entrance facet of the GRIN lens. The intensity profile has R0 ¼ 500k, full-width-at-half-maximum-intensity diameter DFWHM ¼ 1.1774R0 ¼ 0.912 mm, and full-aperture diameter D ¼ 2.325 mm. The Poynting vector distribution (representing geometric-optical rays) in the cross-sectional plane of the beam is computed, and its x-, y-, z-components are shown in Figures 33.2(d)–(f). Method of computation With reference to Figure 33.3, we describe a method of computing the (complex) light amplitude distribution at the focal plane of the lens. From the incident beam profile one derives a large number of rays (i.e., Poynting vectors) for subsequent tracing through the system. Ray-tracing begins at the entrance facet of the GRIN lens, and continues through the focal plane to the destination plane, which is in the far field of the focused spot. Note that, after traversing the GRIN lens, the rays emerge into a homogeneous medium of refractive index n ¼ 1.5; this region is

478

Classical Optics and its Applications 800

a

b

c

d

e

f

y/

–800 800

y/

–800 –800

x/

x/

800 –800

800 –800

x/

800

Figure 33.2 Cross-sectional plots of (a) intensity, (b) log intensity, (c) phase of a k ¼ 1.55 lm beam arriving at the entrance facet of the GRIN lens of Figure 33.1. The intensity distribution is Gaussian, having DFWHM ¼ 589k ¼ 0.912 mm, and full-aperture diameter D ¼ 1500k ¼ 2.325 mm. The Poynting vector distribution S(x, y) – representing geometric-optical rays – is readily computed from the beam profile. Frames (d)–(f) show the x-, y-, z-components of the Poynting vector, namely, Sx(x, y), Sy(x, y), Sz(x, y). In (d) the values of Sx range from 0.18 to 0.39. Similarly, Sy in (e) ranges from 0.22 to 0.32, and Sz in (f) ranges from 0 to 100 (black ¼ minimum, white ¼ maximum).

Incident beam

GRIN lens

Homogeneous medium

Z Focal plane

Destination plane

Figure 33.3 Method of computing the light-amplitude distribution at the focal plane of the GRIN lens. Ray-tracing begins at the entrance facet, and continues through the focal plane to the destination plane, which is in the far field of the focused spot. At the destination plane the traced rays are used to construct the emergent wavefront, which is subsequently back-propagated to the focal plane at the exit facet of the GRIN lens.

intended to simulate the medium of the fiber (ignoring the slight difference between the core and cladding indices). At the destination plane the traced rays are used to construct the wavefront of the emerging (divergent) beam. This wavefront is then propagated backwards, to the focal plane of the GRIN

479

33 Launching light into a fiber

lens (located at its exit facet), where the focused spot’s diffraction pattern is computed. The reason for tracing the rays all the way to the destination plane (in the far field of the focused spot) and then back-propagating to the focal plane is that geometric-optical ray-tracing does not yield valid results when the rays terminate in focal (or caustic) regions. Figure 33.4 shows the results of two different computations for the incident beam depicted in Figure 33.2. Shown are the intensity and phase distributions at the focal plane of the GRIN lens of Figure 33.3. The incident wavefront is initially converted to a set of geometric-optical rays, using the association between a ray and the local Poynting vector of the electromagnetic field. In Figure 33.4(a, b) the incident rays are traced directly to the focal plane, and the emergent wavefront has been reconstructed from the traced rays. In Figure 33.4 (c, d) the rays are traced from the entrance facet to the destination plane (see

a

b

c

d

–36

x (m)

36 –36

x (m)

36

Figure 33.4 Using two different methods, the intensity (left) and phase (right) distributions at the focal plane of the GRIN lens of Figure 33.3 have been computed for the incident beam shown in Figure 33.2. In (a) and (b) the incident rays are traced directly to the focal plane, and the emergent wavefront is constructed from traced rays. In (c) and (d) the rays are traced from the entrance facet of the lens to the destination plane, where the emergent wavefront is constructed and subsequently back-propagated to the focal plane (i.e., rear facet) of the GRIN lens.

480

Classical Optics and its Applications

Figure 33.3), at which point the emergent wavefront is constructed. This wavefront is subsequently back-propagated to the rear facet of the lens using the far field (Fraunhofer) diffraction formula. Since the incident beam in this particular example is highly aberrated, the two methods of calculation yield similar results. As a general rule, however, the incident rays should not be traced to the vicinity of the focal plane, where, due to significant diffraction, geometric-optical methods are inadmissible.

Effect of beam tilt and wavefront curvature Figure 33.5 shows cross-sectional plots of intensity, log intensity, and phase for a k ¼ 1.55 lm Gaussian beam arriving at the entrance pupil of the GRIN lens of Figure 33.3. The incident beam’s FWHM and full-aperture diameters are DFWHM ¼ 0.639 mm and D ¼ 2.17 mm, respectively. The phase plot in Figure 33.5(c) contains 2k of linear distortion (corresponding to 0.164 of tilt), and 3k of Seidel curvature (corresponding to a radius of curvature Rc 127 mm). After tracing the incident rays to the destination plane (located 2.0 mm away from the exit facet of the GRIN lens, within a homogeneous medium having n ¼ 1.5) we obtain the plots of intensity, log intensity, and phase displayed in Figure 33.6. The emerging beam at the destination plane is divergent, and its curvature phase-factor (Rc ¼ 2.046 mm) has been subtracted from the phase plot in Figure 33.6(c). For the full aperture of the incident beam (D ¼ 2.17 mm), the emergent beam diameter of 0.86 mm at the destination plane represents a divergence cone angle h ¼ 24.3 , yielding an effective numerical aperture NA ¼ nsin(h/2) ¼ 0.32.

a

–1.16

c

b

x (mm)

1.16 –1.16

x (mm)

1.16 –1.16

x (mm)

1.16

Figure 33.5 Plots of (a) intensity, (b) log intensity, and (c) phase of a k ¼ 1.55 lm Gaussian beam arriving at the entrance facet of the GRIN lens. The beam has FWHM diameter DFWHM ¼ 0.64 mm, full aperture diameter D ¼ 2.17 mm, 2k of linear distortion (i.e., 0.16 of tilt), and 3k of Seidel curvature (i.e., Rc 127 mm).

481

33 Launching light into a fiber a

–517

b

x (mm)

517 –517

c

x (mm)

517 –517

x (mm)

517

Figure 33.6 Plots of (a) intensity, (b) log intensity, and (c) phase of the emergent beam at the destination plane, located 2.0 mm beyond the exit facet of the GRIN lens of Figure 33.3. Since the beam is highly divergent at this point, its curvature phase-factor (Rc ¼ 2.046 mm) has been subtracted from the phase plot.

a

–25.8

b

x (␮m)

25.8 –25.8

c

x (␮m)

25.8 –25.8

x (␮m)

25.8

Figure 33.7 Plots of (a) intensity, (b) log intensity, and (c) phase of the focused spot at the rear facet of the GRIN lens of Figure 33.3. To compute these distributions, the beam displayed in Figure 33.6 has been back propagated a distance of 2.0 mm (i.e., from the destination plane to the rear facet of the GRIN lens).

When the light amplitude distribution of Figure 33.6 is back-propagated (from the destination plane to the rear facet of the GRIN lens), one obtains the focused spot distribution shown in Figure 33.7. These cross-sectional plots show intensity, log intensity, and phase of the focused beam at the rear facet of the GRIN lens. The 9.2 lm shift of the beam center away from the center of coordinates is a consequence of the 0.164 tilt of the incident beam. Also, the 3k curvature of the incident beam is seen to have resulted in a substantial enlargement of the focused spot. Effect of beam size and astigmatism Figure 33.8 shows plots of intensity (left column), log intensity (middle column), and phase (right column) at the rear facet of the GRIN lens of Figure 33.3

482

Classical Optics and its Applications a

b

c

d

e

f

g

h

i

–25.8

x (␮m)

25.8 –25.8

x (␮m)

25.8 –25.8

x (␮m)

25.8

Figure 33.8 Plots of intensity (left), log intensity (middle), and phase (right) at the rear facet of the GRIN lens of Figure 33.3. Top row: incident beam diameter DFWHM ¼ 1.37 mm, D ¼ 3.0 mm, no aberrations other than the spherical aberration and 105 lm of defocus introduced by the lens itself. Middle row: incident beam diameter DFWHM ¼ 0.365 mm, D ¼ 2.17 mm, no aberrations. Bottom row: same as the middle row, except for the presence of 4k of Seidel astigmatism (i.e., cylinder) on the incident wavefront.

under three different conditions. In the first row of Figure 33.8 the assumed incident Gaussian beam is fairly large, having DFWHM ¼ 1.37 mm, full aperture D ¼ 3.0 mm, and no wavefront aberrations. The focused spot, however, is affected by the spherical aberration of the lens and by nearly –105 lm of defocus, both of which are consequences of the wide aperture of the incident beam. (The GRIN’s parabolic index profile is not optimum for diffractionlimited focusing at large aperture, nor is the selected length of the lens appropriate for wide-aperture applications.) The large NA of the lens is responsible for the poor coupling efficiency into the fiber obtained in this case (g 27%).

33 Launching light into a fiber

483

The second row of Figure 33.8 shows profiles of the focused spot for a smaller incident beam, having DFWHM ¼ 0.365 mm, D ¼ 2.17 mm, and no aberrations. This focused spot is well matched to the fiber’s mode profile, yielding a large coupling efficiency (g 99%). Finally, the third row of Figure 33.8 shows the focused spot profile computed for the same incident beam as above (DFWHM ¼ 0.365 mm, D ¼ 2.17 mm) to which 4k of Seidel astigmatism (i.e., wavefront cylinder) has been added. Astigmatism reduces the computed coupling efficiency to g 69%. Tolerance for beam decenter, tilt, and defocus We computed the coupling efficiency (into a single-mode fiber) of the GRIN lens of Figure 33.1 for an incident Gaussian beam as function of the beam diameter DFWHM. From the resulting plot the optimum beam size that yielded the largest possible g was identified. Subsequently, we studied tolerances of the lens (for the optimum beam size) by computing g as function of the incident beam decenter, tilt, and wavefront curvature (i.e., defocus). These results demonstrate the sensitivity of the lens-fiber combination to alignment errors. Figure 33.9 shows the various performance curves of the GRIN lens of Figure 33.1. Shown in Figure 33.9(a) is a plot of g versus DFWHM; the optimum beam diameter is 365 lm. The remaining frames in Figure 33.9 are computed at this optimum beam size. Figure 33.9(b) shows the sensitivity of g to beam decenter. Note that a decenter of about 250 lm is sufficient to reduce g by about 50%. The plot of g versus beam tilt in Figure 33.9(c) shows that a 0.14 tilt can reduce g more than tenfold. Finally, Figure 33.9(d) shows that a few waves of Seidel curvature (i.e., defocus) can substantially reduce the efficiency of coupling into the fiber. Plano-aspheric lens Another design for a lens that launches a collimated beam into a single-mode fiber is the plano-aspheric aplanat depicted in Figure 33.10. This lens is designed to bring a k ¼ 1.55 lm beam to diffraction-limited focus at its rear facet (a plane facet to which the fiber is attached). The lens has diameter ¼ 3.0 mm, length L ¼ 5.8826 mm, and refractive index n ¼ 1.673 286. The asphere parameters are: Rc ¼ 2.367 mm, K ¼ 0.667 23, A4 ¼ 2.911 25 · 103, A6 ¼ 2.522 86 · 104, and A8 ¼ 2.930 78 · 105. Figure 33.11(a) shows the dependence of g on incident beam diameter. Clearly, maximum efficiency is achieved with DFWHM 411 lm. Figure 33.11(b) shows the dependence of g on beam decenter, when the incident beam diameter is fixed at its optimum value of 411 lm. Similarly, sensitivity to tilt for the optimum beam size is shown in Figure 33.11(c), and the

484

Classical Optics and its Applications 1.0

1.0

0.8 Coupling Efficiency h

DFWHM = 365 ␮m Full-aperture D = 2.17 mm

GRIN Lens f = 3 mm L = 7.89 mm = 1.55 ␮m

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.0

0.0 0

1.0

Coupling Efficiency h

0.8

0

300 600 900 1200 1500 1800 FWHM beam diametrer (␮m)

100

200 300 400 Decenter (␮m)

500 600

1.0 DFWHM = 365 ␮m Full-aperture D = 2.17 mm

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.0

DFWHM = 365 ␮m Full-aperture D = 2.17 mm

0.0 0.00 0.05

0.10

0.15

0.20

Tilt angle (degrees)

0.25

–4.5 –3.0 –1.5 0.0

0.5

3.0

4.5

Seidel curvature ()

Figure 33.9 Characteristics of the GRIN lens of Figure 33.3, computed for a k ¼ 1.55 lm incident Gaussian beam. (a) Dependence of the coupling efficiency g on the FWHM diameter of the incident beam; optimum diameter is 365 lm. (b) Dependence of g on incident beam decenter relative to the optic axis. (c) Variation of g with incident beam tilt. (d) Effect on g of Seidel curvature (i.e., defocus); the horizontal axis depicts the departure of the wavefront at the edge of the beam, where the assumed beam’s full-aperture diameter is D ¼ 2.17 mm. In (b), (c), and (d) the incident beam has DFWHM ¼ 365 lm.

dependence of g on Seidel curvature is shown in Figure 33.11(d). A comparison of Figure 33.9 with Figure 33.11 shows that the GRIN lens is nearly as good as the plano-aspheric lens, at least as far as the particular alignment tolerances studied here are concerned.

485

33 Launching light into a fiber Aspheric surface

Fiber

Incident beam (collimated)

Figure 33.10 Plano-aspheric lens, having diameter ¼ 3.0 mm, length L ¼ 5.88 mm, and refractive index n ¼ 1.673 286. The single-mode fiber is attached to the rear facet of the lens.

Coupling Efficiency h

1.0 0.8

1.0 Plano-aspheric Lens f = 3.0 mm 0.8 L = 5.88 mm = 1.55 μm

0.6

0.6

0.4

0.4

0.2

0.2 0.0

0.0 0 400 800 1200 1600 FWHM beam diameter (μm)

Coupling Efficiency h

1.0 0.8

DFWHM = 411 μm Full-aperture D = 2.17 mm

DFWHM = 411 μm Full-aperture D = 2.17 mm

0

150

300 450 Decenter (μm)

600

1.0 0.8

0.6

0.6

0.4

0.4

0.2

0.2

DFWHM = 411 μm Full-aperture D = 2.17 mm

0.0

0.0 0.00 0.05 0.10 0.15 0.20 0.25 Tilt angle (degrees)

–4.5 –3.0 –1.5 0.0 1.5 3.0 4.5 Seidel curvature ()

Figure 33.11 Characteristics of the plano-aspheric lens of Figure 33.10, computed for a k ¼ 1.55 lm incident Gaussian beam. (a) Dependence of g on the FWHM diameter of the incident beam; optimum diameter is 411 lm. (b) Dependence of g on incident beam decenter relative to the optic axis. (c) Variation of g with incident beam tilt. (d) Effect on g of Seidel curvature (i.e., defocus); the horizontal axis depicts the departure of the wavefront at the edge of the beam, where the assumed beam’s full-aperture diameter is D ¼ 2.17 mm. In (b), (c), and (d) the incident beam has DFWHM ¼ 411 lm.

486

Classical Optics and its Applications

Plano-convex lens made of GradiumTM glass Lenses made from Gradium glass have a refractive index gradient along their optic axis. Although this type of index gradient does not by itself produce focusing power, it has the ability to correct aberrations and field curvature introduced by curved surfaces. (a)

Plano-convex lens Fiber

Incident beam (collimated) (b)

⌬z

Gradium glass

L

␾

Z zmax

(c) Refractive Index, n(z)

1.80 GRADIUMTMG14SF

1.76 1.72 1.68 1.64 = 1.55 ␮m

1.60 0

1

2

3

4

5

6

z (mm)

Figure 33.12 (a) Plano-convex lens made of Gradium glass is used to focus a collimated beam of light into a single-mode fiber. The front facet of the lens is spherical, having Rc ¼ 3.715 mm, and the focal point is 3.535 mm beyond the rear (plane) facet of the lens. (b) The lens is fabricated by polishing into spherical shape one end of a cylindrical rod cut from a slab of Gradium glass. (c) Index profile of G14SF Gradium glass at k ¼ 1.55 lm. The refractive index is highest at the front vertex, decreasing nonlinearly with z as one moves toward the plane facet of the lens. When the same lens is made of homogeneous glass of refractive index n ¼ 1.7, the focal point shifts by about 30 lm to the right, and the spherical aberration increases slightly.

487

33 Launching light into a fiber 1.0

1.0 GRADIUM Lens f = 2.6 mm L = 2.9 mm = 1.55 μm

Coupling Efficiency

0.8

DFWHM = 593 μm Full-aperture D = 2.17 mm

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.0

0.0 0

0

300 600 900 1200 1500 1800 FWHM beam diameter (μm)

1.0

200

400 600 Decenter (μm)

800

1.0

Coupling Efficiency

DFWHM = 593 μm Full-aperture D = 2.17 mm 0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.0

0.0 0.00

0.05

0.10

0.15

0.20

Tilt angle (degrees)

0.25

DFWHM = 593 μm Full-aperture D = 2.17 mm –4.5 –3.0 –1.5 0.0

1.5

3.0

4.5

Seidel curvature ()

Figure 33.13 Characteristics of the plano-convex Gradium lens of Figure 33.12, computed for a k ¼ 1.55 lm incident Gaussian beam. (a) Dependence of g on the FWHM diameter of the incident beam; optimum diameter is 593 lm. (b) Dependence of g on incident beam decenter relative to the optic axis. (c) Variation of g with incident beam tilt. (d) Effect on g of Seidel curvature (i.e., defocus); the horizontal axis depicts the departure of the wavefront at the edge of the beam, where the assumed beam’s full-aperture diameter is D ¼ 2.17 mm. In (b), (c), and (d) the incident beam has DFWHM ¼ 593 lm.

A plano-convex lens made of Gradium glass for focusing a collimated beam into a fiber is shown in Figure 33.12. The various lens parameters are: Rc ¼ 3.715 mm, ¼ 2.6 mm, L ¼ 2.9 mm, Gradium glass ¼ G14SF, zmax ¼ 5.8 mm, and Dz ¼ 2.85 mm. The focal point is a distance of 3.535 mm beyond the

488

Classical Optics and its Applications

plane facet of the lens. For this lens, the dependence of coupling efficiency g on beam size as well as its sensitivity to misalignment are shown in Figure 33.13. The optimum g is obtained for a beam diameter DFWHM 593 lm. For this beam size the amount of decenter that results in 50% reduction of g is 420 lm, and the beam tilt that causes a 50% drop in g is 0.045 . The dependence of g on wavefront curvature may be seen in Figure 33.13(d). We also computed the various performance curves for a plano-convex lens similar to that depicted in Figure 33.12, but made of homogeneous glass (n ¼ 1.7) instead of the Gradium material. The focal point of this homogeneous planoconvex was found to be 3.565 mm beyond the plane facet of the lens, namely, 30 lm greater than that of the Gradium lens. Once again, the optimum beam size was found to be DFWHM 593 lm; the other characteristics of the lens were also very similar to those of the Gradium lens. Apparently, the use of Gradium glass in this particular application does not result in any substantial improvements. References for Chapter 33 1 H. Kogelnik, Coupling and conversion coefficients for optical modes, Proc. Symposium on Quasi-Optics, series VR14, Polytechnic Press, Brooklyn, N.Y. 333–347, 1964. 2 W. L. Emkey, Optical coupling between single-mode semiconductor lasers and strip waveguides, J. Lightwave Technology LT-1, No. 2, 436–443, 1983. 3 D. T. Neilson, Tolerance of optical interconnections to misalignment, Applied Optics 38, 2282–2286, 1999.

34 The optics of semiconductor diode lasers†

Robert N. Hall, born in New Haven, Connecticut in 1919, joined General Electric’s Research and Development Center after graduating from the California Institute of Technology. In 1962, having realized that a semiconductor junction could support population inversion, Hall built the first semiconductor injection laser. This device, based on a specially designed p-n junction, operated when an electric current injected the electrons directly into the junction, thus allowing for highly efficient generation of coherent light from a compact source. Today, diode lasers based on Hall’s original idea are used, among other places, in CD and DVD players, laser printers, and fiber-optic communication systems.1 In this chapter we describe the basic features of the beam of light emitted by a diode laser, and discuss methods to analyze and manipulate this beam. Collimation and beam-shaping with a pair of cylindrical lenses will be shown to be a simple and flexible method that may be applied not only to diode lasers but also to beams emerging from optical fibers.

Characteristics of diode lasers A semiconductor diode laser shown schematically in Figure 34.1 consists of a gain layer (only a few ten nanometers thick), surrounded by guiding layers for confining the laser mode. The guiding layers’ index of refraction is somewhat greater than that of the surrounding regions (substrate and cladding), thus permitting confinement by total internal reflection. The electrical current is injected through the positive electrode, a metallic stripe several microns wide, and collected at the base-plate on the opposite side of the junction (ground electrode). The population inversion and optical gain are strongest beneath the positive †

This chapter is co-authored with Ewan M. Wright, Professor of Optical Sciences at the University of Arizona.

489

490

Classical Optics and its Applications Y X

Positive Electrode Z Cladding

Guiding Layers

Substrate

Gain Medium

Ground Electrode

Figure 34.1 A semiconductor diode laser consisting of an active layer surrounded by guiding layers for confinement of the laser mode. The electrical current, injected through the positive electrode, is collected on the opposite side of the junction by the ground electrode.

electrode, tapering off laterally with an increasing distance from the electrode’s center line along Z. In gain-guided lasers, this tapering off of the gain is responsible for lateral beam confinement. (By contrast, in index-guided lasers the regions adjacent to the guiding stripe are selectively etched away, then replaced by a lower-index cladding material.) In general, the gain layer is highly absorptive in regions that are not directly underneath the electrode and, therefore, experience weak pumping or no pumping at all. The guiding layers are essentially transparent, except for losses due to scattering at impurities and at the interfaces. The substrate and the cladding are also highly transparent. Figure 34.2 shows plots of intensity and phase at the front facet of a singletransverse-mode diode laser (k0 ¼ 980 nm). The assumed beam divergence angles (full-width-at-half-maximum intensity or FWHM) are hk ¼ 7 in the plane of the junction and h? ¼ 35 perpendicular to the junction. In the top row of the figure, (a), (b), where the assumed beam has no astigmatism, the phase distribution at the laser’s front facet is uniform. In the middle row, (c), (d), the astigmatic distance (defined as an equivalent distance in free space between horizontal and vertical beam waists) is Dz ¼ 10 lm, resulting in a slightly wider beam along the X-axis, and a divergent phase front whose peak-to-valley variation (i.e., from the edge to the center of the beam) is 120 . In the bottom row, (e), (f), the assumed astigmatism is Dz ¼ 25 lm. Again the beam is broader (in the horizontal direction) than the one without astigmatism, and the phase distribution exhibits a peak-to-valley variation of 190 .

491

34 The optics of semiconductor diode lasers a

b

c

d

e

f

–10

x (μm)

+10 –10

x (μm)

+10

Figure 34.2 Plots of the logarithm of intensity (left) and phase (right) at the front facet of a k0 ¼ 980 nm diode laser having hk ¼ 7 , h? ¼ 35 . The range of variations of intensity between the maximum (red) and minimum (blue) is Imax : Imin ¼ 104. In (a), (b) the beam has no astigmatism. In (c), (d) the astigmatic distance (in free space) between the horizontal and vertical beam waists is Dz ¼ 10 lm, resulting in a wider beam along X, and a divergent phase pattern whose variation from the center (blue) to the edge (red) is 120 . In (e), (f), where the assumed astigmatism is Dz ¼ 25 lm, the beam is further broadened and the phase distribution exhibits a peak-to-valley variation of 190 .

The elliptical cross-section of the beam emerging at the front facet of the laser is responsible (through diffraction) for hk being much smaller than h?. The cause of astigmatism is the non-uniform gain profile (along the X-axis) within the active region of the laser. As the gain is strongest near the cavity’s central axis, the beam, while propagating in the cavity along Z, experiences a “gain focusing” effect toward this axis – a direct consequence of stronger amplification on-axis than in the wings.2 Consequently, a divergent phase profile automatically evolves for countering this tendency of the beam to collapse to the center. We will have more to say about this property in the following section. Another interesting property of a diode laser beam is its polarization state, which is typically linear, having the E-field parallel to the plane of the junction. This property may be traced back to the fact that, for light polarized parallel to the junction (i.e., Ek) the gain is somewhat greater than that for perpendicularly polarized light (hereinafter E?). The guided mode associated with E? is slightly broader in the Y-direction than the mode associated with Ek. Since a broad mode has less overlap with the gain layer than a more compact mode, it stays behind while the compact mode surpasses the threshold and begins to lase. Moreover,

492

Classical Optics and its Applications

confinement of electrons and holes to a thin (quantum well) active layer makes it easier for Ek (relative to E?) to stimulate the excited electrons and holes into surrendering their photons and returning to the ground state. In practice a combination of both effects is responsible for promoting the selection of Ek polarization over E?. Origin of diode laser astigmatism The non-uniformity of the gain profile along X has a focusing effect on the guided mode that is countered automatically by a divergent phase front imposed on the beam as it propagates along the Z-axis of the cavity. An easy demonstration is provided by studying the propagation of a beam of light through a gain medium using the Beam Propagation Method (BPM); see Chapter 32. Figure 34.3(a) shows the distribution of gain (red) and loss (blue) in the cross-sectional plane of a typical diode laser. The gain is significant only in the middle section of the gain layer, dropping off in a Gaussian fashion along the X-axis and becoming a loss in the regions remote from the central axis, Z. During propagation along the cavity’s Z-axis, the phase profile imparted to the beam’s cross-section is similar to that shown in Figure 34.3(b). Here the high-index guiding layers (orange) advance the phase relative to the lower-index substrate and cladding (blue). The gain medium (red) with its slightly higher index advances the phase even more than the guiding layers do, but the gain layer is thin, and its contribution to mode confinement along the Y-axis is fairly insignificant. Together, the cross-sectional intensity and phase distributions shown in Figures 34.3(a), (b) define the profile of an amplitude–phase mask that can be used in a BPM simulation of a diode laser. This particular mask is placed at intervals of Dz ¼ 0.1k between propagation steps in a medium of refractive index n0 ¼ 3.3. (The wavelength k in this environment is k0/n0, where k0 is the freespace wavelength of the laser beam.) For a uniform beam incident on the mask, the transmitted intensity and phase distributions appear in Figures 34.3(a), (b), respectively. The assumed gain medium is a 0.25k-thick layer sandwiched between two 0.5k-thick layers of slightly lower refractive index, which constitute the guiding slab. The amplitude gain at the center of the active layer is 1.025 (per 0.1k of propagation), tapering off in a Gaussian fashion along the X-axis while remaining uniform in the Y-direction. The background loss of the medium outside the active layer is small (mask amplitude transmission ¼ 0.995), but within the gain layer and far from the central axis, the light amplitude is attenuated by a factor of 0.95 (per 0.1k of propagation). In Figure 34.3(b), the phase of the mask is 6.12 within the active layer, 5.4 in the two adjacent (guiding) layers, and 0 in the substrate and cladding regions. This means, for example, that if the index

493

34 The optics of semiconductor diode lasers +1.5

b

a

y/

–1.5 +1.5

–20

x/

20

c

y/

–1.5 –20

x/

20

Figure 34.3 Profiles of two amplitude–phase masks used in BPM simulations of a diode laser beam. (a), (b) represent the amplitude and phase profiles for Mask 1, (a), (c) the corresponding profiles for Mask 2. The gain medium is a 0.25k-thick layer sandwiched between two 0.5k-thick guiding layers. (a) Transmitted intensity distribution for a uniform incident beam. The amplitude gain at the center is 1.025, tapering off along X to a (lossy) value of 0.95. Outside the active layer, the 0.995 amplitude transmissivity represents a weak background loss. (b) The phase of Mask 1 is 6.12 within the active layer, 5.4 in the guiding layers, and 0 in the substrate and cladding. (c) The phase of Mask 2 (having the same amplitude profile as in (a)) is 4.5 at the center of the active layer, 6.12 within the active layer far from the axis, 5.4 in the guiding layers, and 0 in the substrate and cladding.

of the substrate and cladding materials is n0 ¼ 3.3, then the guiding layers have index n1 ¼ 3.45 and the active medium has index n2 ¼ 3.47. In practice, pumping the gain medium causes a decline in its local index, so that a more realistic phase mask would be similar to that shown in Figure 34.3(c), where the phase at the center of the active region has dropped to 4.5 (corresponding to n2 ¼ 3.425). From this minimum, the phase increases in a Gaussian fashion along the X-axis, reaching the value of 6.12 in the highly absorptive

494

Classical Optics and its Applications

regions of the active layer. This results in index anti-guiding along the X-axis (due to index inhomogeneity within the active layer). The Gaussian phase profile inside the active layer imposes on the laser beam a divergence (along X ) above and beyond that imposed by the gain profile alone.3,4 The rest of the mask is identical to that in Figure 34.3(b). In what follows, we will first show results of BPM simulations obtained with the amplitude-phase Mask 1, depicted in Figures 34.3(a), (b), confirming that the gain profile alone can give rise to astigmatism. We then show results of simulations obtained with the amplitude– phase Mask 2, shown in Figures 34.3(a), (c), which reveal that the index “antiguiding” of the gain medium (caused by population inversion) can further enhance the induced astigmatism. Figure 34.4, top row, shows plots of (a) intensity, (b) logarithm of intensity, and (c) phase after 600 steps of BPM using Mask 1. Since each step corresponds to a propagation distance of 0.1k inside a medium of refractive index n0 ¼ 3.3, the total propagation distance in this simulation is 18 lm. The light is seen to be well confined to the guiding layers, with only a small fraction leaking (i.e., evanescent) into the substrate and cladding. The light that escapes the guiding layers is eventually lost by scattering or diffraction out of the system. In Figure 34.4(c), the peak-to-valley phase variation along X is 160 , corresponding to a divergent beam with a few microns of astigmatism. The bottom row in Figure 34.4 is similar to the top row, except that it is obtained with Mask 2. The beam is somewhat broader along the X-axis when compared to that obtained without an index anti-guiding of the gain medium. Also, the peak-to-valley phase variation in Figure 34.4(f) is 175 (along X), corresponding to a divergent beam with somewhat more astigmatism than the one depicted in Figure 34.4(c). Figure 34.4(g) shows plots of the power content of the beam versus propagation distance z, as obtained in the above simulations. At first, the power decreases as the initial beam adjusts itself to the guiding structure, shedding excess light that does not match the guided mode profile. The gain medium then takes over and raises the power content exponentially, as the confined mode propagates along the optical axis. Shearing interferometry The beam of a single-transverse-mode diode laser may be captured and collimated by an aberration-free lens, then analyzed using a shear-plate interferometer, as shown in Figure 34.5. The shear plate creates two identical copies of the collimated beam shifted relative to each other along the X- and/or Y-axes. Superposition of these two copies of the same beam at the observation plane creates an interferogram that reveals the phase structure of the (collimated) beam.

495

34 The optics of semiconductor diode lasers (a)

(b)

(c)

(d)

(e)

(f)

–8

x/

+8 –8

x/

+8 –8

x/

+8

3.5 (g) 3.0

Optical Power

2.5 2.0

Mask 1

1.5 1.0

Mask 2

0.5 0

3 6 9 12 15 18 Propagation Distance (␮m)

Figure 34.4 Plots of (a) intensity, (b) logarithm of intensity, and (c) phase after 600 BPM steps. The amplitude–phase Mask 1 used in these simulations is depicted in Figures 34.3(a), (b). The light is seen to be confined to the guiding layers, with a weak evanescent tail leaking into the substrate and cladding. In (c) the peak-to-valley phase variation along X is 160 . The bottom row (d)–(f) is similar to the top row (a)–(c), except that it is obtained with Mask 2 depicted in Figures 34.3(a), (c). In (f) the peak-to-valley phase variation along X is 175 . (g) Power content of the beam versus propagation distance in the BPM simulations.

Any phase non-uniformities at the exit pupil of the collimator show up as intensity variations (i.e., fringes) in the interferogram. For a 7 · 35 beam emerging from a k0 ¼ 980 nm diode laser, Figure 34.6 (left column) shows plots of intensity (top) and phase in a plane located 10 mm past the exit pupil of a 0.6 NA collimator lens. The lens is at a distance of f ¼ 4.9 mm from the mid-point between the two waists of the laser beam, and the displayed phase patterns correspond to Dz ¼ 10, 20, 30 lm of astigmatism. For a fixed shear of Dx ¼ 0.7 mm (horizontal) and Dy ¼ 2.0 mm (vertical), the right column in Figure 34.6 shows the observed interference patterns in the viewing window of the shear plate; from top to bottom, the assumed astigmatism of the laser is Dz ¼ 0, 10, 20, 30 lm.

496

Classical Optics and its Applications Collimator Y

Interferogram

X

Z Diode Laser

Shear Plate

Figure 34.5 A single-transverse-mode beam from a diode laser is captured and collimated by an aberration-free lens, then analyzed with a shear-plate interferometer. Any phase variations in the cross-section of the beam show up as fringes in the shearing interferogram.

Beam collimation using a cylindrical lens pair A diode laser’s beam may be collimated by a pair of cylindrical lenses, as shown in Figure 34.7. In this scheme the first lens has the responsibility of collimating the beam along its fast divergence axis, while the second lens arrests the expansion of the beam along its slow axis. When a divergence angle is large, a gradient index (GRIN) cylindrical lens provides more collimation power as well as better correction for residual aberrations. As a concrete example, consider a single-transverse-mode beam having k0 ¼ 980 nm, hk ¼ 7 , h? ¼ 35 , and astigmatism Dz ¼ 0. The first lens is a 5 mm-long cylindrical rod of radius r0 ¼ 1.5 mm, made of GRIN material having n(r) ¼ 1.59[10.044 55(r/r0)2]; the clear aperture diameters of the lens are Dx ¼ 5.0 mm, Dy ¼ 1.2 mm. The distance between the front facet of the laser and the first surface of this lens is 0.348 mm. The second lens, a planocylinder made of homogeneous glass of index n ¼ 1.65, is separated by 0.397 mm from the first lens; its thickness (along the optical axis) is 3.2 mm, it is 5 mm long, has a 3 mm radius of curvature, and its clear aperture diameter is 1.5 mm. All lens surfaces are anti-reflection coated. Figure 34.8 shows computed plots of intensity and phase at an observation plane located 0.3 mm beyond the plano-cylindrical lens of Figure 34.7. The fraction of the optical power captured by the lens pair is nearly 0.8, and the r.m.s. wavefront aberration at the observation plane is 0.19k0. The same lens pair

34 The optics of semiconductor diode lasers

–3.1

x (mm)

+3.1 –3.1

x (mm)

+3.1

Figure 34.6 Left column: plots of intensity (top) and phase in a plane located 10 mm beyond the exit pupil of the collimator of Figure 34.5. The 0.6 NA lens is one focal length (f ¼ 4.9 mm) away from the mid-point between the two waists of the laser beam (k0 ¼ 980 nm, hk ¼ 7 , h?¼ 35 ). Right column: intensity patterns at the viewing window of the shear plate (Dx ¼ 0.7 mm, Dy ¼ 2.0 mm). From top to bottom, the assumed astigmatism of the laser is 0, 10, 20, 30 lm.

497

498

Classical Optics and its Applications

Diode Laser

Cylindrical Lens (radial GRIN) Plano-cylindrical Lens (homogeneous glass)

Figure 34.7 Collimation of a diode laser beam by a pair of cylindrical lenses. The first lens collimates the beam along the fast axis, while the second lens arrests the expansion of the beam along the slow axis.

a

–750

b

x (μm)

+750 –750

x (μm)

+750

Figure 34.8 Plots of (a) intensity, (b) phase of a laser beam (k0 ¼ 980 nm, hk ¼ 7 , h? ¼ 35 , astigmatism Dz ¼ 0) upon emerging from the lens pair of Figure 34.7. For the particular lenses chosen in this simulation, the optical power throughput is 80%, the r.m.s. wavefront aberration is 0.19k, and the peak-tovalley phase variation across the aperture is 280 .

(with a slight adjustment of the separation between its two elements) may be used for collimation in the presence of astigmatism on the laser beam, without any degradation of the wavefront quality. By allowing the slow axis of the beam to propagate further before being collimated, the cylindrical lens pair enables one to adjust the degree of ellipticity

499

34 The optics of semiconductor diode lasers

of the beam’s cross-section. Of course, the requisite physical parameters of the second lens depend on the desired minor-to-major-axis ratio of the collimated beam, but, in principle, any degree of ellipticity can be achieved. Thus, the cylindrical lens pair not only collimates the divergent beam of a diode laser, but it also allows shaping (in particular, circularization) of the beam’s cross-section. Anamorphic magnification and beam compression Figure 34.9 shows an aberration-free lens collimating a diode laser’s beam, followed by a pair of anamorphic prisms that expand the beam along the X-axis. This collimated and anamorphically magnified beam is subsequently focused by an aberration-free lens identical with the one used initially for beam collimation. Because the laser beam’s divergence angles parallel and perpendicular to the plane of the junction are widely different (i.e., hk << h?), the collimated beam’s diameter along X is typically much less than that along Y. Expanding the beam along X until it fills the entrance pupil of the focusing lens enables one to obtain a focused spot substantially smaller (in one dimension) than the bright spot appearing at the front facet of the laser. Figure 34.10 shows computed plots of intensity and phase at several crosssections of the system of Figure 34.9. The assumed parameters of the laser are k0 ¼ 980 nm, hk ¼ 7 , h? ¼ 35 , astigmatism Dz ¼ 0. Both the collimator and the focusing lens have NA¼0.6, f ¼4.9 mm, and the prism pair’s magnification factor M ¼ 5.5 (along X ) is sufficient to circularize the beam’s cross-section. The top Y Anamorphic Prism Pair

Focusing Lens

X

Diode Laser

Z

Collimator

Figure 34.9 A diode laser’s beam is collimated, then shaped by a prism pair that expands the beam’s diameter along the X-axis. The collimated and anamorphically magnified beam is subsequently focused by an aberration-free lens identical to the one used for collimation.

500

Classical Optics and its Applications (a)

–26

(b)

x (μm)

(c)

–3.1

x (mm)

+26

+3.1 –3.1

x (mm)

+3.1

x (mm)

+3.1

x (μm)

+8

(f)

x (mm)

(g)

–8

x (μm)

(d)

(e)

–3.1

+26 –26

+3.1 –3.1 (h)

x (μm)

+8 –8

Figure 34.10 Distributions of the logarithm of intensity (left) and phase (right) at several cross-sections of the system of Figure 34.9. The lenses have NA ¼ 0.6, f ¼ 4.9 mm. The prisms are made of n ¼ 1.72 glass, and have an apex angle of 69 . (a), (b) Front facet of the laser. (c), (d) Exit pupil of the collimator, just before entering the prism pair. (e), (f) Emerging from the second prism. (g), (h) Focal plane of the focusing lens.

row in Figure 34.10 shows the beam at the front facet of the laser. The second row shows that before entering the prisms the beam has an elliptical crosssection with an aspect ratio of 5.5. Emerging from the prism pair (third row) the beam is circularized. The bottom row shows the focused spot at the focal plane of the focusing lens; this compressed image of the bright elliptical spot at the front facet of the laser has circular symmetry and a much reduced diameter along the X-axis.

501

34 The optics of semiconductor diode lasers (a)

–26

(b)

x (μm)

(c)

–3.1

x (mm)

+26

+3.1 –3.1

x (mm)

+3.1

x (mm)

+3.1

x (μm)

+8

(f)

x (mm)

(g)

–8

x (μm)

(d)

(e)

–3.1

+26 –26

+3.1 –3.1 (h)

x (μm)

+8 –8

Figure 34.11 Same as Figure 34.10 but with the diode laser shifted 20 lm to the left along the X-axis. Comparing (d) with (f), note that the tilt angle of the collimated beam is reduced substantially after going through the prism pair. Thus the image of the laser beam, in addition to being circularized, moves closer to the optical axis at x ¼ 0, as shown in (g), (h).

Figure 34.11 is similar to Figure 34.10, except for the position of the diode laser along the X-axis, which is shifted to x ¼ 20 lm. Collimation and anamorphic magnification work as before, but the beam emerging from the collimator is tilted by about 0.23 away from the Z-axis. The prism pair magnifies the beam along X by 5.5, but it also reduces the tilt of the beam by the same factor. The net result is that the image of the bright elliptical spot at the front facet of the laser, in addition to being compressed in size, is brought closer to the optical axis at x ¼ þ3.6 lm. This is an important result which may be

502

Classical Optics and its Applications

applied, for instance, to compressing incoherent laser beams. A typical high-power, multi-transverse-mode diode laser may have the same divergence angles as above (i.e., 7 · 35 ), but its bright area at the front facet is much larger, say 1 · 50 lm2. The radiation profile of such lasers may be considered (approximately) to consist of a number of mutually incoherent filaments, each similar to the beam of a coherent (i.e., single-transverse-mode) diode laser. If, therefore, the central filament of an incoherent laser is identified with the coherent beam depicted in Figure 34.10, then the marginal filament will be represented by Figure 34.11. The system of Figure 34.9 can thus collimate the incoherent laser’s various filaments (simultaneously and independently of each other), perform anamorphic magnification on each and every one of them, then create (at the focal plane of the last lens) a string of closely packed focused spots along the X-axis. In doing so the system of Figure 34.9 creates a compressed image of the elongated bright spot at the front facet of the incoherent laser. (In the present example the image size will be 1 · 10 lm2.) The achievable compression is roughly equal to the ratio h? /hk of the divergence angles. Cylindrical lenses for collimation and beam-shaping in fiber optics systems A pair of cylindrical lenses (similar to those shown in Figure 34.7) can be used in similar fashion to capture the beam emerging from an optical fiber. For example, a pair of GRIN cylinders can collimate and anamorphically magnify the beam emerging from a single-mode silica glass fiber. Unlike the beam from a diode laser, the emergent beam of a fiber is usually unpolarized, making cylindrical lenses superior to prism pairs in applications that demand anamorphic magnification with a low level of polarization-dependent loss. Consider a cylindrical and a plano-cylindrical lens oriented at right angles to each other, as in Figure 34.7. The GRIN profile of both lenses is n(r) ¼ 1.59[1–0.044 55(r/r0)2], r being the radial distance from the cylinder axis, and r0 the cylinder radius. The 1/e (amplitude) diameter of the singlemode Gaussian beam emerging from the fiber is 10 lm (k0 ¼ 1.544 lm). The distance from the fiber facet to the first lens is 0.501 mm, the radius of the first lens is 2 mm, separation between the lenses is 3.525 mm, and the radius of the second lens is 6.7 mm, with its center of curvature located on the plane facet of the lens. The cylinder lengths should be large enough to accommodate the beam, but are otherwise arbitrary. All surfaces are anti-reflection coated.

34 The optics of semiconductor diode lasers

503

(a)

(b)

(c)

–1.65

x (mm)

+1.65

Figure 34.12 Plots of (a) intensity, (b) logarithm of intensity, (c) phase at the exit pupil of a cylindrical lens pair used as collimator and anamorphic magnifier for the beam emerging from a single-mode fiber (k0 ¼ 1.544 lm). A 0.775 · 3.1 mm2 elliptical aperture is placed at the exit pupil to clip the edges of the beam. The overall transmission of the system is 95.5%, and the ratio of the FWHM beam diameters is 4.1. The peak-to-valley phase variation in (c) is 56 .

Figure 34.12 shows computed plots of intensity, logarithm of intensity, and phase at the exit pupil of the lens pair. The first lens arrests the spread of the beam in the vertical direction, while the beam continues to spread in the horizontal direction (and become elongated) until its capture by the second lens. The overall transmission of the system is 95.5%, and the ratio of FWHM beam diameters at the exit pupil is 4.1. (The anamorphic magnification factor can be further increased if the first lens is made proportionately smaller or the second lens larger.) The phase plot shows a small amount of astigmatism, with a peak-tovalley variation of 56 (r.m.s. wavefront aberrations ¼ 0.035). The aberrations may be reduced if the plano-cylindrical lens is replaced by a full cylinder. Alternatively, a different index profile of the GRIN rods or a smaller aperture stop could help reduce these aberrations. When used in conjunction with a one-dimensional array of fibers, one of the lenses can be shared among the various fibers. Similarly, in a 2-D square array of fibers, the fibers in each row can share the first lens, while the fibers in each column can share the second lens. Collimation of a 10 · 10 fiber array will thus require only 10 cylinders of each type.

504

Classical Optics and its Applications

References for Chapter 34 1 Adapted from <web.mit.edu/invent/www/inventors>. 2 L. W. Casperson and A. Yariv, Gain and dispersion focusing in a high gain laser, Applied Optics 11, 462–466 (1972). 3 F. R. Nash, Mode guidance parallel to the junction plane of double-heterostructure GaAs lasers, J. Appl. Phys. 44, 4696–4707 (1973). 4 D. D. Cook and F. R. Nash, Gain-induced guiding and astigmatic output beam of GaAs lasers, J. Appl. Phys. 46, 1660–1672 (1975).

35 Michelson’s stellar interferometer

The essential idea behind the stellar interferometer is that of a double-slit interferometer, such as that shown in Figure 35.1. This type of instrument dates back to 1868 when Fizeau1 proposed using it to measure the diameters of the fixed stars. Some modern textbooks2 describe the stellar interferometer in the language of coherence theory, which tends to obscure its fundamental simplicity. This chapter attempts to present the original concept in its simplest form while providing a historical perspective. The double-slit interferometer With reference to Figure 35.1, let us assume that a quasi-monochromatic point source of wavelength k is placed at the origin of the XY-plane, which is the focal point of the collimator lens. The beam emerges from the lens collimated along the optical axis, effectively placing the point source at infinity. A double-slit mask blocks most of the light, allowing only the rays within two narrow slits to pass through to the focusing lens. The slits have a separation d along the X-axis, their widths being inconsequential as long as a sufficient amount of light gets through and a reasonable number of fringes appear at the observation plane. The focusing lens of focal length f brings together the rays that emerge from the two slits, causing them to interfere and produce a fringe pattern. The simple geometrical construction in Figure 35.2 shows that at the observation plane the fringe period p may be written as3,4 p f k=d:

ð35:1Þ

Figure 35.3 shows computed plots of intensity distribution in the system of Figure 35.1 for a point source centered on the optical axis; Figure 35.3(a) is the distribution immediately after the slits, while Figures 35.3(b) and 35.3(c) present the intensity pattern within the observation plane. For the assumed system 505

506

Classical Optics and its Applications Y

Double-slit mask

X

Y

X

Point source f Optical axis d

Collimator

Focusing lens Observation plane

Figure 35.1 Schematic diagram of a double-slit interferometer. The collimator lens has NA ¼ 0.8, f ¼ 6000k. The mask has two slits each of width 500k, the centres of which are separated by d ¼ 6500k. The focusing lens has NA ¼ 0.008 and f ¼ 6 · 105k.

f S1

S2

p

d

Z

~ d p ~ f

Figure 35.2 Geometrical construction showing the relation between the fringe period p, the distance d between the slits, the focal length f of the focusing lens, and the wavelength k of the light.

parameters ( f ¼ 6 · 105k, d ¼ 6500k) the fringe period found from Eq. (35.1) is p 92.3k, in agreement with the simulated results. Next, assume that a second point source is placed in the focal plane of the collimator lens, slightly displaced from the first one located at the origin. Assume further that the two sources, although both quasi-monochromatic with wavelength k, are completely uncorrelated and independent, so that their radiation may be considered to be spatially incoherent. The collimated beam arriving at the plane of the slits from this second source will make a small angle w with the optical axis. As shown in Figure 35.4, this angle causes the phase of the light arriving at the two slits to differ by D where3,4 D 2pwd=k:

ð35:2Þ

A 2p phase difference at the slits corresponds to a fringe translation by one period at the observation plane. Thus the two sets of fringes arising from the

507

35 Michelson’s stellar interferometer a

b

x/

–5000

5000

–800

x/ /

0 x/

400

800

(c) 1.0

Normalized intensity

0.8

0.6

0.4

0.2

0.0 –800

–400

800

Figure 35.3 Results of a computer simulation involving a single point source, centered on the optical axis of the system of Figure 35.1. (a) Intensity distribution immediately after the mask. (b) The fringe pattern at the focal plane of the focusing lens. (c) Cross-section of the fringe pattern along the X-axis.

two point sources will be shifted relative to each other by an amount that will depend on w. Since the point sources are uncorrelated, it is their corresponding intensity distributions at the observation plane that will be added together. This results in a w-dependence of the fringe visibility V, a quantity defined by Michelson as V ¼ ðImax Imin Þ=ðImax þ Imin Þ;

ð35:3Þ

508

Classical Optics and its Applications d S1

Phase shift =2d/

d

S2

Figure 35.4 A collimated beam arriving at the double-slit mask at an angle w relative to the optical axis will exhibit a phase difference of D between the two slits.

where Imin and Imax are the minimum and maximum values of intensity within one fringe period. As an example, let us assume that two point sources of equal strength, separated by 0.5k along the X-axis, are placed in the focal plane of the collimator lens of Figure 35.1. The angular separation between the two sources as viewed from the plane of the slits, therefore, is w ¼ 17.2 seconds of arc.5 In the absence of the doubleslit mask, the images of the two point sources will be unresolvable at the observation plane; see Figure 35.5(a). With the mask in place, however, the plots of intensity distribution in Figures 35.5(b), (c) indicate that the fringe visibility will be substantially altered from that for a single point source; the latter can be deduced from Figure 35.3(c) to be 100%. Close inspection of the fringes, therefore, enables one to infer the presence of a second source of light in the system. As a practical matter, one may adjust the distance d between the two slits until the phase shift D of Eq. (35.2) becomes equal to p, at which point the two fringe systems will be shifted by half a period. Under these conditions the maxima of one set of fringes will overlap the minima of the other, resulting in a complete “washing out” of the interference pattern. Equation (35.2) can then be used to determine the angular separation w between the two point sources from the knowledge of the slit separation d that resulted in minimum fringe visibility. (Note that changing d will have the undesirable effect of changing the fringe period p according to Eq. (35.1), but, as long as the fringes remain visible to the observer, this change should be inconsequential.) Dependence of fringe visibility on d Of course one may not know a priori whether the source is an extended object (such as a large star) or consists of a number of distinct point sources (e.g., a double star) and, in the latter case, whether the point sources are of equal

509

35 Michelson’s stellar interferometer a

b

x/

–300

300 –800

x//

800

(c) 1.0

Intensity (normalized)

0.8

0.6

0.4

0.2

0.0 –800

–400

0 x/

400

800

Figure 35.5 Results of a simulation of the system of Figure 35.1 involving two independent point sources, one centered on the optical axis (i.e., at the origin), the other shifted by 0.5k along the X-axis. (a) Logarithmic plot of the intensity distribution at the observation plane in the absence of the double-slit mask. (b) Fringe pattern at the observation plane with the double-slit mask present. (c) Cross-section of the fringe pattern along the X-axis.

intensity. It turns out that a measurement of fringe visibility V as a function of the separation d between the slits can provide ample information about the intensity distribution of the source. A pair of equal-intensity stars, for instance, will make the visibility versus d a periodic function, whereas the more-or-less uniform disk of a giant star will give rise to an oscillating V(d) whose magnitude declines with increasing d. Calculations show that in the former case the first zero of V(d)

510

Classical Optics and its Applications a

b

x/

–300

300 –800

x/

800

(c) 1.0

Intensity (normalized)

0.8

0.6

0.4

0.2

0.0 –800

–400

0 x/

400

800

Figure 35.6 Results of a simulation of the system of Figure 35.1 involving three independent point sources, one centered on the optical axis, the others shifted by 0.25k along the X-axis. (a) Logarithmic plot of the intensity distribution at the observation plane in the absence of the double-slit mask. (b) Fringe pattern at the observation plane with the double-slit mask present. (c) Cross-section of the fringe pattern along the X-axis.

appears at d ¼ 0.5k/w, whereas in the latter the first zero occurs at d ¼ 1.22k/w; here w is the angle subtended by the diameter of the giant star’s disk.2, 3, 4 In any event, it is clear that a measurement of fringe visibility for several different separations of the slits will provide much information about the distribution of intensity at the source. As another example, we show in Figure 35.6 the

35 Michelson’s stellar interferometer

511

case of three point sources of equal intensity placed at x ¼ 0.25k, 0, and þ0.25k in the system of Figure 35.1. Again the image obtained at the observation plane without the double-slit mask does not resolve the sources of light, but the fringe visibility obtained as a function of d carries enough information to allow one to make a fairly accurate statement about the distribution of intensity at the source. A historical perspective 6

R. W. Wood sums up the origins of the interferometer: “This method was proposed by Fizeau1 in 1868 for measuring the diameters of the fixed stars. In 1874 Stefan made an attempt to carry out Fizeau’s plan, placing two slits in front of the objective of the Marseilles telescope, the largest available at the time. The fringes remained visible even when the slits were separated by the full diameter of the objective. In 1890 Michelson measured the diameters of the four moons of Jupiter, using the 36 inch telescope of the Lick observatory.7 The method can also be used for determining the distance between the components of a double star. “In 1920 Michelson took up the problem of the determination of stellar diameters.8 Even the great 100 inch telescope of the Mount Wilson Observatory is not large enough to allow of a sufficient separation of the slits; consequently Michelson designed a ‘periscopic’ arrangement of four mirrors, the two outer ones, twenty feet apart, reflecting the light to two inner ones which in turn reflected the beams down upon the mirror of the 100 inch telescope. The mirrors were mounted on a metal beam attached to the top of the telescope tube. The instrument was constructed in collaboration with F. G. Pease of the Mount Wilson Observatory.” A schematic diagram of the stellar interferometer constructed by Michelson (and mentioned by R. W. Wood in the preceding paragraph) is shown in Figure 35.7. In this instrument the distance ‘ between mirrors M1 and M2 was varied to effect a change of fringe visibility; one must therefore substitute ‘ for d in Eq. (35.2) in order to make it applicable to the new instrument. The fringe period p, however, is still determined by the distance d between the slits, and Eq. (35.1) applies to Michelson’s interferometer without any modifications. Thus there is the further advantage that the fringe spacing remains constant as the separation of the movable mirrors is varied. The interferometer was mounted on the 100 inch reflecting telescope of the Mount Wilson Observatory in California, which was used because of its mechanical strength. The apertures S1 and S2 were 114 cm apart, giving a fringe spacing of about 20 lm in the focal plane. The maximum separation of the outer mirrors was 6.1 m, so that the smallest measurable angular diameter (with k ¼ 550 nm) was about 0.02 seconds of arc.3 Again quoting R. W. Wood:6 “The bright star Betelgeuse was the first investigated. This star shows evidence of its diameter with the 100 inch telescope

512

Classical Optics and its Applications

Albert Abraham Michelson (1852–1931) was born in what was then Germany (now Poland) and emigrated with his family to the United States in 1855. He became professor of physics at the Case School of Applied Science (Cleveland, Ohio), then at Clark University (Worcester, Massachusetts), and then at the University of Chicago. In 1907 he became the first American to receive a Nobel prize; the prize citation reads: “For his optical precision instruments and the spectroscopic and meteorological investigations carried out with their aid.” (Photo: courtesy of AIP Emilio Segre´ Visual Archives.)

if a canvas cover is placed over the instrument, provided with two holes 7 inches in diameter and 94 inches apart, the diffraction disk of the star being crossed with faint interference bands. If either hole is covered the bands disappear. If the telescope is pointed at Rigel, however, the bands are clear and strong, showing that its angular diameter is smaller than that of Betelgeuse. With the twenty-foot interferometer the bands disappeared entirely in the case of Betelgeuse when the mirrors were separated by a distance of 120 inches, while Rigel showed very distinct bands. The angular diameter of Betelgeuse was computed as 0.047 seconds of arc. From the known distance of the star [determined by triangulation], its

35 Michelson’s stellar interferometer

513

M1

S1

C1

M3 d

l M4

S2

C2 Observation plane

M2

Figure 35.7 Michelson’s stellar interferometer. The apertures S1 and S2 are fixed, and the light reaches them after reflection at mirrors M1, M2, M3 and M4. The inner mirrors M3 and M4 are fixed, while the outer mirrors M1 and M2 can be moved symmetrically in the direction joining S1 and S2. If the optical paths M1M3S1 and M2M4S2 are maintained equal, the optical path difference for light from a distant point source is the same at S1 and S2 as at M1 and M2, so that the outer mirrors play the part of the movable apertures in the Fizeau method. A plane-parallel glass plate C1, which can be inclined in any direction, is used to maintain the geometrical pencils from S1 and S2 in coincidence in the focal plane. A second plane-parallel glass plate C2, of variable thickness, is used to compensate inequalities of the optical paths M1M3S1 and M2M4S2. (Adapted from Born and Wolf.3)

actual diameter was calculated as 250 million miles [i.e., 300 times the diameter of the sun] or greater than the earth’s orbit about the sun [180 million miles across]. Its diameter has been found to vary, however, for at times the mirrors must be separated by a distance of 14 feet before the fringes disappear. Antares was found to be still larger, having a diameter of 400 million miles. The minimum angular diameter measurable with the 20 foot instrument is 0.024 seconds of arc.” The majority of stars are either too distant or too small for the Michelson interferometer to measure their diameter. For example, at the distance of the nearest star (Alpha Centauri) the sun’s disk would subtend an angle of only 0.007 seconds of arc, and to observe the first disappearance of the fringes a mirror separation of 20 m would be necessary. The construction of such a large interferometer would be a difficult undertaking because of the requirement of rigid mechanical connection between the collecting mirrors and the eyepiece.3 In recent years, the method of Hanbury Brown and Twiss as well as extensions of Michelson’s method to radio astronomy have been used for measurements of some of the smaller astronomical objects.2,3,4

514

Classical Optics and its Applications

References for Chapter 35 1 H. Fizeau, C. R. Acad. Sci. Paris 66, 934 (1868). 2 For example, L. Mandel and E. Wolf, Optical Coherence and Quantum Optics, Cambridge University Press, London, 1995. 3 M. Born and E. Wolf, Principles of Optics, 6th edition, Pergamon Press, Oxford, 1980. 4 M. V. Klein, Optics, Wiley, New York, 1970. 5 One degree is 60 minutes, and one minute is 60 seconds of arc. One second of arc is the angle subtended by a small coin at a distance of about 3.5 km. 6 R. W. Wood, Physical Optics, third edition, reprinted by the Optical Society of America, 1988. 7 A. A. Michelson, Phil. Mag. 30, 1 (1890); A. A. Michelson, Nature (London), 45, 160 (1891). 8 A. A. Michelson, Astrophys. J. 51, 257 (1920); A. A. Michelson and F. G. Pease, Astrophys. J. 53, 249 (1921).

36 Bracewell’s interferometric telescope

There are countless suns and countless earths all rotating around their suns in exactly the same way as the seven planets of our system. We see only the suns because they are the largest bodies and are luminous, but their planets remain invisible to us because they are smaller and non-luminous. The countless worlds in the universe are no worse and no less inhabited than our Earth. Giordano Bruno (1584) in De L’Infinito Universo E Mondi

In 1978 Ronald Bracewell of Stanford University proposed the use of a nulling interferometer to cancel the image of a bright star in order to observe the relatively faint planets which might be in orbit around the star.1 This idea, which has been expounded and further extended by others,2,3,4 is presently the most promising method of detecting terrestrial planets (i.e., small, rocky planets similar to Venus, Earth, and Mars) orbiting in habitable zones around our neighboring stars. Because atmospheric turbulence distorts the stellar wavefronts and limits the resolution of ground-based observations, an interferometric telescope capable of detecting planets in other solar systems must, of necessity, be stationed in space. The National Aeronautics and Space Administration (NASA) is currently working on a program called the Terrestrial Planet Finder (TPF), and has tentatively scheduled the launch of a nulling interferometer into orbit in about the year 2010.5

Nulling interferometer Figure 36.1 is a diagram of a basic Bracewell telescope intended for operation in the infrared range of wavelengths k 7–20 lm. The reason for working in the infrared is that the expected brightness of the star in this region is only 106 times that of the planet, which is much better than the 109 brightness ratio in the visible. Moreover, several signature absorption lines corresponding to ozone, water vapor, methane, 515

516

Classical Optics and its Applications B

Telescope axis

M1

M2 CP

Dp

BS M3

Detector

Figure 36.1 (adapted from reference 4). Diagram of a nulling interferometric telescope. The primary mirrors have diameter Dp and baseline B. Unmatched reflections are made at nearly normal incidence to minimize polarization differences. The folded beams are combined at the beam-splitter (BS), which is designed for equal transmission and reflection in the desired range of wavelengths. The beam from the left-hand side, after crossing the axis of the telescope, is folded down at M1 and transmitted by BS before coming to a focus. The beam from the right-hand side is folded downward at M2 before reaching the axis of the telescope, and then passes through a compensator plate CP and is reflected back up at M3 to equalize the path lengths before being reflected from the underside of BS. An achromatic 180 phase difference is realized by balancing a slight difference in the air path with the path difference between BS and CP, fine-tuned by a slight rotation of CP.

and carbon dioxide reside in this band, which can be exploited in the spectroscopic analysis of these planets to determine whether they harbor life as we know it.5 In the following discussion we confine our attention to a single infrared wavelength of k ¼ 10 lm, even though the interferometric telescope can operate over a fairly broad range of wavelengths. We assume the primaries each have an aperture diameter Dp ¼ 1 m and focal length fp ¼ 2.5 m. (The angular resolution of the individual mirrors is thus k/Dp ¼ 105 radians.) The assumed baseline (i.e., center-to-center separation of primary mirrors) is B ¼ 5 m.† With the †

These parameters, chosen for the sole purpose of demonstrating the basic concepts, are not representative of the planned systems. A typical design under consideration by the TPF program, for example, has four primary mirrors, each 2.5 m in diameter and separated by 100 m baselines. It is envisioned that these free-flying mirrors would collect and forward the beam of light to a local combiner and controller unit (also free-flying). The planned system will be capable of executing nulling interferometry over the broad band of k ¼ 720 lm. The adjustment of mirror positions and their distances from each other as well as from the combiner would allow the configuration to be optimized on the spot in accordance with the characteristics of the particular solar system under consideration.5

36 Bracewell’s interferometric telescope

517

telescope pointing at a star, an angular separation of ¼ 106 radians between the star and its planet results in a relative phase of 2pB/k ¼ 180 between the light rays arriving at the two mirrors from the planet. (1 lrad 0.2 arcsec is the separation between the Sun and the Earth observed from 16 light years away.) The separation of the planet from its parent star in this case is an order of magnitude below the resolution of the individual mirrors, yet the assumed nulling interferometer is capable of detecting the planet in the vicinity of the star. The secondary mirrors in the system of Figure 36.1, placed at z ¼ 25 cm before the primary focus, bring the reflected beam to a final focus at z0 ¼ 5 m in front of the secondary. These negative mirrors, designed for a 20:1 conjugate ratio, have aperture diameter Ds ¼ 10 cm, focal length fs ¼ 26.32 cm, and magnification Ms ¼ 20. The focused cone of light emerging from each secondary is an f / 50 beam (i.e., numerical aperture NA0 ¼ 0.01), giving rise to an Airy disk diameter of 1.22k/NA0 ¼ 1.22 mm at the image plane. The light from the planet, entering the primary at the oblique angle of ¼ 106 radians, emerges from the secondary at 0 ¼ 10. (The secondary is ten times closer than the primary to the virtual image of the sky at the primary’s focal plane.) The final image of the planet, therefore, is shifted by Dr ¼ z0 0 ¼ 50 lm from the image of the star at the center of the image plane. This separation, being more than an order of magnitude below the Airy disk diameter of 1220 lm, is clearly insufficient to resolve the planet’s image from the parent star’s, confirming once again the inadequacy of the individual mirrors for the task. The case against a conventional telescope Even a conventional (filled-aperture) space telescope 25 m in diameter will fail to detect the planet in the preceding example. The problem in this case is not resolution but photon noise. The image of the star, being about 106 times brighter than that of the planet, floods the detectors and obscures the planet’s signal. The nulling interferometer, however, yields an acceptable signal-to-noise ratio at the detector output by canceling the light of the star arriving from the two mirrors while, at the same time, enhancing the image of the planet by constructive interference. Not only does the nulling interferometer eliminate the complete Airy pattern of the star (i.e., the central disk as well as the rings), it does so without requiring any significant displacement of the planet’s Airy pattern. What is important for the nulling interferometer is not how much the two Airy disks in the image plane are separated from each other but how much the wavefront arriving from the planet at one mirror is

518

Classical Optics and its Applications

delayed relative to the time of arrival of the same wavefront at the other. This delay or phase shift, being a function of the baseline B, is independent of the mirror diameter Dp. Destructive and constructive interference As a specific example consider the system of Figure 36.1 with the aforementioned parameters. The beam-splitter (BS) is an important component of this system; to simulate its behavior we used a six-layer stack on a 1 mm substrate, as shown in Figure 36.2. The alternate layers are high- and lowindex dielectrics, their thicknesses chosen to yield a 50/50 beam-splitter at the operating wavelength of k ¼ 10 lm. The top of the substrate is anti-reflection coated with a low-index layer to minimize undesirable reflections. This stack design, although adequate for demonstration purposes, is not suitable for broad-band applications requiring cancellation of the star light with high accuracy. Such applications require alternative designs or more complex multilayer stacks. Figure 36.3 shows computed images of a single star obtained with the above telescope. The Airy pattern in Figure 36.3(a) is obtained when the light from one arm of the telescope is blocked. When both channels are open and properly balanced, destructive interference at the beam-splitter yields the null image in Figure 36.3(b); here the peak intensity is only 1.4 · 104 that in Figure 36.3(a). The weak residual image of the star in Figure 36.3(b) is due to a slight imbalance of the two channels brought about by the beam-splitter’s minute departure from

AR layer

Substrate

Six-layer stack

Figure 36.2 A simple design for the beam-splitter (BS) in the system of Figure 36.1, having a 50/50 reflection to transmission ratio at k ¼ 10 lm. The 1 mm-thick substrate has refractive index n ¼ 2, and is antireflection coated (AR) on the top surface with a t ¼ 1.785 lm layer having n ¼ 1.4. Deposited on the substrate bottom is a six-layer stack. Numbered in increasing order starting at the substrate interface, these layers have the following parameters: layers 1, 3, 5, t ¼ 1.7 lm, n ¼ 1.5; layers 2, 4, t ¼ 1.25 lm, n ¼ 2.0, layer 6, t ¼ 0.475 lm, n ¼ 2.0.

519

36 Bracewell’s interferometric telescope a

– 300

b

x/

300

– 300

x/

300

Figure 36.3 Logarithmic intensity distributions in the image plane corresponding to a single star with no planets: (a) when the light from either arm of the interferometer is blocked; (b) with both channels open and the path lengths properly balanced to allow interferometric cancellation of the star’s image. The ratio of the peak intensity in (b) to that in (a) is 1.4 · 104. The non-zero values of the residual intensity in (b) are due to imperfect balance between the two channels.

the ideal 50/50 ratio. Although this four-orders-of-magnitude reduction of intensity in the null image is sufficient for the present discussion, it is totally unacceptable for the observation of actual terrestrial planets. Because the radiation levels of these planets are expected to be at least a million times weaker than their parent star’s, it is imperative to design the telescope components, with much higher accuracy, for a maximum rejection of the star light. Consider next a planet only 100 times weaker than its star, with an angular separation of 25 lrad. With one channel of the telescope blocked the image in Figure 36.4(a) is obtained, whereas in a balanced interferometer one obtains the image in Figure 36.4(b). It is obvious in this example that the single-channel output, corresponding to a conventional telescope’s image, shows a faintly visible planet next to a bright star, whereas in the interferometric image the light from the star is all but eliminated. Note that for clarity of presentation we have chosen a relatively bright planet with a large separation from its parent star (25 lrad 5 arcsec). Both these assumptions are much too optimistic and, in practice, one must substantially improve the sensitivity of the assumed telescope in order to detect terrestrial planets in our neighboring solar systems.5 The fringe pattern and the spinning telescope Figure 36.5 shows a Bracewell telescope oriented with its baseline in the XY-plane at an angle h from X, pointing at a star along the Z-axis. Each point in

520

Classical Optics and its Applications b

a

– 300

x/

300

– 300

x/

300

Figure 36.4 Images of star and planet, when the assumed brightness of the planet is 1% of the star’s and their angular separation is 5.16 arcsec. (a) When either beam is blocked the intensity distribution is essentially that of the star; the planet is barely visible. (b) With both channels open and the phase difference between them adjusted to 180 , the bright star is canceled out and the planet becomes visible. The center-to-center spacing between the images of the star and the planet is 1.25 mm, and the ratio of the peak intensity in (b) to that in (a) is 0.038.

Z

f

Y

c

u

X

Figure 36.5 The Bracewell telescope, with its baseline in the XY-plane and oriented at an angle h from X, targeting a star along the Z-axis. The points in the vicinity of the star are identified by their polar and azimuthal coordinates , w. The wavefront, arriving from an oblique direction, reaches one mirror later than the other, producing a path-length difference of B sin cos(w h).

521

36 Bracewell’s interferometric telescope

the star’s neighborhood may be identified either by its angular coordinates ,w or by the Cartesian coordinates x, y of its image in the telescope. The image location is related to the polar coordinates through the equation ðx; yÞ ¼ Ms fP tan ðcos w; sin wÞ:

ð36:1Þ

Here fP is the focal length of the primary mirror and Ms is the magnification of the secondary. For the light arriving at the two primaries from the direction (, w) in the sky the relative phase is DU ¼ 2pðB=kÞ sin cosðw hÞ:

ð36:2Þ

The two arms of the telescope are adjusted in such a way that when DU ¼ 0 the two beams interfere destructively whereas a 180 phase shift results in constructive interference. The corresponding light amplitude at the image plane is thus given by A ¼ 12 A0 ½1 expðiDUÞ;

ð36:3Þ

and the resulting intensity may be written jAj2 ¼ jA0 j2 sin2 ð12 DUÞ ¼ jA0 j2 sin2 ½pðB=kÞ sin cosðw hÞ:

ð36:4Þ

Figure 36.6 is a gray-scale plot of jA/A0j2 in the image plane of the telescope (black and white represent 0 and 1, respectively). Each point (x, y) in this plane corresponds to a point (, w) in the sky in accordance with Eq. (36.1); it is also assumed that the primary mirrors are separated by B ¼ 5 m along the h ¼ 45 line.

– 100

x/

100

Figure 36.6 Gray-scale plot of jA/A0j2 (Eq. 36.4) in the image plane of the telescope (black and white represent 0 and 1, respectively). The primary mirrors are separated by B ¼ 5 m along the h ¼ 45 line; k ¼ 10 lm.

522

Classical Optics and its Applications

The field of view is centered at (x, y) ¼ (, w) ¼ (0, 0), which is the target star’s location. Only a circle of radius 100k within the field of view – corresponding to 0 20 lrad – is shown in Figure 36.6, but the same pattern could extend over a much larger patch of sky around the targeted star. Any planet (or other source of radiation) located in the bright fringes of Figure 36.6 will produce a bright Airy pattern in the image plane at that location. However, planets located in the dark fringes disappear from the image because destructive interference cancels them out. If the telescope is rotated around the Z-axis while maintaining a tight fix on the target star, h will change continuously and the pattern of Figure 36.6 will rotate around its center. The image of a planet within the field of view, however, will remain fixed while the fringes rotate. The planet’s image thus waxes and wanes as the bright and dark fringes cross it one after the other. The number of times that the planet’s image appears and disappears in a single revolution of the telescope depends on the polar coordinate of the planet; specifically, the frequency of the planet’s signal at the detector output increases in proportion to its separation from its parent star. In this way it is possible to modulate the signal of a given planet and, by integration over time, to reduce noise components residing outside the specific frequency of the planet’s signal.1 Interplanetary dust and zodiacal light In addition to the Sun and the nine planets and their moons, our solar system is home to countless rocks, pebbles, and dust particles floating in interplanetary space. The light of the Sun scattered from these dust particles (the so-called zodiacal light) will enter a space-based telescope and create a background noise. A similar diffuse radiation from the targeted solar system (exo-zodiacal light) will also be imaged as a fairly uniform distribution across the telescope’s field of view.4,5 It is true, of course, that the ideal image of a broad, uniform source of light in the Bracewell telescope should resemble the striped pattern of Figure 36.6. However, the Airy disk produced by the finite aperture of each mirror is typically many times larger than the fringe spacing in Figure 36.6, and, therefore, the image of the zodiacal light will be the convolution between the Airy pattern of Figure 36.3(a) and the stripes of Figure 36.6. The zodiacal emissions, therefore, appear as a fairly uniform distribution in the image plane of the telescope. The shot noise from this captured background radiation is mainly responsible for the unavoidable noise in the photodetector output, its elimination requiring the spinning of the telescope followed by integration of the signal over time, as discussed in the preceding section.

523

36 Bracewell’s interferometric telescope

Effect of star’s finite diameter In the presence of pointing errors or when the star has a finite angular diameter, the star light leaks out of the null and swamps the planet’s image. Figure 36.7(a) shows the computed image of a star having an angular diameter of 0.05 arcsec, obtained in the nulling interferometer of the previous examples. To simulate this finite-size star we assumed 25 equally bright point sources spread over the surface of the star and superimposed the intensities of their Airy patterns at the image plane. The peak intensity of this image is about 230 times stronger than that in Figure 36.3(b), which was obtained under identical conditions except for neglect of the star’s diameter. It is clear that the interferometer’s null must be made broader if such effects of the finite diameter (as well as any pointing errors) are to be avoided. A proposed solution to this problem involves the use of several telescopes instead of just two, as in the original Bracewell concept. With the beams from four or more telescopes combined in a nulling interferometer, it is possible to broaden the central null of the fringe pattern.2,3 Achromatic path-length equalization The compensator plate CP in the system of Figure 36.1 is used to balance the path lengths of the two interferometer arms over a range of wavelengths. In the particular case studied in reference 4, CP was 42 lm thicker than BS, and achromaticity was achieved for k ¼ 10 –14 lm. a

– 300

b

x/

300 – 300

x/

300

Figure 36.7 (a) Null image of a star of finite diameter (0.05 arcsec). The peak intensity is about 230 times greater than that in Figure 36.3(b). (b) Image of the finite-diameter star and its planet. This image should be compared with Figure 36.4 (b), which was obtained under identical conditions except for neglect of the angular diameter of the star. The peak intensities of the planet and star in the present image are nearly the same.

524

Classical Optics and its Applications

Let d1 and d2 be the optical path lengths of the two channels in air, and denote by t1 and t2 the thicknesses of CP and BS, respectively. These plates are made of the same material, whose refractive index within the wavelength range of interest may be approximated by n(k) a þ bk (a and b are material constants). Also, one must take into account the 90 phase shift introduced by the (symmetric) beamsplitter between the reflected and transmitted beams. The overall optical phase difference between the two channels is thus given by DU ¼ 12 p þ 2p½d1 þ t1 nðkÞ d2 t2 nðkÞ=k 12 p þ 2pfðt1 t2 Þb þ ½d1 d2 þ ðt1 t2 Þa=kg:

ð36:5Þ

In this equation the first bracketed term can be chosen to yield a 90 phase shift by selecting the plate thicknesses such that (t1t2)b ¼ 14. The second bracketed term is dependent on k and must therefore be set to zero. Since t1t2 is already fixed, elimination of the second term requires an adjustment of d1d2, the path-length difference in air. In practice these adjustments are made iteratively by changing d1d2 while rotating CP by small amounts until the desired null is achieved. References for Chapter 36 1 R. N. Bracewell, Detecting nonsolar planets by spinning infrared interferometer, Nature 274, 780–781 (1978). 2 J. R. P. Angel and N. J. Woolf, searching for life on other planets, Scientific American 274, 60–66 (April 1996). 3 N. Woolf and J. R. Angel, Astronomical searches for Earth-like planets and signs of life, Ann. Rev. Astron. Astrophys. 36, 507–537 (1998). 4 P. M. Hinz et al., Imaging circumstellar environments with a nulling interferometer, Nature 395, 251–253 (1998). 5 J. R. Angel et al., TPF: Terrestrial Planet Finder, JPL publication 99-3, May 1999. For more information visit the worldwide web at http://tpf.jpl.nasa.gov.

37 Scanning optical microscopy†

The diffraction-limited focusing of a laser beam to either explore or modify a surface is the basis of several important technologies. Examples include scanning optical microscopy, optical disk data storage, and laser printing. The size of the focused spot and the corresponding depth of focus are important factors in determining the performance characteristics of these systems. In this chapter we examine methods of forming the focused spot, and clarify the relation between spot size and depth of focus. Principle of operation The essential features of a scanning optical microscope are shown in Figure 37.1. A laser beam is sent through an objective lens to form a focused spot on the sample. Ideally, the objective is corrected for all aberrations, yielding a diffraction-limited focused spot. The light reflected from the sample returns through the objective and is redirected by the beam-splitter to a detection module. The detection module may be designed to monitor the power, the phase, or the polarization state of the returning beam. The electrical signal S(x, y) produced by the detector is thus representative of the small area of the sample illuminated by the focused spot at and around the point (x, y). The sample is moved to different locations by the XY stage on which it is mounted; the signal S(x, y), plotted against the sample’s position, yields an image of the sample’s surface over the desired area. Spot size at best focus The most important component of any optical microscope is its objective lens. The quality of the focused spot produced by the objective determines the resolution of the †

The coauthors of this chapter are Lifeng Li and Wei-Hung Yeh.

525

526

Classical Optics and its Applications X

Beam-splitter Sample Laser

Y Collimator

Objective

Z

XY Scanner

Detector module S(x, y)

Figure 37.1 Schematic diagram of a scanning optical microscope. The objective lens focuses the laser beam at the point (x, y) on the sample. The XY scanner on which the sample is mounted moves the sample in small steps along both X- and Y- directions, covering the area of interest. At each point the reflected light is picked up by the detector module and converted to a signal S(x, y). A plot of S(x, y) constitutes the image of the scanned area.

images obtained, so it is important to have a very small, aberration-free spot at the focal plane of the objective. Figure 37.2, a schematic drawing of an objective lens, defines some of its important characteristics. The converging cone of light has a halfangle h. The numerical aperture NA of the lens is defined in terms of this half-angle and the refractive index n of the medium in which the sample is immersed: NA ¼ n sin h:

ð37:1Þ

When the sample is in air (n ¼ 1) the numerical aperture is less than unity. However, if the sample is embedded in a liquid or solid of refractive index n > 1, the numerical aperture can be as large as n. The diameter D of an aberration-free focused spot is given by diffraction theory as1 D k 0 =NA

ð37:2Þ

where k0 is the vacuum wavelength of the laser beam. The above equation gives only a rough estimate of the spot diameter, the exact value depending on how the diameter is defined [e.g., the diameter of the first dark ring of the Airy disk, the full width at half maximum (FWHM) of the intensity distribution, etc.], on the distribution of light at the entrance pupil of the lens (e.g., uniform, truncated Gaussian, etc.), and on the state of polarization of the laser beam. The proportionality constant between D and k0 / NA is typically between 0.5 and 1.5, depending on the circumstances. Figure 37.3 shows plots of intensity distribution at the focal plane of the 0.615NA objective shown in Figure 37.2. The incident beam is assumed to be uniform and

527

37 Scanning optical microscopy X

u

Z

Y 2d Depth of focus

Figure 37.2 A polarized beam of light is brought to diffraction-limited focus by a microscope objective lens. Since scanning microscopy is typically done with a monochromatic laser beam, chromatic aberrations of the lens are of no concern. Bending of the polarization vector, however, is significant and must be taken into consideration. The half-angle h of the focused cone is used to define the NA-value of the lens. The depth of focus is within d of the focal plane. For a high-NA singlet, such as the plano-convex lens shown here, diffraction-limited performance over a flat field can be achieved only with an aspheric surface. This particular lens, designed for operation at k0 ¼ 633 nm, has the following set of parameters: n ¼ 1.806092, Rc ¼ 0.9846 mm, K ¼ 1.00938, A4 ¼ 6.16672 · 102, A6 ¼ 1.42948 · 102, A8 ¼ 2.14376 · 102, A10 ¼ 8.12147 · 103, aperture radius ¼ 1 mm, thickness ¼ 1.142 mm. The lens NA-value is 0.615 and its focal length is 1.2315 mm.

–2

x (μm)

2

–2

x (μm)

2

–2

x (μm)

2

Figure 37.3 Logarithmic plots of intensity distribution at the focal plane of the 0.615NA objective shown in Figure 37.2. The incident beam is uniform and has linear polarization along the X-axis. From left to right: X-, Y-, and Z-components of polarization at best focus. The integrated intensities of the three components are in the ratios 1 : 0.002 : 0.113.

linearly polarized along the X-axis. The bending of the rays by the lens produces E-field components along the Y- and Z-axes as well; the distributions of these components, which carry only a small fraction of the total optical energy, are shown in Figure 37.3(b), (c). The logarithmic scale of these plots enhances the rings of light around the central bright spot; in fact these rings are typically weak and do not contribute much to the scanning signal. The central bright spot in Figure 37.3(a)

528

Classical Optics and its Applications

is the most important contributor to the signal, but for accurate measurements the effects of the entire focused spot should be taken into consideration.

Depth of focus Another important characteristic of the focused spot is its depth of focus. Typically, for high-NA objectives the range over which the spot size can be considered to be small is quite limited. As shown in Figure 37.2, if the sample moves by d along the Z-axis, deviations from perfect focus may be tolerable; for larger movements, the quality of the scanning signal suffers. The order of magnitude of the depth of focus is given by the theory of diffraction as d/k (D/k)2, which is an expression for the Rayleigh range2 of the beam in a medium in which the wavelength is k. This expression may be written as d D2 =k:

ð37:3Þ

The proportionality constant between d and D2/k depends on the performance criteria of the system and may be anywhere in the range 0.1 to 1. For the 0.615NA lens of Figure 37.2, plots of total intensity distribution (i.e., the X-, Y-, and Z-components of polarization combined) at several distances from focus are shown in Figure 37.4. At best focus a small elongation of the spot along the X-axis may be observed. This is characteristic of the focused spots obtained with linearly polarized light at high NA: the spot is always elongated along the direction of incident polarization. The FWHM of the spot at this point is 0.57 lm along X and 0.51 lm along Y. Equations (37.2) and (37.3) predict d k0 / NA2 ¼ 1.67 lm, in agreement with the distributions of Figure 37.4. The spot diameter is substantially enlarged if the depth of focus is exceeded.

Oil immersion objective To obtain improved resolution one may use an oil-immersion objective. As shown in Figure 37.5, the front element of this type of lens is in contact with a fluid having a specific refractive index n. (The front element is typically an aplanatic sphere; for a discussion of aplanatism see chapter 1, “Abbe’s sine condition”.) The front element of the lens, the fluid, and the cover plate protecting the sample (if any) should all have the same or nearly the same refractive index. Thus, upon emerging from the objective the rays go directly to the sample’s surface without further bending. Under such conditions the wavelength of the light within the immersion oil is reduced by a factor n, in consequence of which the effective NA of the lens increases by the same factor. Equations (37.1)–(37.3) apply to this case as well,

37 Scanning optical microscopy

–2

x (μm)

529

2

Figure 37.4 Logarithmic plots of total intensity distribution at and near the focus of the 0.615NA objective shown in Figure 37.2. From top to bottom Dz ¼ 2 lm, 1.5 lm, 1 lm, 0.5 lm, and 0. Because of the symmetry between the two sides of focus, the distributions for Dz are the same. At best focus the spot’s FWHM is 0.57 lm along X and 0.51 lm along Y.

showing that for a given cone angle h both the spot size D and the depth of focus d shrink by a factor n, compared with an objective designed for operation in air. For an oil-immersion lens having sin h ¼ 0.615 and n ¼ 2, Figure 37.6 shows plots of the total intensity distribution at 1 lm defocus (top) and at best focus

530

Classical Optics and its Applications Objective

Sample Z

Index-matching fluid

Figure 37.5 An oil-immersion objective focuses the beam onto the sample through an index-matched fluid of refractive index n. The fluid is in contact with both the sample and the front element of the lens. The rays that emerge from the objective do not bend on their way to the sample, thus forming a high-NA cone of light. For a given half-angle h of the cone, the NA of an oil immersion objective is superior to that of an air-incidence objective by a factor n.

–2

x (μm)

2

Figure 37.6 Logarithmic plots of intensity distribution at and near the focus of an oil-immersion objective. The objective consists of the 0.615NA lens of Figure 37.2 in conjunction with a hemispherical glass cap. Both the cap and the immersion oil have index n ¼ 2, resulting in an overall NA of 1.23. Top: Dz is 1 lm away from the focal plane. Bottom: the position of best focus; FWHM ¼ 0.28 lm along X, 0.25 lm along Y.

531

37 Scanning optical microscopy

(bottom). Compared to Figure 37.4, which corresponds to the same value of h in air, it is apparent that both the spot diameter and the depth of focus have decreased by a factor n ¼ 2. Line scans across a grating Figure 37.7 shows the cross-section of a diffraction grating. The grating is coated with a thick layer of gold, n and k are 0.14 and 3.37, respectively; the grating has a groove depth 170 nm and a period 1.5 lm, of which 0.5 lm is the groove width, 0.66 lm is the land width, and the remaining 0.34 lm is taken up by the two side walls, pitched at 45 . For the purpose of imaging this grating, the assumed detector module in the system of Figure 37.1 is a split detector, oriented with its splitting line parallel to the grooves. As will be described below, the outputs S1, S2 of the split detector may be combined in different ways to yield the scanning signal. Figure 37.8 shows plots of a single line-scan of the grating in the direction perpendicular to the grooves; the scalar theory of diffraction has been used to compute these plots. The dashed curves correspond to a 0.6NA air-incidence objective, while the solid curves represent a 1.2NA oil-immersion objective. The scans in Figure 37.8(a), obtained by adding S1 and S2, represent the total optical power returning from the sample. The monitored signal in Figure 37.8(b) is the so-called push–pull signal, (S1 S2)/(S1 þ S2), which is sensitive to the position of the groove edges. Clearly, the oil-immersion objective with its superior NA-value provides a better resolution in both cases. The origin of the push–pull signal used for sensing the groove edges may be understood by considering Figure 37.9, which shows the intensity distribution in the exit pupil of the objective lens for three cases: from top to bottom, the spot is focused on the land center, on the groove edge, and on the groove center. The symmetry of this so-called “baseball pattern” is such that, with the beam focused on the land or on the groove, the split detector receives equal amounts of light on both its halves. However, on the groove edge the diffraction orders appearing on 0.66 ␮m

0.17 ␮m

Gold

0.5 ␮m

Figure 37.7 Cross-section of a diffraction grating used in computer simulations (period 1.5 lm). The gold coating is thick enough to prevent the light from penetrating through to the other side.

532

Classical Optics and its Applications 1.0

(a) Oil immersion (NA = 1.2)

Sum signal (S1 + S2)

0.8

0.6

0.4

Air-incidence (NA = 0.6)

0.2 Scalar 0.0 – 0.75

Differential Signal (S1 – S2)/(S1 + S2)

0.8

– 0.50

– 0.25

0.00 0.25 Distance (␮m)

0.50

0.75

(b) Oil immersion (NA = 1.2)

0.6 0.4 0.2

Air-incidence (NA = 0.6)

0.0 – 0.2 – 0.4 – 0.6

Scalar – 0.8 – 0.75

– 0.50

– 0.25

0.00

0.25

0.50

0.75

Distance (␮m)

Figure 37.8 Scalar diffraction theory applied to the grating of Figure 37.7 yields single-line scans in the direction perpendicular to the grooves. The scanned period extends from the center of the land at 0.75 lm to the center of the adjacent land, at þ0.75 lm, with the groove center at 0. The broken line corresponds to a 0.6NA air-incidence objective, while the solid line represents a 1.2NA oil-immersion objective. The detector module consists of a split detector aligned with the grooves, yielding signals S1 and S2. (a) Sum signal scans corresponding to the total reflected power. (b) Differential signal scans, corresponding to the “push–pull” method.

37 Scanning optical microscopy

– 2000

x (μm)

533

2000

Figure 37.9 Computed baseball patterns at the exit pupil of the 1.2NA oilimmersion objective during the scans depicted in Figure 37.8. From top to bottom, the focused spot is on the land center, on the groove edge, and on the groove center.

one side of the baseball pattern have a different phase from those appearing on the opposite side and, therefore, the asymmetry between the two halves of the baseball pattern yields a fairly large differential signal. Since these calculations are based on the scalar theory of diffraction, anomalous effects due to surface plasmon excitation and dependence on the beam’s polarization state are not observed. Such effects will show up later in our full vector diffraction calculations. Focusing through a cover plate At times it is necessary to observe a sample through a transparent cover plate. Biological samples, for instance, are usually prepared between a pair of thin glass

534

Classical Optics and its Applications

plates, and the storage layer of compact disks is protected from dust and fingerprints by a plastic substrate 1.2 mm thick. In either case the objective lens must be corrected for the specific thickness and refractive index of the cover plate. As shown in Figure 37.10, a cone of light focused through a parallel plate becomes compressed toward the optical axis, its value of sin h shrinking by the refractive index n of the plate. At the same time, the wavelength of the light inside the plate also shrinks by the same factor, to give k ¼ k0 / n. The net effect is that the spot diameter D does not change as a result of focusing through the cover plate. However, Eq. (37.3) implies that the depth of focus will improve. This would be true, of course, if one interpreted the depth of focus as the depth of the sample interrogated by the focused beam while the sample remained at rest. But what happens if one moves the sample in the Z-direction and determines the distance Dz over which the image of the sample remains sharp? One finds in the latter case that focusing through the cover plate does not improve the depth of focus at all. In other words, the depths of focus with and without the cover plate are exactly the same. (Keep in mind that the objective lens is corrected for each case separately.) The reason for the above apparent discrepancy is as follows. If one moves the sample and the cover plate together by Dz along the positive Z-axis, the top of the cover plate also moves away from the lens by the same distance. Consequently the focused spot recedes from the sample’s surface by nDz, which is greater than the actual travel of the sample. (This analysis, which ignores residual spherical aberrations, is quite straightforward and requires only the use of Snell’s law and simple geometry. It also applies to the case where the sample and the cover plate move along the negative Z-axis.) Thus, as long as the lens remains stationary while the sample and the cover plate travel together along Z, the cover plate does Objective

Sample Z

Cover plate (or substrate)

Figure 37.10 Focusing through a transparent cover plate of refractive index n. The cone angle shrinks by a factor of n, but the spot size and the depth of focus are not affected.

535

37 Scanning optical microscopy

not increase, nor does it decrease, the depth of focus. Aside from protecting the sample, focusing through the cover plate has no obvious advantages. The solid immersion lens A transparent hemisphere of refractive index n may be placed over the sample in such a way as to bring the cone of light to focus at the center of the hemisphere, as shown in Figure 37.11. The use of this type of hemisphere, often referred to as a solid immersion lens (SIL),3 improves the resolution of the system by a factor of n. To establish smooth and seamless contact without the use of an indexmatching fluid, the bottom of the hemisphere and the top of the sample must both be flat and free from dirt, dust and scratches. The resolution gain thus achieved is a consequence of the fact that in going through the hemisphere the cone angle h remains the same while the wavelength of the light shrinks by a factor of n. What is remarkable about the SIL is that, unlike in oil-immersion microscopy, the depth of focus does not suffer as a result of the improved resolution. As long as the SIL and the sample move together along Z, whether towards or away from the objective, the bending of the light rays at the spherical surface of the SIL (governed by Snell’s law) makes the focused spot move in the direction of the sample, thereby helping to increase the depth of focus. The net effect is that the depth of focus of the system remains the same whether or not the SIL is placed on the sample. Figure 37.12 shows computed plots of intensity distribution at the sample’s surface when the assembly of the sample and the SIL travels by distances Dz ¼ 2 lm, 1 lm, and 0 away from the position of best focus. Comparing Figure 37.12 with Figure 37.4, one concludes that the use of the SIL has reduced the spot diameter by a factor of n ¼ 2 but has not changed the system’s depth of focus.

Objective

Sample

Z

SIL

Figure 37.11 Focusing through a solid immersion lens (SIL) of refractive index n. The spot size shrinks by a factor of n, but, assuming the SIL and the sample move together along Z, the depth of focus remains the same.

536

Classical Optics and its Applications

–2

x (μm)

2

Figure 37.12 Logarithmic plots of intensity distribution when a SIL (radius ¼ 0.5 mm, n ¼ 2) is placed in front of a 0.615NA objective lens. From top to bottom, the SIL and the sample, moving together along the Z-axis, deviate from the position of best focus by 2 lm, 1 lm, and 0. At best focus the spot’s FWHM is 0.28 lm along X and 0.25 lm along Y. The effective NA is 1.23, but the depth of focus is the same as it was prior to inserting the SIL.

Effect of the air gap In applications of SIL to microscopy, where the sample is stationary, the SIL and the sample remain in contact, keeping the width of the air gap at Wg ¼ 0. However, in optical disk systems, where the disk spins under the SIL at a rapid rate, a small air gap develops between the bottom of the SIL and the top of the

537

37 Scanning optical microscopy

disk surface.4 Under such circumstances the light must jump through the gap in order to interact with the storage layer of the disk. This is not a serious problem for those rays that propagate along the Z-axis, or at a small inclination with respect to it, since they are readily transmitted through the bottom of the SIL. However, for those rays that make a large angle with the Z-axis, the Fresnel transmission coefficients become small; in particular, when the incidence angle exceeds the critical angle of total internal reflection the transmissivity drops to zero.1 Fortunately, the phenomenon of frustrated total internal reflection allows photons to tunnel through the gap and reach the storage layer of the disk. For this to happen efficiently, the gap width Wg must be a small fraction of k0. (See Chapter 27, “Some quirks of total internal reflection”.) The effects of the air gap on the signal level can be seen in Figure 37.13, which shows results of computer simulations based on the full vector theory of 1.0

(a)

SIL: NA = 1.2 Gap = 200 nm

Sum signal (S1 + S2)

0.8

Gap = 50 nm Gap = 0

0.6 Air-incidence (NA = 0.6)

0.4

0.2 Parallel Polarization

0.0 – 0.75

– 0.50

– 0.25

0.25 0.00 Distance (␮m)

0.50

0.75

Figure 37.13 Computed line-scans in the direction perpendicular to the grooves of the grating of Figure 37.7, based on vector diffraction theory. The scanned range extends from the center of a land at 0.75 lm to the center of an adjacent land at þ0.75 lm. The detector module consists of a split detector aligned with the grooves, yielding the signals S1, S2. (a), (c) Sum signal scans corresponding to the total reflected power collected by the detector: upper solid line, gap-width Wg ¼ 200 nm; dotted line, Wg ¼ 50 nm, lower solid line, Wg ¼ 0. The small oscillations riding over the signals are caused by numerical errors. (b), (d) Differential signal scans corresponding to the push–pull method of detection: solid line of smaller amplitude, Wg ¼ 200 nm; dotted line, Wg ¼ 50 nm, solid line of greater amplitude, Wg ¼ 0. In each figure the broken-and-dotted curve is obtained in the absence of the SIL with a 0.6NA objective lens, while the other three curves correspond to a SIL with n ¼ 2 and an overall objective NA of 1.2. The linear incident polarization in (a) and (b) is parallel to the grooves, while in (c) and (d) it is perpendicular to the grooves.

538

Classical Optics and its Applications

Differntial Signal (S1 – S2)/(S1 + S2)

0.8

(b)

Gap = 0

0.6 0.4 Gap = 50 nm

0.2

Gap = 200 nm SIL: NA = 1.2

0.0 – 0.2 – 0.4 – 0.6 – 0.8

Air-incidence (NA = 0.6)

– 0.75

– 0.50

Parallel Polarization

– 0.25

0.00

0.25

0.50

0.75

Distance (␮m) 1.0

(c)

SIL: NA = 1.2 Gap = 200 nm

Sum signal (S1 + S2)

0.8 Gap = 50 nm

0.6 Gap = 0

0.4 Air-incidence (NA = 0.6)

0.2 Perpendicular Polarization

0.0 – 0.75

Figure 37.13

– 0.50

– 0.25

0.25 0.00 Distance (␮m)

0.50

0.75

(continued)

diffraction.5,6 The grating used in these simulations is that of Figure 37.7, the assumed refractive index of the SIL is n ¼ 2, and the incident beam is assumed to be linearly polarized. The direction of incident polarization is parallel to the grooves in Figures 37.13(a), (b), and perpendicular to the grooves in Figures 37.13 (c), (d). Several line-scans across a single period of the grating are shown; the plots in Figures 37.13(a), (c) correspond to the total returned optical power while those in Figures 37.13(b), (d) represent the differential (or push–pull) signal.5 The signal amplitude is highest when Wg ¼ 0, but it has dropped considerably at Wg ¼ 50 nm

539

37 Scanning optical microscopy

Differential Signal (S1 – S2)/(S1 + S2)

0.8

(d)

Air-incidence (NA = 0.6)

0.6 0.4

Gap = 0

0.2

SIL: NA = 1.2 Gap = 200 nm

0.0 Gap = 50 nm

– 0.2 – 0.4 – 0.6 Perpendicular Polarization

– 0.8 – 0.75

– 0.50

– 0.25

0.00

0.25

0.50

0.75

Distance (␮m)

Figure 37.13 (continued)

and even further at Wg ¼ 200 nm. In practice a gap-width below about k0 /10 is usually acceptable; going beyond this value causes a sharp reduction in the signal level. The scanning signals are sensitive to the direction of incident polarization. In general the two polarization directions parallel and perpendicular to the grooves are not equivalent and yield different results, as may be readily observed in Figure 37.13. To emphasize further the significance of polarization, Figure 37.14 shows the light intensity pattern at the exit pupil of the objective lens for polarization directions parallel and perpendicular to the grooves, all other things being kept equal. The two baseball patterns show clear differences. The super SIL It is well known that a converging cone of light aimed at a point a distance nR below the center of a glass sphere of radius R and refractive index n comes to diffraction-limited focus within the sphere at a distance of R/n below the center.1 This fact has been exploited in the design of the super SIL shown in Figure 37.15. The effective NA of the objective thus increases by a factor n2, not only because the wavelength within the super SIL is shortened by a factor n but also because the super SIL increases the sine of the cone angle h by a factor n. The super SIL is aplanatic, and placing it in front of any aplanatic objective renders the combination aplanatic as well.

540

Classical Optics and its Applications a

–2000

b

x (μm)

2000

–2000

x (μm)

2000

Figure 37.14 Computed baseball patterns at the exit pupil of a 0.6NA objective lens when the grating of Figure 37.7 is placed beneath a SIL of refractive index n ¼ 2 (total NA ¼ 1.2). The focused spot is at the center of the land and the assumed gap width is 100 nm. In (a) the polarization vector is parallel to the grooves, while in (b) it is perpendicular to the grooves.

Objective Sample

C

Z

Super SIL

Figure 37.15 Focusing through a “super SIL” of refractive index n and radius R. In the absence of this SIL the cone of light from the objective comes to focus at a distance of nR from the center C of the sphere. The bending of the rays at the sphere’s surface shifts the focal point to a distance of R/n from C, without introducing any aberrations.

The full factor n2 mentioned above may not be realized in practice, because the bending of the rays within the super SIL works only up to a point, stopping when the marginal rays become orthogonal to the Z-axis. If the objective happens to have a large NA to begin with, the super SIL can only increase the value of its sin h up to 1, at which point the remaining rays will miss the super SIL. The other improvement by a factor of n, however, is always realized in practice because the wavelength always shrinks by this factor. To see the effect of the super SIL on the focused spot, computed light intensity distributions at and near the focus of a 0.4NA objective are shown in Figure 37.16. The FWHM spot diameter at best focus is 0.84 lm along X and 0.8 lm along Y,

37 Scanning optical microscopy

–2

x (μm)

541

2

Figure 37.16 Logarithmic plots of intensity distribution at and near the focus of a 0.4NA objective lens operating at k0 ¼ 633 nm. Top: Dz ¼ 5 lm defocus. Bottom: best focus (Dz ¼ 0). The spot’s FWHM at best focus is 0.84 lm along X and 0.80 lm along Y.

while the depth of focus according to Eqs. (37.2) and (37.3) is around 4 lm. When a super SIL of index n ¼ 2 is placed in front of this objective the plots of Figure 37.17 are obtained. Clearly the spot size has shrunk by n2, but the depth of focus is nearly the same as it was before the super SIL was introduced. Once again, it is observed that focusing through the super SIL does not reduce the depth of focus, as long as the sample and the super SIL move together along the Z-axis. (This statement ignores the effects of a small amount of spherical aberration introduced by the departure of the super SIL from its ideal location.) A catadioptric SIL A design that combines the objective and the SIL into one catadioptric element is shown in Figure 37.18.7 (A catadioptric element is one that involves both the reflection and the refraction of light.) A collimated beam enters the concave facet of the lens, is reflected first at a flat internal mirror, then at an aspheric internal mirror, and is finally brought to focus at the bottom of a plateau that is in contact (or near contact) with the surface of the sample under investigation. This particular lens, which is also aplanatic, has a reasonably large field of view, with

542

Classical Optics and its Applications

–2

x (μm)

2

Figure 37.17 Logarithmic plots of intensity distribution obtained when a super SIL (R ¼ 0.5 mm, n ¼ 2) is placed in front of the 0.4NA objective depicted in Figure 37.16. From top to bottom, the super SIL and the sample, moving together along the Z-axis, deviate from the position of best focus by Dz ¼ 5 lm, 2.5 lm, and 0. At best focus the FWHM of the spot is 0.24 lm along X and 0.20 lm along Y. The effective NA is 1.6, and the spot size has shrunk accordingly, but the depth of focus is essentially the same as it was without the super SIL.

NA ¼ 1.1. Because the central portion of the incident beam is not used, the lens effectively has an annular aperture, which makes the central spot even smaller than the Airy disk but at the expense of increasing the brightness of the rings. Figure 37.19 shows plots of intensity distribution at the focus of the lens for the three components of polarization. The FWHM of the total intensity distribution is

543

37 Scanning optical microscopy Flat mirror

Aspheric mirror

Glass (n = 1.813)

Figure 37.18 A catadioptric optical element molded from glass of refractive index n ¼ 1.813. The spherical entrance facet has a radius of curvature Rc ¼ 0.7 mm and an aperture radius of 0.41 mm. Once inside the glass, the light rays are reflected first from the flat aluminized surface and then from the aspheric surface (also aluminized), before arriving at the flat exit surface of the plateau. The aspheric parameters are as follows: Rc ¼ 2.5308 mm, K ¼ 1.7076, A1 ¼ 0.01233, A2 ¼ 0.209 · 103, A3 ¼ 0.4476 · 104, A4 ¼ 0.8797 · 105, aperture radius ¼ 1.8 mm. The vertex of the aspheric surface is at z ¼ 0, that of the spherical surface is at z ¼ 0.3 mm, the flat mirror is at z ¼ 1.5 mm, and the exit facet is at z ¼ 1.8 mm.

–2

x (μm)

2

–2

x (μm)

2

–2

x (μm)

2

Figure 37.19 Logarithmic plots of the intensity distribution at the focal plane of the catadioptric lens of Figure 37.18. The incident beam is collimated and linearly polarized along the X-axis. From left to right are shown the X-, Y-, and Z-components of polarization. The integrated intensities of the three components are in the ratios 1 : 0.003 : 0.128. The effective NA-value of the lens is 1.1, but its annular shape of aperture gives rise to a spot size slightly less than that of the Airy disk. The enhanced rings are also caused by the annular shape of the aperture.

0.3 lm along X and 0.26 lm along Y. Depth of focus is not a very useful concept for this particular element because the incident beam is collimated. References for Chapter 37 1 2

M. Born and E. Wolf, Principles of Optics, 6th edition, Pergamon Press, Oxford, 1980. A. E. Siegman, An Introduction to Lasers and Masers, McGraw-Hill, New York, 1971.

544

Classical Optics and its Applications

3 S. M. Mansfield, W. R. Studenmund, G. S. Kino, and K. Osato, Opt. Lett. 18, 305–307 (1993). 4 B. D. Terris, H. J. Mamin, and D. Rugar, Appl. Phys. Lett. 65, 388–390 (1994). 5 For the computations that led to Figures 37.13 and 37.14, the reflection coefficients of the grating were first calculated using DELTA, a vector diffraction code developed by Lifeng Li. These coefficients were subsequently imported to DIFFRACT where they were combined to represent the effects of a focused beam. 6 Lifeng Li, Multilayer-coated diffraction gratings: differential method of Chandezon et al. revisited, J. Opt. Soc. Am. A11, 2816–2828 (1994). 7 C. W. Lee et al., Feasibility study on near field optical memory using a catadioptric optical system, Optical Data Storage Conference, Aspen, Colorado, May 1998.

38 Zernike’s method of phase contrast

Frederik (Fritz) Zernike (1888–1966)

Zernike invented the phase-contrast microscope in 1935, and was awarded the 1953 Nobel prize in physics for this achievement.1 In an ordinary optical microscope, an object that imparts a phase modulation to the incident light will produce only a faint image. This faint image may be attributed to the diffraction of a small amount of the light out of the entrance pupil of the objective lens. To improve this image, Zernike in effect extracted a reference beam from the light collected by the objective lens and produced an interferogram of the object at the image plane of the microscope, thus converting phase information into amplitude (or intensity) modulation. The principles of operation of the phase-contrast microscope are by now fully understood.1,2,3,4,5,6 Both spatially coherent and spatially incoherent light may be used in this type of microscopy. For best results, a quasi-monochromatic light source with a reasonable coherence time must be employed. Our goal in the present chapter is to give a simple explanation of the main ideas behind the method and to provide a pictorial survey of this important branch of modern optical microscopy. 545

546

Classical Optics and its Applications

The phase-contrast microscope The diagram in Figure 38.1 shows the main elements of a phase-contrast microscope. The light source may be a coherent source (e.g., a laser) or an incoherent one (e.g., a tungsten lamp or an arc lamp); monochromaticity may be achieved by means of a colored glass filter. The condenser lens projects the source onto the object, whose image is formed by the objective lens. Although the system depicted in Figure 38.1 appears as transmissive, it could just as well represent the unfolded view of a reflective system. In the latter case, the condenser and the objective lens are physically the same element, and a means of separating the incident path from the reflected path (such as a beam-splitter) must be provided. The main difference between an ordinary microscope and a phase contrast microscope is the presence, in the latter, of a spatial filter (or mask) within the rear focal plane of the objective lens (see Figure 38.1). To appreciate the action of this filter, we note that the light emerging from the object over its XY-plane has a complex-amplitude distribution that may be assumed to be proportional to exp[i(x, y)]. Here (x, y) is the phase distribution imparted to a uniform incident beam upon transmission through (or reflection from) a sample that may have surface-relief structure and non-uniform thickness, or perhaps even an inhomogeneous refractive index profile. Assuming that (x, y) is sufficiently small, we may use a Taylor series expansion to arrive at the following approximation: exp½iðx; yÞ 1 þ iðx; yÞ:

ð38:1Þ

In the Fourier domain, the first term in the above expansion, being the constant or d.c. term, appears at the center of the plane of spatial frequencies. In

Spatial filter

Source

Condenser

Object

Image Microscope objective

Figure 38.1 Schematic diagram of a simple phase-contrast imaging system. The light source is projected by the condenser lens onto a phase object, allowing the objective lens to form an image of this object at the image plane. The main component is the mask in the Fourier plane, which imparts a uniform phase shift (and possibly some amplitude attenuation) to the undiffracted component of the beam.

38 Zernike’s method of phase contrast

547

the rear focal plane of the objective lens, therefore, the d.c. term appears as a bright spot centered at and around the optical axis. Zernike realized that by placing a 90 phase shift on this d.c. term (i.e., multiplying it by i), he could bring it in phase with the second term in Eq. (38.1). In this way he enabled beams corresponding to the two terms in the above expansion to interfere with each other when they overlapped within the image plane of the system. The primary function of the spatial filter, therefore, is to delay, by one quarter of a wavelength, the central region of the beam within the rear focal plane of the objective lens. The source and the illumination optics Two types of illumination will be considered. To provide collimated coherent illumination we assume that a monochromatic laser beam is brought to focus on the object by a condenser lens of a very small numerical aperture (NA). Figure 38.2(a) is the logarithmic intensity distribution at the object plane produced by a 0.03NA condenser. This distribution has the shape of an Airy pattern, with a central lobe diameter of 1.22k /NA 41k, where k is the wavelength of the light source. Since the objects of interest will be small compared to the Airy disk diameter, and since they will be placed near the center of the Airy disk, this illumination qualifies as coherent, fairly uniform, and nearly collimated. The second type of illumination to be considered is incoherent. Our concern, of course, is solely with spatial incoherence. However, to ensure that the phase is meaningfully defined throughout the system and that the coherence time is long enough for interference to occur, we assume a quasi-monochromatic source with a sufficiently narrow bandwidth. With this type of illumination, the source can be modeled as a collection of independent point sources extending over the luminous area of the lamp. We compute the image obtained with each such point source independently, and add up the intensities of the resulting images to obtain the final image. The imaging optics The objective lens used in the simulations described below is free from aberrations and, therefore, its performance is diffraction-limited. The objective is a finite-conjugate lens with a numerical aperture of 0.25 (on the side of the object), a focal length of 5000k, and a magnification of 10. The object used throughout this chapter is a transparent piece of flat glass or plastic, embossed with seven marks of various sizes and shapes, as shown

548

Classical Optics and its Applications a

–45

b

x/

45

c

–1500

–12.5

x/

12.5

x/

125

d

x/

1500

–125

Figure 38.2 Computed distributions at various cross-sections of the system of Figure 38.1 (without the phase-contrast mask) for the case of coherent illumination. (a) Logarithmic plot (a ¼ 4) of the intensity distribution at the object plane, obtained when a collimated, coherent source is brought to focus by a 0.03NA condenser lens. (b) Pattern of phase objects (marks) with different sizes and separations, on a uniform background. The transmissivity is 100% over the entire area of this object, but the marks impart to the incident beam a 36 phase shift (i.e., one-tenth of a wavelength) relative to the background. (c) Logarithmic plot (a ¼ 6) of the intensity distribution at the exit pupil of the objective lens. (d) Distribution of intensity in the image plane of the system in the absence of a phase-contrast filter; the outlines of the marks are barely visible in this image.

in Figure 38.2(b). The largest mark is 10k long and the smallest mark is 3k wide. These marks are large enough to yield a reasonably clear image with both coherent and incoherent illumination, in conjunction with an appropriate phase-contrast filter. All the marks impart to the incident beam a phase shift of 36 relative to the background (corresponding to an optical path-length difference of k /10). For coherent illumination of the object by the beam depicted in Figure 38.2(a), the logarithmic plot of intensity distribution at the Fourier plane is shown in

549

38 Zernike’s method of phase contrast

Figure 38.2(c). The bright central spot in this figure is the d.c. term mentioned earlier. Note that the cutoff point of this logarithmic plot is at a ¼ 6 and, therefore, the light diffracted by the object and spread throughout the aperture of the objective lens is quite weak. In the absence of any phase-contrast mechanism the computed image of the object is as shown in Figure 38.2(d). This obviously is a very poor image, one in which the boundaries of the marks are barely perceptible. We will see below how the action of the phase-contrast filter dramatically improves the quality of this image. Contrast enhancement with coherent illumination With a disk-shaped spatial filter (diameter ¼ 550k) placed in the Fourier transform plane of the object, the image shown in Figure 38.3 will be obtained. The filter in this case is a simple 90 phase-shifter, affecting the bright, central region of the beam shown in Figure 38.2(c). The images of the marks are now clearly visible, but the contrast is not remarkable. To study the effect of amplitude filtering on image quality, we replace the phase-shifting filter with one that simply blocks the central region of the beam within the Fourier plane. The resulting image is shown in Figure 38.4. Note that, by eliminating the d.c. component of the phase modulation function (x, y), use of this filter has emphasized the boundaries of the marks. The best choice for the phase-contrast filter is generally a phase/amplitude mask that shifts the d.c. component of the beam by 90 , while attenuating

a

–125

b

x/

125

– 125

x/

125

Figure 38.3 Image of the phase object of Figure 38.2(b), obtained with the coherent illumination of Figure 38.2(a) when a phase-contrast mask is placed in the Fourier plane. The mask is a small disk of radius 275k, imparting a þ90 phase shift to the central region of the beam. (a) Intensity distribution in the image plane. (b) Same as (a) but on a logarithmic scale (a ¼ 1.65).

550

Classical Optics and its Applications b

a

– 125

x/

125

– 125

x/

125

Figure 38.4 Image of the phase object of Figure 38.2(b), obtained with the coherent illumination of Figure 38.2(a), when an amplitude mask is placed in the Fourier plane. The mask, a small disk of radius 275k, blocks the central region of the beam. (a) Intensity distribution in the image plane. (b) Same as (a) but on a logarithmic scale (a ¼ 3).

b

a

–125

x/

125

–125

x/

125

Figure 38.5 Image of the phase object of Figure 38.2(b) obtained with the coherent illumination of Figure 38.2(a) when a phase/amplitude mask is placed in the Fourier plane. The mask, a small disk of radius 275k, imparts a þ 90 phase shift to the central region of the beam while attenuating its amplitude by 50%. (a) Intensity distribution in the image plane. (b) Same as (a) but on a logarithmic scale (a ¼ 1.65).

its amplitude to bring it in line with the magnitude of (x, y). Figure 38.5 shows the image obtained with a filter that cuts the amplitude in half while shifting the phase by 90 . The resulting contrast enhancement is quite impressive. Finally we consider the effect of changing the phase shift from þ 90 to 90 . This is shown in Figure 38.6, where the images of the marks are now brighter than their background. A similar situation will arise, of course, if instead of

551

38 Zernike’s method of phase contrast b

a

–125

x/

125

– 125

x/

125

Figure 38.6 Same as Figure 38.5 except for the phase shift of the mask, which is 90 in the present case. (a) Intensity distribution in the image plane. (b) Same as (a) but on a logarithmic scale (a ¼ 2).

reversing the sign of the phase at the filter we reverse the phase of the marks at the object. In practice, most phase objects contain a number of positive as well as negative features, and their images will appear to be darker than the background in some regions and brighter in other regions.

Contrast enhancement with incoherent illumination To obtain better resolution in optical microscopy one must illuminate the object with a cone of light (as opposed to a cylindrical collimated beam). This point was discussed in Chapter 5, “Coherent and incoherent imaging”. The best results are typically achieved when the numerical apertures of the illumination cone and the objective lens are identical because, under such circumstances, twice as many spatial frequencies of the object are captured by the objective lens. It turns out that the cone of light does not have to be solid in order to achieve high resolution; the same benefits also derive from a hollow cone of light. In Figure 38.7(a) we show the annular source of incoherent light that is used in our calculations to illuminate the condenser lens. In this annulus there are 36 independent “point sources”, which provide a good approximation to a homogeneous ring of incoherent light. A 0.25NA condenser lens projects the annulus to a bright spot at its focal plane, as shown in Figure 38.7(b). This spot is large enough to cover the phase object depicted in Figure 38.2(b). The logarithmic plot of intensity distribution at the Fourier plane, Figure 38.7(c), shows a bright annulus as well as a fairly uniform disk of diffracted light within the

552

Classical Optics and its Applications a

–1500

b

x/

1500

c

–1500

– 25

x/

25

x/

125

d

x/

1500

– 125

Figure 38.7 Imaging of the phase object of Figure 38.2(b), obtained with an incoherent, annular illuminator. (a) The simulated homogeneous, annular light source consists of 36 independent, quasi-monochromatic point sources. These point sources are arranged uniformly around the circumference of the entrance pupil of the 0.25NA condenser lens. (b) Computed intensity distribution at the focal plane of the condenser, which is also the location of the object. (c) Distribution of the logarithm of intensity (a ¼ 6) at the exit pupil of the 0.25NA objective lens. The annular phase mask placed at this pupil has a width of 300k, it imparts a þ 90 phase shift and a 50% (amplitude) attenuation to the beam at the outer periphery of the exit pupil. (d) Computed intensity distribution at the image plane of the system.

exit pupil of the objective lens. Evidently, the phase-contrast filter must also be in the form of an annular ring, covering the circumference of the objective’s exit pupil and capable of delivering a 90 phase shift as well as a reasonable attenuation factor to the incident beam. The resulting image shown in Figure 38.7(d) is obviously of high quality, both in terms of resolution and contrast.

38 Zernike’s method of phase contrast

553

References for Chapter 38 1 2 3 4 5 6

F. Zernike, Z. Tech. Phys. 16, 454 (1935); Phys. Z. 36, 848 (1935); Physica 9, 686, 974 (1942). M. Franc¸on, Le contraste de phase en optique et en microscopie, Revue d’Optique, Paris (1950). A. H. Bennett, H. Jupnik, H. Osterberg, and O. W. Richards, Phase Microscopy, Wiley, New York, 1952. F. D. Kahn, Proc. Phys. Soc. B 68, 1073 (1955). M. Born and E. Wolf, Principles of Optics, 6th edition, Pergamon Press, Oxford, 1980. M. V. Klein, Optics, Wiley, New York, 1970.

39 Polarization microscopy

The state of polarization of a given beam of light is modified upon reflection from (or transmission through) an object. The resulting change in polarization state conveys information about the structure and certain physical properties of the illuminated region. Polarization microscopy is a variant of conventional optical microscopy that enables one to monitor these changes over a small area of a specimen. Such observations then allow the user to identify and analyze the specimen’s structural and other physical features.1,2 Traditionally, observations with a polarization microscope have been categorized “orthoscopic” or “conoscopic.” Orthoscopic observations involve direct imaging of the sample itself, thus allowing one to view the indentations, striations, variations of optical activity and birefringence, etc., over the sample’s surface. Conoscopic observations, however, involve illuminating a crystalline surface with a cone of light and then imaging the exit pupil of the objective lens. This mode of observation is used in characterizing the crystal’s ellipsoid of birefringence and identifying its optical axes. The polarization microscope Figure 39.1 is a simplified diagram of a polarization microscope. The light source is typically an extended white light source, such as a halogen lamp or an arc lamp. The collected and collimated beam from the source is linearly polarized as a result of passage through a polarizer. In metallurgical microscopes, such as the one shown here, the objective lens is used both for illuminating the sample and for collecting the reflected light. Typically the source is imaged onto the entrance pupil of the objective lens, which provides for maximum light-collection efficiency while producing a highly defocused image of the source at the sample.1,3 Any non-uniformities of the source are thereby averaged, to yield a more uniform light intensity distribution at the sample’s surface. 554

555

39 Polarization microscopy CCD Camera

Analyzer Wollaston Prism

Lens Objective

Light Source

Linear Polarizer

Sample

Figure 39.1 Diagram of a conventional polarization microscope. The spatially incoherent light source is linearly polarized and imaged onto the entrance pupil of the objective lens. The reflected light returns through the objective and, after passage through the analyzer, arrives at the image plane. The analyzer is in a rotatable mount, and its transmission axis is adjusted to yield maximum image contrast. If the analyzer is replaced with a Wollaston prism, two images will appear, side by side, on the camera’s CCD plate. The computer downloads both images simultaneously and subtracts one from the other in order to produce a differential image.

Although the source is spatially incoherent, the projected beam at the sample’s surface is, in general, partially coherent. As for the degree of temporal coherence of the light source, it does not play a role in polarization microscopy and is, therefore, ignored throughout this chapter. All one needs to assume is that the light source is quasi-monochromatic, with a bandwidth that is sufficiently narrow to allow one to restrict attention to a single wavelength. The bandwidth must be wide enough, however, to render the source spatially incoherent. (An extended but purely monochromatic source is, of necessity, spatially coherent because the radiated fields from any two locations on the source maintain their relative phase at all times.)

556

Classical Optics and its Applications

Throughout this chapter we assume a quasi-monochromatic source of wavelength k0, consisting of a fixed number of independent and mutually incoherent point sources arranged on a tightly packed square lattice. The contribution of each such point source to the final image is computed independently of those of all the other point sources. The sum of the intensity distributions thus produced at the image plane by the individual point sources constitutes the image of the object. This method of computing the image takes full account of the partial spatial coherence of the illuminating beam without ever having to introduce the corresponding correlation functions explicitly. The light reflected from the sample is collected by the microscope’s objective lens, then passed through another linear polarizer (usually referred to as the analyzer), and finally brought to focus at the image plane. This image plane coincides with the front focal plane of the eyepiece (not shown) or the plane of the detectors within a TV camera. Modern optical microscopes are usually equipped with a charge-coupled device (CCD) camera, which picks up the image and displays it on a computer monitor. The possibility of digital image processing afforded by this electronic acquisition allows new methods of microscopy, such as the differential method to be described shortly. The analyzer is rotated about the optical axis until its transmission axis is crossed (or nearly crossed) with that of the polarizer. The image contrast is primarily determined by the action of the object on the state of polarization of the incident beam. In regions where the sample does not affect the polarization, the reflected light is blocked by the analyzer, making the corresponding regions of the image dark. However, in those regions that rotate the polarization vector, a fraction of the light goes through the analyzer, the transmitted optical power being proportional to the degree of rotation of the polarization as well as to the actual reflectivity of the sample at the given spot. The resulting image thus provides a map and a measure of the ability of the sample to rotate the direction of incident polarization at its various locations. This has been the basis of orthoscopic polarization microscopy for many years. The conoscopic approach, which involves the imaging of the exit pupil of the objective lens, will be discussed towards the end of this chapter. The four-corners problem A limitation of polarization microscopy is rooted in the fact that the beam’s state of polarization is affected by ordinary reflections and refractions at the various surfaces throughout the optical path.1,4,5 This usually results in polarization rotation and/or ellipticity in the four corner areas of the objective’s exit pupil, as shown in Figure 39.2. The four-corners problem allows transmission of spurious

557

39 Polarization microscopy a

b

c

d

– 3200

x/0

3200 – 3200

x/0

3200

Figure 39.2 Various distributions of the reflected light at the exit pupil of the objective when a single monochromatic point source is used to illuminate the sample. The intensity plots in (a) and (b) correspond, respectively, to the components of polarization parallel and perpendicular to the polarizer’s transmission axis. The polarization rotation angle q is depicted in (c) and the polarization ellipticity g is shown in (d). The gray-scale of the latter plots depicts positive values of q and g as bright and negative values as dark.

light through the analyzer, thereby reducing the contrast of the image. When the problem is caused by reflections and refractions at the various surfaces of the objective (or condenser) lens, a viable solution is to use a specialty objective that incorporates a half-wave plate in the midst of its optical train.1,6 The half-wave plate rotates the polarization direction by 90 , allowing the four-corner rotations before and after the plate to cancel each other out. This solution was offered by objective-lens manufacturers in the early days, before the advent of powerful antireflection coatings. Nowadays the various surfaces of the objective and the condenser are antireflection coated, and the four-corners problem caused by these surfaces is negligible. The problem still remains, however, that Fresnel’s reflection coefficients at the sample’s surface differ for p- and s-polarized rays, causing a polarization rotation problem that is aggravated with increasing angle of incidence. Moreover, if the sample is observed through a birefringent substrate, the resulting polarization

558

Classical Optics and its Applications

variations over the beam’s cross-section give rise to spurious light transmission through the analyzer, which, once again, reduces the image contrast.5 These problems can no longer be solved by the incorporation of a half-wave plate within the objective lens, because they are sample dependent. The differential method of microscopy described below solves the four-corners problem by splitting the spurious light between two images of the sample and then eliminating it by subtracting one image from the other. Differential method† A simple modification of the conventional microscope of Figure 39.1 involves replacing the analyzer with a Wollaston prism. The Wollaston splits the image of the sample into two and transmits both images, side by side, to the camera. With the transmission axes of the Wollaston fixed at 45 relative to the polarizer’s axis, the unrotated light is split equally between the two images. When there is polarization rotation, however, one image receives more light than the other, the sense of rotation of the polarization determining which image gets the larger share. The two images are then subtracted from each other (within the computer) to produce a single differential image of the sample. The differential image is superior in many respects to the conventional image, as will be seen in the examples that follow. The main advantage of differential polarization microscopy is that it does not suffer from the four-corners problem. Another advantage is that a map of reflectivity variations across the sample can be readily constructed by adding the two images together; normalizing the differential image by the sum image then provides a pure map of polarization rotation at the sample. The sample In general, the polarization image of a sample is mixed with its other images, say, those produced by reflectivity variations or optical phase variations across the sample. To avoid such complications, we consider a smooth sample having uniform amplitude and phase reflectivity everywhere, but one that rotates the polarization of the incident beam as a result of optical activity. A perpendicularly magnetized thin-film sample provides a good example in this case. By changing the direction of magnetization (from up to down) in different locations, one can create a pattern of magnetic domains such as that shown in Figure 39.3. Here the smallest domain (shown at the center) is one wavelength in diameter. The black †

To the author’s best knowledge the concept of differential polarization microscopy has not been described previously in the technical and patent literature and may therefore be novel.

559

39 Polarization microscopy

–6

x/0

6

Figure 39.3 Pattern of magnetic domains on a perpendicularly magnetized sample. The magnetic material rotates the polarization of a linearly polarized beam at normal incidence by 0.5 . The domains are chosen to represent a wide range of sizes and shapes; the smallest domain appearing in the center is one wavelength (k0) in diameter.

and white regions are magnetized in opposite directions and rotate the incident (linear) polarization by þ 0.5 and 0.5 , respectively. The material of the sample used in the following examples is assumed to have complex index of refraction (n, k) ¼ (3.35, 4.03) which gives it a reflectivity of 62% at normal incidence. At oblique incidence the Fresnel reflection coefficients for p- and s-polarized light differ from each other, thus inducing some rotation and ellipticity into the reflected polarization state. For instance, at a 53 angle of incidence, the linear polarization of a ray originally directed at 45 with respect to the p-direction rotates by 7.4 and acquires 8.7 of ellipticity. This change of the polarization state upon reflection is caused solely by the Fresnel coefficients of the sample, independently of its optical activity. Low-resolution imaging Figure 39.4 shows computed images, both conventional and differential, of the magnetic marks of Figure 39.3 obtained with a 50 ·, 0.4NA objective. In these calculations the source was defocused by a distance of 35k0 below the object plane, and the images from a total of 361 point sources were superimposed to simulate the (spatially incoherent) light source. For the conventional image shown in Figure 39.4(a) the analyzer axis was set 0.5 away from the cross position, nearly the optimum setting for achieving maximum contrast in this case. (The contrast may be reversed by rotating the analyzer to the opposite side of the cross position.) The resolution of these images is not great, as evidenced by the near-disappearance of the small mark in the center. The contrast, however, is

560

Classical Optics and its Applications a

b

– 300

x/0

300

Figure 39.4 Images of the sample of Figure 39.3 in a polarization microscope having a 50 ·, 0.4NA objective lens. (a) Conventional image obtained with the analyzer set 0.5 away from extinction. (b) Differential image obtained with the Wollaston prism.

quite good, and there is little difference between the conventional and differential methods of imaging. The reason is that at 0.4NA the half-angle of the focused cone of light is only 23.6 , which is not large enough to cause a significant four-corners problem. High-resolution imaging Obtaining images with high resolution requires a high-NA objective lens. Figure 39.5 shows both conventional (a), (b) and differential (c), (d) images of the sample of Figure 39.3 obtained with a 50 ·, 0.8NA objective. The images on the left show dark domains on a bright background, while the reverse-contrast counterpart of each image is shown to its right. In these calculations the source was defocused by a distance of 10k0 below the object plane, and the images from a total of 361 point sources were superimposed to simulate the (spatially incoherent) light source. Inspection of Figure 39.5 reveals that the resolution has improved over that of Figure 39.4. The contrast, however, is quite poor for the conventional images in

561

39 Polarization microscopy a

b

c

d

– 300

x/0

300 – 300

x/0

300

Figure 39.5 Images of the sample of Figure 39.3 in a polarization microscope having a 50 ·, 0.8NA objective lens. (a) Conventional image obtained with the analyzer set þ1.5 away from extinction. (b) Same as (a) but now the analyzer is set 1.5 from extinction to reverse the contrast. (c) Differential image. (d) Same as (c) but with the order of subtraction reversed.

Figures 39.5(a), (b), even though the analyzer has been set optimally at 1.5 from the crossed position. This poor contrast is a manifestation of the four-corners problem. In comparison, the differential images of Figures 39.5(c), (d) show excellent contrast, which is not surprising considering that the four-corners contributions to individual images (before subtraction) are identical and can therefore be removed by subtraction. To gain a better appreciation of the four-corners problem, consider the intensity distribution at the plane of the sample, Figure 39.6, corresponding to a single point source defocused by 10k0. Although the incident beam entering the objective lens is linearly polarized along the X-axis, the defocused spot, in consequence of the bending of the rays by the lens, contains all three components of polarization, along the X-, Y-, and Z-axes; these are shown respectively from top to bottom in Figure 39.6. The peak intensities of the three components in Figure 39.6 are in the ratios Ix : Iy : Iz ¼ 1 : 0.007 : 0.185. Upon reflection from the sample the distributions remain qualitatively the same, but the peak-intensity ratios change to 1 : 0.017 : 0.142. Thus the relative content of the Y-component

562

Classical Optics and its Applications

– 12

x/0

12

Figure 39.6 Distribution of incident intensity at the plane of the sample corresponding to a single point source defocused by 10k0 through a 0.8NA objective. The incident beam entering the lens is linearly polarized along the X-axis. Top to bottom: intensity distributions corresponding to polarization components along the X-, Y-, and Z-axes.

increases upon reflection while that of the Z-component decreases. When this distribution returns to the objective lens, it gives rise to patterns of intensity and polarization similar to those shown in Figure 39.2. At the exit pupil the values of the polarization rotation angle q range from 7.0 to þ 8.1 , while the polarization ellipticity g ranges from 8.8 to þ 8.6 . The slight asymmetry between positive and negative values is caused by the presence of magnetization in the sample. In the absence of magneto-optical activity, q and g vary between 7.4 and 8.7 , respectively.

39 Polarization microscopy

563

Substrate birefringence Sometimes it is necessary to observe a sample through an intervening medium, such as a coating layer or a substrate. If this medium happens to be birefringent, it creates a four-corners problem of its own.5 As a typical example, assume that the sample of Figure 39.3 is coated with a birefringent layer 500 nm thick whose principal refractive indices along the coordinate axes are (nx, ny, nz) ¼ (1.5, 1.6, 1.7). For this sample, conventional microscopy yields the image shown in Figure 39.7(a),

a

b

c

– 300

x/0

300

Figure 39.7 Images of the sample of Figure 39.3, coated with a birefringent layer and placed in a microscope having a 50 ·, 0.8NA objective. (a) Conventional image, obtained with the analyzer set optimally at 5 away from extinction. (b) Differential image. (c) Same as (b) but with the order of subtraction reversed.

564

Classical Optics and its Applications

while differential microscopy produces the normal and reverse-contrast images of Figures 39.7(b), (c). Clearly, in the presence of birefringence differential polarization microscopy is far superior to the conventional method. For this sample, the reflected polarization pattern at the exit pupil for a single illuminating point source (see Figure 39.2) exhibits q-values ranging from 20.4 to þ 22.0 , and g-values ranging from 23.3 to þ 23.0 . In the absence of magnetic activity q and g would vary between 21.3 and 23.2 , respectively. Conoscopic observations The system depicted in Figure 39.8 captures the essence of conoscopic polarization microscopy. Here a coherent, monochromatic beam of light is linearly polarized and sent through an objective lens to be focused on a birefringent crystal. The reflected light is re-collimated by the objective and observed after going through a crossed analyzer. For the specific example described below, the objective’s NA-value is 0.375 and its focal length f is 20 000k0. The sample is in the XY-plane, the Z-axis being perpendicular to its surface. The crystal slab’s thickness is 430k0, its principal refractive indices are (nx, ny, nz) ¼ (1.686, 1.682, 1.531), and its ellipsoid of birefringence is rotated around the Z-axis by 13 . The computed intensity distribution at the observation plane of Figure 39.8 is shown in Figure 39.9(a), and the corresponding logarithmic plot appears in Figure 39.9(b). Within the focused cone there are two rays that propagate along the two optical axes of the crystal; these rays return without any change in their state of polarization and are therefore blocked by the analyzer. There are also groups of rays whose polarization vectors undergo rotation by integer multiples of 180 in double passage through the slab. These rays are also blocked by the analyzer, giving rise to the various dark regions in the intensity patterns of

Lens Polarizer Beam-splitter

Analyzer

Birefringent crystal

Aluminum mirror

Observation plane

Figure 39.8 Schematic diagram of a simplified conoscopic microscope. The double passage of the focused beam through the birefringent crystal causes varying degrees of polarization rotation over the beam’s cross-section. The crossed analyzer converts these rotations into an intensity pattern.

565

39 Polarization microscopy a

– 8000

b

x/

0

8000 – 8000

x/

0

8000

Figure 39.9 (a) Intensity and (b) logarithmic intensity distributions at the observation plane in the system of Figure 39.8 with a biaxially birefringent crystal.

Figure 39.9. A systematic analysis of the exit-pupil distribution can, therefore, provide detailed information about the sample’s ellipsoid of birefringence. References for Chapter 39 1 2 3 4 5 6

S. Inoue´ and R. Oldenbourg, Microscopes, in Handbook of Optics, Vol. II, second edition, McGraw-Hill, New York, 1995. J. R. Benford and H. E. Rosenberger, Microscopes, in Applied Optics and Optical Engineering, Vol. IV, ed. R. Kingslake, Academic Press, New York, 1967. M. Born and E. Wolf, Principles of Optics, 6th edition, Pergamon Press, Oxford, 1980. H. Kubota and S. Inoue´, Diffraction images in the polarizing microscope, J. Opt. Soc. Am. 49, 191–198 (1959). Y. C. Hsieh and M. Mansuripur, Image contrast in polarization microscopy of magneto-optical disk data-storage media through birefringent plastic substrates, Applied Optics 36, 4839–4852 (1997). J. R. Benford, Microscope objectives, in Applied Optics and Optical Engineering, Vol. III, ed. R. Kingslake, Academic Press, New York, 1965.

40 Nomarski’s differential interference contrast microscope

George Nomarski invented the method of differential interference contrast for the microscopic observation of phase objects in 1953.1,2,3 The features on a phase object typically modulate the phase of an incident beam without significantly affecting the beam’s amplitude. Examples include unstained biological samples having differing refractive indices from their surroundings, and reflective (as well as transmissive) surfaces containing digs, scratches, bumps, pits, or other surface-relief features that are smooth enough to reflect specularly the incident rays of light. A conventional microscope image of a phase object is usually faint, showing at best the effects of diffraction near the corners and sharp edges but revealing little information about the detailed structure of the sample.4 Nomarski’s method creates two slightly shifted, overlapping images of the same surface. The two images, being temporally coherent with respect to one another, optically interfere, producing contrast variations that contain useful information about the phase gradients across the sample’s surface. In particular, a feature that has a slope in the direction of the imposed shear appears with a specific level of brightness that is distinct from other, differently sloping regions of the same sample.4,5,6 The Nomarski microscope uses a Wollaston prism in the illumination path to produce two orthogonally polarized, slightly shifted bright spots at the sample’s surface. Upon reflection from (or transmission through) the sample, the two beams are collected by the objective lens, then sent through the same (or, in the case of a transmission microscope, a similar) Wollaston prism, which recombines the two beams by sliding them back over each other. The two beams subsequently arrive coincidentally in the image plane of the microscope, but the two images of the sample which they carry will be relatively displaced. A linear analyzer, placed after the Wollaston prism in the reflected (transmitted) path, brings the polarization vectors of the two images into alignment, enabling the two to interfere with each other. A sheared interferogram of the sample’s surface is thus formed at the image plane of the microscope. 566

40 Nomarski’s differential interference contrast microscope

567

Wollaston prism Because Nomarski’s method of microscopy is fundamentally dependent on the action of the Wollaston prism, a brief description of this polarizing beam-splitter is in order. The Wollaston prism, depicted in Figure 40.1, consists of two cemented wedges from the same uniaxial birefringent crystal (e.g., quartz or calcite). The individual wedges are precisely cut and polished, then aligned with their optic axes orthogonal to each other.4 In Figure 40.1 the optic axis of the upper wedge is horizontal within the plane of the page, while that of the lower wedge is perpendicular to the plane. The crystal’s ordinary and extraordinary refractive indices, no and ne, interact with the E-field components perpendicular and parallel to the optic axis, respectively. The incident beam, in general, has both s- and p-components of polarization. In going through the upper half of the Wollaston, the p-component interacts with ne and the s-component with n0, but the propagation direction remains the same for both the p- and s-beams. In the lower half the roles of n0 and ne are exchanged, with the result that the p-component is deflected to one side and the s-component to the other (one beam enters a denser, the other a rarer medium). The angular separation of the beams is further enhanced by Snell’s law when they exit the prism. Emerging from the Wollaston, therefore, are two beams, propagating in different directions and having mutually orthogonal directions of polarization.

Incident beam p s Optic axis a Wollaston Optic axis s

Emergent beam 2

p

Emergent beam 1

Figure 40.1 The Wollaston prism consists of two cemented wedges of the same uniaxial birefringent crystal, aligned with their optic axes in different directions. The incident beam, with its p- and s-components of polarization, is split at the interface between the wedges. Emerging from the Wollaston are two orthogonally polarized beams that propagate in different directions.

568

Classical Optics and its Applications

Figure 40.2 shows a thin bundle of rays arriving at a Wollaston prism and splitting into two orthogonally polarized beams. The p- and s-beams go through a microscope objective and illuminate the sample in two small, slightly displaced patches that cover the objective’s field of view. Upon reflection from the sample the beams return through the objective and come together again as they emerge from the Wollaston. Note that, in a round trip through this system, the optical path lengths of the p- and s-beams will be the same only if the Wollaston is centered on the Z-axis. In particular, if the Wollaston is translated along the X-axis then, during a round trip, one beam sees a longer optical path than the other. The relative phase of the p- and s-beams, referred to as the bias phase B, can therefore be adjusted by sliding the Wollaston along the X-axis. Note that, for a given lateral position of the Wollaston, the bias phase B is constant for all the ray bundles that go through the system: it is independent of their initial distance from the Z-axis. Assuming a ¼ 0.84 for the wedge angles and n0 ¼ 1.54467, ne ¼ 1.55379 for the ordinary and extraordinary refractive indices of the crystal (quartz), the angular separation of the two beams emerging from the Wollaston (in the forward Z

Wollaston

X

Objective

Sample

Figure 40.2 A bundle of rays entering a Wollaston prism is split into p- and s-polarized beams. The beams go through a microscope objective and illuminate the sample in two small, slightly displaced patches that cover the objective’s field of view. Upon reflection from the sample, the beams return through the objective and come together as they exit the Wollaston prism. The bias phase B between the two beams may be adjusted by sliding the Wollaston in the horizontal direction.

40 Nomarski’s differential interference contrast microscope

569

path) will be 0.0153 . For an objective lens having f ¼ 3750k, where k is the wavelength of the quasi-monochromatic light source, this angular separation results in one k of displacement between the two spots that illuminate the sample. Moreover, for every lateral shift by 100k of the Wollaston, there occurs a bias phase B ¼ 19.26 between the p- and s-beams in a double pass through the system. So, for example, if the lateral shift is 1870k then one beam will be retarded by a full 2p relative to the other. Differential interference contrast microscope Figure 40.3 is a diagram of an epi-illumination Nomarski differential interference contrast microscope. For the computer simulations reported in this chapter the Observation Plane

Lens

Analyzer at –45º Polarizer at +45º Beam-splitter

Light source Lens Wollaston prism

Objective

Sample

Figure 40.3 Schematic diagram of an epi-illumination Nomarski microscope. The spatially incoherent light source is quasi-monochromatic (wavelength k), the polarizer renders the illuminating beam linearly polarized, and the Wollaston prism, with axes at 45 to the direction of incident polarization, creates two slightly displaced, orthogonally polarized patches of light at the sample. The light reflected from the sample returns through the objective and the Wollaston, arriving at the crossed analyzer with its two components of polarization relatively phase-shifted. The light that gets through the analyzer forms an image of the sample at the observation plane.

570

Classical Optics and its Applications

spatially incoherent light source is assumed to be quasi-monochromatic (wavelength k), consisting of 529 point sources arranged in a square array. These point sources are projected onto the mid-plane of the Wollaston prism, which sits at the entrance pupil of the objective lens. The entrance pupil being at the back focal plane of the objective, uniform illumination at the sample’s surface is achieved (Ko¨hler illumination). The illumination is called “critical” if the source is imaged directly onto the sample. In practice Ko¨hler illumination is preferred over critical illumination because of its superior uniformity, but coherence-related properties of the system (such as resolution) are not affected by this choice of illumination. In this chapter, for reasons having to do with nuances of the computer simulation, we have chosen to illuminate the sample with a somewhat defocused image of the source. The polarizer renders the illuminating beam linearly polarized, and the Wollaston prism, whose axes are at 45 relative to the transmission axis of the polarizer, creates two orthogonally polarized, slightly displaced patches of light at the sample. The light reflected from the sample returns through the objective and the Wollaston but, as it arrives at the crossed analyzer, its two components of polarization are no longer in phase. The phase difference between the p- and s-beams at this point is B þ D, where B is the constant bias phase produced by the Wollaston’s displacement from the center and D is the imparted phase retardation at the sample’s surface. The amount of light that gets through the analyzer depends on the above phase shift, with more light going through as the phase shift increases from 0 to 180 . Each bright point within the light source illuminates the entire field of view of the objective and creates an image at the observation plane. The various point sources thus create overlapping images, which add up in intensity by virtue of the (spatial) incoherence of the light source. Examples Figure 40.4(a) shows the distribution of phase on a uniformly reflecting surface having several sphero-cylindrical pits with varying depths. The nose feature has a depth of 0.5k, and the mouth, eyes, and eyebrows are respectively 0.25k, 0.375k, and 0.75k deep. The computed image of this phase object in a conventional optical microscope (i.e., like that in Figure 40.3 but without the polarizer, analyzer, and Wollaston) is shown in Figure 40.4(b). Note that diffraction of light from the edges of the various features of the face creates dark borders in the corresponding image regions, but this conventional image lacks information about the slope and depth distribution within those features. The computed Nomarski image of the phase object of Figure 40.4(a), obtained with one k of sheer along the X-axis, is shown in Figure 40.5. The intensity

40 Nomarski’s differential interference contrast microscope a

–11

571

b

x/l

11 –550

x/l

550

Figure 40.4 (a) The distribution of phase at an object’s surface and (b) the distribution of intensity in the image of the same object, as observed in a conventional optical microscope. In (a) the various features of the “face” have the same reflectance but different depth, resulting in phase modulation of the incident light. The nose, mouth, eyes, and the eyebrows are respectively 0.5k, 0.25k, 0.375k, and 0.75k deep. The image in (b) is formed by a 0.8NA, 50· objective. The simulated light source consisted of 529 spatially incoherent point sources, each defocused by 10k above the sample’s surface. The observed contrast is purely due to diffraction effects, as the phase object does not give rise to any contrast in geometric-optical terms.

a

–550

b

x/l

550 –550

x/l

550

Figure 40.5 Nomarski images of the phase object in Figure 40.4(a), when the Wollaston produces one k of shear along the X-axis. The microscope is that shown in Figure 40.3, having a 50·, 0.8NA objective, and the Wollaston’s horizontal position is adjusted for B ¼ 0 . (a) Intensity distribution in the image plane; (b) logarithmic plot of the intensity distribution.

distribution in the image plane is shown in Figure 40.5(a), while a logarithmic plot of intensity (resembling an over-exposed photographic plate) is shown in Figure 40.5(b). In these calculations the assumed bias phase B ¼ 0; this results in identical image brightness for regions with equal but opposite slopes, and also yields a completely dark image background. Since the assumed shear in Figure 40.5 is along the X-direction, vertical features (such as the nose) are clearly visible in the

572

Classical Optics and its Applications

Nomarski image, while horizontal features (such as the mouth) are hidden. The reverse is true when the shear is along the Y-axis, as in Figure 40.6, where horizontal features become visible while vertical features disappear. Figure 40.7 shows the Nomarski image of the object in Figure 40.4(a), but with a bias phase B ¼ 90 . The background of the image is now bright, because the analyzer no longer blocks the light reflected from flat regions of the sample. Moreover there is an asymmetry between regions with positive and negative slope, as can be seen by comparing the right and left sides of the nose feature. Another example of a phase object is shown in Figure 40.8(a). Here a ridge having height k runs along the 45 direction in the XY-plane. The two edges of the ridge have differing slopes, the lower edge being 4k wide while the upper edge is 2k wide. In the middle of the ridge there is a pit of depth k in the shape

a

–550

b

x/l

550 –550

x/l

550

Figure 40.6 Same as Figure 40.5, except for the direction of shear, which is along the Y-axis.

–550

x/l

550

Figure 40.7 Nomarski image of the phase object in Figure 40.4(a), when the Wollaston produces one k of shear along the X-axis. The microscope is that shown in Figure 40.3, having a 50·, 0.8NA objective, and the Wollaston’s horizontal position is adjusted for B ¼ 90 .

40 Nomarski’s differential interference contrast microscope a

–11

573

b

x/l

11 –550

x/l

550

Figure 40.8 (a) Phase object and (b) its conventional microscope image. The object consists of a ridge with a height of k, running at 45 to the X- and Y-axes, and a pit in the middle of the ridge whose depth is also k. The ridge’s sidewalls have different slopes: the lower wall is 4k wide, while the upper wall is 2k wide. The flat-bottomed pit has the shape of a football stadium. The image in (b) is formed through a 50·, 0.8NA microscope objective. The simulated light source consisted of 529 spatially incoherent point sources, each defocused by 10k above the sample’s surface. The observed image contrast is purely due to diffraction effects, as the phase object does not give rise to any contrast in geometric-optical terms.

of a football stadium. The conventional image of this sample is shown in Figure 40.8(b). Again diffraction from the various edges renders certain features visible in the image, but specific information about the slopes is lacking. In contrast, two Nomarski images of the same object obtained with one k of horizontal shear are shown in Figure 40.9. The bias phase B ¼ 0 in Figure 40.9(a), whereas B ¼ 90 in Figure 40.9(b). Different slopes produce different intensity levels in these images. Also note that the symmetry present in Figure 40.9(a) between equal but opposite slopes is broken in Figure 40.9(b), where B ¼ 6 0 . Practical considerations The back focal plane of high-NA objectives is usually inaccessible from outside the lens, so the Wollaston prism cannot be directly inserted at the entrance pupil. By choosing a somewhat different orientation for the optic axes of the crystal wedges, Nomarski modified the Wollaston prism in such a way that the p- and s-beams appeared to be separating from each other in a plane external to the prism.3 In this way the light source could be imaged onto the entrance pupil of the objective through the Nomarski-modified Wollaston prism, allowing both Ko¨hler illumination and the separation and recombination of the p- and s-beams at the entrance pupil.

574

Classical Optics and its Applications a

–550

b

x/l

550 –550

x/l

550

Figure 40.9 Nomarski images of the phase object of Figure 40.8(a), when the Wollaston produces one k of shear along the X-axis. The microscope is that shown in Figure 40.3, having a 50·, 0.8NA objective. The Wollaston’s horizontal position is adjusted to yield a bias phase B between the p- and s-polarized beams. (a) B ¼ 0 , (b) B ¼ 90 .

Another practical consideration involves the use of broadband light sources. The sources used in practice are not always monochromatic and, in fact, may have a fairly broad spectrum. The analysis offered in this chapter applies to multi-color sources as well, provided that the individual wavelengths are treated independently and their corresponding images are eventually superimposed. In any given region of the sample, interference causes certain colors to fade while strengthening others. The color or hue observed through a broadband Nomarski microscope at a given location is thus a qualitative measure of the slope of the sample at that location. For quantitative measurements, however, it is best to use quasi-monochromatic light in conjunction with some form of phase-shifting interferometry.7,8,9 This may be achieved, for instance, by sliding the Wollaston prism along the shear direction while monitoring (with a CCD camera) the variations in intensity at specific locations of the image. References for Chapter 40 1 G. Nomarski, Diapositif interferentiel a` polarisation pour l’e´tude des objects transparents ou opaques appartenant a` la classe des objects de phase, French patent No. 1059 124, 1953. 2 G. Nomarski, Microinterfe´rome`tre diffe´rential a` ondes polarise´es, J. Phys. Radium 16, 9S–11S (1955). 3 R. D. Allen, G. B. David, and G. Nomarski, The Zeiss–Nomarski differential interference equipment for transmitted light microscopy, Z. Wiss. Mikroskopie 69 (4), 193–221 (1969). 4 M. V. Klein, Optics, Wiley, New York, 1970. 5 S. Inoue´ and R. Oldenbourg, Microscopes, chapter 17 in Handbook of Optics, Vol. II, McGraw-Hill, New York, 1995.

40 Nomarski’s differential interference contrast microscope 6 7 8 9

575

M. Pluta, Advanced Light Microscopy, Vol. 2: Specialized Methods, Elsevier, Amsterdam; Polish Scientific Publishers, Warszawa, 1989. D. L. Lessor, J. S. Hartman, and R. L. Gordon, Quantitative surface topography determination by Nomarski reflection microscopy. I. Theory, J. Opt. Soc. Am. 69, 357–366 (1979). J. S. Hartman, R. L. Gordon, and D. L. Lessor, Quantitative surface topography determination by Nomarski reflection microscopy. II. Microscope modification, calibration, and planar sample experiments, Applied Optics 19, 2998–3009 (1980). W. Shimada, T. Sato, and T. Yatagai, Optical Surface Microtopography using phase-shifting Nomarski microscope, SPIE 1332, Optical Testing and Metrology, 525–529 (1990).

41 The van Leeuwenhoek microscope

Antoni van Leeuwenhoek (1632–1723), a fabric merchant from Delft, the Netherlands, used tiny glass spheres to study various microscopic objects at high magnification with surprisingly good resolution. A contemporary of Sir Isaac Newton, Christiaan Huygens, and Robert Hooke, he is said to have made over 400 microscopes and bequeathed 26 of them to the Royal Society of London. (A handful of these microscopes are extant in various European museums.) Using his single-lens microscope, van Leeuwenhoek observed what he called animalcules – or micro-organisms, to use the modern terminology – and made the first drawing of a bacterium in 1683. He kept detailed records of what he saw and wrote about his findings to the Royal Society of London and the Paris Academy of Science. His contributions have made him the father of scientific microscopy.1,2,3 Van Leeuwenhoek was an amateur in science and lacked formal training. He seems to have been inspired to take up microscopy by Robert Hooke’s illustrated book, Micrographia, which depicted Hooke’s own observations with the microscope. In basic design, van Leeuwenhoek’s instruments were simply powerful magnifying glasses, not compound microscopes of the type used today. An entire instrument was only 3– 4 inches (8–10 cm) long, and had to be held up close to the eye; its use required good lighting and great patience.4 Van Leeuwenhoek devised tiny, double-convex lenses to be mounted between brass plates. Through them, he was able to peer at objects mounted on pinheads, magnifying them up to 300 times, a power that far exceeded that of early compound microscopes. Compound microscopes had been invented around 1595. Several of van Leeuwenhoek’s contemporaries, notably Robert Hooke in England and Jan Swammerdam in the Netherlands, had built compound microscopes and were making important discoveries with them. However, because of various technical difficulties, early compound microscopes were not practical for magnifications beyond 20· or 30·. Van Leeuwenhoek’s skill at grinding lenses, together with his naturally acute eyesight and great care in adjusting the lighting, enabled him to 576

41 The van Leeuwenhoek microscope

577

build microscopes with clearer and brighter images than any of his contemporaries could achieve. Van Leeuwenhoek used his invention to confirm the discovery of capillary systems, to describe the life cycle of ants, and to observe plant and muscle tissue, protozoa and bacteria, and the spermatozoa of insects and humans. In 1673, van Leeuwenhoek began writing letters to the newly formed Royal Society of London, describing his findings – his first letter contained some observations on the stings of bees. For the next 50 years he corresponded with the Royal Society; his letters, written in Dutch, were translated into English or Latin and printed in the Philosophical Transactions of the Royal Society, and often reprinted separately. His experiments with microscope design and function made him an international authority on microscopy, and in 1680 he was made a Fellow of the Royal Society. It is suspected that van Leeuwenhoek produced his lenses by chipping away the excess glass from the thickened droplet that forms on the bottom of a blownglass bulb. These lenses probably had a thickness of 1 mm and a radius of curvature of 0.75 mm. They had superior magnification and resolution when compared to other microscopes of the time. The Utrecht museum has one of van Leeuwenhoek’s microscopes in its collection. This amazing instrument has a magnification of about 275· with a resolution approaching one micron (in spite of a scratch on the lens).5 Towards the end of his life van Leeuwenhoek wrote: “ . . . my work, which I’ve done for a long time, was not pursued in order to gain the praise I now enjoy, but chiefly from a craving after knowledge, which I notice resides in me more than in most other men. And therewithal, whenever I found out anything remarkable, I have thought it my duty to put down my discovery on paper, so that all ingenious people might be informed thereof.” Elementary optics of glass spheres Figure 41.1 shows a ray of light parallel to the optic axis at height h, going through a glass sphere of radius R and refractive index n. The angle of incidence on the sphere is denoted by h, and the refracted ray inside the glass makes an angle h0 with the surface normal. According to Snell’s law, sin h ¼ n sin h0 , and from simple geometry CA ¼ R sin h=sinð2h 2h0 Þ:

ð41:1Þ

When the ray height h is much smaller than the radius R of the sphere, the angles h and h0 will be small, in which case the small-angle approximation yields CA nR=½2ðn 1Þ:

ð41:2Þ

578

Classical Optics and its Applications

(a) u u9

h

u

u9

u

A

C R

(b) u u9 h

u

A C u

Figure 41.1 A ray of height h traveling parallel to the optic axis is refracted by a glass sphere of radius R and refractive index n. Upon emerging from the sphere, the ray crosses the optic axis at point A. When h becomes very small, the point A approaches the paraxial rear focus F 0 of the lens. In (a) n < 2.0 and the emergent ray crosses the axis outside the sphere, whereas in (b), where n > 2.0, only the backward extension of the ray crosses the axis. (When n ¼ 2.0, the paraxial rays come to focus on the rear facet of the sphere.)

Thus, for example, if n ¼ 1.5 then the paraxial focus of the lens is at a distance CA ¼ 1.5R from the lens center, or if n ¼ 2 then the paraxial focus coincides with the rear vertex of the sphere, that is, CA ¼ R. Depending on the values of n and h, the proper path of the ray may be that shown in Figure 41.1(a) or (b), but equations (41.1) and (41.2) apply to both cases. The paraxial focus, of course, is relevant only for rays with a small height h; when h increases beyond the paraxial regime, the point A moves closer to the center C, giving rise (for a beam of wide cross-section) to spherical aberrations.

579

41 The van Leeuwenhoek microscope

Confining our attention to a glass sphere having R ¼ 1 mm and n ¼ 1.5 – typical of what Van Leeuwenhoek used for his microscopes – we suppose that a point source of light is placed at the front (paraxial) focus F of the lens, as in Figure 41.2. A ray that leaves the source at an angle relative to the optic axis will emerge parallel to the axis only in the paraxial regime, i.e., when is small. For larger values of the emergent ray crosses the optic axis at the point A, where CA ¼ R sin h=sinð2h 2h0 Þ:

ð41:3Þ

Here and h are related through sin ¼ R sin h/FC. Thus a point source located at the front focus F and radiating into a reasonably large cone will produce a real image on the opposite side at some finite distance from C. To be sure, this image has a certain amount of spherical aberration and, to obtain a good image, one must limit the angular range of the cone of light accepted by the lens. This may be achieved by closing down the aperture stop, which may be located either on the object side or the image side of the lens. In Figure 41.2 the stop is in the image space and may thus be referred to as the exit pupil of the lens. Figure 41.3 shows computed distributions pertaining to the system of Figure 41.2. The point source is located at the paraxial focus of the lens (R ¼ 1 mm, n ¼ 1.5, CF ¼ 1.5 mm), and the assumed radius of aperture Ra ¼ 0.55 mm. Figure 41.3(a) shows that the emergent intensity at the exit pupil is somewhat brighter near the rim compared with that at the center of the aperture. Figure 41.3(b), a plot of phase distribution at the exit pupil (minus the curvature), shows a significant amount of spherical aberration. (The curvature of the emergent beam has been removed from the phase plot; only the residual aberrations are shown.) The emergent beam

X

u Y

u9

u9

u

f F

A

C R

Z

Ra

Figure 41.2 A glass sphere of radius R ¼ 1 mm and refractive index n ¼ 1.5. The aperture stop, of radius Ra, is also the exit pupil of the lens in this case. A monochromatic point source (k ¼ 0.5 lm) placed at the paraxial front focus F is approximately imaged to the point A, at a finite distance from the lens center.

580

Classical Optics and its Applications a

–1200

b

x/λ

1200 –1200

c

x/λ

1200 –400

x/λ

400

Figure 41.3 Various distributions in the system of Figure 41.2 when Ra ¼ 0.55 mm. (a) Emerging intensity distribution at the exit pupil. (b) Distribution of residual phase at the exit pupil when the curvature of the emergent beam is taken out (r.m.s. aberrations ¼ 0.96k). The gray-scale encodes values of phase from 180 (black) to þ180 (white). (c) Logarithmic plot of intensity in the plane of best focus, located a distance of 27.36 mm from the lens center. The logarithmic scale emphasizes the weak rings.

comes to best focus at a distance CA ¼ 27.36 mm behind the lens. Figure 41.3(c), a logarithmic plot of intensity distribution in the plane of best focus, also shows the substantial rings of light caused by spherical aberration. These clearly indicate that the image quality of a wide-aperture system would be poor.4 When the aperture is further closed down to Ra ¼ 0.4 mm the distributions of Figure 41.4 are obtained. The intensity distribution at the exit pupil is now fairly uniform, and the phase plot shows convergent behavior towards the point of best focus at CA ¼ 59.3 mm behind the lens. (Notice that in Figure 41.4(b), unlike Figure 41.3(b), the curvature has not been subtracted from the phase plot.) The best-focused spot is shown in Figure 41.4(c). In addition to a relatively small spherical aberration, this system also has a fairly large field of view, as may be inferred from the plots of Figure 41.5. Here a number of identical point sources are placed in the front focal plane of the lens, and their corresponding images are computed in the plane of best focus, at CA ¼ 59.3 mm. All imaged points show spherical aberration similar to that of the central spot, but there is very little coma and astigmatism, owing to the fact that the system is essentially monocentric. Glass sphere as a magnifier Up to this point we have studied the properties of real images formed by point sources placed in the (paraxial) focal plane of a spherical lens. Now we will consider the spherical lens as a magnifying glass, placing the object somewhat closer to the lens than its front focus and examining the properties of the virtual image thus formed.

581

41 The van Leeuwenhoek microscope a

–850

b

x/l

850 –850

c

x/l

850 –400

x/l

400

Figure 41.4 Various distributions in the system of Figure 41.2 when Ra ¼ 0.4 mm. (a) Intensity distribution at the exit pupil. (b) Total phase distribution at the exit pupil; the r.m.s. value of residual aberrations (with the curvature taken out) is 0.22k. The gray-scale encodes values of phase from 180 (black) to þ180 (white). (c) Intensity distribution in the plane of best focus, located a distance of 59.3 mm from the lens center.

a

–45

b

x/l

45 –2000

x/l

2000

Figure 41.5 Five point sources placed in the front focal plane of the spherical lens shown in Figure 41.2. The exit-pupil radius Ra ¼ 0.4 mm, and the best image (with 45· magnification) appears in a plane 59.3 mm away from the lens center. (a) Intensity distribution in the object plane. (b) Intensity distribution in the image plane. All imaged points show spherical aberration, but there is very little coma or astigmatism.

The diagram of Figure 41.6 is a representation of a Van Leeuwenhoek microscope with a spherical glass lens having R ¼ 1 mm, n ¼ 1.5. To achieve highresolution imaging with this system the aperture is closed down to Ra ¼ 0.25 mm, and the object is displaced from the paraxial focus F by 20 lm towards the lens. The observer’s eye is placed very close to the lens, so that the pupil of the eye essentially coincides with the exit pupil of the lens. The object used in the following calculations is shown in Figure 41.7. This is a transmissive object with several micron-sized features that impart phase and

582

Classical Optics and its Applications X

F Y

Z

C Object

Observer Virtual image Exit Pupil

Figure 41.6 The simulated Van Leeuwenhoek microscope. The lens radius R ¼ 1 mm, its refractive index n ¼ 1.5, the object is 20 lm to the right of the paraxial focus F (i.e., 0.48 mm away from the lens), and the exit-pupil radius Ra ¼ 0.25 mm. The virtual image, formed 316 mm to the left of the lens center, can be comfortably viewed when the eye is placed at or near the exit pupil.

a

–21

b

x/l

21 –21

x/l

21

Figure 41.7 Distributions of (a) intensity and (b) phase immediately in front of the object. The object is trans-illuminated with a uniform, coherent, and monochromatic plane wave k ¼ 0.5 lm. The smallest feature in the lower right-hand side is 1 lm in diameter. The phase values in (b) range from 144 (black) to þ108 (white).

amplitude modulation to the incident beam. With this object we demonstrate both coherent and incoherent imaging through the system of Figure 41.6. The illumination in both cases is monochromatic at a wavelength k ¼ 0.5 lm, although white light or other broadband sources can also be used to illuminate the object. The simplicity of this single-lens microscope keeps chromatic aberrations to a minimum.1 In the case of coherent imaging, the incident beam is collimated, uniform, and propagates along the Z-axis. The computed distributions of intensity and phase at the exit pupil of the lens are shown in Figure 41.8. The intensity plot in

583

41 The van Leeuwenhoek microscope a

–550

b

x/l

550 –550

x/l

550

Figure 41.8 Distributions of (a) intensity and (b) phase at the exit pupil of the microscope of Figure 41.6 with the coherently illuminated object of Figure 41.7. The intensity is shown on a logarithmic scale to emphasize its weak regions. The phase ranges from 180 (black) to þ180 (white).

a

–4500

b

x/l

4500 –4500

x/l

4500

Figure 41.9 Distributions of intensity in the virtual image seen through the microscope of Figure 41.6 with the object of Figure 41.7. The image in (a) is computed for a coherent, monochromatic beam of light normally incident on the object. The incoherent image in (b) is obtained by illuminating the object with 225 point sources through a 0.15NA condenser lens. These virtual images have a magnification of 200· and appear at a distance of 316 mm behind the lens center.

Figure 41.8(a) is drawn on a logarithmic scale to emphasize the spatialfrequency content of the image-carrying beam. It is found numerically that the best focus of this system is at a distance CA ¼ 316 mm from the lens center; the computed coherent image at this distance, having a magnification close to 200, is shown in Figure 41.9(a).

584

Classical Optics and its Applications

To compute the incoherent image, we illuminate the object with 225 monochromatic point sources (k ¼ 0.5 lm, NA ¼ 0.15) and superimpose the resulting intensity distributions in the image plane obtained for individual point sources. Figure 41.9(b) is the computed incoherent image of the object shown in Figure 41.7 through the system of Figure 41.6. The magnification is about 200, and the image exhibits a fairly accurate reproduction of the various features present in the object, except perhaps the spot 1 lm in diameter on the lower right-hand side. Thus the microscope depicted in Figure 41.6, having numerical aperture NA 0.16, is nearly diffraction-limited, over at least a 20 lm field of view, with a resolution of 2 lm at k ¼ 0.5 lm. (In reality, the field of view of the microscope is several times greater than that demonstrated in this particular example.) Method of computation The results presented in this chapter were obtained by a combination of raytracing and diffraction calculations. The light emanating from the object was propagated to the vicinity of the lens using far-field (Fraunhofer) diffraction formulas. The complex-amplitude distribution at this point was converted into a set of geometric-optical rays, using the local Poynting vector to represent the ray. The rays were traced from the entrance pupil to the exit pupil of the lens using standard methods of ray-tracing. At the exit pupil the ray magnitude and phase information was converted into a complex wavefront, and the wavefront was propagated to the image plane using near-field (Fresnel) diffraction formulas. Other applications of glass spheres Glass balls have found application in other areas as well. A simple method of coupling the light from a diode laser (or a light-emitting diode) into an optical fiber uses a spherical glass ball between the source and the fiber’s entrance facet. This may not be the most efficient coupling mechanism, but it is simple, inexpensive, and easy to implement in conjunction with multimode fibers. Tiny glass beads are often mixed with ordinary paint for use on the streets, on automobile license plates, etc., to enhance retro-reflectivity. My colleague Stephen Jacobs of the University of Arizona has made a fused silica ball six inches in diameter, through which one can look toward the sun and observe beautiful optical phenomena.6 Looking through this glass sphere, one cannot help but remember that Nature has employed spherical droplets of water to create the magnificent rainbow.7,8

41 The van Leeuwenhoek microscope

585

References for Chapter 41 1 2 3 4 5 6 7 8

B. J. Ford, The earliest views, Scientific American, 50–53, April 1998. B. J. Ford, Leeuwenhoek Legacy, Bristol, Biopress; London, Farrand Press; 1991. L. Yount, Antoni van Leeuwenhoek: First to See Microscopic Life, Enslow Publishers, 1996. J. A. Mahaffey, Making Leeuwenhoek proud: building simple microscopes, Opt. & Phot. News 10, 62–63, March 1999. These historical anecdotes have been compiled from information available on the worldwide web. See, for example, encarta.msn.com, www.hcs.ohio-state.edu, www. letsfindout.com, www.feic.com, www.ucmp.berkeley.edu, www.utmem.edu. S. F. Jacobs and S. C. Johnston, Unusual optical effects of a solid glass sphere, Opt. & Phot. News 8, 44–45, October 1997. H. M. Nussenzveig, The theory of the rainbow, Scientific American, 116–127, April 1977. C. B. Boyer, The Rainbow, From Myth to Mathematics, Sagamore Press, Thomas Yoseloff, New York, 1959.

42 Projection photolithography†

Photolithography is the technology of reproducing patterns using light. Developed originally for reproducing engravings and photographs and later used to make printing plates, photolithography was found ideal in the 1960s for massproducing integrated circuits.1 Projection exposure tools, which are now used routinely in the semiconductor industry, have continually improved over the past several decades in order to satisfy the insatiable demand for reduced feature size, increased chip size, improved reliability and production yield, and lower overall cost. High-numerical-aperture lenses, short-wavelength light sources, and complex photoresist chemistry have been developed to achieve fabrication of fine patterns over fairly large areas. Research and development efforts in recent years have been directed at improving the resolution and depth of focus of the photolithographic process by using phase-shifting masks (PSMs) in place of the conventional binary intensity masks (BIMs). In this chapter we describe briefly the principles of projection photolithography and explore the range of possibilities opened up by the introduction of PSMs. Basic principles Figure 42.1 is a diagram of a typical projection system used in optical lithography. A quasi-monochromatic, spatially incoherent light source (wavelength k) is used to illuminate the mask. Steps are usually taken to homogenize the source, thus ensuring a highly uniform intensity distribution at the plane of the mask. The condenser stop may be controlled to adjust the degree of coherence of the illuminating beam; this control of partial coherence is especially important when PSMs are used to improve the performance of optical lithography beyond what is achievable with the traditional BIMs. †

The coauthor of this chapter is Rongguang Liang.

586

42 Projection photolithography

587

Light source

Homogenizer Condenser stop

Condenser lens Mask and stage

Projection lens

u Wafer and stage

Figure 42.1 Essential elements of a photolithographic “stepper” used for exposing semiconductor wafers. The condenser stop controls the degree of coherence of the illumination. The numerical aperture NA0 of the projection lens is defined as sin h, where h is the half-angle of the cone subtended by the clear aperture of the projection lens at the wafer. The uniformly illuminated mask is imaged onto the wafer with a magnification M that is typically around 1/5.

The light transmitted through the mask is collected by the projection lens, which images the mask onto the wafer, typically with a magnification M ¼ 1/5. Thus, if the numerical aperture of the projection lens is defined as NA0 ¼ sin h, its angular aperture on the mask side will be sin h0 ¼ M NA0. If the condenser’s numerical aperture NAc happens to be much less than sin h0 then the illumination is coherent, while if NAc sin h0 then the illumination is essentially incoherent. In practice the ratio r ¼ NAc /(M NA0) is used as a measure of the incoherence of illumination. For example, if M ¼ 1/5 and NA0 ¼ 0.6, then NAc ¼ 0.084 yields r ¼ 0.7, while NAc ¼ 0.06 yields r ¼ 0.5. For a given projection lens, therefore, the incoherence of illumination is proportional to the condenser’s stop diameter.1,2,3 Over the past decade, photolithographic systems have evolved through several generations. The wavelength of the light source has steadily decreased from 365 nm

588

Classical Optics and its Applications

(i-line of mercury) to 257 nm (high-pressure mercury arc lamp) to 248 nm (KrF laser), and is presently at 193 nm (ArF excimer laser). The numerical aperture NA0 of the projection lens, having increased from its value of 0.16 in the early days to 0.6 in present-day systems, is likely to increase still further. The illumination systems have also improved, taking advantage of off-axis illumination and related configurations.1,2 Other improvements have occurred in the area of photoresists and the control of their exposure and development processes and also in the control of the flatness of the wafer, which reduces the need for a large depth of focus, etc. These topics are beyond the scope of the present chapter, and we refer the interested reader to the published literature for further information.1,2,3,4,5,6 In the remainder of this chapter we present computed images of various masks obtained in a typical projection system (NA0 ¼ 0.6, M ¼ 1/5) and compare the resulting image contrasts and resolutions. PSM versus BIM Traditional “binary intensity” masks (BIMs) consist of opaque chromium lines on transparent glass substrates; these masks modulate the intensity of the incident light without affecting its phase. Modern masks have begun to take advantage of optical phase by changing the thickness of the transparent regions of the mask, either by depositing additional transparent material where needed or by removing a thin layer from the substrate at specific locations, thereby selectively adjusting the transmitted optical phase.1,2 The basic idea of an optical phase-shifting mask for lithography originated in the early 1980s with M. D. Levenson6 in the US and, independently and almost simultaneously, with M. Shibuya7 in Japan. Figure 42.2 shows several different mask designs that exploit optical phase to improve the resolution of the photolithographic process. In addition to improved resolution, these PSMs also increase the effective depth of focus and provide a wider process window (i.e., a wider range of acceptable focuses and exposures).1 Alternating-aperture phase-shifting mask Consider the simple mask consisting of three bright lines on a dark background shown in Figure 42.3. Each bright line is 3k wide, and the separation between adjacent lines is also 3k. (Note that these are the mask dimensions; at the wafer the features are demagnified by a factor 1/M ¼ 5.) We assume two different designs for the mask. In the first, the mask is a conventional BIM, the same phase being imparted to the light transmitted through each aperture. In the second, the

42 Projection photolithography (b) +

0 –

(d)

+ 0 –

(e)

Amplitude

Amplitude

(c)

Amplitude

0

+

+ 0 –

(f)

+ 0 –

Amplitude

Amplitude

(a)

Amplitude

589

+ 0 –

Figure 42.2 Several mask structures and, below each structure, the corresponding E-field patterns immediately after transmission through the mask. (a) Conventional transmission mask. (b) Alternating-aperture phase mask with etched substrate. (c) A chromeless phase-edge mask produces dark lines in the image solely through destructive interference at the phase transitions. (d) A shifter–shutter mask is similar to (c) except that each dark line is produced by a pair of adjacent phase-edges. (e) A rim-shifter mask contains chrome lines bracketed by 180 phase-edges. (f) An attenuated phase-shift mask; here the shaded regions represent partially transmissive material with a 180 phase shift. (Adapted from reference 1.)

mask is a PSM in which the upper and lower bright lines are phase-shifted by 180 relative to the central bright line. In Figures 42.4(a), (b) we compare the intensity patterns of the images obtained at the wafer for these two types of mask. The assumed projection system is that of Figure 42.1, with NA0 ¼ 0.6, M ¼ 1/5, and r ¼ 0.7. Clearly the PSM is better at resolving the dark spaces between adjacent bright lines. For direct comparison, a cross-section through these two intensity distributions is shown in Figure 42.4(c). Increasing the coherence of the illumination by closing down the aperture of the condenser to r ¼ 0.5 improves the image contrast of the PSM but degrades that of the BIM image, as can be readily observed in Figures 42.4(d)–(f).

590

Classical Optics and its Applications

–10.5

x/

10.5

Figure 42.3 A simple mask containing three transparent apertures on an opaque background. The apertures as well as the spaces between apertures are 3k wide. When the apertures impart a uniform phase to the transmitted beam, the mask is a BIM. When the upper and lower apertures impose on the transmitted beam a 180 phase shift relative to the middle aperture, the mask is an alternatingaperture PSM.

Isolated bright line As our second example we consider the case of an isolated bright line. Figures 42.5(a), (b) show respectively a BIM and a PSM for a line of width 4k. (Again, this is the dimension at the mask; the projected line at the wafer is only 0.8k wide.) The PSM of Figure 42.5(b) contains two 0.8k-wide side-riggers, each imparting a 180 phase shift to the incident beam relative to the central bright line.4 Figures 42.6(a), (b) show the computed intensity patterns at the wafer for the two masks, and Figure 42.6(c) shows cross-sections of both patterns (the assumed coherence factor r is 0.7). The side-riggers produce small bumps in the intensity pattern of the PSM, but these are usually below the resist threshold and are not printed. In Figure 42.6 it can be seen that the computed image of the bright line using the PSM is about 10% narrower than that obtained with the BIM. This modest reduction in the printed line-width can be slightly improved upon if the side-riggers’ location and width are properly optimized and also if the condenser stop is further closed down to increase the coherence of illumination (r-values as low as 0.3 have been suggested in the literature4,5). Contact hole Figure 42.7(a) shows a simple 4k · 4k square aperture on a dark background. This feature has uniform phase across the aperture and, therefore, represents the BIM for a contact hole. A corresponding PSM for the same hole is shown in Figure 42.7(b). Here four side-rigger lines of width 0.5k and 180 phase

591

42 Projection photolithography a

–2.1

d

x/

–2.1

2.1

b

–2.1

2.1

x/

2.1

e

x/

–2.1

2.1

1

1 c

f

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2 BIM PSM

0

x/

–2.1

y/

2.1

0

BIM

PSM –2.1

y/

2.1

Figure 42.4 Computed plots of intensity distribution at the wafer for the mask of Figure 42.3 placed in the system of Figure 42.1 (NA0 ¼ 0.6, M ¼ 1/5). (a) Image of the BIM obtained with r ¼ 0.7. (b) Image of the PSM obtained with r ¼ 0.7. (c) Cross-sections of the intensity patterns for the BIM (broken line) and the PSM (solid line). (d)–(f) Same as the patterns in the left-hand column, but for r ¼ 0.5.

shift (relative to the central aperture) are placed around the hole.4 The computed intensity patterns of the images of these masks at the wafer appear in Figures 42.8(a),(b), respectively. The side-rigger features are too small to be printed, but their destructive interference with the central aperture results in a smaller projected hole, as revealed in the cross-sectional intensity profiles at the wafer shown in Figure 42.8(c). As before, the printed feature size can be further optimized by adjusting the dimensions of the side-riggers as well as by closing the condenser stop to reduce the value of r.

592

Classical Optics and its Applications a

–13

b

x/

13 –13

x/

13

Figure 42.5 Masks designed for creating an isolated bright line at the wafer. (a) BIM containing a 4k-wide line on an opaque background. (b) PSM featuring the same 4k-wide line flanked by a pair of 0.8k-wide side-riggers. Each siderigger imparts to the incident beam a 180 phase shift relative to the central line. The separation between the central line and each side-rigger is 2k.

More complicated patterns Figure 42.9(a) shows a mask with five transparent apertures. The widths of line (bright) and space (dark) on this mask are both equal to 4.8k. If the mask is used without any phase shifts, the intensity pattern of Figure 42.9(b) will be obtained at the wafer. Placing 180 phase-shifters on alternate bright apertures results in the image intensity distribution shown in Figure 42.9(c). Two different crosssections of these patterns are also given in Figures 42.9(d), (e). In this case of relatively large features, there are apparently no significant differences between a BIM and a PSM. With shrinking feature size, however, the advantages of the PSM become apparent. Figure 42.10 is the counterpart of Figure 42.9 for the case where the line- and space-widths (at the mask) are both reduced to 3k. The BIM is now seen to yield a fairly low-contrast image at the wafer, while the PSM provides better resolution and sharper contrast. Reducing the feature size still further to 2.4k (at the mask) results in the patterns of Figure 42.11. Here the PSM still performs reasonably well, while the image quality of the BIM has been substantially degraded. Phase-shifters on a transparent background As a final example, consider the fully transparent (i.e., chromeless) PSM shown in Figure 42.12(a). Each of the three rectangular features on this mask is 4k wide and is phase-shifted by 180 relative to the background. Also, the spaces separating adjacent rectangular features are each 4k wide. The computed intensity distribution at the wafer in a system having NA0 ¼ 0.6, M ¼ 1/5, r ¼ 0.7 is shown

42 Projection photolithography

593

a

–2.6

x/

2.6

x/

2.6

b

–2.6 1 c 0.8 0.6 0.4

BIM

PSM

0.2 0 –2.6

y/

2.6

Figure 42.6 Computed intensity patterns at the wafer for the masks of Figure 42.5 in the system of Figure 42.1 (NA0 ¼ 0.6, M ¼ 1/5, r ¼ 0.7). (a) Using the BIM; (b) using the PSM; (c) the cross-sections of the intensity patterns in the images of the BIM (broken line) and the PSM (solid line).

in Figure 42.12(b), and a cross-sectional view is provided in Figure 42.12(c). Depending on the intended application, this image may or may not be acceptable. For instance, suppose the long edges of the rectangular features of the mask are meant to produce dark lines at the wafer. This they do quite well, as is evident from the presence of four horizontal dark lines in Figure 42.12(b). However, if the ends of these dark lines are required to be disconnected from each other, then the PSM has failed in providing the necessary isolation. The problem is rooted in the sharp 0 –180 phase-edge occurring at the short end of each rectangular feature. This problem can be remedied in principle by softening the phase transition at these short ends by providing a gradual transition from 180 to 120 to 60 and eventually to 0 . Such phase stair-steps, however, are usually

594

Classical Optics and its Applications a

–5

b

x/

5 –5

x/

5

Figure 42.7 Mask patterns for creating a contact hole. (a) BIM containing a 4k · 4k square aperture on an opaque background. (b) PSM featuring the same 4k · 4k aperture surrounded by 0.5k-wide side-riggers. Each side-rigger imparts to the incident beam a 180 phase shift relative to the central aperture.

a

–1.8

x/

1.8

x/

1.8

b

–1.8 1

c

0.8 0.6

BIM

0.4 0.2

PSM

0 –1.8

x/

1.8

Figure 42.8 Computed intensity patterns at the wafer for the masks of Figure 42.7 in the system of Figure 42.1 (NA0 ¼ 0.6, M ¼ 1/5, r ¼ 0.7). (a) Using the BIM; (b) using the PSM; (c) the cross-sections of the intensity patterns in the images of the BIM (broken line) and the PSM (solid line).

595

42 Projection photolithography a

b

–27.5 1 d 0.8

x/

c

x/

27.5 –5.5

x/

5.5 –5.5

5.5

1 e PSM

PSM BIM

0.8

0.6

Mansuripur

0.6

Figure 7

BIM 0.4

0.4

0.2

0.2

0

0 –5.5

x/

5.5

–5.5

x/

5.5

Figure 42.9 (a) Mask pattern containing five transparent apertures on an opaque background. The lines and spaces are all 4.8k wide. When used as a BIM, all apertures impart the same uniform phase to the incident beam. When used as a PSM, the apertures are alternately phase-shifted by 0 and 180 . The assumed projection-system parameters are NA0 ¼ 0.6, M ¼ 1/5, r ¼ 0.7. (b) Computed intensity pattern in the image of the BIM. (c) Computed intensity pattern in the image of the PSM; the arrows mark the cross-sections displayed in (d) and (e). (d) Cross-sectional plots of intensity distributions in the images of the BIM (broken line) and the PSM (solid line). (e) A different cross-section of the two images.

impractical because they are costly and, moreover, they produce masks that are difficult to inspect and to repair. In today’s practice, such unwanted dark lines are erased by a second exposure through a different mask. Concluding remarks Incorporating the advantages of optical phase in the design, manufacture, and testing of photomasks is still very much a research topic; many potential benefits of the PSM await to be realized. The type of PSM in common use today is the attenuated PSM depicted in Figure 42.2(f), where the traditional opaque chrome is replaced by a material that transmits 8% with a 180 phase shift. This is useful for printing bright spaces and contact holes, and has essentially replaced the shifter–shutter type of mask (see Figure 42.2(d)). Also, the more recent

596

Classical Optics and its Applications a

b

–15 1

c

15 –3

x/

x/

3 –3 1

d

PSM

0.8

x/

3

e

0.8 BIM

0.6

0.6 BIM

0.4

0.4

0.2

0.2

0

0 x/

–3

PSM x/

–3

3

3

Figure 42.10 Same as Figure 42.9 but for smaller mask features. The lines and spaces on the mask are now 3k wide.

b

a

–13 1

x/

c

x/

13 –2.6

d

BIM

0.8

0.8

0.6

x/

2.6 –2.6 1

2.6

e BIM

0.6

0.4

0.4 PSM

0.2

0.2

0

PSM

0 –2.6

x/

2.6

–2.6

x/

2.6

Figure 42.11 Same as Figure 42.9 but for very small mask features. The lines and spaces on the mask are now 2.4k wide.

42 Projection photolithography

597

a

–15

x/

15

x/

3

y/

3

b

–3 1

c

0.8 0.6 0.4 0.2 –3

Figure 42.12 (a) Transparent PSM containing three rectangular regions of width 4k, each imparting a 180 phase shift to the incident beam. Like the background, the spaces between adjacent apertures (also 4k wide) are fully transparent and impart a 0 phase to the beam. (b) Computed intensity distribution in the image plane of the system of Figure 42.1 having NA0 ¼ 0.6, M ¼ 1/5, r ¼ 0.7. (c) Central cross-section of the intensity pattern of the image seen in (b).

high-transmission tri-tone PSM, where the phase-shifted material transmits 18% and there is a separately patterned opaque layer, has superseded the rim-shifters (see Figure 42.2(e)).8 References for Chapter 42 1 2

M. D. Levenson, Wavefront engineering for photolithography, Physics Today, 28–36, July 1993. M. D. Levenson, Extending the lifetime of optical lithography technologies with wavefront engineering, Jpn. J. Appl. Phys. 33, 6765–6773 (1994).

598

Classical Optics and its Applications

3 M. D. Levenson, Wavefront engineering from 500 nm to 100 nm CD, in Emerging Lithographic Technologies, SPIE 3048, 2–13 (1997). 4 T. Terasawa, N. Hasegawa, T. Kurosaki, and T. Tanaka, 0.3-micron optical lithography using a phase-shifting mask, SPIE 1088, 25–33 (1989). 5 N. Hasegawa, T. Terasawa, T. Tanaka, and T. Kurosaki, Submicron optical lithography using phase-shifting mask, Electro-chem. Ind. Phys. Chem. 58, 330–335 (1990). 6 M. D. Levenson, N. S. Viswanathan and R. A. Simpson, Improving resolution in photolithography with a phase-shifting mask, IEEE Trans. Electron Devices ED-29, 1828–1836 (1982). 7 M. Shibuya, Projection master for transmitted illumination, Japanese Patent Gazette # Showa 62-50811, application dated 9/30/80, issued 10/27/87. 8 M. D. Levenson, private communication.

43 Interaction of light with subwavelength structures†

When a light field interacts with structures that have complex geometric features comparable in size to the wavelength of the light, it is not permissible to invoke the assumptions of the classical diffraction theory, which simplify the problem and allow for approximate solutions. For such cases, direct numerical solutions of the governing equations are sought through approximating the continuous time and space derivatives by the appropriate difference operators. The Finite Difference Time Domain (FDTD) method discretizes Maxwell’s equations by using a central difference operator in both the time and space variables.1 The E- and B-fields are then represented by their discrete values on the spatial grid, and are advanced in time in steps of Dt. The numerical solution thus obtained to Maxwell’s equations (in conjunction with the relevant constitutive relations) provides a highly reliable representation of the electromagnetic field distribution in the space-time region under consideration. This chapter presents examples of application of the FDTD method to problems involving the interaction between a focused beam of light and certain subwavelength structures of practical interest. A few general remarks concerning the nature of the FDTD method appear in the next section. This is followed by a description of the simulated system and two examples in which comparison is possible between the FDTD method and an alternative method of calculation. We then present simulation results for the case of a focused beam interacting with small pits and apertures in a thin film supported by a transparent substrate. The FDTD method The spatial unit cell used in three-dimensional FDTD simulations is shown in Figure 43.1. The components of the vector fields E and B are located at different †

The co-authors of this chapter are Armis R. Zakharian, now with Corning Corp., and Jerome V. Moloney of the University of Arizona.

599

600

Classical Optics and its Applications Ez

Bx By Δz

Ey Bz

Ex Δy

Δx

Figure 43.1 The unit cell of the FDTD mesh has dimensions Dx · Dy · Dz. The various components of the E and B fields are assigned to different locations on the unit cell. The staggered field components are shifted by a half-pixel in various directions.

positions with respect to the cell center, so that every component of the electric field is surrounded by four circulating components of the magnetic field, and vice versa. Such a staggered mesh is motivated by the integral form of Maxwell’s curl equations. The contour integrals of E (B) along the edges of the cell in Faraday’s law (Ampere’s law) circulate around the corresponding magnetic (electric) field component at the center of the cell face. In 3-D simulations at least six field components must be stored and updated at each grid point, which leads to considerable memory and CPU requirements for FDTD simulations. Fortunately, the time update of any field component involves only nearby fields located one or two cells away on the grid. This kind of locality in the physical space translates into computer memory access locality and allows for efficient implementation of the FDTD algorithm on many types of shared and distributed memory parallel platforms. Low-reflection absorbing boundary conditions that terminate the computational domain by a Perfectly Matched Layer (PML) allow the simulation of physical problems with open boundaries.2 Since the FDTD algorithm solves Maxwell’s equations in the time domain, calculation for a broad range of frequencies is possible in a single simulation using a time-pulsed excitation. Other advantages include the possibility of modeling dispersive and non linear materials. An important property of the FDTD method is that it introduces no additional dissipation into the physical problem due to numerical discretization, and hence energy is conserved. However, the finite difference method contributes to a dispersion error. In the commonly used second order accurate implementation of FDTD, this error diminishes with cell size h as O(h2). In practice, therefore, to keep the numerical dispersion errors under control, a grid with about 30 points per wavelength is desired. The rather large number of

601

43 Interaction of light with subwavelength structures (a)

z [μm]

0.4 0.2 0.0

–0.2 –0.4 –2 –1 y [μ

0 1

m]

2

–2

–1

0 x [μm]

1

2

(b)

z [μm]

0.4 0.2 0.0 –0.2 –0.4 –2 –1 y [μ

0 m]

1 2

–2

–1

0 x [μm]

1

2

Figure 43.2 3-D computational domain for simulating the interaction between a focused beam of light and various marks (i.e., bumps or pits) on the surface of a multilayer data storage medium. (a) Non-uniform conformal grid; the grid-line density is higher near the center, where the focused beam and the multilayer stack are located. (b) Nested rectangular cells forming a non-conformal hierarchical grid.

points and iterations thus required for accurate results may render solution impractical for a problem with large spatial and/or temporal domain. In many cases it is desirable to retain the efficiency of the FDTD scheme on the rectangular grids, but achieve higher resolution only in those regions of the computational domain where it is needed. The non-uniform grids allow one to vary a cell size in each coordinate direction, keeping the grid structured and conformal as in Figure 43.2(a). A more efficient approach is to employ a collection of nested rectangular cells that form a non-conformal hierarchical grid, as in Figure 43.2(b). Each successive nested level has a higher resolution, e.g., by a factor of two, than the previous level, allowing smaller cell sizes to be “focused” in the regions of interest (e.g., sub-wavelength features, photonic crystal microcavity, etc.). Inside

602

Classical Optics and its Applications

each rectangular region the standard FDTD algorithm is applied, while at the boundaries between the grids an update scheme and interpolation must be employed to keep the method stable and accurate. In FDTD the time step Dt is proportional to the cell size, and hence the smallest time step is required on the grids with the highest resolution. Each grid can be updated with its own time-step, the grids with cell size 2h doing half as many iterations as grids with cell size h. The simulated system The FDTD algorithm is quite powerful and can be applied to a wide variety of problems in electromagnetics. For demonstration purposes in this chapter, however, we confine our attention to a simple system involving the interaction between a focused beam of light and small (subwavelength) structures located in the focal region. Figure 43.3 shows a coherent, monochromatic beam of light (free-space wavelength ¼ k0), brought to focus by an aberration-free objective lens (numerical aperture NA ¼ 0.6, focal length f ¼ 5000k0). The incident beam at the entrance pupil is linearly polarized along the X-axis, and the total optical power (i.e., integrated intensity) at the entrance pupil of the lens is set to unity. The sample typically consists of a thin film (or thin-film stack) coated on a transparent substrate; the various samples used in our simulations are depicted in Figures 43.3(b)–(f). Detailed descriptions of these samples will be given in the context of the relevant simulations in the following sections. The focused spot may illuminate the sample directly, as in Figures 43.3(b), (d), and (f), or through a glass hemisphere (i.e., solid immersion lens) placed in contact with the sample, as in Figure 43.3(c). When the hemispherical lens is present, the thin film(s) may be coated directly on its flat facet, in which case the hemisphere acts as the sample substrate as well. Figure 43.4 shows computed plots of intensity (top) and phase (bottom) at the focal plane of the lens depicted in Figure 43.3(a). From left to right, these distributions represent the E-field components along the X-, Y-, and Z-axes. At the focal plane the peak intensities are in the ratio of jExj2 : jEyj2 : jEzj2 ¼ 1000 : 0.4 : 45. The various rings of the focused spot are phase-shifted by 180 relative to their adjacent neighbors, and the Z-component of the field is 90 out-of-phase relative to Ex and Ey. (In the remainder of this chapter we will omit the plots of Ez distribution, since Ez can always be computed from a knowledge of the Ex and Ey distributions.) The intensity and phase distributions in this chapter are plotted in an interval xmin x xmax and ymin y ymax of the XY-plane by assigning the color red to the maximum value of the function, blue to the minimum value, and the continuum of the white light spectrum to the values in between. The phase plots cover the range from 180 (blue) to þ180 (red). The intensity distributions are first normalized by the peak value of the corresponding function, say, Ix-peak ¼ max(jExj2) within the displayed interval. The base 10 logarithm

603

43 Interaction of light with subwavelength structures (a)

Objective Y Substrate

X

Z

Incident beam

(b)

Metal film

Thin film(s)

(c)

Glass hemisphere

(d)

Metal film

(f)

(e)

Metal film

Metal film

Figure 43.3 (a) Diagram of the simulated system, in which an aberration-free objective lens brings a coherent, monochromatic beam of light to focus. The bottom row shows the various samples used in the simulations. In (b), (d), and (e) a 50 nm-thick metal film is coated over a transparent substrate. In (c) the bilayer at the bottom of the glass hemispherical lens consists of quarter-wavethick dielectric films. The sphero-cylindrical pits in (d) and (e) are 500 nm long, 300 nm wide, and 100 nm deep. In (f) the aperture in the 20 nm-thick metal film is circular in one simulation and bowtie-shaped in another. The circular hole’s diameter is 400 nm, while the bowtie aperture is 400 nm long, 300 nm wide on each side, and 60 nm wide at the neck.

of the normalized function is then evaluated, and all pixel values below a certain level, say, a, are set equal to a. Displayed plots of log_intensity_a thus cover the range from 10 aIpeak (blue) to Ipeak (red). When the beam is focused through a hemispherical lens of refractive index n, as in Figure 43.3(c), the same distribution as in Figure 43.4 is found at the focal plane, but the spatial coordinates must shrink by a factor of n to account for the reduced wavelength (k ¼ k0 /n) within the medium of the hemispherical lens. At the bottom of the hemisphere, therefore, the focused spot diameter is reduced by a factor of n compared to that shown in Figure 43.4.

604

Classical Optics and its Applications (a)

(b)

(c)

(d)

(e)

(f)

–4.0

x/0

+4 –4.0

x/0

+4.0 –4.0

x/0

+4.0

Figure 43.4 Plots of log_intensity_4 (top) and phase (bottom) at the focal plane of the lens of Figure 43.3(a). Left to right: E-field components along the X-, Y-, and Z-axes. At the entrance pupil the incident beam (wavelength ¼ k0) is Gaussian with 1/e (amplitude) radius of 4000k0, truncated at the lens aperture (radius ¼ 3000k0). The incident beam is linearly polarized along the X-axis, and its total optical power captured by the lens is unity, that is, Px ¼ 1.0, Py ¼ 0.0. (The power content of the Ez component at the focal plane is 8.3% of total power.)

In some cases the beam must be focused onto the object of interest through a parallel plate cover glass or through the sample’s substrate, as is the case, for instance, in Figure 43.3(e). Under such circumstances, to obtain a focused spot free from spherical aberration, the objective lens must be designed for the specific thickness and refractive index of the substrate. Unlike focusing through a glass hemisphere, however, the focused spot inside a cover plate (or flat substrate, as the case may be) has exactly the same dimensions as that obtained by focusing in air through an objective of the same NA. The reason is that, in passing from the air to the substrate through a flat interface, the effect of the reduced wavelength on the focused beam is exactly canceled out by the reduced angle of the focused cone (Snell’s law). The spot that illuminates the concave pit of Figure 43.3(e) through the sample’s substrate, therefore, has exactly the same size as that which directly illuminates the convex pit of Figure 43.3(d). Reflection from a metallic mirror The mirror depicted in Figure 43.3(b) is a 50-nm-thick metal film of complex refractive index n þ ik ¼ 2.0 þ i7.0, coated over a transparent substrate of index

605

43 Interaction of light with subwavelength structures

n ¼ 1.5. The large absorption coefficient k of the metal film ensures that the light does not reach the substrate; most of the incident light is therefore reflected, while a small fraction is absorbed in the metal. For the incident beam depicted in Figure 43.4 at k0 ¼ 650 nm, Figure 43.5 shows computed plots of reflected intensity (top) and phase (bottom) obtained with the FDTD method. (The FDTD mesh size was Lx ¼ Ly ¼ 12k0, and the mirror’s front facet was a distance z ¼ 180 nm beyond the focal plane of the lens.) The integrated intensity of the reflected light over the XY-plane for the X- and Y-components of polarization may be defined as follows: ZZ ZZ 2 2 Ey dx dy: Px ¼ Py ¼ jEx j dx dy, Using the FDTD method, we found Px ¼ 0.85, Py ¼ 0.0016 for the mirror of Figure 43.3(b) illuminated with the focused spot of Figure 43.4. To verify the

(a)

(b)

(c)

(d)

–2.6

x(μm)

+2.6 –2.6

x(μm)

+2.6

Figure 43.5 Plots of reflected log_intensity_4 (top) and phase (bottom) from the metallic mirror depicted in Figure 43.3(b) at k0 ¼ 650 nm. The panels on the left-hand side correspond to Ex, while those on the right-hand side represent the Ey component of the reflected field. The front facet of the mirror is located a distance z ¼ 180 nm beyond the focal plane.

606

Classical Optics and its Applications

accuracy of the FDTD method, we simulated the same system using an alternative method based on the superposition of plane-wave solutions to Maxwell’s equations with matching boundary conditions at the various interfaces. The intensity and phase distributions thus obtained were visually indistinguishable from those shown in Figure 43.5, and the corresponding integrated intensities were found to be Px ¼ 0.86, Py ¼ 0.0018. The slight differences between the two methods of computation reflect the cumulative effect of numerical errors inherent to the FDTD algorithm. Similar simulations were performed for the sample of Figure 43.3(b) illuminated through a glass hemisphere of index n ¼ 1.5. (The FDTD mesh size in this case was Lx ¼ Ly ¼ 8k0, but the mirror’s front facet remained at z ¼ 180 nm beyond the focal plane of the lens.) The computed values of integrated intensity were Px ¼ 0.78, Py ¼ 0.0019. The corresponding quantities obtained with the alternative (and more accurate) method of plane-wave superposition were Px ¼ 0.80, Py ¼ 0.0022. Once again, comparison against a benchmark has shown the effect of small but cumulative numerical errors on the results of FDTD calculations. (a)

(b)

(c)

(d)

–1.067

x (μm)

+1.067 –1.067

x (μm)

+1.067

Figure 43.6 Plots of reflected log_intensity_3 (top) and phase (bottom) from the dielectric bilayer depicted in Figure 43.3(c) at k0 ¼ 400 nm. The panels on the lefthand side correspond to Ex, while those on the right-hand side represent the Ey component of the reflected field. The front facet of the stack is at z1 ¼ 230 nm beyond the focal plane.

43 Interaction of light with subwavelength structures

607

Although the alternative method employed in the above examples is faster and more accurate than FDTD, it has the disadvantage of being restricted to geometries such as those in Figures 43.3(b) and 43.3(c), where the sample consists of one or more homogeneous layers with flat surfaces/interfaces. As soon as inhomogeneities or non-uniformities are introduced, the computation method based on plane-wave superposition fails, and the FDTD method becomes an attractive (though costly) candidate for numerical solution of Maxwell’s equations. Reflection and transmission at a dielectric bilayer The sample depicted in Figure 43.3(c) consists of two quarter-wave-thick dielectric layers coated at the bottom of a glass hemisphere of index n ¼ 1.5. The layer directly in contact with the hemisphere has n ¼ 2.0, d ¼ 50 nm, while the other layer has n ¼ 1.5, d ¼ 67 nm. Since the layers are homogeneous and the interfaces are flat, the method of computation based on plane-wave superposition may be used once again to check the accuracy of the FDTD simulations.

(a)

(b)

(c)

(d)

–1.067

x (μm)

+1.067 –1.067

x (μm)

+1.067

Figure 43.7 Same as Figure 43.6 but for the transmitted beam. The distance from the rear of the stack to the plane where the transmitted beam is observed is z2 ¼ 30 nm.

608

Classical Optics and its Applications

In our FDTD calculations of the bilayer stack of Figure 43.3(c) the incident focused beam had k0 ¼ 400 nm, the mesh size was Lx ¼ Ly ¼ 8.08k0, and the distance from the focal plane to the top of the stack was z1 ¼ 230 nm, while that from the bottom of the stack to the plane in which the transmitted beam is observed was z2 ¼ 30 nm. Figures 43.6 and 43.7 show computed plots of intensity and phase for the reflected and transmitted fields, respectively. The corresponding distributions obtained with the alternative method of plane-wave superposition were visually indistinguishable from those in Figures 43.6 and 43.7. The integrated values of reflected intensity are Px ¼ 0.022 (0.019 with the alternative method) and Py ¼ 0.0026 (both methods). The corresponding quantities for the transmitted beam are Px ¼ 0.97 (1.01 with the alternative method) and Py ¼ 0.01 (both methods). Once again the FDTD method is seen to be adequate for these types of calculation, provided that a few percentage point deviation from the exact solution (caused by discretization and numerical errors) is deemed acceptable.

(a)

(b)

(c)

(d)

–1.73

x (μm)

+1.73 –1.73

x (μm)

+1.73

Figure 43.8 Plots of reflected log_intensity_3 (top) and phase (bottom) from the convex pit in the sample depicted in Figure 43.3(d) at k0 ¼ 650 nm. The panels on the left-hand side correspond to Ex, while those on the right-hand side represent the Ey component of the reflected field. The pit center is 250 nm to the left of the focused spot center.

43 Interaction of light with subwavelength structures

609

Reflection from convex and concave pits The substrate shown in cross-section in Figure 43.3(d) is embossed with a spherocylindrical pit having a length of 500 nm along X, width of 300 nm along Y, and depth of 100 nm along Z (the profile of the pit in the XY-plane can also be seen in Figure 43.2). The substrate’s index is n ¼ 1.5, and the metal film’s thickness and complex index are d ¼ 50 nm, n þ ik ¼ 2.0 þ 7.0i. In our FDTD simulations the incident wavelength was k0 ¼ 650 nm, the mesh size was Lx ¼ Ly ¼ 12k0, and the front facet of the metal film was at z ¼ 280 nm beyond the focal plane. Figure 43.8 shows computed plots of reflected intensity and phase from a pit whose center has been displaced by Dx ¼ 250 nm from the center of the focused spot. The integrated values of reflected intensity are Px ¼ 0.82, Py ¼ 0.0025. The pit in the above example is similar to those embossed on the plastic substrate of a compact disk (CD) or a digital versatile disk (DVD). However, the focused laser beam in a CD or DVD player does not shine directly onto the pit; rather, the beam arrives through the plastic disk substrate as in Figure 43.3(e). (a)

(b)

(c)

(d)

–1.73

x (μm)

+1.73 –1.73

x (μm)

+1.73

Figure 43.9 Same as Figure 43.8 but for the sample of Figure 43.3(e). The objective is now corrected for the thickness and refractive index of the substrate, so the beam focused on this concave pit continues to be the diffraction-limited spot shown in Figure 43.4.

610

Classical Optics and its Applications (a)

(b)

(c)

(d)

–1.3

x (μm)

+1.3 –1.3

x (μm)

+1.3

Figure 43.10 Plots of transmitted log_intensity_3 (top) and phase (bottom) at k0 ¼ 650 nm through the thin metal film depicted in Figure 43.3(f). The film contains a 400 nm-diameter circular aperture at its center. The panels on the lefthand side correspond to Ex, while those on the right-hand side represent the Ey component of the field. The observation plane is 20 nm past the interface between the film and the substrate.

We simulated this case at k0 ¼ 650 nm with an FDTD mesh of dimensions Lx ¼ Ly ¼ 8k0; the front facet of the metal film was at z ¼ 280 nm beyond the focal plane. Figure 43.9 shows computed plots of reflected intensity and phase from the pit of Figure 43.3(e) when the pit center is displaced by Dx ¼ 250 nm from the center of the focused spot. The computed values of integrated intensity in this case are Px ¼ 0.77, Py ¼ 0.022. A comparison of jExj2 distributions in Figures 43.8 and 43.9 reveals that, whereas the convex pit of Figure 43.3(d) tends to concentrate the incoming rays toward the pit center, the concave pit of Figure 43.3(e) disperses these rays away from the center. Transmission through small apertures Figure 43.3(f) shows a 20 nm-thick metal film (n þ ik ¼ 2.0 þ i7.0) with an air-filled hole at the center, coated on a glass substrate of index n ¼ 1.5. The

611

43 Interaction of light with subwavelength structures (a)

(b)

(c)

(d)

–1.3

x (μm)

+1.3 –1.3

x (μm)

+1.3

Figure 43.11 Same as Figure 43.10 but with evanescent field components filtered out.

hole is either a 400 nm-diameter circular aperture, or a bowtie-shaped aperture 400 nm-long and 300 nm-wide. In our FDTD simulations of these apertures the wavelength was k0 ¼ 650 nm, the mesh size was Lx ¼ Ly ¼ 12k0, the front facet of the metal film was at z1 ¼ 77 nm beyond the focal plane, and the distance from the rear facet of the film to the plane in which the transmitted beam is observed was z2 ¼ 20 nm. Figure 43.10 shows computed plots of the transmitted intensity and phase for the sample of Figure 43.3(f) containing a circular aperture. Note that, despite its large absorption coefficient, the metal film is not thick enough to completely block the incident beam. Thus, in addition to the light that passes through the aperture, a weak ring of light is also transmitted through the film. The integrated values of transmitted intensity are Px ¼ 0.35, Py ¼ 0.0086. Since the focused cone of light consists of p- as well as s-polarized rays, the difference in sample reflectivity for these differently polarized rays at oblique incidence is partly responsible for the elongated shape of the transmitted intensity profile in Figure 43.10(a). The proximity of the observation plane to the aperture ensures that the transmitted field contains a mixture of propagating as well as evanescent

612

Classical Optics and its Applications (a)

(b)

(c)

(d)

–1.3

x (μm)

+1.3 –1.3

x (μm)

+1.3

Figure 43.12 Plots of transmitted log_intensity_3 (top) and phase (bottom) through the 20 nm-thick metal film depicted in Figure 43.3(f). The bowtie aperture at the center of the film is 400 nm-long along X and 300 nm wide along Y; the rectangular neck of the bowtie is 60 nm wide. The panels on the left-hand side correspond to Ex, while those on the right-hand side represent the Ey component of the transmitted field.

plane waves. If these evanescent components are filtered out, then the remaining field will propagate undiminished to the far field. The filtered field in the same observation plane (i.e., at z2 ¼ 20 nm beyond the interface between the metal film and the substrate) is shown in Figure 43.11. The integrated intensities of the X- and Y-components of polarization in these calculations are found to be Px ¼ 0.31, Py ¼ 0.002. For the bowtie aperture in the thin-film sample of Figure 43.3(f), computed plots of transmitted intensity and phase are shown in Figure 43.12. The computed integrated intensities in this case are Px ¼ 0.194, Py ¼ 0.04. When the evanescent content of the transmitted field is filtered out, the distributions shown in Figure 43.13 are obtained. (The integrated intensity values now drop to Px ¼ 0.093, Py ¼ 0.005.) Note that the bowtie shape of the aperture is no longer discernible in the filtered transmitted beam, ostensibly because the fine features of this aperture contribute primarily to the evanescent field.

613

43 Interaction of light with subwavelength structures (a)

(b)

(c)

(d)

–1.3

x (μm)

+1.3 –1.3

x (μm)

+1.3

Figure 43.13 Same as Figure 43.12 but with evanescent field components filtered out.

When the bowtie aperture was rotated 90 in the plane of the metallic film (to make the incident E-field perpendicular to the line that connects the sharp ends of the triangles), the computed integrated intensities dropped to Px ¼ 0.1, Py ¼ 0.012 before filtering and Px ¼ 0.047, Py ¼ 0.0035 after filtering. The transmission efficiency of the bowtie aperture is thus seen to drop by nearly a factor of 2.0 when the incident polarization goes from being parallel to the line that connects the sharp ends of the triangles to being perpendicular to it. References for Chapter 43 1 2

K. S. Yee, IEEE Trans. Antennas and Prop. 14, 302–307 (1966). A. Taflove and S. C. Hagness, Computational Electrodynamics, Artech House, Norwood, MA (2000).

44 The Ronchi test

In the 1920s Vasco Ronchi developed the well-known method of testing optical systems now named after him.1,2 The essential features of the Ronchi test may be described by reference to Figure 44.1. A lens (or more generally, an optical system consisting of a number of lenses and mirrors) is placed in the position of the “object under test”. The lens is then illuminated with a beam of light, which, for the purposes of the present chapter, will be assumed to be coherent and quasi-monochromatic. These restrictions on the beam may be substantially relaxed in practice.3 The lens brings the incident beam to a focus in the vicinity of a diffraction grating, which is placed perpendicular to the optical axis, i.e., the Z-axis. The grating, also referred to as a Ronchi ruling, may be as simple as a low-frequency wire grid or as sophisticated as a modern short-pitched, phase/amplitude grating. The position of the grating should be adjustable in the vicinity of focus, so that it may be shifted back and forth along the optical axis. The grating breaks up the incident beam into multiple diffracted orders, which will subsequently propagate along Z and reach the lens labeled “pupil relay” in Figure 44.1. The pupil relay may simply be the lens of the eye, which projects the exit pupil of the object under test onto the retina of the observer. Alternatively, it may be a conventional lens that creates a real image of the exit pupil on a screen or on a CCD camera. The diffracted orders from the grating will be collected by the relay lens and, within their overlap areas, will create interference fringes characteristic of the aberrations of the optical system under consideration. By analyzing these fringes, one can determine the type and, with some effort, the magnitude of the aberrations present at the exit pupil of the system. The above description of the Ronchi test relies on its modern interpretation; this is based on our current understanding of physical optics and the theory of diffraction gratings. Historically, however, the gratings used in the early days 614

615

44 The Ronchi test Object under test

Pupil Grating (Ronchi ruling) relay Observation plane

Figure 44.1 A beam of coherent, quasi-monochromatic light is brought to focus by an optical system that is undergoing tests to determine its aberrations. A diffraction grating, placed perpendicular to the optical axis in the vicinity of focus, breaks up the incident beam into several diffraction orders. The diffracted orders propagate, independently of each other, and are collected by a pupil relay lens, which forms an image of the exit pupil of the object under test at the observation plane.

were quite coarse, and the results obtained with them required no more than a simple geometric-optical theory for their interpretation. Typically, one would place the eye at the focus of the lens and hold a grating (e.g., a wire grid) in front of the eye, moving the grating in and out until a clear pattern became visible. At this point the beam would be illuminating several of the wires simultaneously. By looking through the grating and observing the shadows that the wires cast on the exit pupil, one could determine the type of aberration present in the system. The coarseness of the grating, of course, caused several of the diffracted orders (as we understand them today) to overlap each other, thus resulting in reduced contrast and smearing of the patterns near the boundaries. These problems were eventually overcome when finer gratings became available and the diffraction theory of the Ronchi test was better understood. Choosing an appropriate grating For best results the pitch of the grating should be chosen such that, as shown in Figure 44.2, no more than two diffraction orders will overlap at any given point. To determine the appropriate grating period P, one needs to know the wavelength k0 of the beam used for testing, and the numerical aperture NA of the focused cone of light. (By definition, NA ¼ sin h, where h is the half-angle subtended by the exit pupil of the lens at its focal point. If the lens under test is being used at full aperture, NA will also be equal to 0.5 divided by the lens’s f-number.) To avoid multiple overlaps among diffracted orders, the angle between adjacent orders must exceed the focused cone’s half-angle. Now, it is well known in the theory of diffraction gratings that, at normal incidence, sin hn ¼ nk0 /P where n, an integer, is the order of diffraction and hn is the corresponding deviation angle from the surface normal. Therefore, we arrive at the conclusion that P should be less than or equal to k0 /NA. For example, assume that the lens under test has a

616

Classical Optics and its Applications

–2

–1

0

1

2

Figure 44.2 Several diffracted orders in the far field of the grating of Figure 44.1. When the grating’s period is chosen properly, each diffracted order (i.e., emergent cone of light) will overlap only with its nearest neighbors. Except for a lateral shift in position, the various orders are identical, carrying the amplitude and phase distribution of the beam as it appears at the exit pupil of the object under test.

Figure 44.3 Distribution of intensity at the observation plane of Figure 44.1 in the absence of aberrations. The pupil relay lens is chosen to have the same numerical aperture as the object under test, thereby limiting the collected light to the zeroth-order beam and to those portions of the first-order beams that overlap the zeroth-order beam.

numerical aperture NA ¼ 0.5. Then, if the grating period is chosen to be 2k0, each diffracted order will deviate from the zero-order beam by 30 , making the þfirst-order beam just touch the first-order beam in the far field. Figure 44.3 shows the computed intensity distribution at the observation plane of an aberration-free system in which the relay lens has the same numerical aperture as the lens under test (NA ¼ 0.5). This equality of the numerical apertures means that only the zeroth-order diffracted beam will be fully transmitted to the observation plane. Of the first-order beams, only those portions that overlap the zero order will reach the observation plane. The period of the grating in this example has been a little less than k0 /NA, leaving a small gap between þfirst order and first order. The absence of aberrations means that the phase distribution over the cross-sections of the various diffracted orders is uniform and, therefore, no interference fringes are to be expected.

617

44 The Ronchi test

Ronchigrams for primary or Seidel aberrations Figure 44.4 shows the computed patterns of intensity distribution at the observation plane of Figure 44.1 corresponding to different types of primary (Seidel) aberrations of the lens. For these calculations we fixed the distance between the lens under test and the relay lens and then placed the grating at the paraxial focus of the converging wavefront. The pattern in Figure 44.4(a) was obtained when we assumed the presence of three waves of curvature (or defocus) at the exit pupil of the lens. Different amounts of defocus would create essentially the same pattern, albeit with a different number of fringes. In Figure 44.4(b) we observe the fringes arising from the presence of three waves of third-order spherical aberration in the test system. The shapes of these fringes depend not only on the magnitude of the aberration but also on the position of the grating relative to the focal plane. (We will have more to say about this point later.) Figure 44.4(c) shows the fringes that would arise when three waves of primary astigmatism are present. When the orientation of the astigmatism changes, the fringes will remain straight lines but their orientation within the observation plane will change accordingly. The last three frames in Figure 44.4 represent the effects of third-order coma. A change in orientation of this aberration causes the interference pattern to change drastically. Figures 44.4(d)–(f) correspond to three waves of coma oriented at 0 , 45 , and 90 , respectively. a

b

c

d

e

f

Figure 44.4 Computed plots of intensity distribution at the observation plane of Figure 44.1. The lens under test is assumed to have three waves of primary (Seidel) aberrations; the grating is at the nominal focal plane of the lens. (a) Defocus, (b) spherical aberration, (c) astigmatism oriented at 45 , (d) coma at 0 , (e) coma at 45 , (f) coma at 90 .

618

Classical Optics and its Applications

Sliding the grating along the optical axis A change in the position of the grating relative to the focal plane influences the observed fringe pattern. We limit our discussion to the case of spherical aberration, although similar analyses could be performed for other aberrations. Assuming three waves of spherical aberration as before, we obtain the patterns displayed in Figure 44.5 as we slide the grating along the optical axis in the system of Figure 44.1. Once again, we have taken the lens under test to have NA ¼ 0.5 and f ¼ 6000k0. The paraxial focus of the lens under test coincides with the front focal point of the relay lens, and the grating is shifted by different amounts Dz relative to this common focus. Frames (a)–(f) in Figure 44.5 correspond to different values of Dz, starting at Dz ¼ 10k0 in (a) and moving forward to Dz ¼ þ25k0 in (f). In the process, as the grating moves through paraxial focus and towards marginal focus, we observe a rich variety of patterns that aid us in determining the nature and the magnitude of the aberration. To be sure, the Ronchi test is not the only scheme used during the fabrication and evaluation of optical systems; several other tests exist and their relative

a

b

c

d

e

f

Figure 44.5 Computed plots of intensity distribution at the observation plane of Figure 44.1, showing the patterns obtained by sliding the grating along the optical axis. The lens under test (NA ¼ 0.5, f ¼ 6000k0) is assumed to have three waves of primary spherical aberration, and its paraxial focus is coincident with the focal point of the relay lens. The grating is moved along the optical axis by an amount Dz relative to the (common) focal plane; positive distances are towards the marginal focus. (a) Dz ¼ 10k0, (b) Dz ¼ 0, (c) Dz ¼ 10k0, (d) Dz ¼ 15k0, (e) Dz ¼ 20k0, (f) Dz ¼ 25k0.

619

44 The Ronchi test

merits have been expounded in the literature.3 It is useful here to examine some of these alternative methods and to compare the resulting patterns (interferograms or otherwise) with those obtained with the Ronchi test. Testing by interfering with a reference plane wave Figure 44.6 shows the schematic diagram of a Mach–Zehnder interferometer, which is one among many that can be used to evaluate the aberrated wavefronts directly. In this system a coherent monochromatic beam of light is sent through the lens under test, is collected and recollimated by a well-corrected lens, and is made to interfere with a reference beam that has been split off the incident wavefront. The flat mirror shown in the lower left side of the interferometer is mounted on a tip–tilt stage that allows the introduction of a small amount of tilt in the reference beam. Figure 44.7 shows the computed patterns of intensity distribution at the observation plane of the Mach–Zehnder interferometer corresponding to three waves of primary coma. In obtaining the various frames of Figure 44.7 we have fixed all the system parameters and only varied the tilt of the reference beam. Note that the characteristic fringes of coma in Figure 44.7 are quite different from those of coma in the Ronchi test, shown in Figures 44.4(d)–(f). Incidentally, the patterns of Figure 44.7 show similarities with the Ronchigrams of spherical aberration displayed in Figure 44.5. This is not a coincidence; it is rooted in the algebraic forms of the aberration function for third-order coma (q3 cos ) and

Beam-splitter

Object under test

Pupil relay Mirror

Observation plane

Mirror Beam-splitter

Figure 44.6 Schematic diagram of a Mach–Zehnder interferometer that might be set up for a direct measurement of wavefront aberrations. The pupil relay lens (itself free from aberrations) forms at the observation plane an image of the exit pupil of the lens under test. A fraction of the incident beam is diverted from its original path and sent to the observation plane by means of the various mirrors and beam-splitters. The observed fringes are characteristic of the aberrations present at the exit pupil of the lens under test. A small tilt of the mirror shown at the lower left side of the figure would introduce a linear phase shift on the reference beam. This tilt is generally useful in producing signature fringe patterns at the observation plane.

620

Classical Optics and its Applications a

b

c

d

e

f

Figure 44.7 Computed plots of intensity distribution at the observation plane of Figure 44.6. The lens under test (NA ¼ 0.5, f ¼ 6000k0) is assumed to have three waves of primary coma, and its nominal focus is coincident with the focal point of the relay lens. The tilt angle w of the reference beam increases progressively from (a) to (f): (a) w ¼ 0.1 , (b) w ¼ 0 , (c) w ¼ 0.05 , (d) w ¼ 0.07 , (e) w ¼ 0.1 , (f) w ¼ 0.18 .

spherical aberration (q4) and also in the fact that a Ronchigram, being a kind of shearing interferogram (albeit with a large shear), is related to the derivative of the wavefront aberration function. Knife-edge and wire tests A schematic diagram of the knife-edge method of testing optical systems is shown in Figure 44.8. A geometric-optical interpretation of this test suffices for most practical purposes: the knife-edge blocks different groups of rays in its various positions along the optical axis, allowing the remaining rays to reach the observation plane.3 Another method of testing, known as the wire test, is quite similar to the knife-edge method, being obtained from it by substituting for the knife-edge a length of fine wire.3 Since the grating in the Ronchi test may be thought of as a series of parallel knife-edges or, more aptly, a series of parallel wires, it should not come as a surprise that similarities exist between Ronchigrams and the patterns observed in these other tests. In fact, early attempts at explaining the results of Ronchi’s method were based on geometrical optics, and considered the grating as a set of parallel wires whose shadows produced the observed patterns.4 We will not delve into these matters, but simply draw the reader’s attention to Figures 44.9 and 44.10, where we

621

44 The Ronchi test Object under test

Knife-edge

Pupil relay

Observation plane

Figure 44.8 In the knife-edge test a certain region in the vicinity of focus is blocked by a knife-edge; the nature and the magnitude of the aberrations are then inferred from the resulting patterns of intensity distribution at the observation plane. (The knife-edge may be moved both along and perpendicular to the optical axis.) The wire test is similar to the knife-edge test except that a fine wire is used instead, to block certain groups of rays.

a

b

c

d

Figure 44.9 Computed plots of intensity distribution at the observation plane of Figure 44.8 corresponding to the knife-edge test carried out with a laser beam. The lens under test (NA ¼ 0.5, f ¼ 6000k0) and the pupil relay lens (NA ¼ 0.5) are assumed to be fixed in their respective positions, while the knife-edge moves along the optical axis. (The tip of the knife remains on the axis at all times.) The lens under test is assumed to have three waves of primary spherical aberration. In frames (a) to (d) the distance of the knife-edge from paraxial focus Dz ¼ 15k0, 0, þ15k0, and þ20k0, respectively. (Positive distances are in the direction of the marginal focus.)

622

Classical Optics and its Applications a

b

c

Figure 44.10 Computed plots of intensity distribution at the observation plane of Figure 44.8 corresponding to the wire test with an extended, quasi-monochromatic light source. The lens under test (NA ¼ 0.5, f ¼ 6000k0) has three waves of primary spherical aberration. The assumed wire diameter is 15k0, which is comparable to the size of the image of the extended light source, as measured in the vicinity of focus. In (a) the wire is centered on axis and is 25k0 away from paraxial focus (in the direction of the marginal focus). In (b) the wire is again centered on the axis, but is 20k0 away from paraxial focus. In (c) the wire has been shifted 0.5k0 off-axis while its distance from paraxial focus remains at 20k0.

show several computed patterns of intensity distribution for the knife-edge and wire tests, respectively. The results of the simulated knife-edge test depicted in Figure 44.9 assume a laser as the light source. Consequently, frames (a) and (b) of Figure 44.9 exhibit

44 The Ronchi test

623

several dark lines which, with a less coherent light source, would have been absent. The results of the simulated wire test shown in Figure 44.10 assume an extended light source, since the small amount of spherical aberration present in the system under consideration would render the test useless with a wire, which fine as it may be, will still be wider than the focused spot produced by a laser beam. Note the similarities between the patterns of Figures 44.9 and 44.10 on the one hand, and those of Figures 44.5(d)–(f) on the other. Extensions of the Ronchi test Several modifications and extensions of the Ronchi test have appeared over the years, and have helped to solve specific problems in testing of optical systems.3 As an example we mention the double-frequency grating lateral-shear interferometer invented by James Wyant in the early 1970s.5 The grating in this device has two slightly different frequencies, which give rise to two +first-order beams as well as two first-order beams; the beams in each pair are slightly shifted relative to each other. Moreover, the (average) pitch of the grating is such that there is no overlap between the zeroth, +first and first orders. Consequently, interference occurs between the two +first-order beams (and, likewise, between the two first-order beams). One can thus obtain an arbitrarily small lateral shear of the wavefront under test and use the results to achieve accurate quantitative measurements. A two-dimensional version of the double-frequency grating has also been employed to generate lateral wavefront shear simultaneously along the X- and Y-axes. (Remember that beam propagation is along Z and, therefore, X and Y are orthogonal axes in the plane of the grating.) In the absence of a two-dimensional grating, one must rotate a one-dimensional grating by 90 to obtain wavefront shear first along the X- and then along the Y-axis. References for Chapter 44 1 2 3 4 5

V. Ronchi, Le Frange di Combinazioni Nello Studio delle Superficie e dei Sistemi Ottici, Riv. Ottica Mecc. Precis. 2, 9 (1923). V. Ronchi, Due Nuovi Metodi per lo Studio delle Superficie e dei Sistemi Ottici, Ann. Sc. Norm. Super. Pisa 15 (1923). D. Malacara, ed., Optical Shop Testing, second edition, Wiley, New York, 1992. G. Toraldo di Francia, Geometrical and interferential aspects of the Ronchi test, in Optical Image Evaluation, National Bureau of Standards Circular 526, issued April 29, 1954. J. C. Wyant, Double frequency grating lateral shear interferometer, Appl. Opt. 12, 2057 (1973).

45 The Shack–Hartmann wavefront sensor

Roland Shack invented the device now known as the Shack–Hartmann wavefront sensor in the early 1970s.1,2 This sensor, which in recent years has been commercialized, measures the phase distribution over the cross-section of a given beam of light without relying on interference and, therefore, does not require a reference beam. The standard method of wavefront analysis is interferometry, where one brings together on an observation plane the beam under investigation (hereinafter the test beam) and a reference beam in order to form tell-tale fringes.3 The trouble with interferometry is that it requires a reference beam, which is not always readily available. Moreover, the coherence length of the light used in these measurements must be long compared with the path-length difference between the reference and test beams. Thus, when the available light source happens to be broad-band, it becomes difficult (though by no means impossible) to produce high-contrast fringes. The Shack–Hartmann instrument solves these problems by eliminating altogether the need for the reference beam. Wavefront analysis by interferometry Before embarking on a discussion of the Shack–Hartmann wavefront sensor, it will be instructive to describe the operation of a conventional interferometer. Consider, for instance, the system of Figure 45.1, where a spherical mirror is under investigation. While grinding and polishing the glass blank, the optician frequently performs this type of test to determine departures of the surface from the desired figure. A point source reflected from a 50/50 beam-splitter is used to illuminate the test mirror. Before arriving at the mirror, however, the beam is partially reflected from the spherical surface of a plano-convex lens attached to the front facet of the beam-splitter cube (i.e., the spherical cap). The center of curvature of this spherical cap is at C, which is also the virtual image of the point 624

45 The Shack–Hartmann wavefront sensor

625

Monochromatic Light Source

Condenser Pinhole Spherical Cap

C C′

Beam-splitter Cube Observation Plane Test Mirror

Figure 45.1 The Shack cube is used here to measure the surface quality of a spherical mirror. The cube is a 50/50 beam-splitter capped by an index-matched plano-convex lens. The light from the point source is partially reflected from this spherical cap, producing a reference beam that comes to focus at C. The beam that passes through the cap illuminates the test mirror, then returns and crosses the cube and is focused at C 0 . The interference pattern between the test and reference beams is viewed at the observation plane. The cube’s axis is slightly displaced from the axis of the mirror in order to separate C from C 0 , which is needed for producing straight-line fringes.

source in the beam-splitter’s half-silvered mirror. The light reflected from the spherical cap (and focused at C) forms the reference beam. (Incidentally, this interferometer was also invented by Roland Shack in the 1970s, and is now known as the Shack cube.4) Note in Figure 45.1 that the pinhole is placed directly on the face of the beamsplitter to eliminate possible aberrations of the beam upon entering and exiting the cube. The reflectivity of the spherical cap is about 4%, which is similar to that of the uncoated test mirror. The equal-strength test and reference beams thus produce a high-contrast fringe pattern. Figure 45.2(a) shows a typical phase distribution over the cross-section of a test beam reflected from a mirror having several waves of aberration. The computed interference pattern between this and an equal-strength reference beam is shown in Figure 45.2(b). Needless to say, the fringe contrast is excellent and the observed fringes may be related directly to the wavefront aberrations. In general, the coherence length of the light source must be long enough to ensure that, at the observation plane, the test and reference

626

Classical Optics and its Applications a

b

–3100

x/

3100

Figure 45.2 (a) A typical phase distribution in the cross-section of a monochromatic test beam (wavelength k). The gray-scale covers the range from 180 (black) to þ180 (white). (b) The interference pattern obtained by adding the test beam in (a) to a reference plane wave of equal magnitude.

beams remain mutually coherent. For testing small mirrors having a short focal length (say, less than 10 cm) a single radiation line of an arc lamp may suffice, but for larger mirrors a long-coherence-length laser is usually necessary. In practice, the center of curvature of the test mirror is slightly displaced from C, as shown in Figure 45.1, so that the rays bouncing off the mirror and arriving at the exit facet of the cube would converge not to C, but to a nearby point C 0 . This small lateral displacement of the test beam relative to the reference beam produces straight-line fringes in the interferogram at the observation plane. Such fringes are very sensitive to small aberrations of the mirror, and their deviation from linearity can be related easily to minute surface errors. If the errors are large, however, there is no need for straight-line fringes, and the center of the test mirror can coincide with C. The combination of the cube beam-splitter and the spherical cap may be considered a thick lens. This lens projects a real image of the test mirror in the

45 The Shack–Hartmann wavefront sensor

627

space behind the cube. The best place to observe the fringes, therefore, is at the location of this image, where the fringes are localized on the mirror, and the observer can readily identify areas that need further grinding and polishing. Another advantage is that scratches and dust particles on the mirror come to focus at its image, thus eliminating spurious fringes of the scattered light that downgrade the quality of the interferograms obtained at other locations in the image space. (The spherical cap, of course, must be kept clean at all times to prevent dust particles that have collected there from producing their own spurious fringes.) The test mirror depicted in Figure 45.1 does not have to be spherical, but may be a mild paraboloid or hyperboloid whose center of curvature is, as before, placed at or near the point C. The departure of the mirror’s figure from sphericity imparts a certain amount of spherical aberration to the test beam, which may be calculated in advance. The optician then looks for aberrations above and beyond this expected amount of spherical aberration in order to determine the necessary corrections. Large telescope mirrors may also be tested with a Shack cube, but they require the use of an additional lens system known as a null-corrector.3 A telescope’s primary mirror is generally a large paraboloid or hyperboloid designed for operation at “infinite conjugate”, that is, it brings the collimated beam of a distant star to focus within the mirror’s focal plane. Testing such a large mirror with a collimated beam is impractical, however, and its actual departure from sphericity is too severe to be simply subtracted from the interferograms obtained in the system of Figure 45.1. In such situations, a null-corrector is designed to cancel the spherical aberrations of a test beam originating from a point source located at the mirror’s center of curvature. When a properly calibrated null-corrector is inserted between the Shack cube and the test mirror, the observed interferogram registers only the departure of the mirror from its desired figure. The Shack–Hartmann wavefront sensor Figure 45.3 shows a schematic of the Shack–Hartmann wavefront sensor. This device is in many ways superior to conventional interferometers: since it does not require a reference beam, it is simpler to align and to operate and, because it does not rely on interference, it may be used with a white light source. At the heart of the instrument is a lenslet array and a charge-coupled device (CCD) camera. The lenslets are identical and have a fairly large f-number, and the CCD detector, whose number of pixels is much greater than the number of lenslets, is placed at the focal plane of the lenslet array. Upon entering the system, the (aberrated) test beam is collimated and expanded or reduced, as necessary, to match the dimensions of the lenslet array. The array

Classical Optics and its Applications

CCD array

628

Test beam Beam expander

Lenslet array

Figure 45.3 The basic elements of a Shack–Hartmann wavefront sensor. Upon entering the system, the (aberrated) test beam is processed to yield a collimated beam with a diameter that matches the dimensions of the lenslet array. Each lenslet captures a fraction of the beam and brings it to focus within its focal plane, where a CCD camera monitors the intensity distribution. The CCD has many more pixels than there are lenslets, so the location of each focused spot is determined simply by identifying the illuminated pixel within the relevant subarray of pixels. The complete wavefront may be reconstructed from knowledge of the positions of individual focused spots.

typically consists of n · n identical lenslets, where n is around 100. Each lenslet thus acts on a small patch of the wavefront and brings it to focus within its focal plane. Located at this focal plane is the CCD array with m · m pixels, where m is typically around 1000. Consequently, each focused spot is assigned an exclusive sub-array of the CCD containing (m /n) · (m/n) pixels. If the patch of the wavefront captured by a given lenslet happens to have uniform phase, it forms a bright spot at the focal point of the lenslet, that is, at the center of the corresponding CCD subarray. However, if the phase distribution at the lenslet’s pupil happens to be nonuniform, the focused spot appears at a different location in the focal plane (but still within the associated CCD sub-array). Figure 45.4 shows the computed distribution of intensity at the focal plane of a 6 · 6 lenslet array illuminated by the test beam of Figure 45.2(a). It may be verified that the center of each focused spot is shifted away from its respective frame’s center by an amount proportional to the slope of the corresponding segment of the incident wavefront. In practice, individual lenslets are very small compared to the test beam’s diameter. Consequently, the incident phase distribution at the pupil of any given lenslet may be approximated by a linear function of the pupil coordinates. It is thus clear that the aberrations of each focused spot are negligible; all one needs to know is the shift of the spot from the focal point of its associated lenslet, which is readily obtained by examining the CCD’s output. So long as the focused spots remain within their allotted sub-array of detectors, the system can compute the local slope of the wavefront at the entrance pupil of each and every lenslet.

629

45 The Shack–Hartmann wavefront sensor

–3000

x/

3000

Figure 45.4 Intensity distribution at the focal plane of a 6 · 6 lenslet array, when the incident beam is assumed to have the phase distribution of Figure 45.2(a). Each square lenslet is 1000k · 1000k in size, and has focal length f ¼ 25 000k. The logarithmic plot of intensity shown here reveals the fine detail of the distribution at the CCD array. In practice, the fine detail is rather faint and only the center of each spot is detected by the CCD.

The local slopes are then patched together to reconstruct the complete phase distribution over the cross-section of the test beam. Historical notes The predecessor to the Shack–Hartmann sensor was Hartmann’s screen test, which used an array of holes in place of the lenslets.3,5,6,7 Shack realized the advantages of using a lenslet array and set out to fabricate one, since no such array with the characteristics he desired was available at the time. He made a mold by using a cutting tool to carve parallel grooves in a piece of flat glass, as shown in Figure 45.5. Two such pieces of grooved glass, oriented at right angles to each other, were clamped to an acrylic sheet and heated in an oven to mold

630

Classical Optics and its Applications

Figure 45.5 A piece of flat glass on which identical grooves were carved served as a mold for the early lenslet arrays. Two such pieces were prepared and placed face-to-face across a plastic sheet at right angles to each other. The assembly was then heated in an oven to transfer the pattern of the mold to the plastic sheet. The 1 mm-wide grooves had a depth of only a few micrometers.

convex ribs on each side of the acrylic sheet, thus forming an array of crossed cylindrical lenses. The first such array had 50 · 50 lenslets, each with an area of 1 · 1 mm2 and a focal length of 150 mm. Before the advent of CCD detectors in the 1980s, wavefront analysis was done by examining a photographic plate exposed to the array of focused spots. The plate was also exposed (simultaneously and through the same array of lenslets) to a parallel, aberration-free reference beam. The spots formed by this reference beam marked the center of each frame, thus providing reference points for measuring the displacement of the spots formed by the test beam. The tedious task of exposing and developing the photographic plate, followed by painstakingly measuring the positions of individual spots, was rewarding nonetheless; it allowed astronomers to measure the aberrations of their telescopes in the field using unfiltered star light. Even atmospheric turbulence did not pose a serious problem for this method, since its effects were simply averaged over during the relatively long exposure time of the photographic plate.

References for Chapter 45 1 R. V. Shack and B. C. Platt, Production and use of a lenticular Hartmann screen (abstract only), J. Opt. Soc. Am. 61, 656 (1971). 2 R. Riekher, Fernrohre und ihre Meister, Verlag Technik, Berlin, 1990.

45 The Shack–Hartmann wavefront sensor 3 4 5 6 7

631

D. Malacara, Optical Shop Testing, second edition, Wiley, New York, 1992. R. V. Shack and G. W. Hopkins, The Shack interferometer, SPIE 126, Clever Optics, 139–142 (1977). J. Hartmann, Bemerkungen uber den Bau und die Justirung von Spektrographen, Z. Instrum. 20, 47 (1900). J. Hartmann, Objektivuntersuchungen, Z. Instrum. 24, 33 (1904). R. Kingslake, The absolute Hartmann test, Trans. Opt. Soc. 29, 133 (1927–1928).

46 Ellipsometry

The goal of ellipsometry is to determine the optical and structural constants of thin films and flat surfaces from the measurements of the ellipse of polarization in reflected or transmitted light.1,2,3,4,5 In the absence of birefringence and optical activity a flat surface, a single-layer film, or a thin-film stack may be characterized by the complex reflection coefficients rp ¼ jrpjexp(irp) and rs ¼ jrsjexp(irs) for p- and s-polarized incident beams, as well as by the corresponding transmission coefficients tp ¼ jtpjexp(itp) and ts ¼ jtsjexp(its).6 Strictly speaking, an ellipsometer is a device that measures the complex ratios rp /rs and/or tp /ts. The amplitude ratios are usually deduced from the angles wr and wt, which are defined by tan wr ¼ jrpj/jrsj and tan wt ¼ jtpj/jtsj. In practice, measuring the individual reflectivities Rp ¼ jrpj2, Rs ¼ jrsj2 or transmissivities Tp ¼ jtpj2, Ts ¼ jtsj2 does not require much additional effort. Measuring the individual phases, of course, is difficult, but the relative phase angles rp rs and tp ts can be readily obtained by ellipsometric methods. The values of Rp, Rs, rp rs, wr, Tp, Ts, tp ts and wt may be measured as functions of the angle of incidence, h, or as functions of the wavelength of the light, k, or both. The results of ellipsometric measurements are fed to a computer program that searches the space of unknown parameters to find agreement between the measured data points and theoretical calculations.5 The unknown parameters of the sample usually include thickness, refractive index, and absorption coefficient of one or more layers. In general, the larger is the collected data set, the more accurate will be the estimates of the unknown parameters or the greater will be the number of unknowns that can be estimated. The relationship between the measurables and the unknowns is usually nonlinear, and there is no a priori guarantee that the various measurements on a given sample are independent of each other, nor that a given set of measurements is sufficient for determining the unknowns. Powerful numerical algorithms exist that search the space of unknown parameters and find estimates that closely reproduce the measured data. 632

633

46 Ellipsometry

The nulling ellipsometer

cte

cto

r

S

Figure 46.1 is the diagram of a conventional nulling ellipsometer.4,5 The quasimonochromatic light of wavelength k enters a rotatable polarizer, whose transmission axis may be oriented at an arbitrary angle qp relative to the X-axis. The polarizer’s output is thus a collimated, linearly polarized beam of light with an adjustable E-field orientation. This beam goes through a quarter-wave plate (QWP) whose fast and slow axes are fixed at 45 to the X-axis. (The QWP imparts a relative 90 phase shift to the E-field components along its axes.) The beam emerging from the QWP has equal amplitudes along X and Y, that is, jExj ¼ jEyj. The phase difference between these E-field components is adjustable in accordance with the following relation: x y ¼ 2(qp 45 ). Reflection from the sample imparts a phase difference rp rs to the p- and s-components of the beam, which may be cancelled out by properly selecting the

An

aly

zer

De

ra

Le

ns

X rp

45˚

u

Z

Y

Light source

Polarizer Quarter-wave Lens plate

Sample

Figure 46.1 In a nulling ellipsometer the collimated beam of light emerging from the source (wavelength k) is linearly polarized along the direction qp by a rotatable polarizer. The quarter-wave plate’s axes are typically at 45 to the XZ-plane of incidence. Thus the beam incident on the sample has equal amounts of p- and s-polarization, the relative phase between these two components depending on qp. Reflection from the sample induces a phase shift rp rs between the p- and s-components, which may be cancelled out by adjusting the polarizer’s orientation. Subsequently, the analyzer in the detection arm is rotated to extinguish the light transmitted to the detector. In the null condition, the value of qp yields the sample’s phase shift rp rs while the analyzer angle qa yields the ellipsometric parameter wr, which is related to the amplitude ratio jrpj/jrsj of the reflection coefficients.

634

Classical Optics and its Applications

polarizer angle qp. At this point the reflected beam is linearly polarized, its E-field components along X and Y being proportional to jrpj and jrsj, respectively. In the reflected path the analyzer, whose transmission axis is also adjustable, is rotated through an angle qa ¼ tan1(jrpj/jrsj)¼wr to block the light that would otherwise reach the detector. Thus by measuring the values of qa and qp that null the detector’s signal, one obtains the amplitude ratio jrpj/jrsj and the relative phase rp rs of the sample’s reflection coefficients. Measuring the sample reflectivities Rp, Rs using a nulling ellipsometer is straightforward; all one needs to do is monitor the detector signal S at qa ¼ 0 and 90 . Calibration requires removing the sample and aligning the arms of the ellipsometer with each other (i.e., h ¼ 90 ), in which case the light from the source goes through the entire system and yields a detector signal corresponding to a 100% sample reflectivity. Optical power fluctuations could be countered by splitting off a small fraction of the beam at the source and monitoring its variations with an auxiliary detector. The signal from the auxiliary detector is subsequently used to normalize the reflectivity signals. Needless to say, the same types of measurement as discussed above, when performed on the transmitted beam, yield the values of Tp, Ts, tp ts and wt. Thin film on transparent substrate Figure 46.2 shows a sample consisting of a thin absorbing layer on a glass substrate. To allow the transmitted beam to exit the substrate without a change in its state of

Ep Es Incident beam

Reflected beam

u d

Substrate

Transmitted beam

Figure 46.2 A 25 nm-thick film of complex refractive index n þ ik ¼ 4.5þ1.75i is deposited on a hemispherical glass substrate (n0 ¼ 1.5). The probe beam has k ¼ 633 nm and is incident at h ¼ 60 . To avoid complications arising from reflections or losses at the substrate bottom, the hemispherical surface is antireflection coated.

635

46 Ellipsometry

polarization, and also to eliminate spurious reflections, an antireflection-coated hemispherical substrate is assumed. The film, which is 25 nm thick, has a complex index of refraction nþik ¼ 4.5þ1.75i, and the substrate’s refractive index is n0 ¼ 1.5. Computed values of the sample’s reflection and transmission characteristics at k ¼ 633 nm and h ¼ 60 are: Rp ¼ 29.63%, Rs ¼ 74.83%, rp rs ¼ 3.95 , wr ¼ 32.18 , Tp ¼ 24.13%, Ts ¼ 6.96%, tp ts ¼ 1.50 , wt ¼ 61.76 . We now examine the sensitivity of ellipsometric measurements to variations in the sample parameters. For example, if the refractive index n of the film is varied in the range 4.0 to 5.0, the various characteristics of the sample vary as in Figure 46.3. (The variations shown here are relative to the nominal sample characteristics evaluated at n ¼ 4.5.) It is seen that Rp, Rs, wr, wt are more sensitive to changes of n than Tp, Ts, rp rs and tp ts. Similarly, Figure 46.4 shows variations of the sample characteristics with changes in k. Here Tp, rp rs and tp ts are seen to be more sensitive to k than Rp, Rs, Ts, wr, wt. Figure 46.5 shows variations of the sample characteristics with a changing film thickness d in the range from 20 nm to 30 nm. In this case wt and, to some extent, wr are insensitive to d, but the remaining characteristics are quite sensitive. When all the components of the system are assumed to be perfect, the ellipsometer is sensitive enough to determine accurately the unknown sample parameters. In practice, however, no measurement system is perfect: the polarizer and the analyzer have a finite extinction ratio, allowing a small fraction of

(a)

(b) ΔRp

2 Δcr

R and T Variation (%)

4 ΔTp

ΔRs

2 ΔTs

0 –2

Angle Varitation (degrees)

6

1

0

Δct

Δ(ftp– fts)

Δ(frp– frs)

–1

–4 –2

–6 4.0

4.2

4.4

4.6 n

4.8

5.0

4.0

4.2

4.4

4.6

4.8

5.0

n

Figure 46.3 Variations of the reflection and transmission characteristics of the sample of Figure 46.2 at k ¼ 633 nm, h ¼ 60 , when the film’s refractive index n is varied from 4.0 to 5.0. The changes are relative to the nominal values obtained with n ¼ 4.5.

636

Classical Optics and its Applications 3

(a) 2

ΔTp

(b) Δ(frp– frs)

ΔTs

1

Angle Variation (degrees)

R and T Variation (%)

2

ΔRp

0

ΔRs

–1 –2

1 Δcr

0

Δct

Δ(ftp– fts) –1

–2

–3 1.5

1.6

1.7

1.8

1.9

2.0

1.5

1.6

k

1.7

1.8

1.9

2.0

k

Figure 46.4 Variations of the reflection and transmission characteristics of the sample of Figure 46.2 at k ¼ 633 nm, h ¼ 60 , when the film’s absorption coefficient k is varied from 1.5 to 2. The changes are relative to the nominal values obtained with k ¼ 1.75.

5 (b)

5 (a) 4

2

3

ΔRp

ΔTs

ΔRs

1 0 –1 –2 –3

Angle Variation (degrees)

R and T Variation (%)

3

2

–1 –2 –3 –4 –5

24 26 28 Thickness (nm)

30

Δ(ftp– fts)

Δct

0

–5 22

Δcr

1

–4 20

Δ(frp– frs)

4

ΔTp

20

22

24 26 28 Thickness (nm)

30

Figure 46.5 Variations of the reflection and transmission characteristics of the sample of Figure 46.2 at k ¼ 633 nm, h ¼ 60 , when the film thickness d is varied from 20 nm to 30 nm. The changes are relative to the nominal values obtained with d ¼ 25 m.

637

46 Ellipsometry

the undesirable E-field component to pass through; the quarter-wave plate’s retardation deviates from 90 , and the beam that illuminates the sample is not an ideal plane wave but has a finite diameter. Moreover, when the beam is focused on the sample to provide a reasonable spatial resolution, the focused cone of light contains a range of incidence angles, resulting in measured values that are averages over these angles. One consequence of such system imperfections is that, in the “null condition,” a minimum amount of light would still reach the detector. Another consequence is the limited accuracy with which the various reflection and transmission characteristics of the sample are measured. Performance of the nulling ellipsometer For the system depicted in Figure 46.1 we show in Figure 46.6 computed plots of the detector signal S versus the angle qa of the analyzer for several values of the polarizer angle qp. The assumed focusing and collimating lenses are identical, having NA ¼ 0.025, which corresponds to a 3 focused cone at the sample.

1.1

(a)

1.0

1.1 rp = 47° 32°

0.9 Detector Signal

0.8

24°

0.8

17°

17°

0.7 2°

0.6

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1 0

45 90 135 Analyzer angle ra (degrees)

9°

0.6

0.5

180

0.0

rp = 54° 39° 32°

0.9

0.7

0.0

(b)

1.0

2°

0

45 90 135 Analyzer angle ra (degrees)

180

Figure 46.6 The detector signal S versus the orientation angle qa of the analyzer in the nulling ellipsometer of Figure 46.1 with the sample of Figure 46.2. Different curves correspond to different values of the polarizer angle qp. The total optical power of the unpolarized (or circularly polarized) beam emerging from the source is unity, the detector’s conversion factor is 4, the incidence angle is h ¼ 60 , and the focusing and collimating lenses have NA ¼ 0.025. In (a) the assumed system is perfect. In (b) there are departures from ideal behavior, namely, the polarizer and analyzer have a 1:100 extinction ratio, the angle of incidence deviates by 1 , and the quarter-wave plate’s retardation is 87 while its axes are 1 away from the ideal 45 orientation.

638

Classical Optics and its Applications

In Figure 46.6(a) the assumed system is perfect, while in Figure 46.6(b) errors are incorporated into the various components, namely, the assumed polarizer and analyzer have a 1:100 extinction ratio, the angle of incidence on the sample is h ¼ 61 , and the QWP’s retardation is 87 while its axes are 1 away from the ideal 45 orientation. The null in Figure 46.6(a) is achieved with qp ¼ 47 and qa ¼ 32.2 , yielding rp rs ¼ 4 and wr ¼ 32.2 , as expected. Also the detector signals at qa ¼ 0 and 90 are 0.296 and 0.748, which correspond to the correct values of Rp and Rs. In practice, even in this ideal case with perfect components the exact location of the null may not be easy to determine. This produces a certain degree of inaccuracy, depending on the available signal-to-noise ratio at the detector. In the case of Figure 46.6(b), where the assumed components have substantial errors, the minimum signal occurs at qp ¼ 54 and qa ¼ 30 , yielding rp rs ¼ 18 and wr ¼ 30 . The reflectivities in this case (obtained at qp ¼ 9 , and qa ¼ 0 and 90 ) are Rp ¼ 0.308, Rs ¼ 0.727. If we consider the sensitivity curves in Figures 46.3–46.5, such huge errors are clearly unacceptable. A more realistic situation might correspond to small system errors; suppose, for instance, that the polarizer and the analyzer have extinction ratios of 1:1000, the angle of incidence on the sample has a 0.25 error (h ¼ 60.25 ), and the QWP’s retardation is 90.5 while its axes are misaligned by only 0.25 . In this case the minimum signal occurs at qp ¼ 49 and qa ¼ 31.8 , yielding rp rs ¼ 8 and wr ¼ 31.8 . The reflectivities (obtained at qp ¼ 4 , and qa ¼ 0 and 90 ) are Rp ¼ 0.291 and Rs ¼ 0.757. It is thus clear that the nulling ellipsometer requires a high degree of accuracy in its components in order to achieve a reasonable level of confidence in its estimates of sample parameters. Ellipsometry with a variable retarder Figure 46.7 shows a different kind of ellipsometer, consisting of a fixed polarizer, a variable retarder (e.g., a liquid crystal cell or a photoelastic modulator), and a fixed differential detection module. None of these components needs to be rotated or otherwise adjusted during measurements. The variable retarder provides a range of polarization states at the sample. For instance, the incident beam is p-polarized when the retardation D is 0 , circularly polarized when D ¼ 90 , and s-polarized when D ¼ 180 . The detection module consists of a Wollaston prism with transmission axes fixed at 45 to the plane of incidence, followed by a pair of identical photodetectors. When the relative phase D imparted by the retarder to the incident beam is continuously varied from 0 to 360 , the sum signal S1 þ S2 oscillates between a maximum and a minimum value; these correspond to Rp and Rs, although not

639

Len

s

Wo lla pri ston sm

Ph

oto

det

ect ors

S2

S1

46 Ellipsometry

X 45°

u

Z

Y

Light source Polarizer

Variable retarder

Lens

Sample

Figure 46.7 Diagram of an ellipsometer based on a variable retarder and a differential detection module. The beam emerging from the polarizer is collimated and linearly polarized along the X-axis. The variable retarder’s axes are fixed at 45 to the XZ-plane of incidence, while its phase is varied continuously from 0 to 360 . The light beam is focused on the sample through a low-NA lens, and the reflected beam is recollimated by an identical lens in the reflection path. The reflected beam is monitored by a differential detector consisting of a Wollaston prism (oriented at 45 to the plane of incidence) and two identical photodetectors. The sum of the detector signals S1 þ S2 contains information about the sample reflectivities Rp and Rs, while their normalized difference (S1 S2)/(S1þS2) yields the relative phase rp rs.

necessarily in that order. At the same time, the normalized difference signal (S1 S2)/ (S1þS2) exhibits a peak-to-valley variation equal to 2 sin(rp rs). The system of Figure 46.7 does not provide an independent measure of the other ellipsometric parameter, wr. However, since Rp and Rs are directly measurable, wr is redundant. In operating the system of Figure 46.7 it is not necessary to know the timedependence of the retardation D, nor in fact does one need to know the specific value of D at any point during the measurement. The maximum and minimum values of the sum signal and of the normalized difference signal contain all the necessary information. Unlike the nulling ellipsometer, this system does not require any adjustment of angles around a broad minimum; therefore, there is much less uncertainty about the measured data points. For the ideal system depicted in Figure 46.7, Figure 46.8(a) shows computed plots of the sum signal and the normalized difference signal versus the retardation D. The maximum and minimum values of the sum signal are 0.748 and 0.296,

640

Classical Optics and its Applications 0.8 (a)

Sum and Difference Signals

0.7

0.8 (b) 0.7

S 1 + S2

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2 (S1 – S2)/(S1 + S2)

0.1

S1 + S2

(S1 – S2)/(S1 + S2)

0.1

0.0

0.0

–0.1

–0.1 0

90

180

270

Retardation (degrees)

360

0

90 180 270 Retardation (degrees)

360

Figure 46.8 Computed plots of the sum and (normalized) difference signals in the system of Figure 46.7 for the sample shown in Figure 46.2. The horizontal axis depicts the relative phase imparted to the beam by the variable retarder. The beam emerging from the polarizer has unit optical power, the detectors’ conversion factor is unity, the incidence angle is h ¼ 60 , and the focusing and collimating lenses have NA ¼ 0.025. (a) The assumed system is perfect. (b) Two instances (solid lines, broken lines) of imperfect system behavior.

corresponding to Rs and Rp. The normalized difference signal has a peak-tovalley variation of 0.1375, yielding rp rs ¼ 3.94 . In Figure 46.8(b) we have assumed some imperfection in the system components. Two cases are examined, one leading to the solid curves and the other to the broken curves. In the former case the polarizer’s extinction ratio is 1:1000, the retarder axes are misaligned by 1 , the Wollaston prism has a 1:100 leak ratio between its two channels, and the angle of incidence h is in error by 0.5 . From the computed sum and difference signals Rp ¼ 0.290, Rs ¼ 0.750, and rp rs ¼ 4.23 . In the case of the broken curves in Figure 46.8(b) the assumed imperfections are large. Here the polarizer’s extinction ratio is 1:100, the retarder’s orientation angle is 43 , the angle of incidence is h ¼ 60.5 , and the Wollaston prism leaks 2% of the wrong polarization into each channel. From the computed sum and difference signals the values of Rp ¼ 0.290, Rs ¼ 0.749, and rp rs ¼ 4.6 are obtained. Obviously, the system of Figure 46.7 is quite tolerant of imperfections and misalignments; therefore, it is suitable for accurate determination of the sample parameters. References for Chapter 46 1 A. Rothen, The ellipsometer, an apparatus to measure thicknesses of thin surface films, Review of Scientific Instruments 16, 26–30 (1945).

46 Ellipsometry 2 3 4 5 6

641

A. B. Winterbottom, Optical methods of studying films on reflecting bases depending on polarization and interference phenomena, Trans. Faraday Society 42, 487–495 (1946). R. H. Muller, Definitions and conventions in ellipsometry, Surface Science 16, 14–33 (1969). R. M. A. Azzam and N. M. Bashara, Ellipsometry and Polarized Light, North-Holland, Amsterdam, 1977. R. M. A. Azzam, Ellipsometry, chapter 27 in Handbook of Optics, Vol. 2, McGraw-Hill, New York, 1995. O. S. Heavens, Optical Properties of Thin Solid Films, Butterworths, London, 1955.

47 Holography and holographic interferometry

Dennis Gabor (1900–1979). His life-long love of physics started at the age of 15. Fascinated by Abbe’s theory of the microscope and by Lippmann’s method of color photography, he and his brother built up a home laboratory and began experimenting with X-rays and radioactivity. Gabor entered the Technische Hochschule Berlin and acquired a diploma in 1924 and an electrical engineering doctorate in 1927. His thesis work involved the development of high-speed cathode ray oscillographs, in the course of which he built the first iron-shrouded magnetic electron lens. In 1927 he joined Siemens & Halske AG, where he invented a high-pressure quartz mercury lamp, since used in millions of street lamps. With the rise of Hitler in 1933, Gabor left for England and obtained employment with the British firm Thomson–Houston. At Thomson–Houston he developed a system of stereoscopic cinematography, and in his last year there carried out basic experiments in holography. In 1949 he joined the Imperial College of Science and Technology (London) and remained there as Professor of Applied Electron Physics until his retirement in 1967. (Photo: courtesy of AIP Emilio Segre´ Visual Archives, W. F. Meggers Collection.)

642

47 Holography and holographic interferometry

643

Holography dates from 1947, when the Hungarian-born British scientist Dennis Gabor (1900–1979) developed the theory of holography while working to improve electron microscopy.1,2 Gabor coined the term “hologram” from the Greek words holos, meaning whole, and gramma, meaning message. The 1971 Nobel prize in physics was awarded to Gabor for his invention of holography. Further progress in the field was prevented during the following decade because the light sources available at the time were not truly coherent. This barrier was overcome in 1960, with the invention of the laser. In 1962 Emmett Leith and Juris Upatnieks of the University of Michigan recognized, from their work in side-looking radar, that holography could be used as a three-dimensional visual medium. They improved upon Gabor’s original idea by using a laser and an off-axis technique.3 The result was the first laser transmission hologram of three-dimensional objects. The basic off-axis technique of Leith and Upatnieks is still the staple of holographic methodology. These transmission holograms produce images with clarity and realistic depth, but require laser light to view the holographic image. The Russian physicist Uri Denisyuk combined holography with Lippmann’s method of color photography. In 1962 Denisyuk’s approach produced a white-light reflection hologram, which could be viewed in the light from an ordinary light bulb. In 1968 Stephen Benton, then at Polaroid corporation, invented white-light transmission holography.4,5,6 This type of hologram can be viewed in ordinary white light and is commonly known as the rainbow hologram. These holograms, which are “printed” by direct stamping of the interference pattern onto plastic, can be mass produced rather inexpensively.7

Basic principles A setup for recording a simple transmission hologram is shown in Figure 47.1. The coherent beam of the laser, after being expanded to cover the area of interest, is split into an object beam and a reference beam. The object beam passes through (or reflects from) the object before arriving at the photographic plate; the reference beam is directed toward the photographic plate at an oblique angle h. At the XY-plane of the plate the complex-amplitude distribution of the object beam is AO(x, y). The reference beam’s amplitude, AR(x, y), is proportional to exp[i(2p/k) (xSx þ ySy)], where Sx and Sy are the direction cosines of the beam. The two beams interfere at the plate, upon which their interference fringes are recorded. When the plate is properly processed and developed, its amplitude transmissivity s(x, y) becomes proportional to the incident intensity pattern, that is,8,9,10 sðx; yÞ ¼ Iðx; yÞ ¼ jAO ðx; yÞ þ AR ðx; yÞj2 :

ð47:1Þ

644

Classical Optics and its Applications Object beam Object

Beam-splitter

Z Laser Beam expander

am

ce

Photographic plate

be

n ere

f

Re Mir

ror

Figure 47.1 The basic optical system used for recording a simple hologram. The laser beam is expanded to accommodate the size of the object. The beamsplitter separates a fraction of the light to be used as a reference beam and sends it along a path that reaches the photographic plate at an oblique angle. The rest of the beam continues along the Z-axis, interacts with the object, and arrives at the photographic plate while carrying the phase/amplitude information about the object. The two beams interfere and the plate records the resulting fringes of the interference pattern. The film is subsequently developed into a positive (or negative) transparency and becomes a permanent record of the object wave.

To reconstruct the object wave, the developed plate is returned to its original position and illuminated with the reference beam, as shown in Figure 47.2. The transmitted beam’s complex amplitude may thus be written Aðx; yÞ ¼ sðx; yÞAR ðx; yÞ ¼ fjAO ðx; yÞj2 þ jAR ðx; yÞj2 gAR ðx; yÞ þ jAR ðx; yÞj2 AO ðx; yÞ þ A2R ðx; yÞA O ðx; yÞ:

ð47:2Þ

Note in the above equation that jAR(x, y)j is a constant, independent of x and y, and that A2R(x, y) is a plane wave with direction cosines 2Sx and 2Sy. (When h is small, the propagation direction of this plane wave makes an angle 2h with the Z-axis.) Thus in addition to the reference beam AR(x, y) – which is modulated by the squared modulus of the object wave – the wavefront emerging from the hologram contains the original object wave AO(x, y), as well as its complex conjugate A*O(x, y). The reconstructed object wave travels in its original direction (i.e., along the Z-axis, in the case of Figure 47.2), but the conjugate wave rides on a plane wave whose deviation angle from the Z-axis is nearly twice that of the original reference beam.

645

47 Holography and holographic interferometry Hologram Beam-splitter

Z Laser Beam expander

Reconstructed wavefront

Mir ro

r

n

tio

ruc

nst co e R am be

Figure 47.2 To reconstruct the recorded wavefront one places the hologram in front of the same reference beam as used for recording. Upon transmission through the hologram several reconstructed waves emerge. If the hologram is in the same position as it was during recording, the virtual image of the object will be carried by the component of the emergent beam traveling along the Z-axis. However, if the hologram is flipped then a real image of the object emerges along the Z-axis. (The flipping is such that the reconstruction beam becomes the conjugate of the original reference beam with respect to the hologram.)

Behind the hologram, the reconstructed object wave yields the virtual image of the recorded object; this image may be viewed through the lens of an eye or photographed through the lens of a camera. The conjugate wave yields a real image of the object, which can be visually inspected or photographed by placing a photographic plate directly in its path. The transmitted portion of the reconstruction beam itself does not carry any useful information and is generally ignored. Hologram of a simple phase-amplitude object As an example, consider the phase–amplitude object shown in Figure 47.3. The featureless areas of the face are transparent to the incident light, but the eyes, nose, and mouth alter both the amplitude and the phase of the beam. The eyes are partially transmissive depressions with a 50% transmittance and a maximum phase depth of 5p at the center. The nose and the mouth are also 50% transmissive, but they are raised above the surface of the face and their corresponding phase depth at the center is 5p. Figure 47.3(a) shows the pattern of transmitted intensity for a uniform incident beam. Figure 47.3(b), an interferogram between the beam transmitted through the face and a collinear plane wave, shows the fringes caused by the phase modulation imparted to the beam by the various features of the face.

646

Classical Optics and its Applications a

–105

b

x/

105 –105

x/

105

Figure 47.3 This face is a partially transmissive phase/amplitude object. The intensity pattern shown in (a) is obtained when the face is illuminated by a coherent, collimated, and uniform laser beam (i.e., a plane wave). The amplitude transmission coefficient of the facial features (eyes, nose, mouth) is 0.7. The interferogram in (b) is obtained when the transmitted beam is made to interfere with a plane wave. The features of the face modulate the phase of the transmitted beam in a continuous fashion by an amount that rises to 5p at the center of the eyes and falls to 5p at the center of the nose and the mouth.

When a plane wave (wavelength k) is transmitted through the face at z ¼ 0 and propagated to a photographic plate at z ¼ 3500k, one obtains the intensity and phase distributions shown in Figures 47.4(a), (b), respectively. Figure 47.4(c) is the interference pattern formed with a reference plane wave traveling at an oblique angle h ¼ 8 . The photographic plate is exposed to this interference pattern and subsequently developed into a positive transparency, that is, one in which the amplitude transmissivity is proportional to the incident intensity distribution during exposure. This transparency is a coherent-light hologram of the face. Note in Figure 47.4(c) that the chosen diameter of the reference beam is not large enough to cover the regions of the object wave far away from the Z-axis. This is simply due to the limited computer memory available for these calculations and is not a limitation in holography. Whereas in practice the reference beam is usually large enough to record all significant spatial frequencies of the object onto the hologram, in the present calculations the small diameter of the reference beam limits the range of admissible spatial frequencies, resulting in the loss of fine detail in the reconstructed images of the original object. When the developed hologram is placed in the system of Figure 47.2 and illuminated with the reconstruction beam, the original object wave and its complex conjugate appear among the transmitted waves, in accordance with Eq. (47.2). Figure 47.4(d) shows the transmitted intensity pattern immediately behind the hologram. At this point the overlapping components of the emergent

647

47 Holography and holographic interferometry a

b

c

d

–210

x/

210

–210

x/

210

Figure 47.4 A plane wave traveling along the Z-axis and transmitted through the face at z ¼ 0 arrives at the photographic plate at z ¼ 3500k. (a) Distribution of the logarithm of intensity of the object wave at the plate. (b) Object wave’s phase distribution at the plate. (c) Interference pattern (logarithm of intensity) between the object wave and a reference plane wave traveling at h ¼ 8 relative to the Z-axis. (d) Distribution of the logarithm of intensity immediately after the hologram, when the exposed plate is developed into a positive transparency and placed in front of the reconstruction beam.

beam are all mixed together and, therefore, difficult to identify separately. Since these components are traveling in different directions, propagation over a short distance is all that is required to disentangle them from each other. Holographic images of the recorded object When the above hologram is placed in the same position as during recording and illuminated with the same reference beam (now called the reconstruction beam) one obtains, at z ¼ 3500k behind the hologram, the reconstructed intensity and phase patterns of Figures 47.5(a), (b). The central region of this figure contains the reconstructed object wave, AO(x, y), carrying the virtual image of the face. The transmitted fraction of the reconstruction beam – modulated by the squared modulus of the object wave – appears to the right and above the central region.

648

Classical Optics and its Applications a

–1000

b

x/

c

–105

1000

–1000

x/

1000

x/

105

d

x/

105 –105

Figure 47.5 The reconstructed wavefront at z ¼ 3500k behind the hologram. The incident beam is the same as the reference beam used in creating the hologram. (a), (b) Distributions of the logarithm of intensity and the phase over the entire reconstructed field. The central region of this field carries a virtual image of the face. (c), (d) Distributions of the logarithm of intensity and the phase in the image plane of a unit-magnification lens that captures the central portion of the field and creates a real image of the face from the reconstructed object wave.

The real image of the face – produced by the conjugate wave A*O(x, y) – is shifted further off-axis, and appears in the upper right corner of Figure 47.5(a). Holographic reconstruction produces not only the amplitude of the original object but also its phase pattern, as is evident from Figure 47.5(b). Unlike regular photography, which maintains a record of the intensity profile but loses all trace of phase, the holographic process preserves both the amplitude and the phase information, and faithfully reproduces the entire object wave upon reconstruction. A comparison of the central regions of Figures 47.5(a), (b) with the original object wave of Figures 47.4(a), (b) might be worthwhile here, although one should note that the reconstructed wave in Figures 47.5(a), (b) is captured at an effective distance of 7000k from the original object, whereas the patterns of Figures 47.4(a), (b) correspond to a propagation distance of only 3500k.

47 Holography and holographic interferometry

649

To observe the virtual image, one should place an imaging lens in the central region of the field and produce a real image from the reconstructed object wave. (Alternatively, one could propagate the reconstructed object wave backwards in space by 7000k to reproduce the object wave at its point of origination.) A one-to-one imaging lens (NA ¼ 0.04, f ¼ 3500k) placed in the central region of Figures 47.5(a), (b) will create an inverted real image of the face at z ¼ 7000k behind the lens. The resulting intensity and phase patterns are shown in Figures 47.5(c), (d). The loss of resolution due to the small size of the hologram is visible at the edges of the various facial features, from which the high-spatial-frequency content of the original face is obviously missing (compare with Figure 47.3(b)). If the hologram is flipped during playback, the reconstruction beam, being a plane wave in this example, becomes the conjugate of the original reference beam, namely, A*R(x, y). (Alternatively, the reference beam may be conjugated and brought in from the opposite side of the hologram.) Under such circumstances the transmitted wave along the original direction of the object wave (i.e., the Z-axis in the present example) becomes the conjugated object wave, A*O(x, y), and the reconstructed object wave moves off-axis. This situation is depicted in Figure 47.6, where, after propagating 3500k beyond the hologram, the various components of the transmitted beam have separated from each other. The intensity distribution in Figure 47.6(a) reveals at the center the real image of the face, slightly to the lower left the directly transmitted reconstruction beam, and close to the lower left corner the beam containing the virtual image. There is also a weaker image of the face on the right-hand side of the real image; this “second harmonic” of the face is created by the nonlinearity of the photographic process. Figures 47.6(c), (d) are close-ups of the intensity and phase patterns in the real image produced by the conjugated object wave. Holographic interferometry Suppose the face shown in Figure 47.3 is somehow distorted at a later time or has undergone changes in its optical properties such that the beam transmitted through the face has acquired a certain degree of phase modulation. To render this phase modulation visible by converting it to intensity variations, it is necessary to interfere the beam transmitted through the face with a reference beam. If a collinear plane wave is chosen as reference, the resulting interferogram will resemble that in Figure 47.7(a). Here the deformation contours appear as black and white fringes superimposed on the face. One can also see in this figure the fringes caused by the phase structure of the facial features, namely, the eyes, the nose, and the mouth.

650

Classical Optics and its Applications a

–1000

b

x/

x/

1000

d

c

–105

1000 –1000

x/

105 –105

x/

105

Figure 47.6 The reconstructed wavefront at z ¼ 3500k behind the hologram. The incident beam is the conjugate of the reference beam used in creating the hologram. (a), (b) Distributions of the logarithm of intensity and the phase over the entire reconstructed field. (c), (d) Close-ups of the central region of the reconstructed field, showing the logarithm of intensity and the phase distribution of the real image of the face.

a

–105

b

x/

105 –105

x/

105

Figure 47.7 Two interferograms of the distorted face. In (a) the reference beam is a plane wave, whereas in (b) the distorted face is made to interfere with its own undistorted version.

47 Holography and holographic interferometry

651

An alternative “reference beam” is provided by the original, undistorted wave from the face itself. If the wave transmitted through the distorted face is made to interfere with that from the original face, the resulting fringe pattern will look like that in Figure 47.7(b). Here the features of the face itself do not appear in the interferogram; only the distortion fringes are visible. This is a clear advantage, of course, because one is usually interested in the changes induced in the object, not in the features of the object itself. The problem in most cases, however, is that the distorted and the undistorted objects are not simultaneously available and, therefore, creating an interferogram between the two using traditional methods of interferometry is not a viable option. Holographic interferometry provides a solution to this problem by allowing the original wavefront, while still available, to be stored on a photographic plate. Later, when the object is distorted, a second recording of its wavefront is made; then the two wavefronts are reconstructed and allowed to interfere with each other. Interestingly enough, these two recordings can be made on the same photographic plate by double exposure. Moreover, the two wavefronts are automatically superimposed during reconstruction.10 The essential idea behind holographic interferometry may be readily grasped by reference to Eqs. (47.1) 0 and (47.2) above. If the distorted wavefront is denoted by A O(x, y), it is clear that, upon reconstructing the double exposure hologram, the emergent object 0 wave will be AO(x, y) þ AO(x, y), while the emergent conjugate wave will be 0 A*O(x, y) þ A *O(x, y). In this way both the virtual image and the real image show fringe patterns corresponding to contours of constant phase shift between the original object and its distorted version. Figures 47.8(a), (b) show the intensity and phase patterns at the photographic plate corresponding to the distorted face. When this beam is combined with a reference plane wave traveling at 8 to the Z-axis, the fringe pattern of Figure 47.8(c) is obtained. This fringe pattern is recorded on the same film that had previously recorded the hologram of the original face. When the resulting double-exposure hologram is developed into a positive transparency and placed in front of the reconstruction beam, the intensity distribution of Figure 47.8(d) appears immediately behind the hologram. Assuming that the reconstruction beam is the conjugate of the reference beam used in recording both holograms, the emergent beam along the Z-axis will be the 0 conjugate of the combined object waves, namely, A*O(x, y) þ A *O(x, y). The intensity and phase patterns in Figures 47.9(a), (b) are obtained after propagating the emergent beam a distance of 3500k beyond the hologram. The fringe pattern caused by the distorted face is clearly visible in this holographic interferogram. In an ideal situation, where the hologram is large enough to capture all significant spatial frequencies of both object waves, the features of the original

652

Classical Optics and its Applications a

b

c

d

–210

x/

210 –210

x/

210

Figure 47.8 A plane wave traveling along the Z-axis and transmitted through the distorted face at z ¼ 0 arrives at the photographic plate at z ¼ 3500k. In this double-exposure experiment a hologram of the undistorted face has already been recorded on the plate. (a) Logarithmic plot of the object wave’s intensity distribution at the plate. (b) The object wave’s phase distribution at the plate. (c) Pattern of interference between the object wave and a reference plane wave traveling at h ¼ 8 relative to the Z-axis. (d) Distribution of the logarithm of intensity immediately after the hologram, when the twice-exposed film is developed into a positive transparency and placed in front of the reconstruction beam.

object will be invisible in the interferogram. However, in these calculations, the hologram is of necessity small and, therefore, the features are not completely absent from the final image. In any event, if the reference beam is large enough to capture the high-spatial-frequency content of the object waves, the interferogram of Figure 47.9(a) will approach the ideal one shown in Figure 47.7(b). Real-time interferometry using a holographic image If a hologram of an object in a given state is made, the reconstructed image can be made to interfere in real time with the “live” images of the same object in different states. Hence deformations that are dynamic in nature can be observed

653

47 Holography and holographic interferometry a

–105

b

x/

105 –105

x/

105

Figure 47.9 The reconstructed wavefront at z ¼ 3500k behind the doubleexposure hologram, showing the interference pattern between the real images of the distorted and undistorted face. The incident beam is the conjugate of the original reference beam used in both exposures, and the component of the reconstructed wave traveling along the Z-axis carries the real images. (a) Logarithmic plot of intensity and (b) plot of phase distribution over the area of the real image.

directly. This also provides a natural and very sensitive method of aligning the hologram to the original position after it has been removed for processing. References for Chapter 47 1 D. Gabor, A new microscopic principle, Nature 161, 777–778 (1948). 2 D. Gabor, Microscopy by reconstructed wavefronts, Proc. Roy. Soc. London A 197, 454–487 (1949). 3 E. N. Leith and J. Upatnieks, Reconstructed wavefronts and communication theory, J. Opt. Soc. Am. 52, 1123–1130 (1962). 4 S. A. Benton, Hologram reconstruction with extended incoherent sources, J. Opt. Soc. Am. 59, 1454A (1969). 5 S. A. Benton, The mathematical optics of white light transmission holograms, in Proceedings of the First International Symposium on Display Holography, ed. T. H. Jeong, Lake Forest College, July 1982. 6 S. A. Benton, Survey of holographic stereograms, in Processing and Display of Three-Dimensional Data, SPIE 367, 15–19 (1983). 7 The introductory section is adapted from Holophile, Inc.’s website at www. holophile. com. 8 J. W. Goodman, Introduction to Fourier Optics, McGraw-Hill, New York, 1968. 9 P. Hariharan, Optical Holography, Cambridge University Press, UK, 1984. 10 C. M. Vest, Holographic Interferometry, Wiley, New York, 1979.

48 Self-focusing in nonlinear optical media†

Self-focusing and self-trapping in nonlinear optical media were discovered soon after the invention of the laser in the early 1960s.1,2,3,4,5 These phenomena provided an explanation for the appearance of hot spots and associated optical damage in media irradiated by high-power laser pulses. The very high intensities achievable with the laser made it possible to observe these and other nonlinear effects, which depend upon the change in refractive index of the medium in response to the local electric field intensity. The physics of optical nonlinearity In a medium exhibiting third-order nonlinearity, the index of refraction n depends on the local E-field intensity I(x, y, z) as follows:2 nðx; y; zÞ ¼ n0 þ n2 Iðx; y; zÞ:

ð48:1Þ

Here n0 is the medium’s background index of refraction (observed at low optical intensities) and n2 is the nonlinear coefficient of the material. Whereas n0 is a dimensionless quantity, the nonlinear coefficient n2 has inverse intensity units, i.e. units of area/power. Several physical mechanisms can cause the refractive index of a given medium to depend on the E-field intensity; notable among them are the anharmonic motion of electrons in crystals, electrostriction, and the molecular orientation known as the Kerr effect.2 Electrostriction is caused by the volume force of an inhomogeneous electric field within a dielectric medium. The volume force draws the material into the high-field region, increasing its local density and, consequently, its refractive index. Optical glasses such as fused silica exhibit both electronic and electrostrictive nonlinearities, their n2-values being in the range 5 · 1016 to 5 · 1015 cm2/W. The Kerr effect is observed in materials whose †

The coauthor of this chapter is Ewan M. Wright of the College of Optical Sciences, University of Arizona.

654

48 Self-focusing in nonlinear optical media

655

molecules possess anisotropic polarizability and so tend to be aligned by the E-field, thus causing a change in the local refractive index. The liquid carbon disulfide (CS2), which has a fairly large n2-value, 2.6 · 1014 cm2/W, is a good example of this class of materials. When n2 is positive, the index of refraction in regions of high intensity tends to be larger than that in regions where the E-field is weak. Consequently, for an initially collimated and localized beam profile (such as a Gaussian), the wavefront propagating through the medium develops a phase pattern that resembles the curvature of a converging beam. While diffraction effects tend to broaden the cross-section of the beam, wavefront curvature – caused by nonlinearity – attempts to pull the beam towards regions of higher intensity. As long as the nonlinear effect is weak, diffraction predominates; however, as one increases the beam’s power a point is reached where the tendency of the beam to become focused balances the effects of diffraction. The beam can then propagate over long distances without any noticeable expansion or contraction. Physically, the field has built an effective waveguide for itself, which enables it to propagate without spreading. This phenomenon, known as self-trapping, occurs at the critical input power Pcr ¼ 0.146k2/(n0n2). Typical values of Pcr are 33 kW for CS2 at k ¼ 1 lm, and 0.2–2 MW for common optical glasses in the visible and near-infrared range. Self-trapping is inherently unstable and is readily destroyed by slight perturbations of the wavefront; nonetheless, it is possible to arrange well-controlled experiments to demonstrate the phenomenon. If the laser power is further increased beyond the threshold of self-trapping, the phenomenon known as self-focusing collapse is observed. In this case, not only does the nonlinear effect counter the natural tendency of the beam to diverge but also it forces the beam to collapse under its own weight and come to a sharp focus (a singularity, in the approximate paraxial theory) within a finite distance.3 Further increases in laser power break up the beam into multiple filaments, each of which carries enough power to exhibit self-focusing in its own right. Our goal in this chapter is to demonstrate some interesting examples of selffocusing in nonlinear media, both to elucidate the fundamental physics and to highlight the key effects produced by self-focusing in bulk media. Gaussian beam profile Figure 48.1(a) shows the distribution of intensity in the cross-section of a Gaussian beam of wavelength k and having a 1/e radius 1000k. The beam is linearly polarized along the X-axis and propagates along the Z-axis. The beam’s waist is at z ¼ 0, so the phase distribution in this plane is uniform over the beam’s crosssection. The full-width at half-maximum (FWHM) intensity of the beam at the

656

Classical Optics and its Applications 1100

a

b

y/

–1100 –1100

x/

1100 –1100

x/

1100

Figure 48.1 Plots of intensity distribution for (a) the X-component and (b) the Z-component of polarization. These plots represent the cross-section of a Gaussian beam having a 1/e radius 1000k.

waist equals 1177k, and its peak intensity Imax ¼ 0.64I0. Here I0 is an arbitrary scale factor used to normalize all intensity profiles throughout this chapter. The beam cannot satisfy Maxwell’s equations unless it has a component of polarization Ez along the Z-axis; the computed intensity profile jEzj2 for this Z-component is shown in Figure 48.1(b). For a beam whose cross-section is substantially larger than a wavelength, the power content of Ez is typically much less than that of Ex. For example, in the present case the fraction of the total optical power carried by the Z-component is only 0.25 · 107. We will see below that Ez gains in strength as the beam converges towards focus. Self-focusing by transmission through a thin slab Consider the transmission of the Gaussian beam depicted in Figure 48.1 through a thin slab of transparent material. (By thin we mean that the medium thickness is much less than the Rayleigh range of the incident beam, so that diffractive effects in the medium may be neglected.) Let the thickness d and the nonlinear coefficient n2 of the slab be chosen to yield a phase shift D ¼ 2pn2Imaxd/k ¼ 10p at the beam center, where the intensity is at its peak. Upon transmission through the slab the beam acquires the intensity and phase distributions shown in Figure 48.2. The distribution of jExj2 in Figure 48.2(a) is the same as that in Figure 48.1(a), but the intensity profile of Ez in Figure 48.2(b) is somewhat different from that in Figure 48.1(b). The fractional power of the Z-component is now 111 · 107, which, small as it may be, is substantially greater than the corresponding value before entering the nonlinear medium. This behavior may be understood by observing that the emergent beam has acquired a fairly large curvature and, consequently, its polarization vector has bent further toward the Z-axis.

657

48 Self-focusing in nonlinear optical media a

b

c

d

–1100

x/

1100 –1100

x/

1100

Figure 48.2 The beam of Figure 48.1 goes through a thin slab of a nonlinear material, creating a change in the index of refraction in proportion to its intensity. At the center, where the beam is brightest, the self-induced phase shift is 10p. The intensity of the beam upon emerging from the slab is shown in (a), (b), and its phase distribution in (c), (d). The plots on the left-hand side correspond to the X-component of polarization, while those on the right-hand side represent the Z-component.

Figure 48.2(c) shows the phase profile of the emergent wavefront for the X-component of polarization. The gray-scale ranges from p (black) to p (white), and the number of rings indicates a total phase shift of 10p from the center to the rim. The phase profile for the Z-component of polarization in Figure 48.2(d) shows, in addition to the curvature, a p phase shift between the right and left halves of the beam. Again this is a simple geometrical consequence of the bending of the rays toward the optical axis. The above example clearly demonstrates that a nonlinear medium can impart a curvature phase factor to a beam during transmission. When the curvature is negative the beam becomes divergent and expands upon further propagation. Conversely, a positive curvature causes the beam to converge towards a focus. This is the underlying physical mechanism of self-focusing in thick nonlinear media, to which we now turn.

658

Classical Optics and its Applications

Self-focusing through a thick slab Let us now consider propagation of the Gaussian beam of Figure 48.1 through a thick slab of a nonlinear material, where the effects of diffraction during propagation within the medium must be retained. For simulation purposes we divide the thick slab into 60 thin slabs (in which we place the nonlinearity), and propagate the beam between pairs of adjacent slabs through a linear medium of refractive index n0, which fills the gap between the slabs. We choose a separation of 5000k between adjacent slabs, compute the incident intensity profile at each slab (using Fresnel’s diffraction formula), and allow the nonlinear medium of each slab to impart to the beam a phase pattern (x, y) in proportion to the incident intensity distribution I(x, y). The specific phase shift assumed is 5 at the reference intensity of I0. The above procedure is repeated 60 times for a total propagation distance of 300 000k. (This numerical scheme of breaking the propagation into alternate sections of linear propagation followed by a nonlinear phase mask is equivalent to the split-step beam propagation method commonly employed in optics.) For the above choice of parameters the input power is about 20 times greater than the critical power for self-trapping, Pcr, and we expect the simulation to display self-focusing collapse.3 The results of this simulation appear in Table 48.1 and Figure 48.3. The left-hand column in the figure shows the cross-sectional profile of jExj2, while the right-hand column shows the corresponding plots of jEzj2. From top to bottom, the intensity profiles are obtained after 20, 30, 40, 50, and 60 steps in the simulation. Note that the beam is converging towards a focus and that Ez is becoming stronger as the beam gets smaller. The FWHM of the beam drops from 1177k in the beginning to 196k after 60 iterations. The focusing, of course, is not diffraction-limited, because the curvature imparted to the beam by the nonlinear medium does not exactly constitute a spherical wavefront. The departure of the wavefront from perfect sphericity saddles the beam with primary and higher-order spherical aberrations. It is clear physically that self-focusing collapse cannot proceed indefinitely. Some mechanisms that can arrest the collapse are saturation of the nonlinear refractive-index change, nonlinear absorption arising from multi-photon ionization, and optical breakdown.

Asymmetric intensity profile and self-deflection Our next example is similar to the previous one, except that now the beam launched into the thick nonlinear medium has an asymmetric profile.4 The asymmetry is produced by blocking off half the incident Gaussian beam. pﬃﬃﬃ To maintain the total optical power, we multiply the beam’s amplitude by 2, thus

659

48 Self-focusing in nonlinear optical media

Table 48.1. Various properties of the beam during propagation through a nonlinear slab. The corresponding intensity profiles are shown in Figure 48.3 Number of steps 10 20 30 40 50 60

Imax / I0 0.65 0.68 0.76 0.92 1.30 3.10

Fractional power of Z-component 7

0.29 · 10 0.39 · 107 0.58 · 107 0.91 · 107 1.52 · 107 3.21 · 107

FWHM (·k) 1160 1103 998 824 545 196

preserving the integrated intensity over the clear aperture of the beam. As before, the distance between adjacent thin slabs is 5000k, and the beam is propagated in 60 steps for a total distance of 300 000k. Figure 48.4 shows, from top to bottom, the initial half-Gaussian beam as well as the patterns of intensity distribution within the medium after 20, 40, 50, and 60 propagation steps. Both columns show the profile of jExj2, the intensity distribution being on the left-hand side and its logarithm on the right-hand side. (The logarithmic plot enhances weak features of the distribution, just like an over-exposed photograph.) We note several new features in this example. First, the beam comes to a focus in the narrow dimension before it collapses in the wide dimension. Second, the center of the beam shifts to the right as it propagates. This selfdeflection is caused by the prism-like phase factor that the nonlinear medium imparts to the beam.4 An ideal prism imparts a phase factor that is linear in the spatial coordinate x, namely, exp(i2prx/k), deflecting the beam by an angle h ¼ sin1r. One can explain the observed self-deflection in Figure 48.4 by noting the similarity between the ideal phase factor of a prism and the phase factor exp[i(x, y)] imposed on the half-Gaussian beam by the nonlinear medium. Finally, note in Figure 48.4 that the beam breaks up into multiple branches after coming to focus. In practice the intensity at the focal point may be large enough to damage the material. Even if damage does not occur, small material inhomogeneities can cause substantial aberrations, distorting and breaking up the beam in unpredictable ways. The fact that computer simulations also show this type of breakup is due to small numerical errors incurred during computation. Usually these numerical errors are insignificant, but when the intensity begins to build up in the vicinity of a focal point, they cause the breakup of the beam in a random-looking fashion.

660

Classical Optics and its Applications

–1100

x/

1100 –1100

x/

1100

Figure 48.3 Top to bottom: plots of intensity distribution after 20, 30, 40, 50, 60 steps of propagation through a nonlinear medium. The X-component of polarization is on the left, and the Z-component on the right. The incident beam is the Gaussian shown in Figure 48.1.

Beam filamentation As mentioned above, if the beam’s power is large enough the beam breaks up into many cells, each of which contains several critical powers and comes independently to focus. Our final example concerns a uniform beam of diameter

48 Self-focusing in nonlinear optical media

–1100

x/

1100 –1100

x/

661

1100

Figure 48.4 Top to bottom: distributions of intensity (left) and logarithm of intensity (right) after 0, 20, 40, 50, 60 propagation steps through a nonlinear medium. The incident beam is the Gaussian of Figure 48.1, with its left half blocked but its intensity doubled to preserve the total power. In units of the reference intensity I0, the peak intensity Imax starts at 1.28 and increases to 1.51 after 10 steps, 1.95 after 20 steps, 3.24 after 30 steps, 6.67 after 40 steps, and 14.95 after 50 steps, then drops to 7.76 after 60 steps.

2000k with a constant intensity equal to 0.32I0 across the aperture. In this simulation we placed 40 thin slabs of a nonlinear material at intervals of 15 000k along the Z-axis. Each slab imparts a phase shift of 15 at the reference intensity of I0, which is equivalent to an incident optical power of 60Pcr. Shown in Figure 48.5

662

Classical Optics and its Applications

–1100

x/

1100 –1100

x/

1100

Figure 48.5 Top to bottom: distributions of intensity (left) and logarithm of intensity (right) after 10, 20, 30, 35, 40 propagation steps through a nonlinear medium. The incident beam is uniform, having a circular cross-section of radius 1000k. In units of the reference intensity I0, the peak intensity Imax starts at 0.32 and then fluctuates as follows: 1.18 after 10 steps, 0.9 after 20 steps, 7.85 after 30 steps, 3.00 after 35 steps, and 6.87 after 40 steps.

are the results of simulation after 10, 20, 30, 35, and 40 steps. At first, as a result of diffraction during propagation, the beam breaks up into multiple rings. After 30 iterations the central region of the beam comes to a focus. Afterwards, the central spot goes out of focus, but one of the rings breaks into multiple filaments.5 Small

48 Self-focusing in nonlinear optical media

663

perturbations are necessary to break up a ring; as mentioned above these are provided by material inhomogeneities in practice, and by small numerical errors inherent to computer simulations in these calculations. The number of filaments depends on the power of the beam as well as on the strength of nonlinearity of the material. Concluding remarks Another mechanism that can couple the refractive index to the beam intensity profile is absorption of the light followed by heating and thermal diffusion. Variation of the refractive index in response to thermal expansion (or contraction) of the material is a frequently observed source of nonlinear optical behavior. Thermal effects usually produce negative values of n2, thus causing defocusing of the beam. Heat diffusion further complicates the relation between n(x, y, z) and I(x, y, z), by removing the local nature of their interdependence. In this chapter we have confined our attention to the simple case of local nonlinearity with a positive value for n2 and have shown examples of self-focusing and beam filamentation. Similar studies can be carried out for thermally induced nonlinearities, provided that heat diffusion is taken into consideration properly. References for Chapter 48 1 2 3 4 5

R. Y. Chiao, E. Garmire, and C. H. Townes, Self-trapping of optical beams, Phys. Rev. Lett. 13, 479–482 (1964). R. W. Boyd, Nonlinear Optics, chapter 4, Academic Press, Boston, 1992. P. L. Kelley, Self-focusing of optical beams, Phys. Rev. Lett. 15, 1005–1008 (1965). G. A. Swartzlander and A. E. Kaplan, Self-deflection of laser beams in a thin nonlinear film, J. Opt. Soc. Am. B 5, 765–768 (1988). A. J. Campillo, S. L. Shapiro, and B. R. Suydam, Periodic breakup of optical beams due to self-focusing, Appl. Phys. Lett. 23, 628–630 (1973); also, Relationship of selffocusing to spatial instability modes, Appl. Phys. Lett. 24, 178–180 (1974).

49 Spatial optical solitons†

The possibility of self-trapping of optical beams due to an intensity-dependent refractive index was recognized in the early days of nonlinear optics.1 However, it was soon realized that in a three-dimensional medium, in which light diffracts in two transverse dimensions, self-trapping is not stable and leads to catastrophic collapse and filamentation. Stable self-trapping was then found to be feasible in two-dimensional media, in which the optical beam diffracts only in one transverse direction. Subsequently, the connection between selftrapping and soliton theory,2 and a complete analogy between spatial and temporal solitons were established. Whereas the formation of temporal solitons requires a balance between dispersion and nonlinear phase modulation, spatial solitons owe their existence to the balancing of diffraction with wavefront curvature induced by the nonlinear refractive index profile of the propagation medium. To observe a spatial soliton one must limit diffraction to one transverse direction, which can be achieved in a planar optical waveguide. The first experiments of this type were conducted using a multimode liquid waveguide (CS2 confined between a pair of glass slides).3 Formation of spatial optical solitons in single-mode planar glass waveguides was reported shortly afterwards.4 Kerr nonlinearity The simplest nonlinearity capable of producing self-trapping (leading to soliton formation in a planar waveguide) is a Kerr nonlinearity, obtained when the refractive index of the medium has an intensity-dependent term of the form nðx, y, zÞ ¼ n0 þ n2 Iðx, y, zÞ; †

This chapter is co-authored with Ewan M. Wright, Professor of Optical Sciences at the University of Arizona.

664

49 Spatial optical solitons

665

where I ¼ jEj2 is the electric field intensity of the optical beam. Since diffraction tends to expand the spatial dimensions of a beam, the requisite nonlinearity must produce self-focusing, which translates into a positive coefficient n2 for the Kerr medium. (In contrast, temporal solitons can exist in media having either negative or positive nonlinear indices, depending on whether the dispersion of the medium is normal or anomalous.) The only spatial solitons that could exist in media with negative n2 are dark solitons, which are localized depressions in a cw background. Although both bright and dark solitons (spatial as well as temporal) have been observed experimentally, we limit the discussion in this chapter to bright spatial solitons in planar waveguides. The beam propagation method (BPM) We use BPM to simulate the propagation of a beam of light through an optical waveguide exhibiting nonlinear effects. (See Chapter 32, The beam propagation method.) The BPM consists of a series of diffractive propagation steps in an isotropic, homogeneous medium, with each step followed by passage of the beam through one or more phase masks. The mask(s) impart to the beam’s crosssection a pattern of phase modulation that accounts for the cumulative effects of propagation in the inhomogeneous medium of the waveguide. For example, a mask can represent the phase modulation caused by the differing indices of refraction of core and cladding or, by becoming dependent on the local intensity distribution, a mask can mimic the phase modulation induced by the nonlinear index of the medium. The BPM can thus simulate many systems of practical interest provided that (i) the propagation steps are sufficiently small and (ii) the phase masks embody realistic effects of interaction between the beam and the waveguide during each propagation step. Slab waveguide with and without nonlinearity Figure 49.1 is the diagram of a slab waveguide consisting of a guiding layer of refractive index n1 ¼ 1.5056, sandwiched between cladding layers of index n0 ¼ 1.50. A Gaussian beam of elliptical cross-section (free-space wavelength ¼ k0), launched into the guide from the left side, propagates along the positive Z-axis. The thickness of the guiding layer is assumed to be 5k, where k ¼ k0 /n0 is the wavelength of the light within the glass medium. In the absence of nonlinearity, the injected beam spreads in the lateral direction X, but its diffraction along the Y-axis is arrested by the action of the waveguide. Figure 49.2 shows cross-sectional plots of the beam’s intensity profile at various locations along the Z-axis (all spatial dimensions are in units of k). The

666

Classical Optics and its Applications X Input beam Y

Guiding layer Z

Figure 49.1 A slab waveguide confines the beam in one spatial dimension (Y), so that nonlinearity can act in the second dimension (X) to produce self-confinement. The incident beam has wavelength k0 in free space. In our simulations the cladding glass material has refractive index n0 ¼ 1.5, the guiding layer has index n1 ¼ 1.5056, and the thickness of the guiding layer is 5k, where k ¼ k0 /n0 is the wavelength of the guided beam within the glass medium. The core and cladding materials have the same nonlinear (Kerr) coefficient n2.

top frame shows the intensity profile of the injected beam upon entering the waveguide. From top to bottom: z/k ¼ 0, 200, 500, 800. It is seen that the injected beam initially expands to fill the guiding region in the Y-direction. The beam subsequently broadens along X as it propagates in the Z-direction, but its width along Y remains constant. If a Kerr nonlinearity is introduced in the above waveguide, the broadening along X will be countered by a self-focusing phase factor imposed on the propagating beam’s cross-section. The Kerr medium’s refractive index responds to the light by increasing in proportion to the local intensity, namely, n(x, y, z) ¼ n0,1 þ n2I(x, y, z), where n2 > 0. Thus the bright central region of the beam is phase shifted more than its tail sections, which are less bright, resulting in a lens-like phase pattern that tends to focus the beam towards the center. If the beam’s intensity is weak, this self-focusing effect will not be sufficient to counter diffraction broadening. However, once the optical power density exceeds a certain critical value, the index modulation becomes strong enough to balance the effects of diffraction, resulting in an unchanging, stable beam profile along the propagation path. Figure 49.3 shows computed cross-sectional profiles of intensity along the Z-axis for the same waveguide and the same injected beam as depicted in Figure 49.2. The difference is that in the present case the nonlinear index n2 is no longer zero, but chosen to yield a stable, non-diffracting guided beam. (The peak intensity reached in this simulation raises the refractive index, locally and instantaneously, by Dn ¼ 0.0022.) The confined beam is a spatial soliton whose properties can be readily evinced from Maxwell’s equations in conjunction with the nonlinear index of the medium. Self-trapping in this one-dimensional case (along the X-axis) is highly stable, and slight inhomogeneities of the guiding

667

49 Spatial optical solitons

5

y/

–5 – 25

x/

25

Figure 49.2 Confinement and propagation of an elliptically shaped Gaussian beam through the slab waveguide depicted in Figure 49.1. The top frame shows the intensity profile of the injected beam upon entering the waveguide. From top to bottom: z/k ¼ 0, 200, 500, 800. All spatial dimensions are in units of k, the wavelength of the guided beam in the glass medium. The beam propagation method (BPM) has been used to obtain these images of the guided beam at various cross sections. The propagation step-size was Dz ¼ 2.5k, and the 5k-wide guiding layer was simulated by a rectangular aperture which advanced the phase of the incident beam by D ¼ 5 in each step.

medium or variations of the input optical power do not destabilize the trapped beam. (Note that the second transverse dimension Y is essentially taken out of the equations by the action of the slab waveguide.) In contrast, two-dimensional self focusing in a Kerr medium (i.e., in the absence of the slab waveguide) would be highly unstable, resulting in catastrophic collapse and subsequent filamentation of the beam.5,6 (See Chapter 48, Self-focusing in nonlinear optical media.) Figure 49.4 shows the total optical power P (as a fraction of the input power P0) plotted versus z/k in the linear and nonlinear waveguides whose behaviors are

668

Classical Optics and its Applications

5 y/

–5 – 15

x/

15

Figure 49.3 Same as Figure 49.2, but with nonlinearity added to the waveguide, n ¼ n0,1 þ n2I. (Note that the horizontal scale differs from that in Figure 49.2.) The injected Gaussian beam initially expands in the Y-direction to fill the guiding layer, but in the X-direction self-focusing combats the natural tendency of the beam to expand by diffraction. From top to bottom, z/k ¼ 0, 200, 500, 800. The net result is a stable, non-diffracting beam that propagates along the Z-axis, confined in the Y-direction by the waveguide, and in the X-direction by the self-focusing action of the nonlinear medium.

depicted in Figures 49.2 and 49.3, respectively. The power P is computed at each step of the simulation by integrating the guided beam’s intensity in the crosssectional plane of the waveguide. The initial steep drop in P/P0 is caused by radiation into the cladding, at a time when the injected beam is still adjusting to the waveguide; the guided mode is seen to stabilize after a fairly short propagation distance. P(z) in the nonlinear guide behaves more or less the same as it does in the linear guide, except for the steady-state value of the guided optical power, which is somewhat greater in the presence of nonlinearity.

669

49 Spatial optical solitons 1.0

Normalized Optical Power (P/P0)

0.9

0.8

0.7

Nonlinear waveguide 0.6

Linear waveguide 0.5 0

200

400

600

800

Propagation distance (z/)

Figure 49.4 Total optical power P (as a fraction of the input power P0) plotted versus z/k in the linear and nonlinear waveguides whose behaviors are depicted in Figures 49.2 and 49.3, respectively.

Adjacent pair of out-of-phase solitons When two solitons propagate side by side in the same slab waveguide, they interact and affect each other’s behavior. Figure 49.5 shows the case of two identical solitons, launched side by side with a relative phase of 180 . The left column in Figure 49.5 displays several cross-sectional intensity profiles throughout the guide, while the right column shows the corresponding phase distributions (light gray ¼ 0 , dark gray ¼ 180 ). From top to bottom, the frames represent propagation distances z/k ¼ 0, 200, 500, 800, 1100, and 1600. At first, the beams expand to fill the guide in the Y-direction. Self-focusing of each beam follows, with the result that two identical solitons (aside from their 180 phase shift) form in the same neighborhood. The tails of the two solitons overlap, however, and this overlap is strong enough to cause their mutual repulsion via the induced nonlinear phase. Note that the 180 phase difference between the left and right halves of the guide is preserved throughout propagation.

670

Classical Optics and its Applications

5

y/

–5 – 25

x/

25 – 25

x/

25

Figure 49.5 Two Gaussian beams, separated along the X-axis and having a relative phase of 180 , are simultaneously launched into the slab waveguide of Figure 49.1. The beams initially expand in the Y-direction to fill the width of the guide; self-focusing then confines each Gaussian beam in the X-direction, and the interaction between the two pushes them apart. (The peak intensity reached in this simulation raises the refractive index, locally and instantaneously, by Dn ¼ 0.003.) The left column displays cross-sectional intensity profiles, while the right column shows the corresponding phase distributions (light gray ¼ 0 , dark gray ¼ 180 ). From top to bottom, the frames represent propagation distances z/k ¼ 0, 200, 500, 800, 1100, and 1600.

671

49 Spatial optical solitons 1.0 Out-of-phase soliton pair

Normalized Optical Power (P/P0)

0.9

0.8

0.7

0.6

0.5 0

400

800

1200

1600

Propagation distance (z/)

Figure 49.6 Total optical power P (as a fraction of the input power P0) plotted versus z/k for the pair of out-of-phase solitons depicted in Figure 49.5.

Figure 49.6 is a plot of the total optical power versus propagation distance for the pair of out-of-phase solitons depicted in Figure 49.5. The steep initial drop is caused by radiation into the cladding, during the time when the injected beams are still adjusting to the waveguide. Once the solitons are established, however, their power content remains essentially constant. Adjacent pair of in-phase solitons Figure 49.7 shows the case of two Gaussian beams launched into the slab waveguide of Figure 49.1 simultaneously and with identical phase. The two solitons thus formed within the nonlinear guide begin to attract each other. In Figure 49.7, from top to bottom, the propagation distance from the input port is z/k ¼ 0, 100, 400, 550, 700, 850, 1100 (left column), and z/k ¼ 1250, 1550, 2150, 2300, 2500, 2650, 3050 (right column). At first it appears that the two solitons fuse together, but soon they separate and move apart, only to return and collide once again. The two solitons thus engage in a periodic dance. In a truly

672

Classical Optics and its Applications

5 y/ –5 – 20

x/

20 – 20

x/

20

Figure 49.7 Two identical Gaussian beams, separated along the X-axis by 20k and having a constant, uniform phase in their cross-sectional plane, are simultaneously launched into the slab waveguide of Figure 49.1. The various frames display the patterns of intensity distribution in the waveguide’s cross section along the Z-axis. From top to bottom, the propagation distance from the input port is z/k ¼ 0, 100, 400, 550, 700, 850, 1100 (left column), and z/k ¼ 1250, 1550, 2150, 2300, 2500, 2650, 3050 (right column). The peak intensity reached in this simulation raises the local refractive index by Dn ¼ 0.01. Initially, the beams expand in the Y-direction and fill the width of the guide, while they self-focus in the X-direction. Thus confined, the two beams move toward each other and collide, appearing for a brief period to have fused together. Following collision, the two solitons re-appear and move apart, but their mutual attraction brings them back together again.

673

49 Spatial optical solitons 1.0

In-phase soliton pair

Normalized Optical Power (P/P0)

0.9

0.8

0.7

0.6

0.5 0

800

1600

2400

3200

Propagation distance (z/)

Figure 49.8 Total optical power P (as a fraction of the input power P0) plotted versus z/k for the in-phase soliton pair whose behavior is depicted in Figure 49.7.

one-dimensional system the dance would have continued forever, but in this quasi-one-dimensional case, it appears that the solitons get somewhat closer together after each oscillation period. Figure 49.8 is a plot of the total optical power versus propagation distance for the in-phase soliton pair depicted in Figure 49.7. It is seen that, once the solitons are established, their power content remains constant despite repeated collisions. It is a well-known property of solitons that, upon collision, they pass through each other unscathed. The above behavior of the in-phase soliton pair is a clear confirmation of this property, even in a non-ideal (i.e., quasi-one-dimensional) situation. Bouncing soliton Consider the rectangular channel waveguide depicted in Figure 49.9. The guiding channel has length ¼ 40k, width ¼ 5k, and refractive index n1 ¼ 1.5056, while the index of the cladding glass is n0 ¼ 1.50. The core and cladding materials are

674

Classical Optics and its Applications X Input beam Y

Channel waveguide Z

Figure 49.9 Rectangular channel waveguide having length ¼ 40k and width ¼ 5k. The cladding glass has index of refraction n0 ¼ 1.50, whereas the index of the guiding channel is n1 ¼ 1.5056. The core and cladding materials have the same nonlinear (Kerr) coefficient n2.

assumed to have the same nonlinear (Kerr) coefficient n2. An elliptically shaped Gaussian beam is launched with a slight sideways tilt into this waveguide. (The tilt is simulated by imposing on the beam a linearly varying phase along the X-axis.) As before, the injected beam initially expands to fill the channel in the Y-direction, while simultaneously contracting along X to form a soliton (see Figure 49.10). However, the sideways tilt of the injected beam propels the soliton towards the right-hand side. In Figure 49.10, from top to bottom, the displayed cross-sectional intensity patterns correspond to propagation distances z/k ¼ 0, 100, 200, 300, 500, 650, 800 (left column), and z/k ¼ 1050, 1100, 1350, 1550, 1800, 2100, 2400 (right column). When the soliton encounters the channel wall on the right-hand side, it is squeezed against the wall, then bounces back. Subsequently, it moves towards the left wall, gets squeezed, and bounces back again. This pattern of behavior is repeated indefinitely as the beam propagates along the Z-axis. (The peak intensity reached in this simulation raises the local refractive index by Dn ¼ 0.0022.) Thus spatial solitons exhibit a particle-like behavior, retaining their identity even after interactions with each other or with the channel walls. This property underlies their potential utility as information-carrying bits in all-optical switching applications.7,8 Figure 49.11 is a plot of the total optical power along the Z-axis for the bouncing soliton depicted in Figure 49.10. Note in particular that no loss of power occurs when the soliton encounters the side walls of the channel. This is what one would expect based on the principle of total internal reflection. Concluding remarks The simulations reported in this chapter are quite stable and yield similar solitonic behaviors under diverse conditions. For example, in all cases considered,

675

49 Spatial optical solitons

5 y/

–5 – 25

x/

25 – 25

x/

25

Figure 49.10 Elliptically shaped Gaussian beam launched sideways into the channel waveguide of Figure 49.9 forms a bouncing soliton. From top to bottom, the cross-sectional intensity patterns correspond to propagation distances z/k ¼ 0, 100, 200, 300, 500, 650, 800 (left column), and z/k ¼ 1050, 1100, 1350, 1550, 1800, 2100, 2400 (right column). The injected beam (left column, top) has a Gaussian amplitude profile, similar to that in Figure 49.3, but it is also modulated by a linear phase along the X-axis, which gives its motion a slight tilt toward the right-hand side. As before, the soliton forms and propagates along the Z-axis, but it slowly drifts to the right. Upon encountering a channel wall, the soliton is squeezed against the wall, then bounces back.

676

Classical Optics and its Applications 1.0 Bouncing soliton

Normalized Optical Power (P/P0)

0.9

0.8

0.7

0.6

0.5 0

600

1200

1800

2400

Propagation distance (z/)

Figure 49.11 Total optical power P (as a fraction of the input power P0) plotted versus z/k for the bouncing soliton whose behavior is depicted in Figure 49.10.

the optical nonlinearity was placed uniformly in the entire waveguide, that is, the guide and the cladding layers had the same coefficient of non-linearity (n2). In general, this is not necessary and one can simulate situations where, for instance, nonlinearity is present in the guiding layer only, without causing any significant modification of the results. References for Chapter 49 1 R. Y. Chiao, E. Garmire and C. H. Townes, Phys. Rev. Lett. 13, 479 (1964). 2 V. E. Zakharov and A. B. Shabat, Sov. Phys. JETP 34, 62 (1972). 3 S. Maneuf, R. Desailly, C. Froehly, Stable self-trapping of laser beams: Observation in a nonlinear planar waveguide, Optics Communications 65, 193–198 (1988). 4 J. S. Aitchison, Y. Silberberg, A. M. Weiner, et al., Spatial optical solitons in planar glass waveguides, J. Opt. Soc. Am. B 8, 1290–1297 (1991). 5 G. I. Stegeman, The growing family of spatial solitons, Optica Applicata 26, 239–248 (1996).

49 Spatial optical solitons 6 7 8

677

Special issue of Optical and Quantum Electronics on Spatial Solitons, Vol. 30, No. 10, published by Chapman & Hall, London, October 1998. A. Aceves, P. Varatharajah, A. C. Newell, et al., Particle aspects of collimated light channel propagation at nonlinear interfaces and in waveguides, J. Opt. Soc. Am. B 7, 963–974 (1990). A. C. Newell and J. V. Moloney, Nonlinear Optics, Addison-Wesley, Redwood City, California (1992).

50 Laser heating of multilayer stacks

Laser beams can deliver controlled doses of optical energy to specific locations on an object, thereby creating hot spots that can melt, anneal, ablate, or otherwise modify the local properties of a given substance. Applications include laser cutting, micro-machining, selective annealing, surface texturing, biological tissue treatment, laser surgery, and optical recording. There are also situations, as in the case of laser mirrors, where the temperature rise is an unavoidable consequence of the system’s operating conditions. In all the above cases the processes of light absorption and heat diffusion must be fully analyzed in order to optimize the performance of the system and/or to avoid catastrophic failure. The physics of laser heating involves the absorption of optical energy and its conversion to heat by the sample, followed by diffusion and redistribution of this thermal energy through the volume of the material. When the sample is inhomogeneous (as when it consists of several layers having different optical and thermal properties) the absorption and diffusion processes become quite complex, giving rise to interesting temperature profiles throughout the body of the sample. This chapter describes some of the phenomena that occur in thin-film stacks subjected to localized irradiation. We confine our attention to examples from the field of optical data storage but the selected examples have many features in common with problems in other areas, and it is hoped that the reader will find this analysis useful in understanding a variety of similar situations. Magneto-optical disk The cross-section of a quadrilayer magneto-optical (MO) disk, optimized for operation at k ¼ 400 nm, is shown in Figure 50.1. (GaN-based semiconductor diode lasers operating at these blue and violet wavelengths are becoming commercially available, and optical disk systems are expected to take advantage of 678

50 Laser heating of multilayer stacks

679

Electromagnet

30 nm

r Aluminum alloy

30 nm 10 nm

Dielectric (SiN) Magnetic film (TbFeCo)

45 nm

Dielectric (SiN)

Substrate (Polycarbonate)

Focused laser beam

Z

Figure 50.1 Quadrilayer stack of a magneto-optical disk. The electromagnet applies a magnetic field Bz(t) in the Z-direction, which is also the easy axis of magnetization of the magnetic layer. The laser beam, focused on the magnetic film through the substrate, acts as a heat source during recording.

this development by switching to blue or violet lasers within the next two to three years.) The quadrilayer of Figure 50.1 is deposited on a plastic substrate and consists of a thin magnetic film sandwiched between two transparent dielectric layers, capped by a thin layer of an aluminum alloy.1,2 The optical and thermal constants of the various layers of this stack are listed in Table 50.1. The focused laser beam arrives at the magnetic layer from the substrate side. This quadrilayer is designed to have a reflectivity of 9%, and has a fairly large polar MO Kerr signal (polarization ellipticity gK ¼ 1.55 and Kerr rotation angle hK ¼ 0.24 , where the signs correspond to the up and down directions of magnetization of the storage layer). Aside from contributing to the optical properties of the stack, the aluminum layer acts as a heat sink, and the upper dielectric layer is thin enough to provide good thermal coupling between the two metallic layers.1,2

680

Classical Optics and its Applications

Table 50.1. Optical and thermal constants of the various materials used in the

calculations Refractive index Dielectric tensor n þ ik (k ¼ 0.4 lm) e, e0 (k ¼ 0.4 lm) Polycarbonate (substrate) Aluminum alloy Tb21Fe72Co7 (amorphous ferrimagnet) SiN (dielectric) Ge2Sb2Te5 (amorphous) Ge2Sb2Te5 (polycrystal) ZnS–SiO2 (dielectric)

1.6

Specific heat C (J/cm3 C)

Thermal conductivity K (J/cm s C)

—

1.4

0.0025

— e ¼ 6.46 þ 16.11i e0 ¼ 0.1850.233i

2.4 2.9

0.75 0.10

2.2 2.9þ2.5i

— —

2.5 1.3

0.030 0.002

2.0þ3.6i

—

1.3

0.005

2.2

—

2

0.006

0.50þ4.85i 2.33þ3.45i

Figure 50.2 shows the intensity profile of the focused spot at the storage layer of the disk. The assumed objective lens that brings the laser light to focus in this case is free from all aberrations, is corrected for the thickness of the substrate, and has NA ¼ 0.8, f ¼ 1.5 mm. The collimated Gaussian beam entering the lens has 1/e (amplitude) radius r0 ¼ 1.2 mm, which is the same as the radius of the objective’s entrance pupil. The distribution of Figure 50.2(a) is displayed on a logarithmic scale to enhance the diffraction rings caused by truncation of the beam at the objective’s aperture. The radial profile of the spot, depicted on a linear scale in Figure 50.2(b), reveals that the rings are quite weak, however, and thus incapable of producing much heat at the periphery of the central bright spot. Figure 50.3 is a plot of the magnitude of the Poynting vector, S, along the Z-axis for a plane wave normally incident on the quadrilayer stack of Figure 50.1 through the substrate.2 The horizontal axis depicts the distance from the top of the stack. Thus S is seen to be constant in the two dielectric layers (30 < z < 60 nm and 70 < z < 115 nm), indicating no optical absorption in these regions. Most of the absorption takes place in the magnetic film (60 < z < 70 nm); a very small fraction of the incident energy goes to the aluminum layer (0 < z < 30 nm). The optical energy thus deposited in the magnetic film raises the local temperature immediately, but soon thermal diffusion takes over and carries the heat to other regions of the stack.

681

50 Laser heating of multilayer stacks a

x(μm)

–1

+1

= 0.4 μm, r0 = 1.2 mm NA = 0.8, f = 1.5 mm

(b) 1.0

Normalized Intensity

0.8

0.6

0.4

0.2

0.0 0.0

0.1

0.2

0.3 Radius (μm)

0.4

0.5

0.6

Figure 50.2 Distribution of total E-field intensity, jExj2 þ jEyj2 þ jEzj2, at the focal plane of a 0.8NA objective. The incident Gaussian beam (k ¼ 0.4 lm) is truncated by the lens aperture at its 1/e (amplitude) radius. For simplicity’s sake, the beam is assumed to be circularly polarized, so that it would yield a circularly symmetric spot at the focal plane. (a) Logarithmic plot of intensity, showing an Airy disk diameter 0.6 lm and FWHM 0.3 lm. (b) Radial intensity profile.

682

Classical Optics and its Applications 1.0

0.8

S

0.6

0.4

0.2

0.0 0

20

40 60 z (nm)

80

100

Figure 50.3 The magnitude S of the Poynting vector along the Z-axis, plotted through the thickness of the quadrilayer of Figure 50.1. The incident beam (k ¼ 0.4 lm) is assumed to have unit power. Upon entering the stack S ¼ 0.91, which indicates that 9% of the incident optical energy is reflected at the substrate interface. Approximately 3% of the energy goes to the aluminum layer and the remaining 88% is absorbed by the magnetic film.

Heat diffusion in the stationary stack To describe the temperature distribution within the stack, we need to specify the time dependence of the incident laser power, P(t). Figure 50.4 shows three such functions used in the examples throughout this chapter. The first function, P1(t), is a 1 mW trapezoidal pulse with 55 ns duration and 5 ns rise and fall times. We examine the effect of this pulse on the quadrilayer of Figure 50.1, when the stack is stationary. Figure 50.5 shows the profiles of temperature versus z at the beam center, r ¼ 0; the various curves correspond to different instants of time. Early on, at t ¼ 10 ns, the magnetic film is at a relatively high temperature, the aluminum layer has uniform temperature through its thickness, and there is a large thermal gradient between the magnetic film and the aluminum layer. It is through this temperature gradient that heat is transferred from the magnetic layer to the aluminum heat sink.3 A gradient has been established also in the lower dielectric layer between the magnetic film and the substrate. The temperature in the substrate is seen to decay exponentially with z.

50 Laser heating of multilayer stacks

683

P1(mW) 1

0

15

30

45

60

75

30

45

60

75

30 45 Time (ns)

60

75

P2(mW) 3 2 1

0

15 P3(mW)

6 5 4 3 2 1 0

0

15

Figure 50.4 The functions representing laser power versus time that are used in the various examples: P1(t) is a 55 ns trapezoidal pulse with 5 ns rise and fall times; P2(t) is a sequence of five identical pulses, each with a 5 ns duration, 1 ns rise and fall times, and a center-to-center spacing of 10 ns; P3(t) is a fairly complex pattern of three-level pulses, used in phase-change recording.

At later times during the heating cycle (i.e., t ¼ 30 ns and 50 ns) the patterns are similar to that at t ¼ 10 ns, but the temperatures are higher. Once the laser is turned off the temperatures drop abruptly. At t ¼ 55 ns, the magnetic film is already cooling down, and the heat is moving to the substrate. The hottest spot at this point is somewhere in the substrate, close to the interface with the lower dielectric layer. At t ¼ 60 ns, the cooling has progressed further, and the heat is rapidly spreading through the substrate. By t ¼ 100 ns, the temperature everywhere is essentially back

684

Classical Optics and its Applications 150

time = 50 ns

Temperature above ambient (°C)

125

30 ns

100

55 ns 10 ns

75

50

60 ns

25

100 ns

0 0

100

200

300

400

Z (nm)

Figure 50.5 Computed temperature profiles along the Z-axis at the beam center, r ¼ 0, for the stack of Figure 50.1 illuminated by the focused beam of Figure 50.2 and the pulse P1(t) of Figure 50.4. The profiles at t ¼ 10, 30, and 50 ns represent the heating-period; cooling-period profiles are shown for t ¼ 55, 60, and 100 ns. By virtue of its strong absorption of the incident optical energy, the magnetic film is the hottest region during heating. The high thermal conductivity of the aluminum layer gives it a fairly uniform temperature through its thickness. As soon as the laser is turned off, the temperatures drop rapidly, and the peak temperature shifts to the substrate.

to the ambient temperature. Although only the z-profiles are shown here, it should be remembered that the heat diffuses radially as well; not only does the heat move to the substrate, but also it spreads radially throughout the entire stack.3 Next we consider the profiles of temperature versus time in the magnetic layer. Figure 50.6 shows several profiles at different distances r from the beam’s center, starting at r ¼ 0 and increasing in steps of Dr ¼ 50 nm to r ¼ 1 lm. At r ¼ 0, the temperature reaches its highest value at the end of the pulse, then decays quickly and, in the span of a few nanoseconds after the laser turn-off, goes down by almost an order of magnitude. At larger radii, the temperature is slow to rise, and it also peaks somewhat after the laser is turned off. The reason for this behavior is that the focused spot at these radii is rather weak, and the heat does not arrive there directly from the laser, but by radial diffusion from the central region, which is under intense illumination.

685

50 Laser heating of multilayer stacks

Temperature above ambient (°C)

150

r=0

125

100

75

50

25 r = 1␮m

0 0

20

40

60

80

100

Time (ns)

Figure 50.6 Computed profiles of temperature versus time in the magnetic layer under the same conditions as in Figure 50.5. Different curves correspond to different radial distances from the beam center (in steps of Dr ¼ 50 nm), the largest temperature occurring in the center at r ¼ 0 and the lowest temperatures belonging to r ¼ 1 lm. As soon as the laser is turned down at t ¼ 50 ns temperatures near the beam center drop sharply, but at larger radii, because of radial heat diffusion from the center, T continues to rise for a while after the laser is turned off.

Recording by magnetic field modulation The electromagnet (EM) above the quadrilayer stack of Figure 50.1 provides a switched magnetic field Bz(t) between Bmax, thus helping to set the direction of magnetization of the hot spot within the magnetic layer.4,5 (Bmax is typically of the order of a hundred oersteds.) To ensure proper alignment of the focused spot with the EM’s pole piece, the EM is rigidly attached across the disk to the optical head. (The optical head is the assembly of the laser and other optical, mechanical, and electronic elements that guide the laser beam to the disk and back to the detectors.) The disk moves at a constant velocity V in the space between the EM and the optical head. Typically, it spins at a fixed angular velocity, say, 6000 rpm, which, at a radius of 40 mm on a 3.5 inch diameter disk, corresponds to a linear track velocity V ¼ 25 m/s. Since the information track is usually a continuous spiral from the inner to the outer radius of the disk, the combined optical and magnetic head assembly must follow this track by slow, continuous, travel along the disk’s radial direction.1,2 In a currently popular recording scheme, the laser is pulsed at a fixed rate to produce a sequence of identical hot spots in the magnetic layer.6 The binary

686

Classical Optics and its Applications

information to be recorded on the disk is fed to the EM, which switches the magnetization of the hot spot between the “up” and “down” stable states. The switching rate must be rapid enough to provide a high data-transfer rate into the recording medium. This requires a compact EM capable of flying very close to the magnetic layer, lest its inductance becomes too large. If the recorded marks are to be 0.25 lm long in the direction of the track, the laser must be pulsed at 10 ns intervals, in which case Bz(t) must switch between Bmax with rise and fall times of only a few nanoseconds. Such fast magnetic heads are currently at the forefront of conventional magnetic recording technology (i.e., hard-disk drives), but they require further development in order to be suitable for future generations of MO drives. Consider the quadrilayer disk of Figure 50.1 moving at V ¼ 25 m/s under the focused beam of Figure 50.2, modulated with the pulse sequence P2(t) of Figure 50.4. These five pulses are each 5 ns wide, have 1 ns rise and fall times, and reach a peak power of 3 mW. The assumed ambient temperature is 25 C. Figure 50.7 shows several isotherms at the critical temperature Tcrit ¼ 175 C in the magnetic film during the period 0 t 50 ns. (The maximum temperature of the magnetic film during the same period is Tmax ¼ 300 C.) To a good approximation, the magnetic dipoles of the storage layer align with the field of the EM in those regions where T Tcrit but the EM is unable to reorient these dipoles where T < Tcrit.4,5,6 200

Tmax = 300°C

Isotherms at 175°C

Y (nm)

100 0

–100 –200 0

200

400

600 X (nm)

800

1000

1200

Figure 50.7 Computed isotherms in the magnetic layer of Figure 50.1, when the multilayer is subjected to the focused spot of Figure 50.2 and the pulse sequence P2(t) of Figure 50.4. The ambient temperature is 25 C, and the disk moves at V ¼ 25 m/s along the X-axis. (In the reference frame of the disk, the focused spot moves from left to right.) The maximum temperature during this period is Tmax ¼ 300 C, reached at t ¼ 44 ns. All depicted isotherms are at T ¼ 175 C, plotted at Dt ¼ 1 ns intervals whenever T 175 C. The solid (broken) curves represent the heating (cooling) phase of each pulse. Because of the lateral heat diffusion, each pulse produces slightly larger isotherms than the preceding one, but by the end of the fifth pulse this process reaches a steady state. During the 10 ns period of each pulse the disk moves by Dx ¼ 0.25 lm, which is the minimum mark-length that can be recorded in this example.

50 Laser heating of multilayer stacks

687

The isotherms in Figure 50.7 are plotted at Dt ¼ 1 ns intervals whenever T Tcrit in the MO film. When the temperature is on the rise the isotherms are shown as solid lines, but as broken lines when the temperature is declining. By the end of the fifth pulse the temperature profiles reach the steady state (i.e., there are no significant variations from one set of isotherms to the next). In between the adjacent pulses the temperatures everywhere drop below Tcrit, so that for about 4 ns before and after each pulse the entire magnetic film is at T < Tcrit. This cooling period is crucial for thermomagnetic recording by magnetic-field modulation, because as long as Bz(t) is saturated at Bmax, the magnetization state of the disk is well defined (either up or down). But when the field is in transition, there is a short time interval during which Bz(t) is weak and, therefore, the magnetization orientation is uncertain. This problem is overcome by keeping the temperature below Tcrit during the transition period, in which case no changes occur in the disk’s magnetic state and, consequently, the recorded domains acquire sharp boundaries.6 For these reasons it is imperative to have a quadrilayer design, such as that of Figure 50.1, that cools down below Tcrit in between adjacent laser pulses. Phase-change optical recording The general structure of a phase-change (PC) disk is similar to that of an MO disk. Figure 50.8 shows the cross-section of a quadrilayer PC stack optimized for

50 nm

Aluminum alloy

25 nm 20 nm

Dielectric (ZnS–SiO2) Phase-change (GeSbTe)

50 nm

Dielectric (ZnS–SiO2)

Substrate (Polycarbonate)

Figure 50.8 A quadrilayer stack designed for through-substrate phase-change recording at k ¼ 400 nm. The reflectance of the stack Ra 8% when the Ge2Sb2Te5 film is amorphous, and Rc 30% when the film is crystalline. The absorbed optical power in the aluminum layer in either case is only about 1%. Thus, for all practical purposes, the fraction of the incident power that is not reflected is entirely absorbed in the PC layer.

688

Classical Optics and its Applications

operation at k ¼ 400 nm. The optical and thermal constants of this stack are listed in Table 50.1. The Ge2Sb2Te5 material can be switched between amorphous and (poly)crystalline states by the laser beam: melting at Tmelt 625 C followed by rapid quenching results in an amorphous mark, whereas annealing for a reasonable length of time above the glass transition temperature (Tglass 150 C) returns the material to the crystalline state.7,8,9 The stack shown in Figure 50.8 has reflectivities Rc ¼ 30% and Ra ¼ 8% for the crystalline and amorphous phases of the PC film. Note also that the thermal constants for the two phases are somewhat different. In the following analysis we ignore these differences by assuming the PC layer to be crystalline at all times. Also to simplify the calculations further, we ignore the heats of melting and crystallization. These are reasonable approximations, but the final results may need slight corrections if more accuracy is desired. The laser pulse sequence applied to this sample is P3(t), shown in Figure 50.4. Here the laser operates at three different power levels. At the highest level the pulse is strong enough to melt the PC film. In the low-power regime, occurring immediately after the melting pulse, the temperatures drop rapidly, causing the quenching of the molten material into an amorphous state. The intermediate power level is for annealing the pre-existing amorphous marks, which is required when overwriting a previously written track. (Such tracks contain both amorphous and crystalline regions, and it is necessary that all amorphous regions that are not being melted be annealed into the crystalline state.) Figure 50.9(a) shows the computed isotherms in the PC layer at T ¼ Tmelt for a disk speed V ¼ 25 m/s and an ambient temperature of 25 C. The solid and broken isotherms as before represent the heating and cooling cycles, respectively. The maximum temperature reached in this sample is Tmax ¼ 1153 C at t ¼ 59 ns. The isotherms are plotted at intervals Dt ¼ 0.5 ns whenever T Tmelt in the PC film. The first two molten regions are well separated from each other and from other molten pools; these will eventually quench to form two small amorphous marks. The cooling in these regions is rapid, and the temperatures return to the vicinity of Tglass in about 5 ns. The third and fourth molten pools, however, have some degree of overlap. (In practice two or more short overlapping marks such as these are used to create a long mark.) The heat generated by the fourth pulse flows backward and affects the amorphous region being formed in the wake of the third pulse. In general, heat diffusion from the tail end of any long mark can anneal the leading edge as well as the mid-sections of the same mark, causing partial crystallization. This problem may be better appreciated by examining the T ¼ 325 C isotherms of the same system, shown in Figure 50.9(b). Here the annealing pulses (i.e., the medium-power levels of P3(t)) appear behind the first two marks well after they have cooled. By then the annealed region is far enough from the previously

689

50 Laser heating of multilayer stacks 300

(a) Isotherms at 625 °C

Tmax = 1153 °C

Y (nm)

150 0

–150 –300 0 300

300

600

900

1200

1500

1800

(b) Isotherms at 325 °C

Y (nm)

150 0

–150 V = 25 m/s

–300 0

300

600

900 X (nm)

1200

1500

1800

Figure 50.9 Computed isotherms in the GeSbTe film of Figure 50.8, subjected to the focused beam of Figure 50.2 and the pulse sequence P3(t) of Figure 50.4. The ambient temperature is 25 C, and the quadrilayer disk moves at V ¼ 25 m/s along the X-axis. The peak temperature Tmax ¼ 1153 C in the film is reached at t ¼ 59 ns. The solid (broken) isotherms correspond to the heating (cooling) phase of each pulse. (a) Isotherms at T ¼ 625 C, the melting point of the PC material. (b) Isotherms at T ¼ 325 C, the presumed (elevated) annealing temperature given the short annealing time.

molten pools that there is no danger of recrystallization. In contrast, the two large isotherms in Figure 50.9(b) corresponding to the third and fourth melting pulses partially overlap, causing the formation of a small, undesirable crystallite in the middle of the long amorphous mark. These are some of the issues with which the designers of optical disk drives must grapple, in order to create robust and reliable data storage systems. References for Chapter 50 1 2

T. W. McDaniel and R. H. Victora, eds., Handbook of Magneto-optical Recording, Noyes Publications, Westwood, New Jersey, 1997. M. Mansuripur, The Physical Principles of Magneto-optical Recording, Cambridge University Press, UK, 1995.

690

Classical Optics and its Applications

3 H. S. Carslaw and J. C. Jaeger, Conduction of Heat in Solids, Oxford University Press, UK, 1954. 4 D. Chen, G. N. Otto, and F. M. Schmit, MnBi films for magneto-optic recording, IEEE Trans. Magnet. MAG-9, 66–83 (1973). 5 Y. Mimura, N. Imamura, and T. Kobayashi, Magnetic properties and Curie point writing in amorphous metallic films, IEEE Trans. Magnet. MAG-12, 779–781 (1976). 6 S. Yonezawa and M. Takahashi, Thermodynamic simulation of magnetic field modulation methods for pulsed laser irradiation in magneto-optical disks, Appl. Opt. 33, 2333–2337 (1994). 7 S. R. Ovshinsky, Reversible electrical switching phenomena in disordered structures, Phys. Rev. Lett. 21, 1450–1453 (1968). 8 J. Feinleib, J. deNeufvile, S. C. Moss, and S. R. Ovshinsky, Rapid reversible lightinduced crystallization of amorphous semiconductors, Appl. Phys. Lett. 18, 254–257 (1971). 9 T. Ohta, M. Takenaga, N. Akahira, and T. Yamashita, Thermal change of optical properties in some sub-oxide thin films, J. Appl. Phys. 53, 8497–8500 (1982).

Index

Abbe’s sine condition 9, 10, 14, 16, 20, 33, 39, 46, 528 Abbe’s theory of image formation 23, 38 aberrated wavefront 619 aberration 9, 16, 20, 33, 45, 64, 160, 227, 310, 313, 351, 355, 362, 379, 381, 452, 482, 503, 525, 578, 617, 658, 680 aberration-free 10, 11, 16, 18, 19, 34, 48, 57, 227, 229, 261, 307, 388, 447, 494, 496, 499, 526, 602, 616, 630 primary spherical aberration 618, 621, 622 spherical aberration 227, 380, 381, 482, 486, 534, 541, 578, 604, 617–623, 627, 658 absorbing media 216, 222 absorption coefficient 128, 155, 204, 206, 605, 611, 632 air gap 133, 144, 198, 238, 270, 275, 280, 383, 389, 395, 536 Airy disk 62, 261, 316, 379, 517, 522, 526, 542, 547, 681 Airy function 35, 41 Airy pattern 10, 19, 33–35, 37, 45, 62, 69, 70, 379, 517, 518, 522, 547 aluminum mirror 83, 564 ambient temperature 684, 686, 688, 689 amorphous 168, 680, 687, 688, 689 amorphous mark 688, 689 amplitude mask 373, 459, 549 amplitude spectrum 102, 117, 118 amplitude transmission function 70, 373 analyzer 555–570, 633–638 anamorphic magnification 499, 501–503 anamorphic magnification factor 503 anamorphic prisms 499 angular discrimination 263 angular momentum 289, 301–309 angular resolution 37, 516 angular separation 114, 121, 124, 258–265, 508, 517–520, 567–569 angular spectrum 135, 178, 233, 264, 381 angular spectrum decomposition 233 anisotropic polarizability 655 annular light source 552

annular phase mask 552 anti-bunching 127 anti-guiding 494 antireflection coated 11, 158, 346, 361, 388–391, 476, 518, 557, 634–635 aperture annular aperture 32, 542 aperture stop 503, 579 circular aperture 26–28, 34, 41, 443, 448–449, 457, 610–611 clear aperture 10, 16, 447, 496, 587, 659 spiral aperture 372 aplanat 18–19, 483 aplanatic 16, 18, 21, 33–35, 46–51, 328, 389, 391, 528, 539, 541 aplanatic sphere 528 aplanatic system 16 aplanatism 33, 528 apodization 379 apparent position of fixed star 310 aragonite 405 arc lamp 62, 546, 554, 588, 626 aspheric mirror 543 aspherics 351 aspheric surface 485, 527, 543 astigmatic 59, 490–491 astigmatic distance 490, 491 astigmatism 363, 381, 481–483, 490–499, 503, 580–581, 617 atomic dipole 211 atom optics 370 attenuated total internal reflection (TIR) 133, 385–388 attenuation coefficient 449–458 autocorrelation 78, 113, 115–116, 126, 241 autocorrelation function 78, 113, 116 average intensity 77, 91–92 Babinet’s principle 69 backward propagated 478 backward propagation 53 baseball pattern 136–137, 315–316, 531, 533, 539–540 baseline 516–520

691

692 beam decenter 483–487 beam propagation method (BPM) 459, 492, 658, 665, 667 split-step BPM 459–460, 658 beam-splitter (BS) 79, 126, 182–187, 194, 201, 224, 231–234, 268, 297, 464, 516, 525, 546, 564, 567, 619, 624, 644 beam tilt 480–488 beam waist 55–56, 290, 295, 320, 490 beat frequency 195 bending of polarization vector 46 Bessel beam 32 best focus 226–228, 358–362, 525–530, 535–542, 580–583 bias phase 568–574 biaxial birefringent crystal 405–408, 413 binary intensity mask (BIM) 586, 588 binary star 121 birefringence 197, 201–203, 554, 563–565, 632 birefringent 110, 201–203, 235, 405–408, 413, 557, 563–565, 567 birefringent slab 110 birefringent substrate 557 boundary condition 142, 149, 324–325, 447, 461, 600, 606 Bracewell, Ronald 61, 515, 524 Bracewell telescope 515, 519–522 Bradley, James 310 Bragg’s law 270, 317, 318 bright space 595 Brewster’s angle 164, 218, 281–283, 381, 383, 389–390 Calcite 110, 567 catadioptric solid immersion lens (SIL) 541, 543 caustic 301, 479 cavity 177–178, 195, 197, 204, 207, 270, 447–457, 491–492, 602 central fringe 96, 98 channel waveguide 464–467, 673–675 chaotic light 113, 116–119 chaotic point source 124, 125 characteristic equation 141 charge coupled device 556, 627 chirp cancellation 241, 248, 249, 252 chirp-compensation 251 chirped mirror 240 chirped pulse 240, 247, 248 chromatic aberration 355, 356, 527, 582 chromatic dispersion 351 chromeless 589, 592 circle of least confusion 226–227 circular aperture 26–28, 34, 41, 443, 448, 449, 457, 610, 611 circularization 499 circularly polarized 5, 12, 35, 107, 154, 167, 202, 224, 229, 263, 305, 409, 413–416, 637, 681 circular polarization 154–155, 168, 203–204, 208, 230, 305, 410 cladding 139, 243, 255–257, 459–470, 478, 489–495, 665–668, 671–676

Index classical mount 326–328, 332, 336, 338–341, 346–349 classical source of light 127 coefficient of nonlinearity 676 coherence coherence factor 590 coherence length 74, 78, 80–81, 93, 113, 624–626 coherence theory 88, 505 coherence time 64, 545, 547 first-order coherence 77, 89, 91–92, 95, 113, 117, 124 degree of coherence 88–89, 95, 586, 587 mutual coherence 100, 103 partial coherence 586 partial spatial coherence 556 temporal coherence 74–79, 88, 93, 113, 555 temporally coherent 64, 566 coherent addition 213–214 coherent illumination 62, 64, 66–69, 88, 367, 547–550 coherent and incoherent imaging 62, 88, 551, 582 coherent image 68, 92, 93, 583 coherent imaging 41, 42, 65, 68, 70, 582 coherent imaging system 42, 65 coherent monochromatic light 408 coherent point source 64, 92 coherent source 63, 64, 546, 548 colliding pulse ring laser 240 collimated beam 5, 10, 21, 28, 65, 78, 100, 136, 201, 225, 307, 367, 382, 405, 476, 494, 506, 541, 551, 627, 633 collimator 160, 161, 351, 383, 495–503, 505–508, 526 collimated coherent illumination 67–69, 547 coma 5, 9–10, 16, 19, 379–381, 452, 457, 580–581, 617–620 third-order coma 5, 617, 619 primary coma 19, 452, 457, 619, 620 comatic tail 452 comb function 375–377 compact disk 351, 534, 609 complex amplitude distribution 16, 17, 25–28, 34, 52, 174, 228, 289, 371, 400, 448, 546, 584, 643 complex degree of spatial coherence 96 compound microscope 9, 576 compression 46, 86, 240–257, 499, 502 compression ratio 241, 248, 252, 253 concave pit 604, 609, 610 concentric ring pattern 370 condenser 29, 30, 62–67, 546–557, 583–591, 625 condenser stop 586, 587, 590, 591 conduction electrons 150 cone of light 11–18, 65–67, 136, 160, 200, 316, 372, 388, 405, 415, 461, 517, 526, 530–540, 551, 560, 579, 611, 637 confocal resonator 447, 451, 454, 457 conical mount 326, 327, 331–335, 340–346 conical refraction 404–408, 410, 413, 415 conjugate plane 11, 14, 17 conjugate wave 644, 645, 658, 651 conjugated object wave 649 conoscopic 554, 556, 564 conoscopic polarization microscopy 564

Index conservation of energy 211, 213, 215, 232, 235, 275 constitutive relation(s) 420, 599 construction wavelength 352, 354, 357, 361, 363, 364, 365 contact hole 590, 594, 595 contrast 62, 88, 96, 201, 315, 371, 389, 549–561, 564, 566, 571, 573, 588, 592, 615, 624, 625 contrast enhancement 549–551 convection of light 310, 311, 320 convex pit 604, 608, 610 convolution 35, 41, 522 core 243, 459–468, 476, 478, 665, 666, 673, 674 co-rotating dielectric 191 coupling efficiency 402, 476–488 cover plate 528, 533–535, 604 critical angle 129, 142, 143, 255, 343, 379–383, 393, 401, 402, 537 critical illumination 570 critical TIR angle 133, 134, 379, 388, 391, 393 cross-correlation function 91, 100, 103, 123, 124 crossed analyzer 564, 569, 570 crystalline 554, 687, 688 current loop 426, 433, 435, 436 curvature 10, 21, 31–33, 55, 59, 226, 250, 291–294, 302, 358–366, 374, 480–488, 496, 502, 543, 577–581, 617, 624–627, 655–658, 664 curvature phase factor 31, 32, 55, 366, 480, 481, 557 cutoff frequency 64–67 cycle-averaged intensity 113, 115, 119, 121, 126 cylindrical lens pair 496, 498, 499, 503 dark soliton 665 defocus 45, 48, 64, 227, 381, 482–487, 529, 541, 554, 559–562, 570–573, 617, 663 degree of first-order coherence 77, 117, 124 degree of second-order coherence 113, 114, 116, 119, 127 DELTA 350, 544 delta function 27, 375–377 depolarization 107, 110, 156 depolarized 105, 107 depth of focus 525–536, 541, 542, 586, 588 detection module 177, 397, 398, 525, 638, 639 detector 78, 86, 113, 121–127, 176, 194, 268, 351, 397, 402, 517, 522, 525, 526, 531, 532, 537, 556, 627–640, 685 diagonal element 154, 167 diamagnetic 154, 167 dielectric constant 128, 131, 139–143 dielectric mirror 161, 177–179, 197–199, 206, 207, 274 dielectric slab 191, 192, 213, 215, 216, 218, 220, 254, 279 dielectric stack 82, 235–237, 268, 276, 324 dielectric tensor 153, 162, 166, 171, 234, 235, 396 differential detection 176, 398, 638, 639 differential detection module 398, 638, 639 differential detector 402, 639 differential image 555, 558, 561 differential interference contrast 566, 569 differential interference contrast microscope 566, 569

693

differential method of Chandezon 8, 138, 325, 350, 544 differential polarization microscopy 558, 564 differential signal 175, 177, 398, 401, 402, 533 differentiation theorem 60 DIFFRACT ix, xi, 2, 138, 350, 544 diffracted order 65, 66, 134–136, 249, 270, 272, 315, 318, 324–338, 341–345, 356, 361–365, 387, 614–616 diffracted ray 360 diffraction 1, 9, 16, 23, 28, 44–52, 70, 336, 344, 355, 382, 445, 447, 459, 461, 526, 545, 566, 570 classical diffraction 367, 459, 599 classical theory of diffraction 23, 26, 45, 47 diffraction-free beam 36, 44 diffraction effect 301, 382, 571, 573, 655 diffraction efficiency 250, 324, 330–349, 354, 356 diffraction-limited 10, 35, 64, 172, 307, 460, 527, 547, 584, 658 diffraction-limited focus 10, 36, 52, 336, 483, 525, 527, 539 diffraction-limited spot 37, 328, 609 diffraction order 21, 137, 249, 250, 316, 325–327, 331, 355, 531, 615 diffraction theory 25, 47, 367, 526, 532, 537, 599, 615 diffraction rings 680 scalar diffraction theory 25, 532 vector diffraction 8, 47, 51, 138, 330, 345, 355, 533, 537, 544 diffractive lens 351 diffractive optical element (DOE) 351, 352, 353, 355, 356, 366 diffractive propagation 303, 665 diffuse radiation 522 diffusion 678, 684 heat diffusion 663, 678, 682, 685, 686, 688 lateral heat diffusion 686 radial diffusion 684, 685 thermal diffusion 663, 680 diode laser 351, 489–502, 584, 678 dipolar oscillation 211 dipole radiation pattern 420 Dirac’s delta function 27 directional coupler 467, 469–473 dispersion 83, 240, 243–246, 254, 351, 600, 664, 665 dispersive 81, 240, 241, 246–249, 257, 600 dispersive element 81, 248 dispersive optical element 248 divergence-free 150, 419, 410, 435 divergence laws 420 Doppler shift 184–190, 310–320 double exposure 651, 652 double-slit mask 505, 508–511 double star 508, 511 down-chirp 246 duty cycle 64, 65, 325, 326, 346 Earth’s rotation 182 effective index 243 effective medium theory 333

694 E-field energy density 50, 51 eigenfunction 23, 60, 449 eigenfunction of propagation in free space 60 eigenvalue 449 electric charge 418, 419, 423, 429, 436, 439 electric current 437, 489 electric dipole 209, 216, 420, 421, 425–427, 429, 433–437 electric field intensity 2, 6, 205, 264, 307, 654, 665, 681 electromagnet 679, 685 electromagnetic energy 292, 301 electromagnetic field 23, 45, 88, 130, 145, 147, 149, 150, 209, 258, 325, 338, 381, 387, 388, 419, 447, 479, 599 electromagnetic radiation 209, 304, 324 electromagnetic waves 47, 52, 133, 149, 234, 275, 387 elegant solution of wave equation 60 ellipse of polarization 5, 7, 102, 105, 106, 203, 398, 415, 632 ellipsoid of birefringence 554, 564, 565 ellipsometry 632, 638 elliptical aperture 418–442, 503 ellipticity 5–7, 102–110, 155–170, 173, 179, 180, 202–206, 389–399, 413–416, 498, 499, 556, 557, 559, 562, 679 emergent wavefront 10, 16, 19, 45, 364, 478, 479, 480, 657 energy flow pattern 139, 150 ensemble 75, 116 ensemble average 75 ergodic 88 ergodicity 116 evanescent 25–27, 132, 143, 145, 147, 148, 255, 257, 325, 381, 387, 388, 461, 467, 494, 495, 611–613 evanescent beam 27 evanescent coupling 1, 387–393, 395, 398, 401, 402 evanescent wave 26, 132–134, 381, 382, 386, 387, 461 even mode 144–150, 467, 469 Ewald-Oseen theorem 209, 214, 218, 220, 222 extended incoherent source 92 extended source 88, 91, 92 extended waveform 75, 77, 81, 91 extended white light source 554 external conical refraction 404–407, 413 extinction rate 145 extinction ratio 635, 637, 638, 640 extinction theorem 1, 209, 213, 214, 216, 220–222 Fabry–Pe´rot etalon 1, 197–204, 207, 248, 251, 263, 265, 270, 271 Fabry–Pe´rot interferometer 197, 205 Fabry–Pe´rot resonator 159–161 Faraday angle 154, 157 Faraday effect 152–159, 162–164, 166 longitudinal Faraday effect 162, 163 Faraday medium 155, 156, 159, 160, 204–206, 208 Faraday rotation angle 155, 156, 208 Faraday rotator 203–205, 208, 224, 225, 230, 235

Index far field 23, 30–32, 41, 52, 55, 294–298, 368, 477–480, 584, 612, 616 far field (Fraunhofer) diffraction formula 480, 584 far field pattern 31, 32, 368 fast axis 498 fast Fourier transform (FFT) 26, 45, 302, 461 femtosecond range 240 ferrimagnetic 154, 167 ferromagnetic 154, 167 fiber 64, 182, 240, 241, 246, 248, 460–466, 475–478, 482, 483, 485–489, 502, 503, 584 fiber bundle 64 fiber-optic gyroscope 182, 196 field momenta 301 field momentum density 305 field of view 17, 40, 41, 62, 522, 541, 568, 570, 580, 584 filament 447, 463, 464, 502, 655, 662, 663 filamentation 660, 663, 664, 667 Finite Difference Time Domain (FDTD) 139, 151, 418, 599 first-order beam 66, 67, 331, 338, 341, 616, 623 first order field coherence function 78 Fizeau 505, 511, 513 flow of heat 396 focal-shift phenomenon 46 f-number 379, 615, 627 focused cone 135, 136, 200, 406, 461, 517, 527, 560, 564, 604, 611, 615, 637 focused laser beam 301, 396, 397, 473, 609, 679 focused spot 10, 19, 20, 35, 37, 45, 136–138, 144, 183, 184, 261, 263, 265, 272, 297, 308, 309, 315–317, 328, 336, 379, 380–382, 388, 396, 397, 415, 460, 476–483, 499–502, 525, 526, 528, 533–535, 540, 561, 580, 602–605, 608–610, 623, 628, 630, 680, 684–686 forward propagation 53 four-corners problem 556–558, 560, 561, 563 Fourier coefficients 354 Fourier component 42, 242, 373 Fourier domain 27, 28, 41, 52, 89, 375–377, 546 Fourier plane 27, 546, 548–551 Fourier optics 1, 23, 44, 45, 53, 73, 653 Fourier series 78, 115, 354, 375 Fourier spectrum 144, 146, 241, 325 Fourier transform 23, 25–28, 31–38, 45, 48, 53, 60, 76, 90, 96, 116, 119–121, 124, 147, 242, 245–247, 258, 302, 375, 376, 380, 461, 549 Fourier transform lens 36 Fourier transform plane 38, 549 Fraunhofer (far field) distribution 31 free-space impedance 140, 246, 432 frequency domain 100, 121, 247 frequency spectrum 75–77, 89, 90, 100, 117, 118, 240, 241, 320 frequency sweep 240 Fresnel’s coefficient(s) 141, 209, 210, 221, 222, 238, 256, 559 Fresnel drag 182 Fresnel’s formula for the drag of light 321 Fresnel-Kirchhoff diffraction integral 447

Index

695

Fresnel number 26 Fresnel’s reflection coefficient 131, 141, 143, 167, 209, 221, 256, 379, 389, 391, 400, 557, 559 Fresnel’s reflection formula 128 Fresnel rhomb 103, 105 Fresnel transmission coefficient 238, 537 fringes 68, 74, 88, 92–95, 183, 187, 190, 191, 195, 201, 298, 422, 427, 429, 436, 438, 440, 495, 505, 506, 508, 511, 513, 522, 614, 616, 617, 619, 624–627, 643–645, 649, 651 fringe contrast 88, 96, 98, 625 fringe pattern 88, 93–96, 188, 261, 290, 291, 505–510, 519, 523, 618, 619, 625, 651 fringe periodicity 94 fringe shift 183, 187 fringe visibility 507–511 frustrated total internal reflection (FTIR) 383, 388, 537 fused silica 154, 243, 244, 248, 584, 654

grating period 21, 136, 250, 251, 318, 325–328, 331, 334–346, 615, 616 metal grating 135, 136, 331 metallic grating 324, 331, 446 metallized grating 325, 326 ruled grating 323, 324, 337 transmission grating 330, 341, 344, 346 two-dimensional grating 623 grating compressor 240, 241 grazing incidence 283, 284, 288 groove 135–138, 272, 315–318, 324–333, 336–341, 345, 346, 353, 531–533, 537–540, 629, 630 groove depth 315, 316, 325, 333, 346, 531 groove edge 136, 137, 531, 533 group velocity, 193, 243–245 group velocity dispersion (GVD) 244, 245, 246, 248 guided mode 243, 255, 346, 443, 444, 461, 464, 466, 467, 472, 477, 491, 492, 494, 668 guiding layer 255–257, 489–495, 665–668, 676

gain-guided laser 490 gain layer 489–492 gain medium 194, 195, 490–494 Gale, Henry 182 Gauss–Hermite polynomials 296 Gauss–Laguerre polynomials 296 Gaussian beam 3–6, 29, 52–60, 143–146, 289–298, 302–306, 319, 320, 358, 361–365, 460, 476–487, 502, 655–659, 665–675, 680, 681 Gaussian optics 12 generalized Gaussian beam 52, 58 geometrical optics 14, 258, 301, 305, 620 geometric-optical ray 46, 301, 352, 477–479, 584 geometric-optical theory 615 geometrical optics 14, 258, 301, 305, 620 giant star 509, 510 Gires–Tournois (GT) resonator 251 guided mode 243, 255, 346, 443, 444, 461, 464, 466, 467, 472, 477, 491, 492, 494, 668 glass ball 584 glass cylinder 274 glass hemisphere 129, 133, 344, 383–391, 602–607 glass prism 143, 144, 150, 361, 363, 379, 388 glass sphere 539, 576–580, 584 Goos–Ha¨nchen effect 379, 380, 382 Gouy phase 52, 56–60 gradient-index (GRIN) 277, 476, 477 Gradium glass 486–488 grating 1, 21, 48, 64–67, 134, 151, 240, 241, 248, 249, 270, 272, 315, 316, 318, 323, 325, 327, 328, 333, 336, 341, 350, 351, 354, 387, 531, 540, 615–618, 623 amplitude grating 64, 65, 614 blazed grating 336, 337 dielectric-coated grating 346, 347 diffraction grating 1, 2, 21, 24, 48, 134–138, 248, 270, 315, 316, 323, 324, 346, 350, 387, 531, 544, 614, 615 double-frequency grating 623 echelette grating 337–341

half-wave plate 557, 558 half-wave layer 208 half-wave thickness 212, 217, 219, 232 Hall conductivity 420 Hall, Robert N. 489 halogen lamp 554 Hamilton, Sir William Rowan 23, 404, 405 Hanbury Brown, Robert 114, 121, 127, 513 Harress, Francis 182 Hanbury Brown-Twiss experiment 113, 114, 127 heat sink 396, 679, 682 heat source 679 Heisenberg’s uncertainty relations 258 helicity 295–298 hemispherical glass cap 530 hemispherical glass substrate 129, 346, 634 hemispherical 129, 343–346, 530, 602, 603, 634, 635 Hermite–Gaussian beam 60 HeNe laser 74, 231, 233, 328, 405 Hermite polynomial 60 higher-order Gaussian beam 59 higher-order mode 256, 450, 452, 464 high-resolution imaging 560 hollow cone of light 391, 405, 551 hologram 289, 643–653 double exposure hologram 651, 653 rainbow hologram 643 holography 1, 642, 643, 646, 653 Huygens principle 23 hybrid design 355 illumination optics 62, 63, 547 image contrast 371, 372, 555–558, 565, 573, 588, 589 image-forming system 9, 16, 38, 298 image plane 11–13, 16, 19, 21, 39–42, 63–71, 370, 517–523, 545–552, 555, 556, 566, 571, 581, 584, 597, 648 image quality 68, 298, 549, 580, 592 imaging system 10, 11, 13, 16, 39, 41, 42, 62, 63, 65, 92, 546 immersion-oil microscopy 11

696 incandescent lamp 62, 91 incidence medium 283, 330, 356 incoherent illumination 64–70, 548, 551 incoherent image 69, 583, 584 incoherent light source 63, 92, 370, 555, 559, 560, 569, 570, 586 index ellipsoid 406–408 index-guided laser 490 index-matched fluid 530, 535 infinite conjugate 18, 20, 33, 35, 627 information storage 297 infrared 248, 324, 515, 516, 524, 655 inhomogeneous plane wave 140, 149, 386 injection laser 489 in-phase soliton pair 673 instantaneous intensity 114 integrated intensity 46, 53, 172, 173, 401, 602, 605, 606, 610, 612, 659 intensity autocorrelation 116, 126 intensity fluctuation 91, 113, 114, 126, 127 interfere 78, 79, 108, 183–188, 262, 268, 290, 291, 298, 299, 411, 422, 505, 521, 547, 566, 619, 643–646, 649–652 interference 1, 29, 74, 75, 80, 92, 146, 147, 156, 158, 164, 183, 221, 268, 297, 315, 385, 396, 401, 422, 429, 463, 471, 474, 512, 547, 566, 574, 623–627, 652 constructive interference 58, 268, 517, 518, 521 destructive interference 58, 74, 136, 268, 411, 518, 522, 589, 591 double pinhole interference 93 interference fringe 88, 92, 93, 299, 614, 616, 643 interference pattern 74, 195, 200, 202, 261, 298, 495, 508, 617, 625, 626, 643–647, 653 interferogram 199, 201, 290, 411, 494–496, 545, 566, 619, 620, 626, 627, 645–652 holographic interferogram 651 sheared interferogram 566 interferometer 78–81, 86, 182–186, 189, 190, 194–197, 201, 205, 251, 268, 269, 494, 496, 505, 506, 511–519, 523, 619, 623–627 double-slit interferometer 505, 506 nulling interferometer 515, 517, 523, 524 interferometric telescope 515, 516 interferometry 127, 182, 199, 494, 516, 574, 624, 642, 651–653 holographic interferometry 642, 649, 651, 653 phase-shift interferometry 574 real-time interferometry 652 stellar interferometry 88, 505, 511, 513 internal conical refraction 405, 407, 408, 410, 415 inverse Fourier transform 26, 60 iron garnet 154, 204 isolated bright line 590, 592 isolator 225 isotherm 686–689 iteration 447–458, 460, 601, 602, 658, 662 k-space 143, 258 k-vector 131, 143, 144, 234, 235, 258, 259, 265, 270 Kerr effect 2, 166, 167, 170, 173, 178, 654

Index longitudinal Kerr effect 168–172 polar Kerr effect 166, 168, 172 polar Kerr signal 174, 175 magneto-optical (MO) Kerr effect 2, 166, 167, 178, 179, 397 transverse Kerr effect 166 Kerr ellipticity 397, 399 Kerr nonlinearity 246, 664, 666 Kerr rotation angle 180, 399, 402, 679 MO Kerr rotation 397 Kerr signal 172, 174, 175, 177 MO Kerr signal 179, 402, 679 Kerr liquid 240 Ko¨hler illumination 570, 573 knife-edge method 620 knife-edge test 621, 622 Kretschmann configuration 142 land 325–328, 535, 532, 533, 537, 540 land-groove 328 laser 62, 74, 76, 113, 172, 195, 231, 248, 328, 402, 461, 489, 491, 501, 588, 626, 655, 679, 684, 688 laser beam 62, 113, 172, 176, 183, 301, 351, 396, 460, 491, 497, 525, 547, 609, 621, 644, 678, 685, 688 laser diode 224, 225, 460, 461 laser gyroscope 194 laser heating 678 lateral wavefront shear 623 launching light 476 lens 3, 11, 19, 50, 318, 502, 558, 580, 602, 617, 626, 633, 645, 680 aplanatic lens 18, 33, 34, 46–51, 328 collimating lens 32, 160, 161, 328–330, 344, 345, 382, 413, 637, 640 collimator lens 495, 505, 506, 508 condenser lens 29, 62–64, 67, 546–548, 551, 552, 557, 583, 587 cylindrical lens 143, 489, 496, 498, 499, 502, 503, 630 focusing lens 160, 328, 382, 406, 413, 414, 416, 499, 500, 505–507 dark lens 28–30 diffraction-limited lens 35, 460 finite-conjugate lens 64, 547 high-NA lens 42, 47, 261 microscope objective lens 136, 172, 201, 307, 308, 527 plano-convex lens 10–12, 226, 227, 486–488, 527, 624, 625 objective lens 42, 62, 66, 135, 161, 199, 201, 297, 315, 328, 364, 388, 397, 525, 534, 540, 545, 551, 554, 560, 564, 566, 569, 602, 680 oil-immersion lens 529 split lens 57 thick lens 626 lensless imaging 367 left circularly polarized (LCP) 5, 107, 154, 167, 230, 264 lenslet 627–630 lenslet array 627–630 light emitting diode 584 light source 62, 63, 113, 124, 155, 182, 186, 188, 190, 314, 319, 351, 370, 381, 388, 545–547, 552, 554, 555, 560, 569, 586, 587, 622, 624, 625, 633, 639, 643

Index linearly polarized 3, 35, 47, 49, 107, 143, 154, 161, 166, 178, 201, 224, 230, 261, 302, 307, 358, 363, 380, 388, 397, 410, 415, 527, 538, 543, 554, 559, 569, 602, 633, 639, 655 linear momentum 259, 301 linear phase shift 85, 247, 380, 619 linear system 81 line-shape 117, 119, 124 liquid crystal cell 638 liquid waveguide 664 Littrow mount 336–340 localized irradiation 678 local nonlinearity 663 long-range SSP 138, 139, 149 Lorentz contraction 318, 319 Lorentz force 420, 422 Lorentz transformation 187, 192, 193, 311, 320 Lorentzian 116–118 lowest-order mode 448, 449, 450, 452, 457, 467 low-pass filtering 114, 126 luminiferous ether 310 Mach–Zehnder interferometer 78–80, 268, 269, 619 magnetic charge 149 magnetic dipole 420–443, 686 magnetic domain 397, 558, 559 magnetic energy 424, 437, 440 magnetic field 140, 153–156, 167, 204, 302, 387, 397, 418–440, 600, 679, 685, 687 magnetic-field modulation 687 magnetic film 178, 396, 397, 401, 679–687 magnetic layer 395–397, 679–686 magnetic medium 156, 168, 171, 172, 178 magnetic moment 167, 170–175 magnetic recording 686 magnetization 153–158, 164, 166–178, 396–400, 558, 562, 679, 685–687 magneto-optical (MO) activity 235, 562 MO contribution 174, 175, 400 magneto-optical (MO) disk 168, 395, 396, 398, 565, 678, 679, 687, 690 magneto-optical film 687 magneto-optically induced polarization 156, 161 MO signal 169, 174, 177–180, 395, 397, 401 magnification 9, 12, 14, 16, 18, 20, 21, 39, 40, 64, 374, 375, 499, 501–503, 517, 521, 547, 576, 577, 581, 583, 584, 587, 648 magnification factor 374, 375, 499, 503 magnifying glass 576, 580 marginal focus 618, 621, 622 mask 65, 68, 70, 72, 209, 371, 373, 374, 391, 401, 406, 459, 460, 465, 474, 492, 495, 506, 546, 550, 586, 588, 590, 595, 665 material inhomogeneity 659, 663 Maxwell’s equations 3, 23, 25, 26, 47, 128, 136, 139, 149, 209, 213, 222, 234, 235, 301, 324, 325, 386, 387, 405, 419, 420, 447, 599, 600, 606, 607, 656, 666 metallo-dielectric interface 139, 149 metal slab 142, 143, 149, 215 method of Fox and Li 2, 447–452

697

Michelson 78, 88, 182, 505, 507, 511–514 Michelson-Gale interferometer 182 Michelson interferometer 78, 511, 513 Michelson’s stellar interferometer 88, 505, 513 microscope 2, 9, 50, 297, 309, 525, 526, 545, 546, 554, 560, 564, 566, 569, 574, 576, 581, 584, 642 microscope objective 136, 172, 201, 307, 328, 527, 546, 565, 568, 573 misalignment 183, 452, 488, 640 mode 74, 76, 77, 91, 119, 139, 140–150, 194, 195, 240, 255–257, 289, 296, 346, 443, 447–458, 459–475, 476, 477, 483, 489–496, 502, 584, 664, 668 mode-locked laser 76, 240 modulation transfer function (MTF) 64–67 monochromatic 21, 29, 74, 92, 97, 129, 135, 231, 327, 408, 546, 557, 574, 614, 625, 633 monochromatic point source 88, 93, 372, 373, 505, 552, 557, 579, 584 MTF cutoff frequency 65, 66 MULTILAYER 2 multilayer dielectric mirror 198 multilayer stack 75, 84, 86, 197, 236, 238, 239, 275, 395, 396, 400, 518, 601, 678 multimode 471, 473–475, 584, 664 multimode interference (MMI) device 471–475 multimode fiber 584 multi-photon ionization 658 multiple reflection 108, 156, 158, 210, 218 multi-transverse-mode 502 nano-photonic 139 narrowband spectra 123 natural light 107 near field 52, 403, 544, 584 neutral density filter 201, 202 non-classical light 127 nonlinear absorption 658 nonlinear coefficient 248, 654, 656 nonlinear medium 254, 656–662, 668 nonlinear refractive index 246, 658, 664 nonuniform grid 601 nonuniform polarization 307 non-reciprocal 224–227, 230 normalized difference signal 639, 640 Nomarski, George 566, 574 Nomarski microscope 566, 569, 574, 575 broadband Nomarski microscope 574 null-corrector 627 nulling ellipsometer 633, 634, 637–639 nulling telescope 516 numerical algorithm 2, 459, 632 numerical aperture 11, 16, 33, 45, 46, 62, 66, 272, 315, 379, 447, 461, 480, 517, 526, 547, 551, 584–588, 602, 615 numerical error 6, 409, 414, 451, 537, 606, 608, 659, 663 numerical method 2, 447 object beam 643, 644 objective lens 42, 62, 66, 136, 161, 201, 297, 316, 364, 388, 397, 525, 534, 546, 551, 555, 566, 569, 602, 680

698 object wave 644–652 oblique incidence 19, 20, 129, 131, 133, 156, 157, 162, 198, 199, 207, 216, 218, 219, 222, 328, 559, 611 odd mode 141, 142, 144, 148–150, 467, 469 off-diagonal element 154, 167, 171, 397, 400 offense against the sine condition 19 oil-immersion microscopy 535 oil-immersion objective 528–532 omni-directional dielectric mirror 274 omni-directional reflector 274, 277, 285, 288 optical activity 158, 197, 204, 235, 397, 554, 558, 559, 562, 632 optical axis 10–14, 21, 36, 46, 58, 97, 176, 201, 292, 295, 303, 379, 398, 494, 501, 534, 547, 556, 614, 618, 657 optical data storage 403, 544, 678 optical disk 138, 168, 224, 395, 397, 398, 525, 536, 678, 679, 689 optical disk drive 224, 397, 689 optical filter 208, 223, 239 optical head 685 optical path difference (OPD) 352, 357, 513 optical path-length 58, 79, 524, 548, 568 optical path-length difference 79, 548 optical power density 442, 666 optical recording 397, 678, 687 optical tweezers 289, 301 optical vortex 289, 291, 300, 309 optic axis of wave normals 406 optic-ray axis 406–408, 413, 415 orthoscopic 554, 556 oscillating dipole 211, 213, 216, 218, 420 Otto configuration 142 out-of-phase soliton pair 671 paraboloid 365, 627 parallax 314 parallel plate 232, 233, 534, 604 paramagnetic 154, 167 paraxial approximation 14, 16, 34, 52, 53, 59, 60, 307 paraxial focus 578, 579, 581, 582, 617, 618, 621, 622 paraxial ray 12, 578 paraxial ray-tracing 12 paraxial regime 13–17, 296, 578, 579 path-length difference 18, 74, 79, 93, 268, 269, 520, 524, 548, 624 parabolic mirror 364, 365, 448 partially coherent 78, 88, 127, 555 partial depolarization 107, 110, 156 partially coherent illumination 88 partial polarization 1, 100 penetration depth 83, 215 perfectly matched layer (PML) 600 period-averaged intensity 246 periodic boundary condition 461 periodic mask 373, 374 periodic stack 274, 275, 281 phase-amplitude information 644 phase-amplitude mask 373, 459, 549, 550 phase-amplitude modulation 17, 459

Index phase-amplitude object 31, 38, 645, 646 phase-change 683, 687 phase-change (PC) disk 687 phase-conjugate mirror (PCM) 228–232, 235, 236 phase-contrast filter 548, 549, 551 phase-contrast mask 548, 549 phase-contrast mechanism 70, 549 phase-contrast microscope 545, 546 phase discontinuity 295 phase-edge 589, 593 phase factor 18, 26, 30, 38, 41, 59, 96, 207, 243, 248, 252, 257, 366, 373, 380, 480, 657, 659, 666 phase gradient 566 phase mask 460, 462, 464, 467, 470, 473, 474, 492–495, 552, 589, 558, 665 phase object 65, 70–73, 546, 548–552, 566, 570–574 phase-shifter 549, 592 phase-shifting mask (PSM) 486, 488, 498 phase singularity 289, 291, 294, 296, 300, 309 phase velocity 149, 243 photodetector 78, 79, 113, 114, 126, 175, 176, 183, 194, 195, 398, 522, 638, 639 photodetector array 194, 195 photoelastic modulator 638 photographic plate 3, 62, 92, 264, 367, 571, 630, 643–652 photolithography 1, 586, 597, 598 photomask 595 photo-multiplication 126 photo-multiplier tube 114 photonic bandgap crystal 274 photonic bandgap structure 463 photonic crystal 288, 463 photon momentum 258 photon noise 517 picosecond range 240, 248, 252 pinhole 88, 93–98, 406, 413, 415, 416, 625 planar glass waveguide 664, 676 Planck’s constant 258, 302 plane of best focus 227, 228, 358, 360, 362, 580, 581 plane monochromatic beam 129, 198 plane wave 3, 17, 23, 25, 31, 35, 40, 46, 75, 100, 107, 131, 140, 159, 169, 180, 192, 211, 214, 220, 233, 238, 255, 263, 277, 290, 302, 311, 315, 325, 337, 341, 356, 368, 372, 397, 407, 418, 421, 423, 427, 432, 442, 582, 606, 608, 619, 626, 637, 644, 647, 650, 680 fully-polarized plane wave 102 inhomogeneous plane wave 140, 149, 386 monochromatic plane wave 100, 129–131, 133, 135, 210, 277, 316, 418, 582 plane-wave spectrum 46, 302 polychromatic plane wave 107, 108 spectrum of plane waves 369 superposition of plane waves 7, 23, 26, 31, 35, 45, 320 uniform plane wave 26, 27, 49, 137 plano-aspheric lens 483–485 plano-convex lens 10–12, 226, 227, 486–488, 527, 624, 625 plano-cylindrical lens 496, 498, 502, 503

Index plasmon excitation 128, 133, 135, 137, 138, 346, 349, 393, 533 plastic substrate 395, 534, 565, 609, 679 p–n junction 489 Poincare´, Henri 101, 106 Poincare´ sphere 100, 105, 106 point source 30, 63, 64, 89, 93, 114, 121, 125, 370, 505, 510, 523, 547, 552, 559, 562, 570, 579, 583, 624–627 Poisson’s bright spot 28 polarizability 212, 655 polarization 1, 5, 25, 45, 48, 88, 100, 108, 130, 135, 140, 149, 154, 159, 164, 173, 179, 197, 202, 208, 217, 224, 230, 250, 263, 281, 302, 307, 329, 345, 360, 385, 390, 397, 400, 407, 410, 415, 418, 427, 442, 495, 502, 516, 525, 538, 554, 557, 564, 570, 605, 632, 640, 656, 660, 679 polarization ellipticity (see also (ellipticity)) 7, 161, 168, 173, 203, 389, 390, 392, 394, 399, 413–416, 557, 562, 679 polarization microscope 554, 555, 560, 561 polarization rotation 155, 161, 164, 168, 171, 173, 179, 200, 203–206, 556–558, 564 polarization rotation angle 6, 7, 108, 110, 157, 161, 163, 169, 173, 203, 205, 389, 390–394, 399, 413–516, 557, 562 polarization state 5, 7, 45, 48, 108, 140, 155, 159, 172, 202, 225, 230, 261, 264, 268, 274, 281, 305, 324, 361, 393, 410, 491, 525, 533, 554, 559, 638 polarization vector 46, 48, 108, 136, 166, 201, 224, 409, 410, 415, 527, 540, 556, 564, 566, 656 polarizer 100, 103, 108, 199, 200, 225, 554, 555, 569, 633, 638, 640 polarizing beam splitter (PBS) 224, 225, 230, 567 polychromatic 100–110 polychromatic beam 100, 102, 103, 110 population inversion 489, 494 power attenuation coefficient 449–458 power content 53, 361, 448, 461–473, 494, 495, 604, 656, 671, 673 power spectral density 113, 115, 124 Poynting vector 2, 129, 144, 148, 205, 291, 301–309, 381, 388, 431, 441, 445, 477, 584, 680, 682 primary astigmatism 617 primary coma 19, 452, 457, 619, 620 primary mirror 516, 521, 627 principal axes 406, 407 principle of conservation of energy 232, 235 principle of superposition 81 principal refractive index 406, 408, 563, 564 principal plane(s) 10–21, 33–39 prism 47, 81, 142, 150, 175, 231, 240, 248, 337, 351, 361, 379, 387, 397, 499, 555, 560, 566, 569, 574, 638, 659 prism-coupling 142 prism pair 240, 499, 500–502 propagating mode 460 propagating order 326 propagation through nonlinear medium 660–662 pulse broadening 245

699

pulse compression 240, 241, 246, 254, 257 pulse train 76, 241, 242 entrance pupil 33–41, 45, 46, 65, 264, 297, 307, 406, 414, 416, 461, 480, 499, 526, 545, 552, 554, 555, 570, 573, 584, 602, 604, 628, 680 exit pupil 14, 32, 36, 40, 47, 66, 71, 135, 160, 172–179, 315, 328, 345, 363, 382, 388, 400, 413, 495, 500, 531, 540, 551, 562, 565, 579, 584, 614–619 push–pull method 532, 537 quadratic phase factor 38, 40, 41, 59, 243–252 quadrilayer 178, 180, 235, 237, 396, 401, 402, 678–689 quadrilayer stack 178, 235, 402, 679, 680, 685, 687 quadrupole 425, 427, 429, 430 quality factor 204 quantum nature of light 127, 259 quantum optics 127 quantum well 492 quarter-wave plate 100, 105, 175, 176, 201, 202, 224, 225, 229, 633, 637 quarter-wave stack 83, 204, 276, 278 quartz 82, 83, 87, 567, 568, 642 quasi-monochromatic 29, 64, 88–96, 100, 113, 372, 505, 545, 547, 552, 555, 569, 574, 586, 615, 622 quasi-monochromatic light source 545, 569, 622 quasi-monochromatic point source 88, 93, 372, 373, 505, 552 radiated field 209, 212–214, 555 radiation pressure 301, 423 radiative mode 462, 464, 466 random phase 92, 116, 118 rare-earth iron garnet 154 ray-bending 50, 380 ray ellipsoid 407, 408 ray distribution 307 ray-tracing 12, 301, 351, 477–479, 584 Rayleigh, Lord 23, 323–325, 378 Rayleigh anomaly 331–347 Rayleigh criterion 298 Rayleigh range 58, 59, 259, 261, 291–296, 319, 320, 528, 656 readout signal 401 real image 29, 579, 580, 614, 626, 645–653 reciprocity 2, 224–238, 275, 277, 333–340, 348 reconstructed wavefront 361, 645–653 reconstruction beam 645–652 reference beam 199–202, 297–299, 411, 545, 619, 620, 624–627, 630, 643–653 reference mirror 200, 201, 297 reflecting telescope 511 reflection coefficient 83, 129, 141, 167, 197, 205, 216, 221, 233, 251, 256, 275, 379, 385, 391, 400, 557, 632 reflection loss 355, 476 reflective diffractive optical element (DOE) 356 refraction 46, 48, 107, 180, 344, 415, 541, 556, 557 refractive index profile 459, 546, 664

700 residual phase 227, 228, 359, 364, 580 resist threshold 590 resolution 9, 37, 41, 50, 62, 65, 67, 68, 176, 270, 297, 298, 515–517, 525, 528, 531, 535, 551, 552, 559, 560, 570, 576, 577, 581, 584, 586, 588, 592, 601, 602, 637, 649 resolution of imaging system 41, 65 resolvability 258, 261, 272, 273 resolving power 336 resonance 144, 161, 204, 207, 208, 252, 338 resonant absorption 128, 131, 132, 144, 393 resonant behavior 346 resonant cavity 452 resonator 159–161, 251–253, 270, 351, 447–458 rest frame 316–321 retardation 103, 105, 252, 570, 637–640 retarder 102–106, 638–640 reverse-contrast 560, 564 right circularly polarized (RCP) 5, 154, 167, 204, 225, 229, 264, 393, 410–416 ring interferometer 182 ring laser 194, 195 r.m.s. wavefront aberration 228, 496, 498, 503 Ronchigram 617–620 Ronchi ruling 614, 615 Ronchi test 614–623 rotating frame 182, 188 running fringes 190, 191, 195 Sagnac, George 182, 189 Sagnac interferometer 182–186, 190, 194, 195 Sagnac loop 184–188, 194 scalar theory 45, 325, 329, 531, 533 scanning optical microscope 525, 526 Schwartz inequality 105, 111 secondary mirror 517 second-harmonic 241 second-order coherence 113, 114, 116, 119, 127 Seidel aberration(s) 5, 617 Seidel astigmatism 482, 483 Seidel curvature 480–487 self-focusing 654–663, 665–670 self-focusing collapse 655, 658 self-imaging 367, 378, 475 self-induced phase shift 657 self-phase modulation 240, 241, 257 self-trapping 654, 655, 658, 663, 664, 666, 676 semiconductor junction 489 semiconductor laser diode 461, 489, 490, 678 Shack cube 625, 627 Shack–Hartmann wavefront sensor 624, 627, 628 shearing interferometry 494 shear plate 494–497 shifter-shutter mask 589, 595 short-range or lossy mode 149 shot noise 126, 522 side-rigger 590–594 signal-to-noise ratio 298, 517, 638 silica glass fiber 246, 460, 462, 476, 477, 502 single-mode beam 475

Index single-mode fiber 463, 476, 477, 483, 485, 486, 503 single- transverse-mode 494, 496, 502 Sirius 121 skin depth 130, 420 slab waveguide 243, 255, 665–672 slow axis 496, 498 Snell’s law 217, 218, 232, 343, 344, 355, 380, 534, 535, 567, 577, 604 solid immersion lens (SIL) 388, 395, 535, 602 soliton theory 664 spatial incoherence 64, 547, 570 spatial filter 546–549 spatial frequency 64, 150, 320, 382, 649, 652 spatial frequency-content 382, 649, 652 spatially coherent 88, 545, 555 spatially incoherent 92, 94, 506, 545, 555, 559, 560, 569–573, 586 spatially incoherent point source 571, 573 spatial optical soliton 664, 676 spatial resolution 176, 637 special theory of relativity 84, 101, 310 spectral bandwidth 74, 81, 151, 156, 351 spectral broadening 85, 86, 248 spectral filtering 119 spectral width 76, 252 specular reflection 331 spherical aberration spherical cap 35, 39, 624–627 spherical wavefront 17, 18, 370–374, 658 split detector 176, 398, 531, 532, 537 split-step beam propagation method 658 split-step technique 459, 460 splitter 78, 80, 126, 182, 187, 195, 224, 231, 268, 297, 464, 471, 516, 525, 546, 564, 569, 619, 624, 625, 644, 645 s-polarization 47, 107, 200, 277, 286, 329, 633 spot size 525–543 square-shaped aperture 444, 470, 590, 594 stable self-trapping 664, 676 standing-wave 187, 188, 205, 422, 429 state of polarization 5, 25, 102, 106, 176, 226, 230, 261, 305, 389, 407–417, 526, 554, 556, 564, 634 stationary 75, 78, 116 stationary-phase approximation 23, 33, 42, 45, 46, 48 stationary point 30, 34, 43, 44 stationary process 75, 78 stellar aberration 310, 313 Stokes, Sir George Gabriel 101, 104, 112, 235 Stokes’ parameters 100, 104–110 storage layer 534, 537, 679, 680, 686 straight-line fringes 625, 626 subwavelength 139, 420, 423, 599, 602 subwavelength structure 599, 602 successive iteration 448 sum signal 177, 532, 537, 538, 638, 639 superposition 23, 31, 48, 67, 75, 81, 94, 107, 124, 136, 155, 233, 261, 296, 317, 320, 325, 370, 411, 413, 427, 494, 606–608 superposition integral 48

Index super solid immersion lens (super SIL) 539–542 surface charge 149, 435 surface current 419–442 surface plasmon 1, 128–138, 139, 143, 149, 331, 346, 349, 386, 388, 533 surface plasmon excitation 128, 133, 135, 137, 138, 346, 349, 533 surface plasmon polariton (SPP) 138, 139, 149 surface relief feature 325 surface relief structure 367, 546 switching 467, 674, 679, 686, 690 Talbot effect 1, 367, 370, 375, 473 telescope 2, 9, 50, 310, 314, 511–523, 627, 630 temperature distribution 682 temperature gradient 682 temporal soliton 664, 665 TEMPROFILE 2 test beam 624–630 thermal conductivity 680, 684 thermal diffusion 663, 680 thermal source 91, 119 thermomagnetic recording 687 thin-film optics 209, 221 thin-film stack 236, 602, 632, 678 thin magnetic film 178, 679 thin magnetic layer 395 third-order nonlinearity 654 time average 75, 77, 78, 104, 116, 301, 302 time-averaged intensity 77 total E-field intensity 307, 681 total intensity distribution 528, 529, 542 total internal reflection (TIR) 2, 103, 129, 133, 142, 143, 230, 255, 343, 344, 379, 387, 388, 489, 537, 674 TIR mirror 230, 231 TIR prism 231, 380 transcendental equation 141 transfer function of propagation 53, 60 transform-limited 246 transmission axis 201, 225, 226, 555–557, 570, 633, 634 transmission coefficient 83, 183, 197, 202, 205, 210, 214, 216, 231–239, 263, 266, 270, 275, 277, 279, 280, 355, 461, 537, 632, 646 transmission efficiency 432, 441, 442, 613 transmission function 59, 70, 373 transmissive DOE 356, 357, 358, 361, 363 transmitted order 233, 333, 341–346 transparent hemisphere 535 transverse effect 162, 164, 170–172 transverse electric (TE) 140, 149, 326 transverse Faraday effect 164

701

transverse magnetic (TM) 140, 141, 144, 149, 326, 328 transverse magnification 12, 14, 16, 18, 21 triangulation 512 truncated Bessel beam 32 truncated Gaussian 76, 90, 526 truncated prism 146–148 tungsten lamp 546 Twiss, Richard Q. 114, 121, 127, 513 Twyman–Green interferometry 199 uncertainty principle 258, 261 uniaxial birefringent crystal 407, 567 unpolarized 35, 100, 103, 106, 107, 410, 415, 502, 637 up-chirp 245, 247 van Cittert–Zernike theorem 88, 94, 96 van Leeuwenhoek, Antoni 576, 577, 579, 585 van Leeuwenhoek microscope 576, 581, 582 variable retarder 102, 103, 105, 638–640 Verdet constant 154 virtual image 517, 580, 582, 583, 624, 645–651 vortex structure 309 waist 55–60, 289–296, 302–306, 319, 320, 490–497, 655, 656 wavefront curvature 291, 294, 359, 480, 483, 488, 655, 664 wavefront cylinder 483 wavefront tilt 380 waveguide 141, 243, 255, 274, 296, 346, 443, 447, 459, 464, 469, 474, 655, 664–677 waveguide mode 141, 349 wavelength discrimination 270 wave optics 16, 258 wave packet 74–86, 91, 92 wave-plate 105, 106, 108, 110 wide-aperture system 17, 580 white light 74, 554, 582, 602, 627, 643, 653 wire test 620–623 Wollaston 175, 397, 555, 560, 566–568 Wollaston prism 175, 397, 555, 558, 560, 566–574, 638–640 Wood, R.W. 138, 165, 181, 208, 323, 350, 511, 514 Wood’s anomaly 324 Y-branch beam splitter 464, 466 Young’s interference fringes 88 Zernike, Frederick 88, 94, 99, 545, 547, 552 zodiacal light 522 zeroth-order 66, 328–341, 616