Adaptive Antennas for Wireless Communications

ADAPTIVE ANTENNAS FOR WIRELESS COMMUNICATIONS IEEE Press 445 Hoes Lane, P.O. Box 1331 Piscataway, NJ 08855-1331 IEEE P...

Author: George V. Tsoulos

146 downloads 2243 Views 31MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

ADAPTIVE ANTENNAS FOR WIRELESS COMMUNICATIONS

IEEE Press 445 Hoes Lane, P.O. Box 1331 Piscataway, NJ 08855-1331 IEEE Press Editorial Board Robert J. Herrick, Editor in Chief M. Akay J. B. Anderson P. M. Anderson J. E. Brewer

M. Eden M. E. El-Hawary R. F. Hoyt S. V. Kartalopoulos D. Kirk

M. S. Newman M. Pagdett W. D. Reeve G. Zobrist

Kenneth Moore, Director of IEEE Press Catherine Faduska, Senior Acquisitions Editor Linda Matarazzo, Associate Acquisitions Editor Mark Morrell, Associate Production Editor IEEE Antennas & Propagation Society, Sponsor AP-S Liaison to IEEE Press, Robert Mailloux Cover design: William T. Donnelly, WT Design

Technical Reviewers Hans Steyskal, Rome Laboratory/ERA, Hanscom AFB, MA Robert Mailloux, Rome Laboratory/ERI, Hanscomb APB, MA

Books of Related Interest from the IEEE Press PERSPECTIVES IN CONTROL ENGINEERING: Technologies, Applications, and New Directions Edited by Tariq Samad 2001 Hardcover 536 pp IEEE Order No. PC5798 ISBN 0-7803-5356-0 PHYSIOLOGICAL CONTROL SYSTEMS: Analysis, Simulation, and Estimation Mich~el C. K. Khoo 2000 Hardcover 344 pp IEEE Order No. PC5680 ISBN 0-7803-3408-6 THE CONTROL HANDBOOK Edited by William S. Levine A CRC Handbook published in cooperation with IEEE Press 1996 Hardcover 1566 pp IEEE Order No. PC5649

ISBN 0-8493-8570-9

ROBUST VISION FOR VISION-BASED CONTROL MOTION Edited by Gregory D. Hager and Markus Vincze IEEE Order No. PC5403 2000 Hardcover 264 pp

ISBN 0-7803-5378-1

ADAPTIVE ANTENNAS FOR WIRELESS COMMUNICATIONS

Edited by

George V. Tsoulos PA Consulting Group Cambridge, U.K.

IEEE Antennas & Propagation Society, Sponsor

A Selected Reprint Volume

IEEE PRESS

The Institute of Electrical and Electronics Engineers, Inc., New York

This book and other books may be purchased at a discount from the publisher when ordered in bulk quantities. Contact: IEEE Press Marketing Attn: Special Sales 445 Hoes Lane, P.O. Box 1331 Piscataway, NJ 08855-1331 Fax: +1 732 981 9334 For more information about IEEE Press products, visit the IEEE Online Catalog & Store at http://www.ieee.org/store.

© 2001 by the Institute of Electrical and Electronics Engineers, Inc.

3 Park Avenue, 17th Floor, New York, NY 10016-5997.

All rights reserved. No part of this book may be reproduced in any form, nor may it be stored in a retrieval system or transmitted in any form, without written permission from the publisher. Printed in the United States of America. 10

9

8

7

6

5

4

3

2

1

ISBN 0-7803-6016-8 IEEE Order No. PP5866

Library of Congress Cataloging-in-Publication Data

Adaptive antennas for wireless communications / edited by George V. Tsoulos. p. em. Includes bibliographical references and index. ISBN 0-7803-6016-8 1. Adaptive antennas. 2. Wireless communication systems--Equipment and supplies. I. Tsoulos, George V., 1968TK7871.67.A33 A33 2000 621.382' 4--dc21

00-040990

Contents

xi

Preface

Chapter 1

Introduction and Channel Models

1

Adaptive Antenna Systems 3 B. Widrow, P. E. Mantey, L. J. Griffiths, and B. B. Goode (IEEE Proceedings, December 1967). Overview of Spatial Channel Models for Antenna Array Communication Systems 20 R. B. Ertel, P. Cardieri, K. W. Sowerby, T. S. Rappaport, and J. H. Reed (IEEE Personal Communications Magazine, February 1998). Antenna Systems for Base Station Diversity in Urban Small and Micro Cells 33 F. C. F. Eggers, J. Tcttgard, and A. M. Oprea (Journal on Selected Areas of Communication, September 1993). A Statistical Model for Angle of Arrival in Indoor Multipath Propagation 44 Q. Spencer, M. Rice, B. Jeffs, and M. Jensen (IEEE Vehicular Technology Conference, May 1997).

Chapter 2

Adaptive Algorithms

49

Highlights of Statistical Signal and Array Processing 51 A. Hero (IEEE Signal Processing Magazine, September 1998). Application of Antenna Arrays to Mobile Communications, Part II: Beamforming and Direction-of-Arrival 95 Considerations L. C. Godara (Proceedings of the IEEE, August 1997). High-Resolution Frequency-Wavenumber Spectrum Analysis 146 J. Capon (Proceedings of the IEEE, August 1969). An Algorithm for Linearly Constrained Adaptive Array Processing 157 O. L. Frost III(Proceedings of the IEEE, August 1972). The Application of Spectral Estimation Methods to Bearing Estimation Problems 167 D. H. Johnson (Proceedings of the IEEE, September 1982). On Spatial Smoothing for Direction-of-Arrival Estimation of Coherent Signals 178 T-J. Shan, M. Wax, and T. Kailath (IEEE Transactions on Acoustics, Speech, and Signal Processing, August 1985). Detection of Signals by Information Theoretic Criteria 184 M. Wax and T. Kailath (IEEE Transaction on Acoustics, Speech, and Signal Processing, April 1985). Multiple Emitter Location and Signal Parameter Estimation 190 R. O. Schmidt (IEEE Transactions on Antennas and Propagation, March 1986). Using Spectral Estimation Techniques in Adaptive Processing Antenna Systems 195 W. F. Gabriel (IEEE Transactions on Antennas and Propagation, March 1986). Implementation of Adaptive Array Algorithms 205 R. Schreiber (IEEE Transactions on Acoustics, Speech, and Signal Processing, October 1986). v

Contents

Steady State Analysis of the Generalized Sidelobe Canceller by Adaptive Noise Cancelling Techniques 213 N. K. Jablon (IEEE Transactions on Antennas and Propagation, March 1986). Adaptive-Adaptive Array Processing 221 E. Brookner and J. M. Howell (Proceedings of the IEEE, April 1986). ESPRIT-Estimation of Signal Parameters via Rotational Invariance Techniques 224 R. Roy and T. Kailath (IEEE Transactions on Acoustics, Speech, and Signal Processing, July 1989). Spectral Self-Coherence Restoral: A New Approach to Blind Adaptive Signal Extraction Using Antenna Arrays 236 B. G. Agee, S. V. Schell, and W. A. Gardner (Proceedings of the IEEE, April 1990). Sensor Array Processing Based on Subspace Fitting 250 M. Viberg and B. Ottersten (IEEE Transactions on Signal Processing, May 1991). Direction-of-Arrival Estimation via Exploitation of Cyclostationarity-A Combination of Temporal and Spatial Processing 261 G. Xu and T. Kailath (IEEE Transactions on Signal Processing, July 1992). Space-Alternating Generalized Expectation Maximization Algorithm 272 J. A. Fessler and A. O. Hero (IEEE Transactions on Signal Processing, October 1994). Unitary ESPRIT: How to Obtain Increased Estimation Accuracy with a Reduced Computational Burden 286 M. Haardt and J. A. Nossek (IEEE Transactions on Signal Processing, May 1995). Joint Angle and Delay Estimation (JADE) for Multipath Signals Arriving at an Antenna Array 297 M. C. Vanderveen, C. B. Papadias, and A. Paulraj (IEEE Communications Letters, January 1997).

Chapter 3

Performance Issues

301

Smart Antennas for Mobile Communication Systems: Benefits and Challenges 303 G. V. Tsoulos (Electronics & Communication Engineering Journal, April 1999). 314 An Adaptive Array in a Spread-Spectrum Communication System R. T. Compton, Jr. (Proceedings of the IEEE, March 1978). 324 On the Performance of a Polarization Sensitive Adaptive Array R. T. Compton, Jr. (IEEE Transactions on Antennas and Propagation, September 1981). Effect of Mutual Coupling on the Performance of Adaptive Arrays 332 I. J. Gupta and A. A. Ksienski (IEEE Transactions on Antennas and Propagation, September 1983). Optimum Combining in Digital Mobile Radio with Co-channel Interference 339 J. H. Winters (IEEE Transactions on Vehicular Technology, August 1984). On Optimum Combining at the Mobile 351 R. G. Vaughan (IEEE Transactions on Vehicular Technology, November 1988). The Performance of an LMS Adaptive Array with Frequency Hopped Signals 359 L. Acar and R. T. Compton, Jr. (IEEE Transactions on Aerospace and Electronic Systems, May 1985). An LMS Adaptive Array for Multipath Fading Reduction 371 Y. Ogawa, M. Ohmiya, and K. Itoh (IEEE Transactions on Aerospace and Electronic Systems, January 1987). Optimum Combining for Indoor Radio Systems with Multiple Users 378 J. H. Winters (IEEE Transactions on Communications, November 1987). The Performance Enhancement of Multibeam Adaptive Base-Station Antennas for Cellular Land Mobile Radio Systems 387 S. C. Swales, M. A. Beach, D. J. Edwards, and J. P. McGeehan (IEEE Transactions on Vehicular Technology, February 1990). Combination of an Adaptive Array Antenna and a Canceller of Interference for Direct-Sequence SpreadSpectrum Multiple-Access System 399 R. Kohno, H. Imai, M. Hatori, and S. Pasupathy (IEEE Journal on Selected Areas in Communications, May 1990). Direction Finding in the Presence of Mutual Coupling 406 B. Friendlander and A. J. Weiss (IEEE Transactions on Antennas and Propagation, March 1991). vi

Contents

Improving the Performance of a Slotted ALOHA Packet Radio Network with an Adaptive Array 418 J. Ward and R. T. Compton, Jr. (IEEE Transactions on Communications, February 1992). Signal Acquisition and Tracking with Adaptive Arrays in the Digital Mobile Radio System IS-54 with Flat Fading 427 J. H. Winters (IEEE Transactions on Vehicular Technology, November 1993). Effect of Fading Correlation on Adaptive Arrays in Digital Mobile Radio 435 J. Salz and J. H. Winters (IEEE Transactions on Vehicular Technology, November 1994). 444 Capacity Improvement with Base-Station Antenna Arrays in Cellular CDMA A. F. Naguib, A. Paulraj, and T. Kailath (IEEE Transactions on Vehicular Technology, August 1994). 452 Analytical Results for Capacity Improvements in COMA J. C. Liberti and T. S. Rappaport (IEEE Transactions on Vehicular Technology, August 1994). Adaptive Transmitting Antenna Arrays with Feedback 463 D. Gerlach and A. Paulraj (IEEE Signal Processing Letters, October 1994). Adaptive Antennas for Third Generation DS-CDMA Cellular Systems 466 G. V. Tsoulos, M. A. Beach, and S. C. Swales (Proceedings of 45th Vehicular Technology Conference, July 1995). The Spectrum Efficiency of Base Station Antenna Array System for Spatially Selective Transmission 471 P. Zetterberg and B. Ottersten (IEEE Transactions on Vehicular Technology, August 1995). Capacity Enhancement and BER in a Combined SDMA/TDMA System 481 J. Fuhl and A. F. Molisch (Proceedings of the 46th Vehicular Technology Conference, April 1996). Performance of Wireless CDMA with M-ary Orthogonal Modulation and Cell Site Antenna Arrays 486 A. F. Naguib and A. Paulraj (IEEE Journal of Selected Areas in Communications, December 1996). Smart Antenna Arrays for CDMA Systems 500 J. S. Thompson, P. N. Grant, and B. Mulgrew (IEEE Personal Communications Magazine, October 1996). Efficient Direction and Polarization Estimation with a COLD Array 510 J. Li, P. Stoica and D. Zheng (IEEE Transactions on Antennas and Propagation, April 1996). Upper Bounds on the Bit-Error Rate of Optimum Combining in Wireless Systems 519 J. H. Winters and J. Salz (IEEE Transactions on Communications, December 1998). The Range Increase of Adaptive Versus Phased Arrays in Mobile Radio Systems 525 J. H. Winters and M. J. Gans (IEEE Transactions on Vehicular Technology, March 1999). 535 A Comparison of Two Systems for Downlink Communication with Base Station Antenna Arrays P. Zetterberg (IEEE Transactions on Vehicular Technology, September 1999).

Chapter 4

Implementation Issues

551

Fundamentals of Digital Array Processing 553 D. E. Dudgeon (Proceedings of the IEEE, June 1977). A Novel Algorithm and Architecture for Adaptive Digital Beamforming 560 C. P. Ward, P. J. Hargrave, and J. G. McWhirter (IEEE Transactions on Antennas and Propagation, March 1986). Nonlinearities in Digital Manifold Phased Arrays 569 B. D. Mathews (IEEE Transactions on Antennas and Propagation, November 1986). Adaptive Beamforming with the Generalized Sidelobe Canceller in the Presence of Array Imperfections 579 N. K. Jablon (IEEE Transactions on Antennas and Propagation, August 1986). 596 An Efficient Algorithm and Systolic Architecture for Multiple Channel Adaptive Filtering S. M. Yuen, K. Abend, and R. S. Berkowitz (IEEE Transactions on Antennas and Propagation, May 1988). Mutual Coupling Compensation in Small Array Antennas 603 H. Steyskal and J. S.Herd (IEEE Transactions on Antennas and Propagation, December 1995). 607 A Unified Approach to the Design of Robust Narrow-Band Antenna Array Processors M-H. Er and A. Cantoni (IEEE Transactions on Antennas and Propagation, January 1990).

vii

Contents

Design Trades for Rotman Lenses 614 R. C. Hansen (IEEE Transactions on Antennas and Propagation, April 1991). Optimum Networks for Simultaneous Multiple Beam Antennas 623 E. C. DuFort (IEEE Transactions on Antennas and Propagation, January 1992). Direction Finding in Phased Arrays with a Neural Network Beamformer 630 H. L. Southall, J. A. Simmers, and T. H. O'Donnell (IEEE Transactions on Antennas and Propagation, December 1995). Application of Orthogonal Codes to the Calibration of Active Phased Array Antennas for Communication Satellites 636 S. D. Silverstein (IEEE Transactions on Signal Processing, January 1997). The Analogy Between the Butler Matrix and the Neural-Network Direction-Finding Array 649 R. J. Mailloux and H. L. Southall (IEEE Antennas and Propagation Magazine, December 1997). Forward-Backward Averaging in the Presence of Array Manifold Errors 655 M. Zatman and D. Marshall (IEEE Transactions on Antennas and Propagation, November 1998).

Chapter 5

Experiments

661

Multiple Source DF Signal Processing: An Experimental System 663 R. O. Schmidt and R. E. Franks (IEEE Transactions on Antennas and Propagation, March 1986). An Implementation of a CMA Adaptive Array for High Speed GMSK Transmission in Mobile Communications 673 T. Ohgane, T. Shimura, N. Matsuzawa, and H. Sasaoka (IEEE Transactions on Vehicular Technology, August 1993). A Four-Element Adaptive Antenna Array for IS-136 PCS Base Stations 680 R. L. Cupo, G. D. Golden, C. C. Martin, K. L. Sherman, N. R. Sollenberger, J. H. Winters, and P. W. Wolniasky (IEEE 46th Vehicular Technology Conference, May 1996). Ericsson/Mannesmann GSM Field-Trials with Adaptive Antennas 685 S. Anderson, U. Forssen, J. Karlsson, T. Witzschel, P. Fischer, and A. Krug (IEEE 46th Vehicular Technology Conference, May 1996). 690 Preliminary Measurement Results from an Adaptive Antenna Array Testbed for GSM/UMTS P. E. Mogensen, K. I. Pedersen, P. Leth-Espensen, B. Fleury, F. Frederiksen, K. Olesen, and S. L. Larsen (IEEE Vehicular Technology Conference, May 1997). Performance Evaluation of a Cellular Base Station Multibeam Antenna 695 Y. Li, M. Feuerstein, and D. O. Reudink (IEEE Transactions on Vehicular Technology, February 1997). Space Division Multiple Access (SDMA) Field Trials. Part 1: Tracking and BER Performance 704 G. V. Tsoulos, J. McGeehan, and M. Beach (lEE Proceedings of Radar, Sonar, and Navigation, February 1998). 710 Space Division Multiple Access (SDMA) Field Trials. Part 2: Calibration and Linearity Issues G. V. Tsoulos, J. McGeehan, and M. Beach (lEE Proceedings of Radar, Sonar, and Navigation, February 1998).

Chapter 6

Applications and Planning Issues

717

High Data Rate Indoor Wireless Communications Using Antenna Arrays 719 M. J. Gans, R. A. Valenzuela, J. H. Winters, and M. J. Carloni (6th International Symposium on Personal, Indoor and Mobile Radio Communications, September 1995). On Optimizing Base Station Antenna Array Topology for Coverage Extension in Cellular Radio Networks 726 J-W. Liang and A. J. Paulraj (IEEE 45th Vehicular Technology Conference, July 1995). Usage of Adaptive Arrays to Solve Resource Planning Problems 731 M. Frullone, P. Grazioso, C. Passerini, and G. Riva (Proceedings of the 46th Vehicular Technology Conference, April 1996). viii

Contents

Subscriber Location in CDMA Cellular Networks 735 J. Caffery, Jr. and G. L. Stuber (IEEE Transactions on Vehicular Technology, May 1998). On the Capacity Formula for Multiple Input-Multiple Output Wireless Channels: A Geometric Interpretation 745 P. F. Driessen and G. J. Foschini (IEEE Transactions on Communications, February 1999). Optimum Space-Time Processors with Dispersive Interference: Unified Analysis and Required Filter Span 749 S. L. Ariyavisitakul, J. H. Winters, and I. Lee (IEEE Transactions on Communications, July 1999).

Author Index

761

Subject Index

763

ix

Preface

O

technology. The key areas are separated in six chapters as follows:

VE R the last few years, the demand for service provision via the wireless communication bearer has risen beyond all expectations. This fact introduces one of the most demanding technological challenges: the need to increase the spectrum efficiency of wireless networks. Whereas great effort until today has been focused toward the development of modulation methods, coding techniques, communication protocols, and so forth, the antenna-related technology has received significantly less attention up to now. Nevertheless, in order to achieve the ambitious requirements introduced for future wireless systems, new "intelligent" or "self-configured" and highly efficient systems will most certainly be required. In the pursuit for schemes that will solve these problems, attention has recently turned to spatial filtering methods using advanced antenna techniques: adaptive or "smart" antennas. Filtering in the space domain can separate spectrally and temporally overlapping signals from multiple mobile units, and hence the spatial dimension can be exploited as a hybrid multiple access technique complementing the basic underlying multiple access technique. Adaptive antennas have been studied for many years by the sonar and radar research communities as interferenceresistant aids (the first known case of an adaptive antenna dates back to 1959: L. C. Van Atta, Electromagnetic Reflection, U.S. Patent 29080002, October 6, 1959), and their main application until recently has been military. Advances in processor cost and speed have only recently made it possible to overcome the major obstacle of hardware cost and complexity and start considering the possibility of applying this technique to commercial communications. This book targets a very wide audience. It can be used as a reference source (e.g., in conjunction with other texts on signal processing, antennas, mobile communications) for students at the undergraduate and/or postgraduate level, academics, researchers, professionals, and managers who either are specifically interested or want to understand general aspects of this technology. In order to achieve these goals, a large number of published works on a variety of issues related to adaptive antennas have been gathered. The papers included in this volume, along with the cited references, constitute a very detailed source of information dealing with almost all the important issues related to this

Chapter 1: One introductory paper that provides important background information on adaptive antennas, followed by three papers with the channel models necessary for simulations and material dealing with the spatial characteristics of the radio channel for different operational environments. Chapter 2: Nineteen papers with the most representative, widely used and researched adaptive methods and algorithms such as MUSIC, ESPRIT, and SAGE. Chapter 3: Twenty-seven papers dealing with the issue that has attracted most of the attention in terms of research up to now, the performance of adaptive antennas with different adaptive methods and algorithms under a variety of conditions in mobile communication environments. Chapter 4: Thirteen papers dealing with implementation issues for adaptive antennas: beamforming techniques, calibration, mutual coupling, nonlinearity problems, and so forth. Chapter 5: Eight papers presenting experimental results for issues related to this technology, mainly from adaptive antenna test beds. Chapter 6: Six papers that deal with more general issues related to adaptive antennas such as specific applications for user location, indoor wireless high data rate networks, planning issues for adaptive antennas, and novel techniques that seem promising to open new directions for this technology in the future. The work included in the different chapters is sorted chronologically except for papers that present overviews or comparisons for the issues of focus in a chapter. The latter are always at the beginning of the chapter. I sincerely hope that you find this source of reference useful. If it manages to stimulate you and as a result opens new horizons for you in this very exciting and promising area, then it will have succeeded in its purpose.

George V. Tsoulos xi

Chapter One Introduction and Channel Models

A

DAPTIVE antenna arrays have long been an attractive solution to a plethora of problems related to signal detection and estimation. An array of antenna elements can overcome the directivity and beamwidth limitations of a single antenna element, and when it is combined with methods from statistical detection and estimation and control theory, a self-adjusting or adaptive system emerges. This key capability was recognized in 1967 by Widrow and his colleagues in their publication in the IEEE Proceedings, with which this book opens. This paper offers a valuable introduction to the adaptive antenna concepts. A smart antenna system relies heavily on the spatial characteristics of the operational environment to improve the output signal. In order to study the performance of

adaptive algorithms in radio operational environments (Chapters 2 and 3), it is essential to employ suitable channel models that provide both spatial and temporal information. For that reason, three papers are included in this chapter. There is still a lot of work to be done in terms of characterizing the radio channel and producing propagation models capable of providing all the information needed to efficiently study wideband systems that also exploit the spatial dimension. This need was recently underlined by the international standardization organisations, and several research activities are already under way (e.g., subgroup on spatial propagation models of the COST-European Union Forum for Cooperative Scientific ResearchAction 259).

Adaptive Antenna Systems B. WIDROW,

MEMBER, IEEE,

P. E. MANTEY, MEMBER, IEEE, L. J. GRIFFITHS, B. B. GOODE, STUDENT MEMBER, IEEE

STUDENT MEMBER, IEEE, AND

Ahstract-A system consisting of an antenna array and an adaptive processor can perform filtering in both the space and the frequency domains, thus reducing the sensitivity of the signal-receiving system to interfering directional noise sources. Variable weights of a signal processor can be automatically adjusted by a simple adaptive technique based on the least-mean-squares (LJ\lS) algorithm. During the adaptive process an injected pilot signal simulates a received signal from a desired "Iook' direction. This allows the array to be "trained" so that its directivity pattern has a main lobe in the previously specified look direction. At the same time, the array processing system can reject any incident noises, whose directions of propagation are different from the desired look direction, by forming appropriate nulls in the antenna directivity pattern. The array adapts itself to fonn a main lobe, with its direction and bandwidth determined by the pilot signal, and to reject signals or noises occurring outside the main lobe as well as possible in the minimum meansquare error sense. Several examples illustrate the convergence of the L~IS adaptation procedure toward the corresponding Wiener optimum solutions. Rates of adaptation and misadjustments of the solutions are predicted theoretically and checked experimentally. Substantial reductions in noise reception are demonstrated in computer-simulated experiments. The techniques described are applicable to signal-reech'ing arrays for use over a wide range of frequencies.

T

INTRODUCTIO~

H E SENSITIVITY of a signal-receiving array to interfering noise sources can be reduced by suitable processing of the outputs of the individual array elements. The combination of array and processing acts as a filter in both space and frequency. This paper describes a method of applying the techniques of adaptive filtering! I} to the design of a receiving antenna system which can extract directional signals from the medium with minimum distortion due to noise. This system will be called an adaptive array. The adaptation process is based on minimization of mean-square error by the LMS algorithm.[2 1- [-+] The system operates with knowledge of the direction of arrival and spectrum of the signal, but with no knowledge of the noise field. The adaptive array promises to be useful whenever there is interference that possesses some degree of spatial correlation ~ such conditions manifest themselves over the entire spectrum, from seismic to radar frequencies. .

Manuscript received May 29. 1967; revised September 5, 1967. B. Widrow and L. J. Griffiths are with the Department of Electrical Engineering, Stanford University, Stanford, Calif. P. E. Mantey was formerly with the Department of Electrical Engineering, Stanford University. He is presently with the Control and Dynamical Systems Group, IBM Research Laboratories, San Jose. Calif. B. B. Goode is with the Department of Electrical Engineering, Stanford University, Stanford, Calif., and the Navy Electronics Laboratory, San Diego, Calif.

The term "adaptive antenna" has previously been used by Van Atta[5] and others!"! to describe a self-phasing antenna system which reradiates a signal in the direction from which it was received. This type of system is called adaptive because it performs without any prior knowledge of the direction in which it is to transmit. For clarity, such a systern might be called an adaptive transmittinq array: whereas the system described in this paper might be called an adaptive receiving array. The term "adaptive filter" has been used by Jakowatz, Shuey, and White[7] to describe a systern which extracts an unknown signal from noise, where the signal waveform recurs frequently at random intervals. Davisson!"! has described a method for estimating an unknown signal waveform in the presence of white noise of unknown variance. Glaser!"! has described an adaptive system suitable for the detection of a pulse signal of fixed but unknown waveform, Previous work on array signal processing directly related to the present paper was done by Bryn. Merrnoz, and Shor. The problem of detecting Gaussian signals in additive Gaussian noise fields was studied by Bryn, (lOl who showed that. assuming K antenna elements in the array, the Bayes optimum detector could be implemented by either K 2 linear filters followed by "conventional" beam-forming for each possible signal direction, or by K linear filters for each possible signal direction. In either case, the measurement and inversion of a 2K by 2K correlation matrix was required at a large number of frequencies in the band of the signal. Merrnoz! 11] proposed a similar scheme for narrowband known signals, using the signal-to-noise ratio as a performance criterion. Shor[l:!] also used a signal-to-noise-ratio criterion to detect narrowband pulse signals. He proposed that the sensors be switched off when the signal was known to be absent, and a pilot signal injected as if it were a noisefree signal impinging on the array from a specified direction. The need for specific matrix inversion was circumvented by calculating the gradient of the ratio between the output power due to pilot signal and the output power due to noise, and using the method of steepest descent. At the same time, the number of correlation measurements required was reduced, by Shor's procedure, to 4K at each step in the adjustment of the processor. Both Mermoz and Shor have suggested the possibility of real-time adaptation. This paper presents a potentially simpler scheme for obtaining the desired array processing improvement in real time. The performance criterion used is minimum meansquare error. The statistics of the signal are assumed

Reprinted from IEEE Proceedings, \'01 55, No. 12, pp. 2143-2159, December 1967.

3

to be known, but no prior knowledge or direct measurements of the noise field are required in this scheme. The adaptive array processor considered in the study may be automatically adjusted (adapted) according to a simple iterative algorithm, and the procedure does not directly involve the computation of any correlation coefficients or the inversion of matrices. The input signals are used only once, as they occur, in the adaptation process. There is no need to store past input data; but there is a need to store the processor adjustment values, i.e., the processor weighting coefficients Cweigh ts" ). Methods of adaptation are presented here, which may be implemented with either analog or digital adaptive circuits, or by digital-computer realization.

a cycle at frequency j~ (i.e., a 90° phase shift), denoted by 1/(4/0}. The output signal is the sum of all the weighted signals, and since all weights are set to unit values, the directivity pattern at frequency fo is by symmetry the same as that of Fig. l(a). For purposes of illustration, an interfering directional sinusoidal "noise" of frequency wfo incident on the array is shown in Fig. 2(a), indicated by the dotted arrow. The angle of incidence (45.5°) of this noise is such that it would be received on one of the side lobes of the directivity pattern with a sensitivity only 17 dB less than that of the main lobe at 0=0°. If the weights are now set as indicated in Fig. 2(b)., the directivity pattern at frequency j~ becomes as shown in that figure. In this case, the main lobe is almost unchanged from that shown in Figs. l(a) and 2(a), while the particular side lobe that previously intercepted the sinusoidal noise in DIRECTIONAL AND SPATIAL FILTERING Fig. 2(a) has been shifted so that a null is now placed in the An example of a linear-array receiving antenna is shown in direction of that noise. The sensitivity in the noise direction Fig. l(a) and (b). The antenna of Fig. l(a) consists of seven is 77 dB below the main lobe sensitivity, improving the noise isotropic elements spaced ;"0/2 apart along a straight line, rejection by 60 dB. where i.. o is the wavelength of the center frequency j~ of A simple example follows which illustrates the existence the array. The received signals are summed to produce an and calculation of a set of weights which will cause a signal array output signal. The directivity pattern.. i.e., the relative from a desired direction to be accepted while a "noise from sensitivity of response to signals from various directions, a different direction is rejected. Such an example is illusis plotted in this figure in a plane over an angular range of -n/2 < f) < rc/2 for frequency J~. This pattern is symmetric trated in Fig. 3. Let the signal arriving from the deabout the vertical line 8 = o. The main lobe is cen tered at sired direction 0 = 0 be called the "pilot" signal p( t) = P () = O. The largest-amplitude side lobe.. at V= 24 . has a sin wot., where Wo ~ 2rrj~, and let the other signal. the noise, be chosen as n(l) = N sin ((Jot, incident to the receiving array maximum sensitivity which is 12.5 dB below the maximum at an angle 0 = rt/6 radians. Both the pilot signal and the main-lobe sensitivity. This pattern would be different if it noise signal are assumed for this example to be at exactly were plotted at frequencies other than f~. the same frequency .f~. At a point in space midway between The same array configuration is shown in Fig. ltb): howthe antenna array elements, the signal and the noise are ever.. in this case the output of each element is delayed in assumed to be in phase. In the example shown, there are time before being summed. The resulting directivity pattern. two identical omnidirectional array elements. spaced i~o/2 now has its main lobe at an angle of'" radians.. where apart. The signals received by each element are fed to two variable weights., one weight being preceded by a quarter. - 1 (;~oc5j~) 1 ( 1) ~ = SID - - = sIn wave time delay of 1/(4j~). The four weighted signals are d d then summed to form the array output. The problem of obtaining a set of weights to accept p(t) in which and reject net) can now be studied. Note that with any set wfo = frequency of received signal of nonzero weights, the output is of the form A sin (wa t + 4», ;"0 = wavelength at frequency j~ and a number of solutions exist which will make the output b = time-delay difference between neighboring-element be p(t). However, the output of the array must be indepenoutputs dent of the amplitude and phase of the noise signal if the d = spacing between antenna elements array is to be regarded as rejecting the noise. Satisfaction of c = signal propagation velocity = ;"oj~. this constraint leads to a unique set of weights determined as The sensitivity is maximum at angle t/J because signals re- follows. The array output due to the pilot signal is ceived from a plane wave source incident at this angle" and delayed as in Fig. l(b), are in phase with one another and produce the maximum output signal. For the example illustrated in the figure, d=A.o/2, £5=(0.12941/ j~), and thereFor this output to be equal to the desired output of p(t)=P fore t/!=sin- 1 (2£5/0 ) = 15°. There are many possible configurations for phased arrays. sin wot (which is the pilot signal itself), it is necessary that Fig. 2(a) shows one such configuration where each of the antenna-element outputs is weighted by two weights in (3) parallel, one being preceded by a time delay of a quarter of 4

. - (C6)'

4

LOOK

DIRECTION

o·

DIRECTIVITY PATTERN

"S

NOSE

S·~RECTION /(NOISE AT FRED. t.)

ANTENNA ELEMENTS

/\ d

\

I a)

o· ,

I bl W, ""'

0 .0 99

w:t- - 1.255

w] = - 0 2 6 6

.....4 =- I 51 8 """,""' 0 . 18 2 w. ~- I 6 10 w.," O.OCX>

Fig . I.

Dir ecti vit y pattern fo r a linea r a rray . (a ) Sim ple a rr ay . (0 ) Dela ys added .

" PILOT" SIGNAL p(t)=PSltlw..t

F ig. 2.

l

....\. - - I 255

Direct ivity pa tte rn o f linea r a rray . (a ) With equal weighting. (b ) With weighting fo r no ise elimina tio n.

/ ' NOISE" / n (t ) =NSl n( £)..1

I

/--K

I i

.....- - 1 2 3 3 .... ... - 0 18 2 ""10=-- 1.6 10 '* 11 0 26 6 w1z· - 1 5 19 'IIIII [] _ -0999

6 ;.'

/

/

/

i

'-------o t~f~T Fig. 3.

Array confi guration for noise elimination example.

5

With respect to the midpoint between the antenna elements, the relative time delays of the noise at the two antenna elements are ± [1/(4fo)]sin n/6 = ± 1/(810)= ± Ao/(8c), which corresponds to phase shifts of ± n/4 at frequency.fo . The array output due to the incident noise at () = n/6 is then

N

[WI sin

(root -~) + W2sin (root - 34n) + W3 sin (root + ~) +

W4

sin

(wot - ~)]

(4)

For this response to equal zero, it is necessary that Fig . 4.

(5)

Adaptive array configuration for receiving narrowband signals .

Thus the set of weights that satisfies the signal and noise response requirements can be found by solving (3) and (5) simultaneously. The solution is

(6) With these weights, the array will have the desired properties in that it will accept a signal from the desired direction. while rejecting a noise, even a noise which is at the same frequency 10 as the signal, because the noise comes from a different direction than does the signal. The foregoing method of calculating the weights is more illustrative than practical. This method is usable when there are only a small number of directional noise sources, when the noises are monochromatic, and when the directions of the noises are known a priori. A practical processor should not require detailed information about the number and the nature of the noises. The adaptive processor described in the following meets this requirement. It recursively solves a sequence of simultaneous equations. which are generally overspecified, and it finds solutions which minimize the mean-square error between the pilot signal and the total array output.

;;:5 , Fig . 5.

0

Adaptive ar ra y contigurauou for receiving broadband signals.

Thus the two weights and the I 14};) time delay provide completely adjustable linear processing for narrowband signals received by each individual antenna element. The full array of Fig. 4 represents a completely general way of combining the antenna-element signals in an adjustable linear structure when the received signals and noises are narrowband. It should be realized that the same generality (for narrowband signals) can be achieved even when the time delays do not result in a phase shift of exactly nl2 at the center frequency }~. Keeping the phase shifts CONFIGURAnONS OF ADAPTIVE ARRA YS clo se to 7[/2 is desirable for keeping required weight values Before discussing methods of adaptive filtering and signal small. but is not necessary in principle. processing to be used in the adaptive array, various spatial When one is interested in receiving signals over a wide and electrical configurations of antenna arrays will be band of frequencies. each of the phase shifters in Fig. 4 can considered. An adaptive array configuration for processing be replaced by a tapped-delay-line network as shown in narrowband signals is shown in Fig. 4. Each individual Fig. 5. This tapped delay line permits adjustment of gain antenna element is shown connected to a variable weight and phase as desired at a number of frequencies over the and to a quarter-period time delay whose output is in band of interest. If the tap spacing is sufficiently close, this turn connected to another variable weight. The weighted network approximates the ideal filter which would allow signals are summed, as shown in the figure. The signal, complete control of the gain and phase at each frequency assumed to be either monochromatic or narrowband, is in the passband . received by the antenna element and is thus weighted by a ADAPTIVE SIGNAL PROCESSORS complex gain factor AeP. Any phase angle
R+

6

The procedure should produce a given array gain in the specified look direction while simultaneously nulling out interfering noise sources. Fig. 6 shows an adaptive signal-processing element. If this element were combined with an output-signal quantizer, it would then comprise an adaptive threshold logic unit. Such an element has been called an "Adaline"[13] or a threshold logic unit (TLU).[14 1 Applications of the adaptive threshold element have been made in pattern-recognition systems and in experimental adaptive control systems.[2],[3],[14j-[17]

s lt) OUTPUT

d(t)

DESIRED RESPONSE

Fig. 6.

In Fig. 6 the input signals xt(t), ... , xi(t), ... , xn(t) are the same signals that are applied to the multiplying weights WI' ... , Wi' ... , W n shown in Fig. 4 or Fig. 5. The heavy lines show the paths of signal flow; the lighter lines show functions related to weight-changing or adaptation processes. The output signal s(t) in Fig. 6 is the weighted sum s(t) ==

n

L

i==: 1

Xi(t)Wi

Basic adaptive element.

This signal is used as a control signal for the "weight adjustment circuits" of Fig. 6.

Solving Simultaneous Equations The purpose of the adaptation or weight-changing processes is to find a set of weights that will permit the output response of the adaptive element at each instant of time to be equal to or as close as possible to the desired response. For each input-signal vector XU), the error e(j) of (10) should be made as small as possible. Consider the finite set of linear simultaneous equations

(7)

where n is the number of weights: or, using vector notation

(8) where W T is the transpose of the weight vector

WTX(l)

W TX(2)

= =

d(l)

d(2) (11)

WTX(N)

=

d(N)

where N is the total number of input-signal vectors; each vector is a measurement of an underlying n-dimensional random process. There are N equations, corresponding to N instants of time at which the output response values are of concern ~ there are n "unknowns," the n weight values which form the components of W. The set of equations (11) will usually be overspecified and inconsistent, since in the present application, with an ample supply of input data, it is usual that N» 11. [These equations did have a solution in the simple example represented in Fig. 3. The solution is For digital systems, the input signals are in discrete-time given in (6). Although the simultaneous equations (3) in that example appear to be different from (11), they are really the sampled-data form and the output is written same, since those in (3) are in a specialized form for the case (9) when all inputs are deterministic sinusoids which can be easily specified over all time in terms of amplitudes, phases, where the index j indicates the jth sampling instant. and frequencies.] In order that adaptation take place, a "desired response" When N is very large compared to n, one is generally signal., d(t) when continuous or dU) when sampled, must be interested in obtaining a solution of a set of N equations supplied to the adaptive element. A method for obtaining this signal for adaptive antenna array processing will be [each equation in the form of (10)] which minimizes the sum of the squares of the errors. That is, a set of weights W discussed in a following section. The difference between the desired response and the out- is found to minimize put response forms the error signal £U): N (12) L c; 2 U)·

and the signal-input vector is

( 10)

j= I

7

When the input signals can be regarded as stationary stochastic variables, one is usually interested in finding a set of weights to minimize mean-square error. The quantity of interest is then the expected value of the square of the error, i.e., the mean-square error, given by

to solve (19). This solution is generally straightforward, but presents serious computational problems when the number of weights n is large and when data rates are high. In addition to the necessity of inverting an n x n matrix, this method may require as many as n(n+ 1)/2 autocorrelation and crosscorrelation measurements to obtain the elements of $(x, x). E[£2(j)] £ 2. (13) Furthermore, this process generally needs to be continually The set of weights that minimizes mean-square error can repeated in most practical situations where the input signal statistics change slowly. No perfect solution of (19) is posbe calculated by squaring both sides of (10) which yields sible in practice because of the fact that an infinite stac; 2 U) = d 2(j) + WTX(j)XU)TW - 2dU)WTX(j) (14) tistical sample would be required to estimate perfectly the elements of the correlation matrices. and then taking the expected value of both sides of (14) Two methods for finding approximate solutions to (19) E[82 U)] = E[d 2 + WTXU)XTU)JV - 2W T dU)XU)] will be presented in the following. Their accuracy is limited = E[d2 ] + WT<J)(.r, x)W - 2W T<J)(x, (l) (15) by statistical sample size, since they find weight values based on finite-time measurements of input-data signals. These where methods do not require explicit measurements of correlation functions or matrix inversion. They are based on gradient-search techniques applied to mean-square-error functions. One of these methods, the LMS algorithm, does not even require squaring, averaging, or differentiation in order to make use of gradients of mean-square-error funcX,.X 1 tions. The second method, a relaxation method, will be discussed later. and The LJ\1S Alqorithm

A number of weight-adjustment procedures or algorithms exist which minimize the mean-square error. Minimization (17) is usually accomplished by gradient-search techniques. One method that has proven to be very useful is the LMS Xi" algorithm.[1}-{3J.(17 1 This algorithm is based on the method of steepest descent. Changes in the weight vector are made along the direction of the estimated gradient vector. Accordingly, The symmetric matrix (J)(x, x) is a matrix of cross correlaW(j + 1) = W(j) + k.~V(j) C~O) tions and autocorrelations of the input signals to the adaptive element, and the column matrix
VE[82 ]

= 2(x, (/).

mean-square-error function is to take the gradient of a single time sample of the squared error

(18)

When the choice of the weights is optimized, the gradient is zero. Then

From (10) V[r.(j)] = V[d(j) - WT(j)X(j)]

(19)

= -

The optimum weight vector WLMS is the one that gives the least mean-square error. Equation (19) is the Wiener-Hopf equation, and is the equation for the multichannel leastsquares filter used by Burg(18) and Claerbout-'"! in the processing of digital seismic array data. One way of finding the optimum set of weight values is

Thus

X(j).

v(j) = - 28(j)X{j).

(21)

The gradient estimate of(21) is unbiased, as will be shown by the following argument. For a given weight vector W(j),

8

the expected value o f the gradient es tim ate is

E[V(j)J =

- 2E[ ld (j ) - WT(j )X(j )}X(j)J 2[$ (x , d) - WT(j )$(x , x )].

= -

wE IGHT SETTING

(22)

Comparing (18) and (22), we see th at

E[ V(j)J = VE[I;2] WEIGHT SETT ING

and therefore, for a give n weight vector. th e expec ted va lue o f the esti mate eq ua ls the true va lue. Using the g ra d ient es timatio n form ula give n in (2 1), th e weig ht iteration rule (20) becomes W(j

+

I)

=

W (j ) - 2k,l;(j)X(j)

(b)

ERROR SlGNAL

Fig . 7, Bloc k d iag ra m rep resentat ion o r LMS a lgo rith m. (a) D igital rea lizatio n, (b ) A na log real iza tio n,

a nd th e next weight vecto r is o btai ned by adding to th e present weight vect or the input vector sca led by th e value of the error. The LMS a lgo rithm is given by (23). It is d irectl y usable as a weight-adaptation formula for digital sys te ms . Fig. 7(a) sho ws a block-diagram re prese n ta tio n of thi s eq ua tio n in terms of one component lV i o f th e weight vector W. An eq uiva len t diffe rent ial equatio n which can be used in a nalog implementation of conti n uo us sys te ms (see Fi g. 7(b)) is given by

d

- W(f) dt

=

" It )

(23)

,,<>---

-

--{ /}--

' . o---

_ - {/}--

'-----.,.

-

-"'

-2 k,;;( £) Xlt ). d =OESlRED. RESPONSE . SIGNAL

This equat ion can a lso be writt en as

WIt)

= -

Fig . x.

Ot

21\,

Jo

1;1~)Xf~J d~ .

E[Wlj

F ig. S sho ws how ci rc ui try of the type indi ca ted in F ig. 7(a) o r tb ) might be inco rpora ted into the im p lementa tio n of the ba sic adaptive elem ent o f Fig. 6. Conterqence of the ,\1 l!all

or the Pr'l!iyhr

+

I I]

= [I +

~ k ,$\ s ..,,}] j - ln/ (Ol

L: 1

[I

i =0

+

2k ,$(s. S)T $(s . d) .

(~ 5 )

Equ ati on (25) ma y be put in di ag on al fo rm by using the appropriate simil arity tran sformat ion Q for th e matri x $ (s. .x}, that is.

VeClvr

$\ s . s l =

Q- I EQ

whe re

0

0 e,

0

0

L' l

o o

E~

is th e diagon al matrix o f eigen values. The eigen values are a ll positi ve, since $ (x, x) is positive definite [ see (16)]. Equat ion (25) m a y no w be ex pressed as

E[ WU + 1)] = [ I

+ 2k, Q - 1 EQ]i + t W(0) j

1)]= E[ W( j)] - 2k,E[ :tI(j) - W /(j)X(j ))X (j )]

= [I +

+

- 2k,

For the purpose of the following di scu ssion. we assume that the time between s uccessive iterations o f th e LMS a lgo rithm is su fficien tly lon g so th at the sa m p le inpu t vecto rs X(j ) a nd Xtj + I) are unco rrela ted , Thi s assumptio n is co mmon in the field of stoc ha stic a pproxi ma tio n.1201-122] Becau se th e wei gh t vec to r W(j) is a func tio n only o f th e inp ut vecto rs X(j-I ), X(j-2), " ' . XIO) [ see (23)J and beca use th e successive in put vec tors a re un correlat ed. W (j ) is independent o f X(j). For sta tio na ry input pr ocesses meeting thi s condition, the expec ted valu e E[W(j)] o f the weight vecto r afte r a la rge number of ite ra tio ns ca n th en be sho wn to co nve rge to th e Wiener so lutio n given by (19 ). T aking th e ex pec ted va lue of bo th sides o f (23). we obtain a differen ce eq ua tion in t he ex pec ted va lue of the weight vector

E[W(j

A nulog d igita l rmplcm entution of L7vlS weig ht-adjustme nt a lgo rit hm.

- 2k,

I

21-:,$ (.\: . .\: )] E[W(j)] - 2k,$( .\:. til (24)

= Q -t[! +

where 1 is the identity matrix. With an initial weight vect or WID). j + I iterations o f (24) yield

- 2k sQ -

9

+ 2ksQ - t EQ J «I>(x , d)

[I

i= O

2k, E]i + I QW(O) j

I

I

i =O

[1 +

2k sE]iQ$( X, d).

(26)

Consider the diagonal matrix [I + 2k~J. As long as its diagonal terms are all of magnitude less than unity

"

I 1 E[x?J

+ 2k sEJi+ 1 --+ 0

lim [1

i- 00

1

....

p

Thus, in the limit, (26) becomes lim E[W(j ) - 00

+

I)J= Q-1E-1Q(x, d)

=

(J) - 1(x, X )
d).

Comparison of this result with (19) shows that as the number of iterations increases without limit, the expected value of the weight vector converges to the Wiener solution. Convergence of the mean of the weight vector to the Wiener solution is insured if and only if the proportionality constant k s is set within certain bounds. Since the diagonal terms of [I + 2ksE] must all have magnitude less than unity. and since all eigenvalues in E are positive, the bounds on ks are given by

r= p

2( -

1 ,p=1~2~···.n ks)ep

r

- - < k, < 0

where em ax is the maximum eigenvalue of
(28)

where trace [(J)(x, x)J £ E[XT(j)X(j)]

=

n

I

i= 1

E[xfJ !

total input power,

it follows that satisfactory convergence can be obtained with n

I

i= 1

-1

E[xrJ

< ks <

1

== - - 2( -

(27)

e max

(30)

where e p is the pth eigenvalue of the input-signal correlation matrix (J)(x, x], In the special case when all eigenvalues are equal, all time constants are equal. Accordingly,

or -1

(29)

State-variable methods, which are widely used in modern control theory, have been applied by Widrow!'! and Koford and Gronerlf to the analysis of stability and time constants (related to rate of convergence) of the LMS algorithm. Considerable simplifications in the analysis have been realized by expressing transient phenomena of the system adjustments (which take place during the adaptation process) in terms of the normal coordinates of the system. As shown by Widrow, [1} the weight values undergo transients during adaptation. The transients consist of sums of exponentials with time constants given

I (1 + 2k.ep)' = 1 _ (1 + 2k) = ~k . se se p

o.

Time Constants and Learning Curve with LMS Adaptation

-1

i= 0

k, <

It is the opinion of the authors that the assumption of independent successive input samples used in the foregoing convergence proof is overly restrictive. That is, convergence of the mean of the weight vector to the LMS solution can be achieved under conditions of highly correlated input samples. In fact, the computer-simulation experiments described in this paper do not satisfy the condition of independence.

where the formula for the sum of a geometric series has been used, that is, .

«

i=

and the first term of (26) vanishes as the number of iterations increases. The second term in (26) generally converges to a nonzero limit. The summation factor I{=o[l +2k sEJi becomes

00

-1

o.

One very useful way to monitor the progress of an adaptive process is to plot or display its "learning curve." When mean-square error is the performance criterion being used, one can plot the expected mean-square error at each stage of the learning process as a function of the number of adaptation cycles. Since the underlying relaxation phenomenon which takes place in the weight values is of exponential nature, and since from (15) the mean-square error is a quadratic form in the weight values" the transients in the mean-square-error function must also be exponential in nature. When all the time constants are equal, the mean-squareerror learning curve is a pure exponential with a time constant r

'tmse

In practice, when slow, precise adaptation is desired, k s is usually chosen such that

10

ks)e

= 2=

1

4( -ks)e·

The basic reason for this is that the square of an exponential function is an exponential with half the time constant.

Estimation of the rate of adaptation is more complex when the eigenvalues are unequal. When actual experimental learning curves are plotted, they are generally of the form of noisy exponentials because of the inherent noise in the adaptation process. The slower the adaptation, the smaller will be the amplitude of the noise apparent in the learning curve.

M isadjustment with Ll\1 S Adaptation All adaptive or learning systems capable of adapting at real-time rates experience losses in performance because their system adjustments are based on statistical averages taken with limited sample sizes. The faster a system adapts. in general" the poorer will be its expected performance. When the LMS algorithm is used with the basic adaptive element of Fig. 8, the expected level of mean-square error will be greater than that of the Wiener optimum system whose weights are set in accordance with (19). The longer the time constants of adaptation" however" the closer the expected performance comes to the Wiener optimum performance. To get the Wiener performance. i.e.. to achieve the minimum mean-square error. one would have to know the input statistics a priori, or. if (as is usual) these statistics are unknown. they would have to be measured with an arbitrarily large statistical sample. When the LMS adaptation algorithm is used. an excess mean-square error therefore develops. .A measure of the extent to which the adaptive system is rnisadjusted as compared to the Wiener optimum system is determined in a performance sense by the ratio of the excess mean-square error to the minimum mean-square error. This dimensionless measure of the loss in performance is defined as the "misadjustrnent" AJ. For LMS adaptation of the basic adaptive element. it is shown by Widrow' 11 that " dijustrnent M isa

~ _. 1 ,\;f == ,.1 L -

p= 1

Tp

(31)

The value of the rnisadjustment depends on the time constants (settling times) of the filter adjustment weights. Again. in the special case when all the time constants are equal, M is proportional to the number q( weights and inversely proportional to the time constant. That is. 11

A1 ==2r n

(32)

Although the foregoing results specifically apply to statistically stationary processes, the LMS algorithm can also be used with nonstationary processes. It is shown by Widrow[ 2 3 1 that, under certain assumed conditions, the rate of adaptation is optimized when the loss of performance resulting from adapting too rapidly equals twice the loss in performance resulting from adapting too slowly.

ADAPTIVE SPATIAL FILTERING

If the radiated signals received by the elements of an adaptive antenna array were to consist of signal components plus undesired noise, the signal would be reproduced (and noise eliminated) as best possible in the least-squares sense if the desired response of the adaptive processor were made to be the signal itself. This signal is not generally available for adaptation purposes, however. If it were available, there would be no need for a receiver and a receiving array. In the adaptive antenna systems to be described here. the desired response signal is provided through the use of an artificially injected signal" the "pilot signal", which is completely known at the receiver and usually generated there. The pilot signal is constructed to have spectral and directional characteristics similar to those of the incoming signal of interest. These characteristics may, in some cases, be known a priori but, in general, represent estimates of the parameters of the signal of interest. Adaptation with the pilot signal causes the array to form a beam in the pilot-signal direction having essentially flat spectral response and linear phase shift within the passband of the pilot signal. Moreover. directional noises impinging on the antenna array will cause reduced array response (nulling) in their directions within their passbands. These notions are demonstrated by experiments which will be described in the following. I njection of the pilot signal could block the receiver and render useless its output. To circumvent this difficulty, two adaptation algorithms have been devised, the .... onemode" and the "two-mode." The two-mode process alternately adapts on the pilot signal to form the beam and then adapts on the natural inputs with the pilot signal off to eliminate noise. The array output is usable during the second 1110de., while the pilot signal is off. The one-mode algorithm permits listening at all times. but requires more equipment for its implementation.

The Two-Mode Adaptation Alqorithm Fig. 9 ill ustrates a method for providing the pilot signal wherein the latter is actually transmitted by an antenna located some distance from the array in the desired look direction. Fig. 10 shows a more practical method for providing the pilot signal. The inputs to the processor are connected either to the actual antenna element outputs (during "mode II"), or to a set of delayed signals derived from the pilot-signal generator (during "mode 1'''). The filters b l ' . . . , bK (ideal time-delays if the array elements are identical) are chosen to result in a set of input signals identical with those that would appear if the array were actually receiving a radiated plane-wave pilot signal from the desired "look" direction, the direction intended for the main lobe of the antenna directivity pattern. During adaptation in mode I, the input signals to the adaptive processor derive from the pilot signal, and the desired response of the adaptive processor is the pilot signal

11

ANTENNAS I ~-..,.-----l

\

OUTPuT

Fig. 9. Adaptation with external pilot-signal generator. Mode I: adapration with pilot signal present ; Mode II : adaptation wah pilot signal absent.

/

Fig. II.

ADAPTIVE

~

!

ERROR

n . ~--_. MOOE

MOOE ;

r I

:

/

/

)

Single-m ode adaptation with pilot signal.

The One-Mode Adaptation Algorithm

DESIRED RESPONSE

~-----_._-

/

/

\

of all signals received by the antenna elements which are uncorrelated with the pilot signals, subject to the constraint that the gain and phase in the beam approximate predetermined values at the frequencies and angles dictated by the pilot-signal components.

SIGNAL 1--...--."..""" , OUTPUT 1,i. ----fS,·t-<'"""-----1 PROCESSOR I

/

/

\

PILOT SIGNAL

GENERATCR

Fig. 10. Two-mode ada ptuuo n wuh Internal pilot- signal generato r. Mode I : adaptation with pilot Signal present : Mode II : adaptuuon with pilot signal a bsent.

itself. If a sinusoidal pilot signal at trequencyj., is used. for example, adapting the weights to minimize mean-square error will force the gain of the antenna array in the look direction to have a specific amplitude and a specific phase shift at frequency j~. During adaptation in mode II. all signals applied to the adaptive processor are received by the antenna elements from the actual noise field. In this mode , the adaptation process proceeds to eliminate all received signals. since the desired response is set to zero . Continuous operation in mode II would cause all the weight values to tend to zero. and the system would shut itself off. However, by alternating frequently between mode I and mode II and causing only small changes in the weight vector during each mode of adaptation, it is possible to maintain a beam in the desired look direction and, in addition, to minimize the reception of incident-noise power. The pilot signal can be chosen as the sum of several sinusoids of differing frequencies. Then adaptation in mode I will constrain the antenna gain and phase in the look direction to have specific values at each of the pilot-signal frequencies. Furthermore, if several pilot signals of different simulated directions are added together, it will be possible to constrain the array gain simultaneously at various frequencies and angles when adapting in mode I. This feature affords some control of the bandwidth and beamwidth in the look direction. The two-mode adaptive process essentially minimizes the mean-square value (the total power)

In the two-mode adaptation algorithm the beam is formed during mode L and the noises are eliminated in the least-squares sense (subject to the pilot-signal constraints) in mode II. Signal reception during mode I is impossible because the processor is connected to the pilot-signal generator. Reception can therefore take place only during mode II. This difficulty is eliminated in the system of Fig. II. in which the actions of both mode [ and mode II can be accomplished simultaneously. The pilot signals and the received signals enter into an auxiliarv. adaptive processor. just as des~ribed prev iously . For this processor. the desired response is the pilot signal p(t ). A second weighted processor (linear element) generates the actual array output signal. but it performs no adaptation . Its input signals do not contain the pilot signal. It is slaved to the adaptive processor in such a way that its weights track the corresponding weights of the adapting system. so that it never needs to receive the pilot signal. In the single-mode system of Fig. II. the pilot signal is on continuously. Adaptation to minimize mean-square error will force the adaptive processor to reproduce the pilot signal as closely as possible, and. at the same time. to reject as well as possible (in the mean-square sense) all signals received by the antenna elements which are uncorrelated with the pilot signal. Thus the adaptive process forces a directivity pattern having the proper main lobe in the look direction in the passband of the pilot signal (satisfying the pilot signal constraints), and it forces nulls in the directions of the noises and in their frequency bands. Usually, the stronger the noises , the deeper are the corresponding nulls.

12

COMPUTER SIMULATION OF ADAPTIVE ANTENNA SYSTEMS

To demonstrate the performance characteristics of adaptive antenna systems, many simulation experiments, involving a wide variety of array geometries and signal-

9~o' I

i

9 ! I

+- -

o

o

(0I

0

o -0 - - 9~90'

o

ANTENNA ~ &----j

I b)

Fig. 12.

Array configurati on and pr oce ssing fo r narrowband experiments .

DESIRED

"LOOK" DlRECOON '

' I .

- ".

.-

40" .

-DIRECTION OF SINUSQlQAL NOISE

r

..

..,.

r

..

T ~ I OO

I. '

T- 0 5

" r

" ;-.. 30 0

T- 4

r r-50

T _ 4 00

~.. '

Fig . 13. Evolution of the directivity pattern while learning to eliminate a d irccuonal no ise and uncorrelatcd noises . (Arra y configuration of Fig . 12.) T = number of elapsed cycles of frequenc yj., (total number of adaptations = 20n.

and noise-field configurations, have been carried out using an IBM 1620-11 computer equipped with a digital output plotter. For simplicity of presentation, the examples outlined in the following are restricted to planar arrays composed of ideal isotropic radiators. In every case, the LMS adaptation algorithm was used . All experiments were begun with the initial condition that all weight values were equal. Narrowband Processor Experiments

used . The pilot signal was a unit-amplitude sine wave (power = 0.5, frequency fo) which was used to train the array to look in the ()= 0° direction. The noise field consisted of a sinusoidal noise signal (of the same frequency and power as the pilot signal) incident at angle () = 40°, and a small amount of random, uncorrelated, zero-mean, "white" Gaussian noise of variance (power)=O.1 at each antenna element. In this simulation, the weights were adapted using the LMS two-mode algorithm. Fig. 13 shows the sequence of directivity patterns which evolved during the "learning" process. These computerplotted patterns represent the decibel sensitivity of the array at frequency fo. Each directivity pattern is computed from the set of weights resulting at various stages of adaptation. The solid arrow indicates the direction of arrival of the interfering sine-wave noise source. Notice that the initial directivity pattern is essentially circular. This is due to the symmetry of the antenna array elements and of the initial weight values. A timing indicator T, the number of elapsed cycles of frequency j~ , is presented with each directivity pattern. The total number of adaptations equals 20T in these experiments. Note that if j~ = I kl-lz, T = I corresponds to I ms real time : if j~= 1 MHz. T= 1 corresponds to I us. etc. Several observations can be made from the series of directivity patterns of Fig . 13. Notice that the sensitivity of the array in the look direction is essentially constant during the adaptation process. Also notice that the array sensitivity drops very rapidly in the direction of the sinusoidal noise source : a deep notch in the directivity pattern forms in the noise direction as the adaptation process progresses. After the adaptive transients died out. the array sensitivity in the noise direction was '27 dB below that of the array in the desired look direction. The total noise power in the array output is the sum of the sinusoidal noise power due to the directional noise source plus the power due to the "white" Gaussian. mutually uncorrelated noise-input signals. The total noise power generally drops as the adaptation process commences, until it reaches an irreducible level. A plot of the total received noise power as a function of T is shown in Fig. 14. This curve may be called a "learning curve." Starting with the initial weights, the total output noise power was 0.65, as shown in the figure. After adaptation. the total output noise power was 0.01. In this noise field, the signal -to-noise ratio of the array! after adaptation was better than that of a single isotropic receiving element by a factor of about 60. A second experiment using the same array configuration and the two-mode adaptive process was performed to investigate adaptive array performance in tile presence of several interfering directional noise sources. In this example, the noise field was composed of five directional sinus-

Fig. 12 shows a twelve-clement circular array and signal processor which was used to demonstrate the performance of the narrowband system shown in Fig. 4. In the first computer simulation, the two-mode adaptation algorithm was

, Signal-to-noi se ratio is defined as

SNR =

13

array output power due to signal array output power due to noise

1.0

DESIRED

'. r -. .....___" LOOK " DIRECT ION . - :.-- "

a: O.B

INITIAL NOISE POWER

~

\ 'J - •

;

w !!l06

-

!

...

RECTION CF

SlNUSC1DAL NOISES

o

z

T- 150

...::>

a. ~O.4

o

'.= - 0 0 0 25

...J

~ 0 .2

T-IO

'- ---

T_300

o L_---I.-=======±==~ 400 10 0 200 300 o TIME . T ( cycles of f.)

Fig. 14.

Learning curve for narrowband system of Fig . 12. with noise from one direction onl y.

I( T.5 0 0

TABLE I

I'

SENSITIVITIES OF ARRAY IN DIRECTlO:-;S Of HIE FIVE :-JOISE SOl'R(,[~~ OF FIG . 15. AFTER ADAPTATlo:-;

Noise Direction td egrecs )

Noise Frequency (times 10 )

67 134 191 236 338

1.\0 0 .95 1.00 O.l)O I.OS

Array Sensitivity in Noise Direcu on , Relative to Sensitivit y in De sired Lo ok Directi on (dB)

n

24 4r ms e

T-68 2

r-

( a)

-30

- 2S -30

- .~ X

6 14

= - - = - - = -00 = 0.43 percent. 4r ms e

T-70

-26

oidal noises , each of amplitude 0.5 and power 0.125. acting simultaneously, and, in addition. superposed uncorrelated "white" Gaussian noises of power 0,5 at each of the antenna elements. The frequencies of the five directional noises are shown in Table 1. Fig. 15(a) shows the evolution of the directivity pattern. plotted at frequency j ;j, from the initial conditions to the finally converged (adapted) state. The latter was achieved after 682 cycles of the frequency j~. The learning curve for this experiment is shown in Fig . 15(b). The final array sensitivities in the five noise directions relative to the array sensitivity in the desired look direction are shown in Table 1. The signal-to-noise ratio was improved by a factor of about IS over that of a single isotropic radiator, In Fig. 15(b), one can roughly discern a time constant approximately equal to 70 cycles of the frequency fo. Since there were 20 adaptations per cycle of j~, the learn ing curve time constant was approximately 'mse = 1400 adaptations. Within about 400 cycles of 10, the adaptive process virtually converges to steady state. If fo were 1 MHz, 400 J1s would be the realtime settling time . The misadjustment for this process can be roughly estimated by using (32), although actually all eigenvalues were not equal as required by this equation : M

,.

This is a very low value of misadjustment, indicating a very slow, precise adaptive process. This is evidenced by the

F iu, 15. Evoluti on o f the direcriv uv pattern whr lc lcurrung III elim inate - live direction al noise s and uncorrelutcd no ises. I.·\rra~ configuration o f Fig . 12.) (a) Sequence of dircctiv ity patterns Juring adaptation. (b) Learning curve uotal number o f adaptations = 20T) .

learning curve Fig. 15(b) for this experiment. which is very smoo th and noise-free. Broadband Processor Experiments

Fig . 16 shows the antenna a rra y configuration and signal processor used in a series of computer-simulated broadband experiments. In these experiments, the one-mode or simultaneous adaptation process was used to adjust the weights, Each antenna or element in a five-element circular array was connected to a tapped delay line having five variable weights. as shown in the figure. A broadband pilot signal was used, and the desired look direction was chosen (arbitrarily, for purposes of exam ple) to be ()= - 13 . The frequency spectrum of the pilot signal is shown in Fig. 17(a). This spectrum is approximately one octave wide and is centered at frequency lo. A time-delay increment of I /(4j~) was used in the tapped delay line, thus providing a delay between adjacent weights of a quarter cycle at fre14

2 .0

1.6

...

a::

...

r

19 =0'

lJf:SlRED

LOOK

~

DIRECTION

-\~.

I I

~ O.8

0. ....

-0-" I -,

I 10 I

1.2

'" g

-,. , rA

(5

;q

0.0

II

,, ,,

0 '

(0

25

I

50

75

TIME . T (cycles of f.)

10 0

2 .0

1.6

a:

~

:r 12 ...

'" 6

Fig. 16. Array configura tio n a nd pr ocessing for broadband experiments. (a) Arra y geo metry . (b) Ind ividu al d ement signa l processor.

z

....0 8

~:0

o

kl == - 0 .0002 5

0 .0 0

a:

0.8 1.

'" 3:

1

o

~

in

500

TIME. T

750

(cyCles 01

t.l

1000

1250

F ig. 18. Learning. curves for broa d ba nd experiments. t a ) Rapid lea rni ng 1.\1= 13 percent ). (b l Slow lea rmng IM = 1.3 percent ). .PILOT SIGNAL

0.

.J 06

250

(b )

I

AT - i ,3-

w

JESrRED _OOK

~ 0 .4

"... w

9 - - 13' ' 8 - 0 '

DIRECTION

a:

02

°0L..L-_ _ ..J,---:>....~2----:3 - ---:.

10 )

REL ATIV E FREOUENCY. f I I.

10 )

'0 DESIRED LOOK

08

a::

~ 06 w

'" 6 z

8--13'

DIRECTION-

SPECTRUM OF NOiSE INCIDENT AT - 70 ·

SPECTRUM OF NOISE INCIDENT AT 50'

w

~ 0.4

'"a: ..J

Fig. 19. Comparison o r optimum broadband dire cti vity pattern with experimental pattern after former has been adapted during 625 cycles o r .I~ . (Plo tted at frequency fo ') (a ) Optimum pattern. (b) Adapted with k , = -0.00025.

W

0.2

(b)

Fig. 17.

I 2 3 RELATI VE FREOUENCY, f / f.

4

Freq ue ncy spect ra to r br oadband experiments. (a I Pilot signa l a t 0= - 13 ' . (b) Incid ent noises at 0 = 50 ' and 0 = - 70 '

15

quency 10, and a total delay-line length of one wavelength 19(b), the broadband directivity pattern which resulted at this frequency. from adaptation (after 625 cycles of 10' with k s = -0.0025) The computer-simulated noise field consisted of two is plotted for comparison with the optimum broadband wideband directional noise sources/ incident on the array pattern. Note that the patterns are almost indistinguishable at angles ()= 50 u and 0 = - 70 u • Each source of noise had from each other. power 0.5. The noise at 8= 50° had the same frequency The learning curves of Fig. 18(a) and (b) are composed spectrum as the pilot signal (though with reduced power): of decaying exponentials of various time constants. When while the noise at 8 = - 70° was narrower and centered at k, is set to - 0.00025, in Fig. 18(b), the misadjustment is a slightly higher frequency. The noise sources were un- about 1.3 percent, which is a quite small, but practical value. correlated with the pilot signal. Fig. 17(b) shows these fre- With this rate of adaptation, it can be seen from Fig. 18(b) quency spectra. Additive "white" Gaussian noises (mutually that adapting transients are essentially finished after about uncorrelated) of power 0.0625 were also present in each of 500 cycles of"j~. If j~ is 1 MHz, for example, adaptation could be completed (if the adaptation circuitry is fast the antenna-element signals. enough) in about 500 ps. If j~ is 1 kHz, adaptation could To demonstrate the effects of adaptation rate, the experiments were performed twice, using two different values be completed in about one-half second. Faster adaptation (-0.0025 and -0.00025) for k s ' the scalar constant in (23). is possible, but there will be more misadjustment. These Fig. 18(a) and (b) shows the learning curves obtained under figures are typical for an adaptive antenna with broadband these conditions. The abscissa of each curve is expressed noise inputs with 25 adaptive weights. For the same level in cycles ofj~, the array center frequency ~ and, as before, of misadjustment, convergence times increase approxithe array was adapted at a rate of twenty times per cycle mately linearly with the number of weights.!!' The ability of this adaptive antenna array to obtain of j~. Note that the faster learning curve is a much more "frequency tuning" is shown in Fig. 20. This figure gives noisy one. Since the statistics of the pilot signal and directional the sensitivities of the adapted array (after 1250 cycles of noises in this example are known (having been generated in j~ at k.. = - 0.(0025) as a function of freq uency for the the computer simulation), it is possible to check measured desired look direction. Fig. 20(a), and for the two noise values of misadjustment against theoretical values. Thus directions. Fig. 20(b) and (c). The spectra of the pilot signal the «D(x, x) matrix is known, and its eigenvalues have been and noises are also shown in the figures. In Fig. 20(a). the adaptive process tends to make the computed.' Using (30) and (31) and the known eigenvalues, the mis- sensitivity of this simple array configuration as close as adjustment for the two values of k, is calculated to give the possible to unity over the band of frequencies where the pilot signal has finite power density. Improved performance following values: might be attained by adding antenna elements and by adding more taps to each delay line: or.. more simply. by bandTheoretical Experimental k.. limiting the output to the passband of the pilot signal. Fig. Value of.W Value of Jt 20(b) and (c) shows the sensitivities of the array in the direc0.1288 0.134 -0.0025 tions of the noises. Illustrated in this figure is the very strik0.Ol70 0.0129 -0.00025 ing reduction of the array sensitivity in the directions of the noises, within their specific passbands. The same idea is The theoretical values of misadjustment check quite well illustrated by the nulls in the broadband directivity patterns with corresponding measured values. which occur in the noise directions" as shown in Fig. 19. From the known statistics the optimum (in the least- After the adaptive transients subsided in this experiment, squares sense) weight vector Wl MS can be computed, us-ing the signal-to-noise ratio was improved by the array over (19). The antenna directivity pattern for this optimum weight that of a single isotropic sensor by a factor of 56. vector WLMS is shown in Fig. 19(a). This is a broadband directivity pattern, in which the relative sensitivity of the I MPLEMENTATION array versus angle of incidence is plotted for a broadband The discrete adaptive processor shown in Figs. 7(a) and received signal having the same frequency spectrum as the pilot signal. This form of directivity pattern has few side 8 could be realized by either a special-purpose digital apparatus or a suitably programmed general-purpose malobes, and nulls which are generally not very deep. In Fig. chine. The antenna signals would need analog-to-digital conversion, and then they would be applied to shift regis2 Broadband directional noises were computer-simulated by first ters or computer memory to realize the effects of the tapped generating a series of uncorrelated ("white") pseudorandom numbers. applying them to an appropriate sampled-data (discrete. digital) filter to delay lines as illustrated in Fig. 5. If the narrowband scheme achieve the proper spectral characteristics. and then applying the reshown in Fig. 4 is to be realized, the time delays can be sulting correlated noise waveform to each of the simulated antenna eleimplemented either digitally or by analog means (phase ments with the appropriate delays to simulate the effect of a propagating wavefront. shifters) before the analog-to-digital conversion process. 3 They are: 10.65,9.83, 5.65~ 5.43, 3.59, 3.44,2.68, 2.13. ) .45~ 1.35. 1.20. The analog adaptive processor shown in Figs. 7(b) and 8 0.99,0.66, 0.60,0.46, 0.29, 0.24, 0.20, 0.16, 0.12. 0.01,0.087,0.083.0.075, could be realized by using conventional analog-computer 0.069.

e

16

stru cture would be a capacitive voltage divider rather than a resistive one . Other possible realizations of analog weights include the use of a Hall -effect multiplier combiner with magnetic storage[241 and also the electrochemical memistor of Wid row and HoffYSI Further effort s will be req uired to impro ve existing weigh ting ' elements and to de velop new ones which are sim ple, chea p, a nd adaptable according to the requ irements of the various adaptation algorithms. The realization of the processor ultimately found to be useful in cert ain application s may be composed of a combination of analog and d igital techniques.

10

0.8 -,.- ARRAY SENSITIVITY 8~-13'

06

0 .4

0 .2

o o~'----'--_>-J._---'------' I 2 RELATIVE FREOUENC Y,

( a)

3

4

t / I.

R ELAXATION ALGORITHMS AND TH EIR IMPLEM ENTATI ON 10

ARRAY

/ SENSITIVITY .' AT~

08

>j-,

~

;;;

06

z

~

~

a:

04

"

02

k .~

- 0 .00025

O ;--~-7"':-----"~---~--~

Ib ) 0

08

I 2 3 RELATIVE FREOUENCY. 1 / t.

4

.sPECTRUM . OF NOISE AT - 70· J.RRAY

.' SENSITIVIT Y AT - 70 '

06

04

0.2

( c)

°O~-~--:---=-'->-::---:------' 4 I 2 3 RELA TIVE FREOUENCY. f / f.

F ig. 20. Array sens itivity vers us frequen cy, for broadband experime nt of F ig. 19. (a) Desired look direct ion. Ii= -13 . (b) Sensiti vity In o ne no ise di rectio n. Ii= 50' . (e ) Sen siti vity in the o ther noise d irectio n. IJ= - 70 .

apparatus, such as multipliers, integrators, summers, etc. More economical realizations that would , in add ition. be more suitable fo r high-frequency operation might use fieldeffect transistors as the variable-gain multipliers, whose control (gate ) signals could come from capacitors used as integrators to form and store the weight values . On the other hand, instead of using a va ria ble resistance struc ture to form the vector dot products, the sa me functi on could be achieved using variable-voltage capacitors, with ordinary capacitors again storing the weight values . The resulting

Algorithms other than the LMS procedure described in the foregoing exist that may permit considerable decrease in complexity with specific adaptive circuit implementations. One method of adaptation which may be easy to imp lement electronicall y is based on a relaxation algorithm described by Southwell. [26 ] This algo rithm uses the sa me erro r signal as used in the LMS technique. An estimated mean-squ are error formed by squaring and averaging this error signal over a finite time interval is used in determining the proper weight adjustment. Th e relaxat ion algorithm adjusts one weight at a tim e in a cyclic sequence, Each weight in its turn is adjusted to minimize the measured mean-squ are err or. This method is in contras t to the simultaneou s adj ustment procedure of the LMS steepest-descent a lgo rithm. The relaxation procedure can be shown to produce a misadjustment th at increases with the square of the number of weight s, as opposed to the LMS algorithm whose misadju stm ent increases o nly linearl y with the number of weights. For a given level of misadjustment, the ada pta tio n sett ling time of the relax ation process increases with the sq ua re of the number of weights. For implementation of the Southwell relaxation algo rithm. the configura tio ns of the array and adaptive processor remain the sa me, as does the use of the pilot signal. Th e relaxat ion a lgorithm will work with eith er the two-mode o r th e one-mode adaptation pr ocess. Savings in circu itry ma y result , in that cha nges in th e adj ustments of the weight values depend only upon error measurements and not upon confi gurations of error measurements and simultaneous input-signal mea surements. Circuitry for implementing the LMS systems as shown in Fig. 7(a)' and (b) ma y be more co mplicated. The relaxati on method ma y be applicable in cases where the adjustments are not obvious " weight" settings. For example, in a microwave system, the adj ustments might be a system of motor-dri ven apertures or tuning stubs in a waveguid e or a network of waveguides feeding an antenna. Or the adj ustments may be in the antenna geometry itself. In such cas es, th e mean-square error can still be measured, but it is likely that it would not be a simple quadratic fun ction of the adjustment parameters, In any event, some very interesting pos sibilities in automatic optimization are presen ted by relaxation adaptation methods.

17

has been computer simulated and shown to operate as expected. However. much work of a theoretical and experimental nature needs to be done on capture and rejection phenomena in such systems before they can be reported in detail.

OTHER ApPLICATIONS AND FURTHER WORK ON

ADAPTIVE ANTENNAS

Work is continuing on the proper choice of pilot signals to achieve the best trade-off between response in the desired look direction and rejection of noises. The subject of "nullsteering, where the adaptive algorithm causes the nulls of the directivity pattern to track moving noise sources" is also being studied. The LMS criterion used as the performance measure in this paper minimizes the mean-square error between the array output and the pilot signal waveform. It is a useful performance measure for signal extraction purposes. For sicnal detection, however, maximization of array output signal-to-noise ratio is desirable. Algorithms which achieve the maximum SNR solution are also being studied. Goode[27] has described a method for synthesizing the optimal Bayes detector for continuous waveforms using Wiener (LMS) filters. A third criterion under investigation has been discussed by Kelley and Levin[28] and . more recently" applied by Capon et ale [29] to the processing of large aperture seismic array (LASA) data. This filter. the maximum-likelihood array processor.. is constrained to provide a distortionless signal estimate and simultaneously minimize output noise power. Griffiths'<'" has discussed the relationship between the maximum likelihood array processor and the Wiener filter for discrete systems. The examples given have illustrated the ability of the adaptive antenna system to counteract directional interfering noises.. whether they are monochromatic.. narrowband. or broadband..Although adaptation processes have been applied here exclusively to receiving arrays. they may also be applied to transmitting systems. Consider. tor example, an application to aid a low-power transmitter. If a fixed amplitude and frequency pilot signal is transmitted from the receiving site on a slightly different frequency than that of the carrier of the low-power information transmitter, the transmitter array could be adapted (in a receiving mode) to place a beam in the direction of this pilot signal. and. therefore" by reciprocity the transmitting beam would be directed toward the receiving site. The performance of such a system would be very similar to that of the retrodirective antenna systems.. [5J,[6 1 although the methods of achieving such performance would be quite different. These systems may be useful in satellite communications. An additional application of interest is that of "signal seeking." The problem is to find a coherent signal of unknown direction in space, and to tind this signal by adapting the weights so that the array directivity pattern receives this signal while rejecting all other noise sources. The desired response or pilot signal for this application is the received signal itself processed through a narrowband filter. The use of the output signal of the adaptive processor to provide its own desired response is a form of unsupervised learning that has been referred to as "bootstrap learning. "[31] Use of this adaptation algorithm yields a set of weights which accepts all correlated signals (in the desired passband) and rejects all other received signals. This system

CONCLUSION

'I"

I t has been shown that the techniques of adaptive filterins can be applied to processing the output of the individual elements in a receiving antenna array. This processing results in reduced sensitivity of the array to interferin{;! noise sources whose characteristics may be unknown ~ priori. The combination of array and processor has been shown to act as an automatically tunable filter in both space and frequency. ACKNOWLEDG~1ENT

The authors are indebted to Dr. 1\11. E. Hoff. r-. for a number of useful discussions in the early development of these ideas" and to Mrs. Mabel Rockwell who edited the

18

rnanuscri pt.

REFERENCES

B. Widrow, "Adaptive filters I: Fundamentals," Stanford Electronics Labs., Stanford, Calif., Rept. SEL-66-126 (Tech. Rept. 6764-6), December 1966. 121 J. S. Koford and G. F. Groner, "The use of an adaptive threshold element to design a linear optimal pattern classifier," IEEE Trans. Information Theory, vol. IT-12, pp. 42-50, January 1966. [31 K. Steinbuch and B. Widrow, "A critical comparison of two kinds of adaptive classification networks," IEEE Trans. Electronic Computers (Short Notes), vol. EC-14, pp. 737-740, October 1965. 4 1 \ C. H. Mays, "The relationship of algorithms used with adjustable threshold elements to differential equations," IEEE Trans. Electronic Computers (Short Notes), vol. EC-14, pp. 62-63, February 1965. [5) L. C. Van Atta, "Electromagnetic reflection," U.S. Pattent 2 908002, October 6, 1959. 161 "Special Issue on Active and Adaptive Antennas," IEEE Trans. Antennas and Propagation, vol. AP-12, March 1964. (7) C. V. Jakowatz, R. L. Shuey, and G. M. White, "Adaptive waveform recognition," 4th London Symp. on Information Theory, London: Butterworths, September 1960,pp. 317-326. IH) L. D. Davisson, "A theory of adaptive filtering," IEEE Trans. Information Theory, vol. IT-12, pp. 97-102, April 1966. 19 \ E. M. Glaser, "Signal detection by adaptive filters," IRE Trans. Information Theory, vol. IT-7, pp. 87-98, April 1961. /10) F. Bryn, "Optimum signal processing of three-dimensional arrays operating on gaussian signals and noise," J. Acoust. Soc. Am., vol. 34, pp. 289-297, March 1962. [Ill H. Mermoz, "Adaptive filtering and optimal utilization of an antenna," U. S.Navy Bureau of Ships (translation 903of Ph.D. thesis, Institut Polytechnique, Grenoble, France), October 4, 1965. [12) S. W. W. Shor, "Adaptive technique to discriminate against coherent noise in a narrow-band system," J. Acoust. Soc. Am., vol. 39, pp. 74-78, January 1966. [13) B. Widrow and M. E. Hoff, JI., "Adaptive switching circuits," IRE WESCON Conv. Rec., pta4, pp. 96-104,1960. (14) N. G. Nilsson, Learning Machines, New York: McGraw-Hill, 1965. [l51 B. Widrow and F. W. Smith, "Pattern-recognizing control systems," 1963 Computer and Information Sciences (COINS) Symp. Proc. Washington, D.C.: Spartan, 1964. (16) L. R. Talbert et aI., "A real-time adaptive speech-recognition system," Stanford Electronics Labs., Stanford University, Stanford, Calif., Rept. SEL 63-064 (Tech. Rept. 6760-1), May 1963. [17} F. W. Smith, "Design of quasi-optimal minimum time controllers," IEEE Trans. Automatic Control, vol. AC-11, pp. 71-77, January 1966. (18)J. P. Burg, "Three-dimensional filtering with an array of seismome11)

ters," Geophysics, vol. 29, pp. 693-713, October 1964. lJ9) J. F. Claerbout, "Detection of P waves from weak sources at great distances," Geophysics, vol. 29, pp. 197-211, April 1964. [20) H. Robbins and S. Monro, "A stochastic approximation method," Ann. Math. Stat., vol. 22, pp. 400-407, March 1951. [21] J. Kiefer and J. Wolfowitz, "Stochastic estimation of the maximum of a regression function," Ann. Math. Stat., vol. 23, pp. 462-466, March 1952. [221 A. Dvoretzky, "On stochastic approximation," Proc. 3rd Berkeley Symp. on Math. Stat. and Prob., J. Neyman, Ed. Berkeley, Calif.: University of California Press, 1956, pp. 39-55. [231 B. Widrow, "Adaptive sampled-data systems," Proc. 1st lnternat'l Congress ofthe Internat'l Federation ofAutomatic Control (Moscow, 1960). London: Butterworths, 1960. [241 D. Gabor, W. P. L. Wilby, and R. Woodcock, "A universal nonlinear filter predictor and simulator which optimizes itself by a learning process," Proc. lEE (London), vol. 108 B, July 1960. 125] B. Widrow and M. E. Hoff, Jr., "Generalization and information storage in networks of adaline 'neurons'," in SelfOrganizing Systems 1962,

M. C. Yovits, G. T. Jacobi, and G. D. Goldstein, Eds. Washington, D. C.: Spartan, 1962, pp. 435-461. [26) R. V. Southwell, Relaxation Methods in Engineering Science, London: Oxford University Press, 1940. 127) B. B. Goode, "Synthesis of a nonlinear Bayes detector for Gaussian signal and noise fields using Wiener filters," IEEE Trans. Information Theory (Correspondence), vol. IT-13, pp.116-118, January 1967. [28) E. J. Kelley and M. J. Levin, "Signal parameter estimation for seismometer arrays," M.LT. Lincoln Lab., Lexington, Mass., Tech. Rept. 339, January 8,1964. [291 J. Capon, R. J. Greenfield, and R. J. Kolker, "Multidimensional maximum-likelihood processing of a large aperture seismic array," Proc. IEEE, vol. 55, pp. 192-211, February 1967. [301 L. J. Griffiths, "A comparison of multidimensional Wiener and maximum-likelihood filters for antenna arrays," Proc. IEEE (Letters), vol. 55, pp. 2045-2047, November 1967. [31) B. Widrow, "Bootstrap learning in threshold logic systems," presented at the American Automatic Control Council (Theory Committee), IFAC Meeting, London, England, June 1966.

19

Abstract

Throughout the history of wireless communications. spatial antenna diversity has been important in improving the radio link between wireless users. Historically, microscopic antenna diversity has been used to reduce the fading seen by a radio receiver. whereas macroscopic diversity provides multiple listening posts to ensure that mobile communication links remain intact over a wide geographic area. In recent years, the concepts of spatial diversity have been expanded to build foundations for emerging technologies, such as smart (adaptive) antennas and position location systems. Smart antennas hold great promise for increasing the capacity of wireless communications because they radiate and receive energy only in the intended directions, thereby greatly reducing interference. To properly design, analyze, and implement smart antennas and to exploit spatial processing in emerging wireless systems. accurate radio channel models that incorporate spatial characteristics are necessary. In this tutorial. we review the key concepts in spatial channel modeling and present emerging approaches. We also review the research issues in developing and using spatial channel models for adaptive antennas.

Overview ofSpatial Channel Models for Antenna AlTay Communication Systems RICHARD

B.

ERTEL AND PAULO CAROIERI, VIRGINIA POLYTECHNIC INSTITUTE

KEVIN W. SOWERBY, UNIVERSITY OF AUCKLAND, NEW ZEALAND THEODORE

S. RAPPAPORT AND JEFFREY H. REED,

VIRGINIA POLYTECHNIC INSTITUTE

ith the advent of antenna array systems for both interference cancellation and position locadon applications comes the need to better understand the spatial properties of the wireless communications channel. These spatial properties of the channel will have an enormous impact on the performance of antenna array systems: hence, an understanding of these properties is paramount to effective system design and evaluation. The challenge facing communications engineers is to develop realistic channel models that can efficiently and accurately predict the performance of a wireless system. It is important to stress here that the level of detail about the environment a channel model must provide is highly dependent on the type of system under consideration. To predict the performance of single-sensor narrowband receivers.. it may be acceptable to consider only the received signal power and/or time-varying amplitude (fading) distribution of the channel. However. for emerging wide band multisensor arrays, in addition to signal power level, information regarding the signal multipath delay and angle of arrival (ADA) is needed. Classical mode Is provide information about signal power level distributions and Doppler shifts of the received signals. These models have their origins in the early days of cellular radio when wideband digital modulation techniques were not readily available. As shown subsequently, many of the emerging spatial .nodels in the literature utilize the fundamental principles 01 the classical channel models. However, modern spatial channel models build on the classical understanding of fading and Doppler spread, and incorporate additional con-

cepts such as time delay spread, ADA. and adaptive array antenna geometries. In this article, we review the fundamental channel models that have led to the present-day theories of spatial diversity from both mobile user and base station perspectives. The evolution of these models has paralleled that of cellular systems. Early models only accounted for amplitude and time-varying properties of the channel. These models were then enhanced by adding time delay spread information. which is important when dealing with digital transmission performance. Now, with the introduction of techniques and features that depend on the spatial distribution of the mobiles, spatial information is required in the channel models. IA.S shown in the next sections. more accurate models for the distribution of the scatterers surrounding the mobile and base station are needed. The differentiation between the mobile and base station is important. Classical work has derno nst r ate d that models must account for the physical geometry of scattering objects in the vicinity of the antenna of interest. The number and locations of these scattering objects are dependent on the heights of the antennas, particularly regarding the local environment. This article, then, explores some of the emerging models for spatial diversity and adaptive antennas, and includes the physical mechanisms and motivations behind the models. A literature survey of existing RF channel measurements with ADA information is also included. The article concludes with a summary and suggestions for future research.

Wireless l\1ultipath Channel Models This section describes the physical properties of the wireless communication channel that must be modeled. In a wireless system, a signal transmitted into the channel interacts with the environment in a very complex way. There are reflections from large objects, diffraction of the electromagnetic waves

This work was partially supported by the DARR4 Globdo program, Virginia Tech's FederalHighwaysResearch Center of Excellence. Virginia Tech's BradleyFoundation. the Brazilian National Science CoullcilC.VPq, and .VSF PresidentialFaculty Fellowship.

Reprinted from IEEE Personal Communications Magazine, Vol 5, No.1, pp. 10-22, February 1998.

20

A

1,2

(t) e j~I , 2(t)S(r -

't

~

Mobile 2

-e;,~ l (a)

Base

(b)

Beam of the bas e sta tio n ste ered towa rd mobile

• Figure 1, Multipath propagation channel: a) side view; b ) top view. where j;" is th e maximum Doppler shift give n by vl ); wh ere A is the wavelength of the transmitted signa l at frequency { . Figure :2 show s the recei ved sig n a ls a t the base station , assuming that mobiles I and :2 have tra nsmitt ed narrow pul ses at the same time . Al so s ho wn is the output of an antenna array syste m adapted [0 mobile I. The channel model in Eq . I do es not consid er the AOA of each multipath component shown in Fig s. I and 2. For narrowb and signals. the AOA ma y be include d int o the vec to r channel impulse response usin g

around objects. and signal scattering. The result of these complex interactions is the presence of many signal components. or multipatli signals. at the receiver. Another property of wireless channels is the presence o f Doppler shift , which is caused by the motion of the receiver, the transmitter, and /or any other objects in the channel. A sim plified pictorial of the rnultipath environment with two mobile stations is shown in Fig. 1. Each signa l component exp e rie nces a different path environment. which will determine the amplitude A /.k . carri er phase shift <J>/.k, time delay ' I.b AOA 8u , and Doppler shift if of the lth sign a l co m p o n e n t of the kth mobile . In ge ne r a l. each of these signal parameters will be time-varying. The early classical models. which were developed for narrowb and transmission syst em s. only provide information ab out signal amplitude level distributions a nd Doppler sh ifts of the received signals. The se models ha ve their ori gins in the e arl v da ys of cellular radio [1--1] when wideband di gital modul ation techniques were not readily ava ilable. As cellular systems became more co m p lex a nd more accu rate models were re q u ired . additional co nce pts. suc h as time delay sp read . were incorporated int o the mode l. Representing the RF channel as a time -vari ant c ha nne l and using a base band complex envelope repre sentation . the channel impulse response for mobile 1 has class ically be er: represented as [51

h\(1.,)=

L (r) - I

I

AI.I (r )e! <jl " I '8 (i - r u

(T) )

I~\(l.t )=

s(

.

=

ref

l ·m

1

j

f - ;;. \o f m

Ar. I (r )eJ<jl ;!' )Zi (8 1(11)O( t -: .l T))

(2)

where d (8 /(t)) is the arrav respon se rector. The a rray response vecto r is a function o f the arrav zcorn etrv a nd ,-\ O A . Figure 3 sho ws the case for an a rbitra rY' a rra v ueometrv when the- arrav and signal are res tricte d to tw()-dim~;s ional space . The resulting array respo nse vecto r is give n bv

Ir exp(- J.lf/ I I .) ii I

i exp]- i lf/ I.2 )

l.•

£1(8,(1 )) = i e xpl - i lf/u )

(I )

N ,,,, )

where L(t) is the number of rnultipath compon ents a nd th e other variables have already be en defined. The am p litude A u of the multipath co m po ne n ts is usually modeled as a Ra yleigh distributed rand om vari able . while the ph ase shift <J>/.k is uniformly distributed. The time-varying nature of a wireless c ha nnel is caused by the motion of obj ects in the channel. A measure of the time rate of change of the c h a n ne l is the Doppler pow er spec trum. introduced by M . J. Gans in 1972 [2J . The Doppler power spectru m provides us with sta tistical information on th e variation of the frequency of a tone rec eiv ed by a mobile tr aveling at speed v . Based on th e tlat fading channel model dev eloped by R . H . Clarke in 1968. G an s ass ume d that the received signal a t the mobile s t a t io n ca m e from all directions a n d wa s un iformly distributed . U nd e r the se assumptions and for a A/4 ve rt ica l antenna , the Doppler power sp ectrum is give n by [5]

where 'I'{.i(t) =[x,cos(8 {(t)) + y ,s i n ( 8 , (T )) ] ~ a nd ~ = 2reiA. is the wave numbe r. The spa tia l c h a nn e l impulse re sponse give n in Eq . :2 is a su m m a t io n of several multip ath compon ents. e ach of which has its own amplitude . ph ase. a nd AOA. The distribution of these parameters is dependent o n the type of en vironment. In particular . the angle sp re ad o f th e channel is known to be a fu nctio n of both the en vironment and the base sta tio n anten na hei ght s . In the ne xt se c t io n . we de scribe macrocell a nd rnicr ocell e nv iro n m e nt s and di scu ss how the e nvir o n m e n t affects the signal parameters.

,Vlacrocell V5. Microcell

,\iIQcrocell Environment - Figure 4 sho ws the c ha n ne l on the forward link for a m acro c ~ll en v ironm ent. It is usuallv assumed that the scatt e re rs surrounding the mobile station are a bo ut the same hei ght as o r are higher than the mobile . This implies th at the received signal a t the mobile antenna a rr ives from all directions afte r bouncing fro m the su rro und ing scat terers as illustra ted in Fig. -I. Under these conditions, Gans ' assum ptio n that the AOA is uniformly distributed o ve r [0 . :2re] is va lid . Th e clas si cal

If - J;·I<J;n

1.5

I

/= 1)

1= 0

nj

L IIi - 1

2

else where

21

Received signal

I

90 . ,

Mobile 1 9"

fi}.'

I 92,1

Received signal

Received signal

1 *' '~i t 2. ,

Delay

(a)

Received signal 9 •

02

I

M obile

Delay

(c)

21

(d)

Delay

Delay

II Figure 2. Channel impulse responses for mobiles I and 2: a) received signal from mobile I to the base station: b) received signal from

mobile 2 to the base station: c) combined received signal from mobiles I and 2 at the base station: d) received signal at the base station when a beam steered toward mobile I is employed.

Rayleigh fading envelope with deep ~ /vlicrocell Environment - In the microfades approximatelv 1../2 apart cell environment. the base station Plane wave emanates from this model [5J. antenna is usuallv mounted at the same However. the AOA of the received height as the surrounding objects. This signal at the base station is quite difimplies that the scattering spread of fe-rent. In a macrocell environment. the AOA of the received signal at the e typically. the base station is deployed base station is larger than in the ! I higher than the surrounding scattermacrocell case since the scattering pro X I i er;. Hence . the received signals at cess also happens in the vicinity of the the base station result from the scatbase station. Thus. as the base station antenna is lowered. the tendencv is for tering process in the vicinity of the mobile station. as shown in Fig . 5 . the multipath AOA spread to increase. The multipath components at~the This change in the behavior of the base station are restricted to a smallreceived signal is very important as far as antenna array applications are coner angular region . BBW. and the dis• Figure 3. Arbitrary antenna array cerned. Studies have shown that statistribution of the AOA is no longer configuration. uniform over [O.2lt] . Other AOA distical characteristics of the received tributions are considered later in this signal are functions of the angle spread . Lee [3J and Adachi [6J found that the correlation article . The base station model of Fig. 5 was used to develop the between the signals received at two base station antennas theory and practice of base station diversity in today's cellular increases as the angle spread decreases. svstem and has led to rules of thumb for the spacing of diverThis section has presented some of the physical properties of a wireless communication channel, A mathematical expressity antennas on cellular towers [3J. sion that describes the time-varying spatial channel impulse response was given in Eq. 2. In the next section , several models that provide varying levels of information about the spatial channel are presented.

Space: The Final Frontier

Details ofthe Spatial Channel Models Base

station

In the past when the distribution of angle of arrival of multipath signals was unknown, researchers assumed uniform distribution over [0, 21tJ [7J. In this section, a number of more realistic spatial channel models are introduced. The defining equations (or geometry) and the key results for the models

ost atio n Base

til Figure 4. Macrocell environment - the mobile station perspective.

22

Top view

are d escr ib e d . Also provided is an exte nangle spreads and e lement spacings result sive list of references. in lower corre la tio ns, which provide an Table 1 lists so me representative active increased diversi ty gai n. Measurements of researc h groups in the field and their Web the correlation observed a t both the base site addresses where m or e informatio n o n station and the mo b ile are consisten t with the subject can be found . (Note that this is a narrow a ngle spread at the base sta tion by no means an exhaustive list.) and a large a ngle spread at the mobile . The Gaussia n Wide Sense St ationary Correlation meas u re me n ts made at the Uncorre lated Scattering (GWSSUS) , base station indicate tha t the typical rad ius Gaussia n A ngle of Arrival (GAA) , Typiof scatt e re r s is f rom 100 to 200 wavec a l Urba n (TU) , and Bad Urban (BU) lengths [3]. • Base station models described be low were developed Assumi ng that N scatterers are un iformin a series of papers at the R oyal Institute ly p lace d on t h e circle wit h radi us Rand • Figur e 5. Macrocell-s- base of Tech nology and may be downlo aded oriented such that a scatterer is loca ted on station perspective. from the Web si te. Further details of the the line of sight, the discrete AOAs are [91 Geometrically Based Single Bounce . - I torl= 0.I.. . .. N -l. 6 i "" - R Sin (GBSB) models are given in theses at VirD N ginia Tech , wh ich are available at h ttp r//etd .vt .edu/etd / index.htm!. From the di screte AOAs, the cor re lation of the signals These various models were developed and used for differbetween any two elements of the array ca n be foun d using [9] j V.1 ent ap plications. Some of the models were intended to prop(d.6 0.R .D )=-; cos(6 0+6 ,)]. vide information about only a sin gle channe l characteristic. N ; ; 1) such as angle sp re ad , while others at te mpt to capture all the properties of t he wireless channel. In the discussion of th e where d is the element spacing and 60 is me asur ed with respect models, an effort is made to identify the original motivation of to the line h .'tween the two elements as shown in Fig. 6. the model and to co nvey the information the model is intendThe origina l model provided information regarding o nly ed to provide. sign al corre lations . Mo tiva ted by the need to consider smallscale fading in diversity systems. Staple ton et al. proposed an Lee's /v1odel ex te nsion to Lee 's model tha t accounts fo r Dopp ler shift by In Lee 's model, scatterers are evenly spaced o n a circular ring imposing an angul ar ve locity o n the ring of sca tterers [10. 11]. about the mobi le as shown in F ig. 6. Each of the scatt e re rs is For the model to give t he a ppropriate maxim um Doppler inte nded to represent the effect of many shift. the angular velocity of the sca tte rscatte rers wit hin the region. and hence ers m ust equal VI R where \. is the vehiy are referred to as effective scatterers. The cle velocitv and R is t h e r adius of th e model was originally used to predict the scatterer ring [ II ]. Using this model to correlation between the signa ls re ceived simu late a Rayleigh fadi ng spatia l chanEffective scatterers by two sensors as a function of element nel model. the BER fo r a It/4 differen sp acing. However. since the co rr e la tio n tial quadratu re phase shift keyed matrix of the rece ive d sign a l ve cto r o f (DQPSK) signal was sim u lated . Th e an antenna array ca n be determi ned by res ul ts were compared wi th measure consideri ng the correlation betwe en ments take n in a typical suburban env ieach p a ir of elements . the model h as ron me nt. The resu lt ing BER estimates application to any arbitrary array size . .-""-.-'. .x were within a factor of two of the actu al The level of correlation will determeas ured BER, indicating a reasonable mine the performance of spatial diversidegree of accuracy for the model [10]. ty met hods [3 ,9] . In gen er al. large r • Figure 6. Lee 's model. When the model is use d to provide

(11t. ). ,

Iexp[-j11td

~)~~~~l¥t::·'~:<:-~:~::·:;~:;·~'~tir;~::::.:, ·'::·:·~_i~~~:ji~~\~~~1

I

! Center for Communications Research -

University of Bristol

http://www.fen'.bris.ac.uk/elec/research/ccr/ccr.html

i'qmter.for Personkommunikation ~ Aalborg University

! Center for Wireless Telecommunications I Research.Group for RF Communications II

http://www.mprg.ee.vt.edu

University of Kaiserlautern

Royal tnstituteotTechnoloqy

I

i

http://www. cwt.vt.edu/

Virginia Tech

I Smart Antenna Research Group - Stanford University and Information Systems Engineering II Telecommunicaticns University of Texas at Austin !

http://www.kom .auc .dk/CPK!

Virginia Tech

i Mobile and Portable Radio Research Group -

I

http://www.e-technik.uni-kl.de! http://www.s3.lcth.se http://www-isl.stanford.edu/groups/SARG http://www.ece.utexas.edu/projects/tise!

'

http://www.crl.mcmaster.ca

Wireless Technology Group - McMaster University

• Tab le 1. Some active research group s in the field of adaptive antenna arrays.

23

I

r

joint ADA and time of arrival (TDA) tributed (see the discussion of the GAA channel information, one finds that the model later). However, in practice the Effect ive resulting power delay profile is " UADA will bediscrete (i.e ., a finite numscatterers shaped" [12]. By considering the interber of samples from a Gaussian distribusections of the effect ive scatterers by tion), and therefore it is not valid to use ellipses of con stant delay, one finds that a continuous ADA distribution to estithere is a high concentration of scattermate the correlation present between ers in ellipses with minimum delay, a different antenna elements in the arrav. h igh concentration of scatterers in The correlation that results from a conellipses with maximum delay , and a tinuous ADA distribution decreases • Figure 7. Discrete uniform geometry. lower concentration of scatterers monotomically with element spacing. between . Higher concentrations of where as the correlation that results scatterers with a given delay correfrom a discrete ADA has damped Scatterer reg ion spo nd with larger powers, and hence oscillations present. Therefore, a cony tinuous ADA distribution will underlarger values on the power delay prof ile . The "Uvshaped " power delay estimate the correlation that e xists profile is not consi stent with measurebetween the elements in the array [9]. D x ments . Therefore, an extension to In [9], a comparison is made Base Lee's model is proposed in [II] in between the correlation obtained stati on which additional scatterer rings are using the discrete uniform distribution model, Lee 's model. and a continuous added to provide different power Gaussian ADA as a function of eledelay profiles. • Figure 8. Circularscatterer densitygeometry. While the model is quite useful in ment spacing. The comparisons indi cate that , for small element predicting the correlation between any separations (two wavelengths) , the two elements of the arrav, and hence the array correlation matrix, it is not well suited for simulathree models have nearly identical correlations. For larger element separations (greater than two wavelengths), the correlations requiring a complete model of the wireless channel. tion values using the continuous Gaussian ADA are close to Discrete Uniform Distribution zero . while the two discrete models have oscillation peaks A model similar to Lee 's model in terms of both motivation with correlatio ns as high as 0.2 even beyond four wavelengths. Additionally, it was found that the correlation of the discrete and analysis was proposed in [9]. The model (referred to here uniform distribution falls off more quickly than the correlaas the discrete uniform distribution) evenly spaces N scattertion in Lee's model. as within a narrow beamwidth centered about the line of sight to the mobile as shown in Fig. 7. The d iscrete pos sible Again. while the model is useful for predicting the correlation between any pair of elements in the array (which can be ADAs. assuming N is odd. are given by [9] used to calculate the array correlation matrix), it fails to I ,v-I N-I 8 i = --8 BlI' i. i = - - -..... - - . include all the phenomena, such as delay spread and Doppler N-l 2 2 spread, required for certain types of simulations. From this . the correlation of the signal s present at two antenna elements with a separation of d is found to be Geometrically Based Single-Bounce

Statistical Channel Models

.V- I

!

'v - I 1= - - ,-

exp[-j21tdcos(8o+8;J]'

Geometrically Based Single -Bounce (GBSB) Statistical Channel Models are defined by a spatial scatterer density function . These models are useful for both simulation and analysis pur poses . Use of the models for simulation involves randomly placing scatterers in the scatterer region according to the form

Measurements reported in [9J suggest that the ADA statistics in rural and suburban en vironments are Gaussian dis-

c: a c> g

2 1.5

.~ 0. 5 ai 0 'D .~ -O. 5

~

e a..

.0

-1

5

o

4

~4

-5

3 .8 3.9

Angle of arrival (deg rees)

Angle of arrival (degrees)

Time of arrival

(~s) ~

• Figure 10. Joint TDA and ADA probabilitydensityfunction at the mobile, circular model (log-scale).

• Figure 9. Joint TDA and ADA probability density function at the basestation, circularmodel (log-scale).

24

y

Scatt erer region 1+----'---'--- - D ------+1 MHz, which provides a range of of the sp a t ia l scatterer density ---l-'~----'------,r--------4t-j-L-"X 30-60 m, roughly the width of Base station function . From the location of wide urban streets. each of the scatterers, the AOA, The GBSBCM can be used to TO A , and signal amplitude are ----t;:::::=~a;-:.m generate random channel s for determined. simulation purposes . Generation From the spatial scatterer den• Figure 11. Ellipti cal scatterer density geometry. of sa mples from the GBSBCM is sity function , it is possible to derive accomplished by uniforml y placthe jo int and marginal TOA a nd ing scatt erers in the circular sca tte re r region about the mobile AO A probability density funct ion s. Knowledge of these sta tisand then calculating the corre sponding AOA , TOA. a n d tics ca n be used to pr edict th e perform ance of a n ad apti ve power levels . array. Furthermore , knowledge of th e underlying structure of the resulting array response vecto r may be exploited by be amGeometrically Based Elliptical Model (M icro cell Wideband forming and position location algor ithm s. Ivl odel) - The Geometrically Based Single Bounce Elliptical The shape and size of the spatial scattere r density functi on Model (GBSBEM) assumes that scatterers are uniformly disrequired to provide an accura te model of the channel is sub tr ibuted within an e llipse, as shown in Fig. 11, where the base ject to debate . Validation of these models through extensive station and mobile are the foci of the ellipse. The model was measurements remains an active are a of research. proposed for microcell environments where antenna heights are relatively low, and therefore multipath scattering near the Geometrically Based Circular Model (Macrocell Model) - The ba se statio n is just as likely as multipath scatterering near the geometry of the Geometrically Based Single Bounce Circular Model (GBSBCM) is shown in Fig. 8. It assumes that the scatmobile [17,18] . A nice attribute of the elliptical model is the physical interterers lie within radius Rm about the mobile . Often the requirepr etation that o nly multipath sig na ls which arri ve with an ment that R m < D is impo sed. Th e model is based o n th e assumption that in macrocell e nviro nme nts where antenna abso lute delay s 1 m ar e accounted for by the model. Ignoring co mpo ne nts with larger delays is possible since signa ls with he ights are rel ativel y large. ther e will be no signa l scatterering fro m loc ations near the base sta tio n. The ide a of a circula r longer dela ys will expe ri e nce greater path loss. and hence regi o n of sca u erers centere d abo ut the mob ile was o riginally ha ve relativel y lo w power co m p a re d to those with s ho rte r proposed by Jakes [13] to derive the or etical result s for the co rde lays . There for e. pr ovided that 1 m is chose n su ff ici e ntly la rge . the model wi ll acc ount for nearly a ll the pow er a nd relation observed between two ante nna elements. Later. it was AOA of the mult ipath signa ls. used to determine the effects of beamform ing on the Doppler spe ct ru m [ 14. 15] fo r narrowb and sig na ls. It was shown t ha t Th e par ameters a m a nd h m are th e se rnimajo r axis and th e rate and the depth of the enve lo pe fades are significant ly semiminor axis values. which are given bv reduced when a narrow-be am beam former is used . c1 'n (1m =~- . Th e jo int TOA a nd AOA density func tio n obtai ned fro m the model provides so me in sights into the properties of th e mod el. Us ing a Jacobian tr an sf orm ati on. it is easv to derive the joint TO.A. and AOA density fu nction at both the base statio n and the mobile. The re sulting joint probability density functio ns ( PD Fs ) at the At the base statio n: base sta tio n and the mob ile a re shown in the box on this page [16]. The joint TOA and AOA PDFs for the GBSBCM are shown in Figs . ') and 10 fo r 2 2 2 2c 2 _ 1 C + 1 \ ·3 - 2 1 C Dcos(Sb)) the case of D = I km a n d R m = 100 m from the base station and mobile perspe c47tR;, ( DCos(Sb) _ 1C)3 tives. respectively. The circula r model pre d icts a relatively high pr obabilit y o f mu ltipath co mpo ne n ts with sma ll e xcess else. a del ays a lo n g the line o f s ig h t. Fr om th e base-station perspective . a ll o f th e multi path co mponents are restricted to lie within a small range of an gles. At the mob ile: The app ro pri a te va lue s fo r th e rad ius of scatt e re rs can be determined by eq uatin g the a ng le s p rea d pr edi ct e d by t he model (wh ic h is a fu nct io n o f R m ) wi th 2 2 2 2 3 _1 C c -2"tc 2 Dcos(Sm)) me asured values . Me asurements reported in [9] s ugge st th at typica l an gl e sp re a ds 47tR,~( Dcos(Sm) -1c / for ma crocell envir onment s with a T- R sepa ra tio n of I km a re a ppro xima te ly two to six degrees . Also , it is s ta te d th at th e o else. angle sp re a d is inversel y pr op ort ional to the T-R se p a r a t io n , which le ad s to a radius of scatterers that ranges from 30 to where ef> and em are the angle of arrival measured relative to the line of sight 2UU m [16] . In [3]. it is s t a t e d th at t h e from the base station and the mobile, respectively. active scattering region around the mobile is a bo u t 100-200 wav el en gth s fo r 900

- --.{

(D

)(D

(D

)(D\ ·+"t

25

c: - 1 ~-2

G WSSUS. Figure 13 shows the geometry assumed for the GWSSUS model corresponding to £I = 3 clusters. The mean AOA for the kth cluster is denoted SOk. It is assumed that the location and delay associated with each cluster remains constant over several data bursts. b. The form of the received signal vector is

]' -3

'; -4 -5

.~

~ -6

~ -7

ii

'"o

.0

5

et

X,,(l) =

Time of arrival (us)

Angle of arrival (degrees)

where Vk.b is the superposition of the steering vectors during the bth data burst within the kth cluster. which mav be expressed as . v, "t." = I<XC ieJQ I.'a(SOk -Su).

• Figure 12.Joint TOA and ADA probability densityfunction.

elliptical model (log-scale).

i= 1

where N k denotes the number of scatterers in the kth cluster. <Xk.i is the amplitude. lPkj is the phase. Sk.i is the angle of arrival of the ith ret1ected scatterer of the kth cluster, and a( S) is the array response vector in the direction of S [9). It is assumed that the steering vectors are independent for different k. If v, is sufficiently large (approximately 10 or more [J 9]) for each duster of scatterers, the central limit theorem may be applied to the elements of Vk./> . Under this condition, the elements of vk.b are Gaussian distributed. Additionally. it is assumed that Vk./, is wide sense stationary. The time delays tk are assumed to be constant over several bursts. b. whereas the phases
where c is the speed of light and t", is the maximum TOA to be considered. To gain some insight into the properties of this model, consider the resulting joint TOA and AOA density function. Using a transformation of variables of the original uniform scatterer spatial density function, it can be shown that the joint TOA and AOA density function observed at the base station is given by [16] J~.e i' (t.S/,) =

C 3 2 2 C)( 2 (D - t C D e + t e -

1f l

2t1'

C

D."COS(S',)) D

-htlllllblll(DCOOS(S,,)-t1')

.

-

" U,s(l - td· 2>

k=1

$ t$ till

c

elsewhere. where Sb is AOA observed at the base station. A plot of the joint TOA and AOA PDF is shown in Fig . 12 for the case of D = I km and t m = 5 us. From the plot of the joint TOA and AOA PDF, it is apparent that the GBSBEM results in a high probability of scatterers with minimum excess delay along the line of sight. The choice of t", will determine both the delay spread and angle spread of the channel. Methods for selecting an appropriate value of t", are given in [18]. Table 2 summarizes the techniques for selecting t m where L, is the reflection loss in dB. n is the path loss exponent. and to is the minimum path delay . To generate multipath profiles using the GBSBEM, the most efficient method is to uniformly place scatterers in the ellipse and then calculate the corresponding AOA. TOA, and power levels from the coordinates of the scatterer, Uniformly placing scatterers in an ellipse may be accomplished by first uniformly placing the scatterers in a unit circle and then scaling each x and y coordinate by £/11I and b"" respectively [16].

s, = E{ vk. b vt,,,}

The model provides a fairly general result for the form of the covariance matrix. However. it does not indicate the number or location of the scattering clusters. and hence requires some additional information for application to typical environments.

Gaussian Angle of Arrival

The Gaussian Angle of Arrival (GAA) channel model is a Gaussian Wide Sense Stationary special case of the GWSSUS model described above where Uncorrelated Scattering only a single cluster is considered (£I = 1), and the AOA statistics are assumed to be Gaussian distributed about some The GWSSUS is a statistical channel model that makes nominal angle , So, as shown in Fig. assumptions about the form of the received signal vector 14. Since only a single cluster is con[19-22]. The primary motivation of .J~I ~: \"':7 . sidered, the model is a narrowband 'E;cp~~si'o~ :. the model is to provide a general ,. r ..c channel model that is valid when equation for the received signal cor, the time spread of the channel is t m = const ant Fixed maximum delay, t m relation matrix. In the GWSSUS small compared to the inverse of model, scatterers are grouped into the signal bandwidth; hence , time t m = t o 1O(T - Lr)!1On clusters in space. The clusters are Fixed threshold Tin dB shifts may be modeled as simple such that the delay differences withFixed delay spread, crt 't m = 3.244<>, + t o phase shifts [231. in each cluster are not resolvable The statistics of the steering vecwithin the transmission signal bandFixed max. excessdelay, 'te 't m = t o + 'te tor are distributed as a multivariate width . By including multiple clusGaussian random variable. Similar ters. frequency-selective fading • Table 2. Methodsfor selection t m • to the GWSSUS model, if no line of channels can be modeled using the '

"

26

:. ~,

y

Scatterer cluste r

kt h scatterer clust er

___._LBase station

I

+ X ,I

I

bution in all directions away from the mobile [24]. Both the time and spatial co rr e la tio n properties of the model are compared to theoretical results in (24) . The comparison shows that there is good agreement between the two .

x

Two Simulation Models (TU and BU)

Base stat io n

Next we describe tw o spatial channel models • Figure 14. GM geometry. • Figure 13. G WSS US geom etry. that have been developed for simulation purposes. The Typical Urban (TU) model is designed to have time properties similar to the GSM-TU defined in sight is present. then E {vk./,} = 0: otherwise . the mean is proGSM 05 .05 . while the Bad Urban (BU) model was developed portion al to the array response vector a( 80d . For th e special to model environments with large reflectors that are not in case of uniform linear arrays. the co va r ia nce matrix may be the vicinity of the mobile . Although the models are designed described by for GSM. DCS1800 , and PCS1900 formats . extensions to H R(80.<Js) '" pa(8 o)a (80) @B(8 0 ·<J s ). other formats are possible (251) . Both of these models obtain the received signal vector where the (k.l) element of B(8 0 • <Js) is given by using B( 8 0 .<Js )u = ex p[ -2( 1tu(k _1)) 2 <J~ cos' I (t) .) ( I, (r) \ x(t) = ~u,, (I)exp - j 2 ;rf; ~+ f3 5 I -~ +~ r )a(tl,,(I)) p is the receive r signal power. .J. is the element spacing. a nd @ 11 = 1 denotes e le me nt -wise multiplication [231. where N is the number of scatterers. .f~. the carrier frequency. c is the speed o f light . I,,(t) the path propagation distance. ~ a Time-Varying Vector r andom phase. and ur random delay . In general. the path Channel A/lodel (Raleigh 's ,\!lode/) propagation distance I ,,(t) will vary continuously with time : hence. Doppler fading occurs naturally in the model. Raleigh's time-varying vector ch annel model was developed to provide both small scale Rayleigh fad ing and theoretical spaTypical Urban (TU) - In the TU model. 120 scatterers are rantial correlation properties [24 J. The propag ation environment domly placed within a I km radius a bo ut th e mobile [25] . The considered is densely populated with large dominant reflectors (Fig. IS) . It is assumed that at a particu lar time th e chanposition of th e sc a tt e re rs is held fixed over the duration in which the mobile travels a distance of 5 m. At the end of the 5 nel is characterized by L dominant re fle cto r s. The rece ive d signa l vector is then modeled as rn, the scatterers are returned to their original position with L(r) - I re spect to the mobile . At each 5-m interval. random phases ar e x(t)= Ia(8 1)u l(t ls(t - e ) + n(t l. as s ign e d to the sca tre r e rs as well as randomi zed s ha d ow in g 1=0 e ffe cts, which ar e modeled as log-normal with distance with a where a is the array response ve cto r. Ut(t) is the complex path standard devi ation of 5- 10 dB [25] . The re ce ive d signal is amplitude . s(t ) is the modulated signal. and n(t) is additive determined by brute force from the location of each of th e sca tnoise. This is equivalent to the impulse response given in Eq . 2. terers. An exponential path loss law is also applied to account The unique fea t u re of th e model is in the calculation of for large-scale fad ing [21]. Simulations have shown th at the TU the complex amplitude term. aM) . which is expressed as model and the GSM- T U model have ne arly identical power del ay profiles. Doppler spectru ms. and delay spreads (25). Furu ,(t)= ~ I(t) · \I f llJl(e l )' thermore . the AOA sta tistics are ap proxima te ly Gaussian a nd sim ilar to those of the GAA model described above . where f l accounts for log-norm al fading. '11(" 1) describes the power del ay profile. a nd ~/ (t ) is the complex intensity of th e radiation pattern as a functio n o f tim e . The co mp lex intensity Bad Urban (BU) - The BU is identical to the TU model with is described by the addition o f a second sca t te re r cluster with a no t h e r 120 NI sca tte re rs offse t 45 ' from the first, as shown in Fig. 16. The sca tte re rs in the second cluster are as signed 5 dB less average ~ 1(t) = K c, (8 1 ) exp ( jW " cos(n ".1)t), n= 1 power tha n the o rig inal cluster [25] . The presence of the secwhere N, is the number of s ig na l ond cluster results in an increased components contributing to the Ith angle spread . which in turn dominant reflecting s u r f a c e . K reduces the off-diagonal el ements Dominant a c co u n t s for the antenna g a i n s of the array covariance matrix . ~Iectors and transmit signal power . C,,(81) The presence of the second cluster is the complex radiation of the nth also causes an increase in the delay co m p o n e n t of the Ith d ominant spread . reflecting surface in the direction Uniform Sectored Distribution of 8t , wi! is the maximum Doppler shift , a nd n".! is the angle toward The defining geometry of Uniform the nth component of the Ith dom Sectored Distribution (USD) is Base sta tion inant reflector with respect to the sho w n in Fig. 17 [26]. The model motion of the mobile. The resultassumes that scatterers are uniing complex intensity, ~ t(t) , formly distributed within an angle exhibits a complex Gaussian distri• Figure 15. Raleigh's model signal environment. distribution of 8BW and a radial

e].

f

I

27

(

range of L1R centered about the mobile . The magnitude and phase associated with each scatterer is selected at random from a uniform distribution of [0,1] and [0, 2lt], respectively. As the number of scatterers approaches infinity , the signal fading envelope becomes Rayleigh with uniform phase [26]. In [26], the model is used to study the effect of angle spread on spatial diversity techniques. A key re sult is that beam-steering techniques are most suitable for scatterer distributions with widths slightly larger than the beamwidths.

Modified Saleh-Valenzuela's Model

mation was developed by Klein and Mohr [29]. The channel impulse response is represented by

w

h(.,t .S) = Ia",(t)o(. -'", )o(S - SlY). w=1

This model is composed by W tap s. each with an associated time de lay r., complex amplitude a.., and AOA Sw - The • Figure 16. Bad Urban vector chanjoint density functions of the model nel model geometry. parameters should be determined from measurements. As shown in [29], measurements can provide histograms of the joint distribution of la I, t and S, and the density functions, which are proportional to these histograms, can be chosen.

Elliptical Subregions Model (Lu, Lo, and Litva's Model)

Saleh and Valenzuela developed a multipath channel model for indoor environment based on the clustering phenomenon observed in experimental data [27]. The clustering phenomenon refers to the observation that multipath components arrive at the antenna in groups. It was found that both the clusters and the rays within a cluster decayed in amplitude with time. The impulse response of this model is given by

Lu et al. [30] proposed a model of rnultipath propagation based on the distribution of the scatterers in elliptical subregions. as shown in Fig. 18. Each subregion (shown in a different shade) corresponds to one range of the excess delay time. This approach is similar to the GBS(3) BEM ·proposed by Liberti and Rappah(t) = IUijo(t - T; -' ij) i = llj = O port [18] in that an ellipse of scatt e re rs y where the sum over i corresponds to is considered. The primary difference the clusters and the sum over j reprebetween the two models is in the selecse nts the rays within a cluster. The vari tion of the number of scatterers and the ables uij are Rayleigh distributed with di stribution of those scatterers. In the the mean square value described by a GBSBEM. the scatterers were uniformly distributed within the entire ellipse. In double-exponential decay given by Lu, Lo, and Litva 's model. the ellipse is first subdivided into a number of elliptical subregions. The number of scatterers where rand yare the cluster and ray within each subregion is then selected from a Poisson random variable. the time decay constant, respectively. Motivated bv the need to include AOA in mean of which is chosen to match the the cha~nel mode , Spencer et al. promeasured time delay profile data. Basestation It was also assumed that the multiposed an extension to the Saleh- Valenpath components arrive in clusters due zuela's model (18]. assuming that time • Figure 17. Geometry of the uniform and the angle are statistically indepento the multiple reflecting points of the sectored distribution. scatterers. Thus, assuming that there dent, or are L scatterers with K1 retlecting points l1(t,S) = h(t)h(S) . each. the model proposed is represented by Similar to the time impulse response in Eq. 3, the proL C posed angular impulse response is given by 11(1.10)=

I

l1(t) =

f IUijo(s- e,

IE (of l) 1

i=1

-Ol ij)

K,

i = Oj =O

X I,a ik exp(-(21if;klo + r« ))8(t- fik )Er(Oik )

where U ij is the amplitude of the jth ray in the ith cluster. The variable 6 i is the mean angle of the ith cluster and is assumed to be uniformly distributed over [0. 2lt). The var iable Olij corresponds to the ray angle within a cluster and is modeled as a Laplacian distributed random variable with zero mean and standard deviation 0':

k=O

where iXib 'ik' and Yik correspond to the amplitude. time delay, and phase of the signal component from the ikth reflecting point, respectively. fik is the Doppler frequency shift of each individual path, Sik is the angle between the ikth path and the receiver-to-transmitter direction, and SVl is the angle of the ith scatterer as seen from the transmitter. £1(S) and £r(8) are the radiation patterns of the transmit and receive antennas, respectively. The variable Sik was assumed to be Gaussian distributed. Simulation results using this model were presented in [30], showing that a 60° beamwidth antenna reduces the mean RMS delay spread by about 30-43 percent. These results are consistent with similar measurements made in Toronto using a sectorized antenna (31].

This model was proposed based on indoor measurements which will be discussed in the fourth section.

Extended Tap-Delay-Line Model

A wide band channel model that is an extension of the traditional statistical tap-delay-line model and includes AOA infor-

28

Measurement-Based Channel Model

• By using direction al a nte nnas, it is possible to re d uc e th e time dispersion . A no the r se t of TD A and A DA me asur em ents is rep orted in [39] for urban are as. Th e me asurem ents were made using a A cha nne l model in whi ch the par am et er s are based o n me atwo-e le me nt rec e iver th a t was mou nt ed o n th e test ve h icle sure me nt was proposed by Bla nz et al. [32]. T he idea be hind w it h a n e leva tio n of 2.6 m . Th e t ra ns m itti ng a n te n na was this a pproach is to characte rize the propagatio n environm ent , placed 30 m high on th e side of a buildi ng. A bandwidt h of 10 in terms of sca tte ring po ints, based on meas ure me nt data . The M H z wit h a ca rr ie r fre q ue nc y of 2.33 GH z was use d . Th e time- var iant impulse response ta kes the form de lay- Doppl e r spect ra ob se rved at the mobil e was use d to 2rr obtain the del ay-AD A spec tra. T he seco nd ant enn a e leme nt is h(1".t ) = 6(1",t ,8) * g(1".8)* f( 1; )d8 use d to rem ove th e a mbigu ity in A D A th at wo u ld occu r if o o nly the D oppl er spect ra we re know n. The resul ts indic at e th at it is possible to acco unt for most of the major fe atures of whe re f(1") is the im p u lse re s pon se rep rese n ting th e jo int th e del ay-ADA spec tra by co nsi de ring the la rge bui ld ings in tr an sfer ch ar acteristi c of the tra nsmissio n syste m compo nents the environment. (mo dulato r, demodulat or , filte rs, etc.), and g(1;, 8) is the cha racte ristic of the ba se sta tio n a nte nna . Th e term v(1" , t, 8) is Motivated by di ve rsity co mbi ning methods. ear lier me asure me nts were con ce rn ed primar ily with determ in ing t he the time -variant d irect ion al d istribu tio n of ch annel impulse response seen from the base sta tio n. T his distribution is timeco rr e latio n between th e signa ls a t two ante nna e le me nts as var iant due to mobile mot ion a nd depends on th e loca tion, a fun ction of th e e le me nt se pa ra tio n di stance . The se st udo rie nta tio n. and vel ocity of th e mob ile sta tio n antenn a a nd ies found th at , a t t he mob ile , rel at ively sm a ll se p a ra t io n th e to pog ra p h ica l a nd mo r phogra p h ica l pr op erti es o f t he dis ta nces we re requ ir ed to o btai n a sma ll degree o f co rrelati on be twee n th e e le me nts, whe reas at the ba se sta tio n propagation are a as we ll. Me asurem ent is use d to det ermine the distribution tl(1", t. 8) . very lar ge spac ing was ne eded . Th e s e f ind in gs indi c at e th at Ray Tracing Models th er e is a rel at ivel v sm a ll a n gle sp rea d o bse rve d a t~ th e base St a Th e mod e ls pr esented so fa r a re bas ed o n st a tistica l a na lys is a n d tion [6]. Pr e v io u s ly , a n ex tens ion to me a sur em ent s. a n d pr o vid e us wi t h the av e ra ge p a th lo ss and Sale h-V ale nzuela's indoo r mode l. d e la y sp rea d , a dj us ting so me incl ud ing A DA inf o r matio n. wa s pa ra me te rs acc ording to the e nvipr e s e n t e d . Thi s e xtensio n w as ronm ent ( in d o o r. o u td o o r , pro pos e d base d o n indoo r me a o bs tr ucted, etc .) . In th e past few su r c m e n ts of d e la y s p r e a d a n d yea rs, a determi nis tic mode l. ca lled A DA a t 7 G Hz ma de a t Br igh a m ray traci ng , h a s b e e n pr o p o se d Y ou ng U n ive r s it y [40] . Th e • Figure 18. Elliptical subregions spatial scatterer base d on the geo metric theo ry and AD As we re me asured using a 60 dens itv. em pa ra bol ic d ish ant enn a t hat re flec tion. d iffracti on . and sca tte rhad a 3 d B be am width of 6°. Th e ing mod els. By usin g site-spe cific info rmatio n, such as building dat abases or ar chitecture drawres ul ts sho we d a cl uste ri ng pattern in both ti me a nd a ng le ings, thi s technique de te rmi nis tica lly mod els the pr op agat ion do mai n, which led to the pr op osed channe l mod el descri bed ch ann el [33- 36], inclu d in g t he path loss e xpo ne nt a nd th e in [28]. Also . it was ob se rved th at the clu ster me an angle of de lay spre ad . However. th e hi gh co mputa tio nal burden and ar riva l was un iformly dis tribute d [0, 2It]. The d istribu tio n of th e angle of arrival of the rays with in a clu st er pre sented a lack of detailed terrain and building databases make ray tracing models difficult to use . Alt ho ugh so me progress has been sha rp pe ak at th e me an . leadin g to the Lapl acian d istribumad e in ove rcom ing the co mp uta tio na l burden , th e develop tio n modeling. The stand ard de via tio n found for thi s dis tr iment of an e ffec tive an d e ff icie nt pro cedure for ge ne ra ting buti on wa s a ro un d 25°. Based o n t hes e me asurem en ts. a ter rain and bu ildin g da ta for ray tr acing is still necessary . c ha n nel model inclu din g del a y sprea d an d AD A info rm atio n wa s pr op o se d , s up posing th at t ime a nd a ng le wer e Channel lvlodelSummary inde pe nde nt variables. Table 3 summari zes eac h of the spa tial models prese nted above. In [4 1L two- d im en sion al A DA a nd del a y sp rea d m e a s u r e me n t a nd es timatio n we re pr e sent ed . T he me asu re me n ts we re m ad e in d ow n tow n P a r is u s in g a c ha nnel so u nd e r a t 900 MHz a nd a ho rizo n ta l rect an gul a r p la na r There have be e n o nly a few pu blicatio ns re lat ing to spat ial a rr ay a t the re ce ive r. Th e est ima tio n of ADA , including cha nnel measurements. In this sect io n, references are given to azimut h a nd eleva tion a ngle . was perfor me d using 20 un ithese papers, and th e key res ults are de scribed . ta ry ES P R IT [42 ) with a ti me resolu tion of 0.1 us and angle In [38], TD A a nd ADA measure me nts are prese nted for resol utio n o f 5°. T he results pre se n te d con fi r me d as sump outdoor macrocellular e nviro nments. Th e meas ure me nts were tio ns mad e in ur ban prop aga tion , such as th e wave -guidi ng mad e using a ro tat in g 9 ° azimut h be a m d irectio nal rec e ive r m e ch an ism o f s t r e e ts a n d th e ex po ne n t ia l de cay of t he a nte nna with a 10 M Hz ba ndwid t h ce nte red a t 1840 M Hz . power de lay pr ofile . Also, it was observed tha t 90 perce nt of Three e nviro nme nts ne ar Mu nic h we re co nsidered, includ ing t he re ce ived power was co ntai ned in the p aths wit h elevaru ral , suburba n, a nd urb an a reas wit h base sta tio n a nte n na tio n betwe en 0° an d 40° wi t h t he lo w e leva te d path s co nheight s of 12.3 m, 25.8 rn, a nd 37 .5 rn, res pectively. Th e key tr ibuting a lar ger amo unt. obse rvatio ns mad e includ e [38]: Fina lly, in [43] me asurements are use d to show the var ia• Mo st of the signa l ene rgy is co ncentra te d in a sma ll inte rval tion in th e spa tia l sig na ture wit h both t im e a nd fre q ue ncy. o f del ay and wit hin a sma ll A D A in ru ral, suburba n . a nd Tw o me asures of cha nge are given, th e relative a ngle cha nge even many urban e nviro nme nts. given by

f

Spatial Signal Measurements

29

"

Model

.

-' "

.

,

"

r1escription

References: .

Lee's Model

Effective scatterers are evenly spaced on a circular ring about the mobile Predicts correlation coefficient using a discrete ADA model Extension accounts for Doppler shift

[3,9, 11)

Discrete Uniform Distribution

N scatterers are evenly spaced over an ADA range Predicts correlation coefficient using a discrete ADA model Correlation predicted by this model falls off more quickly than the correlation in Lee's model

(9)

Geometrically Based Circular Model (Macrocell Model)

Assumes that the scatterers lie within circular ring about the mobile ADA, TDA, joint TDA and ADA, Doppler shift, and signal amplitude information is provided Intended for macrocell environments where antenna heights are relatively large

[12-14, 16,37]

Geometrically Based Elliptical Model (Microcell Wideband Model)

Scatterers are uniformly distributed in an ellipse where the base station and the mobile are the foci of the ellipse ADA, TDA, joint TDA and ADA, Doppler shift, and signal amplitude information is provided Intended for microcell environments where antenna heights are relatively low

[17, 18J

Gaussian Wide Sense Stationary Uncorrelated Scattering (GWSSUS)

N scatterers are grouped into clusters in space such that the delay differences within each cluster are not resolvable within the transmission signal BW Provides an analytical model for the array covariance matrix

[19-22]

Gaussian Angle of Arrival (GAA)

Special case of the GWSSUS model with a single cluster and angle of arrival statistics assumed to be Gaussian distributed about some nominal angle Narrowband channel model Provides an analytical model for the array covariance matrix

(23)

Time-Varying Vector Channel Model (Raleigh's Model)

Assumes that the signal energy leaving the region of the mobile is Rayleigh faded Angle spread is accounted for by dominant reflectors Provides both Rayleigh fading and theoretical spatial correlation properties

[20)

Typical Urban

Simulation model for GSM, DCS 1800, and PCS 1900 Time domain properties are similar to the GSM-TU defined in GSM 05 .05 120 scatterers are randomly placed within a 1 km radius about the mobile Received signal is determined by brute f orce from the location of each of the scatterers and the time-varying location of the mobile ADA statistics are approximately Gaussian

[21, 22, 25)

Bad Urban

Simulation model for GSM, DCS 1800, and PCS 1900 Accounts for large reflectors not in the vicinity of the mobile Identical to the TU model with the addition of a second scatterer cluster offset 45' from the first

[21,22,25J

Uniform Sectored Distribution

Assumes that scatterers are uniformly distributed within an angle distribution of 9Bwand a radial range of M centered about the mobile Magnitude and phase associated with each scatterer are selected at random from a uniform distribution of [0,1) and [0, btl. respectively

[26]

Modified SalehValenzuela's Model

An extension to the Saleh-Valenzuela model, including ADA information in the channel model Assumes that time and the angle are statistically independent Based on indoor measurements

[28)

Extended Tap-Delay-Line Model

Wideband channel model Extension of the traditional statistical tap-delay-line model which includes ADA information The joint density functions of the model parameters should be determined from measurements

[29J

Model of multi path propagation based on the distribution of the scatterers in elliptical Spatio-Temporal Model subregions, corresponding to a range of excess delay time (Lu, Lo, and Litva's Model) Similar to the GBSBEM

,

I

I

Parameters are based on measurement Characterizes the propagation environment in terms of scattering points

[32)

RayTracing Models

Deterministic model based on the geometric theory and reflection, diffraction, and scattering models Uses site-specific information, such as building databases or architecture drawings

[33-36)

30

i I

[30]

Measurement-Based Channel Model

J! Table 3. Sum m ary ofspatia l chann el m od els.

I

i I

Conclusions Relative Angle Change (%)

As antenna technology advances, radio system engineers are increasingly able to utilize the spatial domain to enhance system performance by rejecting interfering signals and boosting desired signal levels. However, to make effective use of the spatial domain, design engineers need to understand and appropriately model spatial domain characteristics, particularly the distribu tion of scatterers, angles of arrival, and the Doppler spectrum. These characteristics tend to be dependent on the height of the transmitting and receiving antennas relative to the local environment. For example, the distributions expected in a microcellular environment with relatively low base station antenna heights are usually quite different from those found in traditional macroccllular systems with elevated base station antennas. This article has provided a review of a number of spatial propagation models. These models can be divided into three groups: • General statistically based models • More site-specific models based on measurement data • Entirely site-specific models. The first group of models (Lees Model, Discrete Uniform Distribution Model, Geometrically Based Single Bounce Statistical Model. Gaussian Wide Sense Stationary Uncorrelated Scattering Model, Gaussian Angle of Arrival Model, Uniform Sectored Distributed Model, Modified Saleh-Valenzue las Model, Spatio-Temporal Model) are useful for general system performance analysis. The models in the second group (Extended Tap Delay Line Model and Measurement-Based Channel Model) can be expected to yield greater accuracy but require measurement data as an input. An example from the third group of models is Ray Tracing, which has the potential t o be extremely accurate but requires a comprehensive description of the physical propagation environment as well as measurements to validate the models. Further research is required to validate and enhance the models described in this article. Bearing in mind that an objective of modeling is to substantially reduce the amount of physical measurement required in the system planning process, it is important for design engineers to have reliable models of AOA. TDOA. delay spread, and the power of the multipath components. Further measurement programs that focus on spatial domain signal characteristics are required. These programs would greatly benefit from the development of improved measurement equipment. Armed with improved spatial channel modeling tools and a greater understanding of signal propagation, engineers can begin to me e t the challenges inherent in designing future high-capacity/high-quality wireless communication systems, including the effective use of smart antennas.

and the relative amplitude change, found using Relative Amplitude Change(dB)

= ~~Oloblo 0

tJ Iladl

where a, and 3j are the two spatial signatures (array response vectors) being compared. The measurements indicate that when the mobile and surroundings are stationary, there are relatively small changes in the spatial signature. Likewise, there are moderate changes when objects and the environment are in motion and large changes when the mobile itself is moving. Also, it was found that the spatial signature changes significantly with a change in carrier frequency. In particular. the measurements found that the relative amplitude change in the spatial signal could exceed 10 dB with a frequency change of only 10 Ml-lz. This result indicates that the uplink spatial signature cannot be directly applied for downlink beamforming in most of today's cellular and pes systems that have ~5 MHz and 80 MHz separation between the uplink and downlink frequencies, respectively.

Application of Spatial Channel Models The effect that classical channel properties such as delay spread and Doppler spread have on system performance has been an active area of research for several years and hence is fairly well understood. The spatial channel models include the AOA properties of the channel, which are often characterized by the angle spread. The angle spread has a major impact on the correlation observed between the pairs of elements in the array. These correlation values specify the received signal vector covariance matrix, which is known to deterrnine the performance of linear combining arrays [-l41. In general, the higher the angle spread the 10\N"er the correlation observed between any pair of elements in the array. The various spatial channel models provide different angle spreads and hence will predict different levels of system performance. The channel models presented here have various applications in the analysis and design of systems that utilize adaptive antenna arrays. Some of the models were developed to provide analytical models of the spatial correlation function, while others are intended primarily for simulation purposes.

Simulation

More and more companies are relying on detailed simulations to help design and develop today's wireless networks. The application of adaptive antennas is no exception. However, to obtain reliable results. accurate spatial channel models are needed. With accurate simulations of adaptive antenna array systems, researchers will be able to predict the capacity improvement, range extension, and other performance measures of the system. which in turn will determine the cost effectiveness of adaptive array technologies.

References

(1] R. H. Clarke, "A Statistrcal Theory of Mobile-Radio Reception," Bell Sys. Tech. I, vol. 47, 1968, pp. 957-1000. [2] M. J. Gans, "A Power Spectral Theory of Propaqanon In the Mobile RadIO Environment," IEEE Trans. Vehic. Tech., vo1.VT-21, Feb. 1972, pp. 27-38. [3] W. C. Y. Lee, Mobile Communications Engineering, New York: McGraw Hill, 1982. [4] Turin, "A Statistical Model for Urban Multipath Propagation," JEEE Trans. Vehic. Tech., vol. VT-21, no. 1, Feb. 1972, pp. 1-11. [5] T. S. Rappaport, Wireless Communications: Principles & Practice, Upper SaddleRiver, NJ: Prentice Hall PTR, 1996. (6J F. Adachi et al., "Crosscorrelation Between the Envelopes of 900 MHz Signals Received at a Mobile Radio Base Station Site," lEE Proc., vol. 133, Pt. F, no. 6, Oct. 1986, pp. 506-12. [7] J. D. Parsons, The Mobile Radio Propagation Channel, New York: John Wiley & Sons, 1992. [8] W. c. Y. Lee, Mobile Cellular Telecommunications Systems, New York: McGraw Hill, 1989.

Algorithm Development

The availability of channel models also opens up the possibility of developing new maximum likelihood smart antennas and L~OA estimation algorithms based on these channel models. Good analytical models that will provide insights into the structure of the spatial channel are needed.

31

[9] D. Aszetly, "On Antenna Arrays In Mobile Cornrnumcatron Systems: Fast Fading and GSM Base Station Receiver Algorithms," Ph.D. dissertation, Royal Inst. Technology, Mar. 1996. [10] S. P. Stapleton, X. Carbo, and T. McKeen, "Spatial Channel Simulator for Phased Arrays," Proc. IEEE vrc, 1994, pp 1789-92. [11] S. P. Stapleton, X. Carbo, and T. Mckeen, "Tracking and Diversity for a Mobile Communications Base Station Array Antenna," Proc. IEEE VTC, 1996, pp. 1695-99. [12] R. B. Ertel, "Vector Channel Model Evaluation," Tech.rep., SW Bell Tech. Resources, Aug. 1997. [13] J. William C. Jakes, ed., Microwave Mobile Communications, New York: John Wiley & Sons, 1974. [14] P. Petrus, "Novel Adaptive Array Algorithms and Their Impact on Cellular System Capacity," Ph. D. dissertation, Virginia Polytechnic lnst. and State Univ., Mar. 1997. [15] P. Petrus, J. H. Reed, and T. S. Rappaport, "Effects of Directional Antennas at the Base Station on the Doppler Spectrum," IEEE Commun. Lett., vol. 1, no. 2, Mar. 1997. (161 R. B. Ertel, "Statistical Analysts of the Geometrically Based Single Bounce Channel Models," unpublished notes, May 1997. [17] J. C. Liberti, "Analysis of CDMA Cellular Radio Systems Employing Adaptive Antennas," Ph.D. dissertation, Vrrqrrua Polytechnic Inst. and State Uruv , Sept. 1995. (18] J. C. Liberti and T. S. Rappaport, "A Geometrically Based Model for line of Sight Multipath Radio Channels," IEEE VTC, Apr. 1996, pp.

[40] Q. Spencer et al., "Indoor Wideband Time/Angle of Arrival Multrpath Propagation Results," IEEE VTC, 1997, pp. 1410-14. [41] J. Fuhl, J-P Rossi, and E. Bonek, "High Resolution 3-D Direction-ofArnval Determination for Urban Mobile Radio, IEEE Trans. Antennas and Propagation, vol. 45, no. 4, Apr. 1997, pp. 672-81. [42J M. D. Zoltowski, M. Haardt, and C. P. Mathews, "Closed-Form 2-D Angfe Estimation With Rectangular Arrays in Element Space or Beamspace via Unitary ESPRIT," IEEE Trans. Signal Processing, vol., vol. 44, Feb. 1996, pp. 316-28. [43] S. S. Jenget et al., "Measurements of Spatial Signature of an Antenna Array," Proc. IEEE 6th PIMRC, vol. 2, Sept. 1995, pp. 669-72. [441 R. A. Monzinqo and T. W. Miller, Introduction to Adaptive Arrays, New York: John Wiley & Sons, 1980. II

Additional Reading

[1 J M. J. Devasirvatharn, "A Comparison of Time Delay Spread and Signal Level Measurements With Two Dissimilar Office BUildings," IEEE Trans. Antennas and Propagation, vol. AP-35, No.3, Mar. 1994, pp 319-24. [2) M. J. Feuerstein et al., "Path Loss, Delay Spread, and Outage Models as Functions of Antenna for Microcellular Systems Design," IEEE Trans. Vehic. Tech., vol. 43, no. 3, Aug. 1994, pp. 487-98. (3) J. C. Liberti and T. S. Rappaport, "Analytical Results for Capacity Improvements In CDMA," IEEE Trans. Vehic. Tech., VT-43, no. 3, Aug. 1994. [4J D. Molkdar, "Review on Radio Propagation into and Within Buildings," lEE Proc., vol. 138, no. 1, Feb. 1991. [5] A. F. Naguib, A. Paulraj, and T. Kallath, "Capacity Improvement With Base-Station Antenna Array," IEEE Trans. Vehic. Tech., VT-43, no. 3, Aug. 1994, pp. 691-98. (6) A. F. Naqurb and A. Paulraj, "Performance Enhancement and Trade-offs of a Smart Antenna 10 CDMA Cellular Network," IEEE V7C, 1995, pp. 225-29. [7] S. Y. Seidel et al., "Path Loss, Scattering and Multipath Propagation Stattstics for European Cities for DIgital Cellular and Microcellular Radiotelephone," IEEE Trans. Vehic, Tech" vol. VT-40, no. 4, Nov. 1991, pp. 721-30. [8] S. C. Swales, M. A. Beach, and J. P. MacGeehan, "The Performance Enhancement of Multi-beam Adaptive Base-station Antenna for Cellular Land Mobile Radio System," IEEE Trans. Vehic. Tech., vo1.VT-39, Feb 1990, pp. 56-67.

844-48.

[19] P. Zetterberg and B. Ottersten, "The Spectrum Efficiency of a Basestation Antenna Array System for Spatially Selective Transmission," IEEE VTC,1994. (20] P. Zetterberg, "Mobile Communication with Base Station Antenna Arrays: Propagation Modeling and System Capacity," Tech. rep., Royal Inst. Technology, Jan 1995. [21] P. Zetterberg and P L. Espensen, A Downlink Beam Steering Technique for GSM/DCS 1800/PCS 1900," IEEE PIMRC, Taipei, Taiwan, Oct., 1996. [22] P. Zetterberg, P. L. Espensen, and P. Mogensen, "Propaqatron. Beamsteering and Uplink Combrning Algorithms for Cellular Systems," ACTS Mobile Commun. Summit, Granada, Spam, Nov., 1996. [231 B. Ottersten "Spatial Division Multiple Access (SDMA) In Wireless Communications, , Proc. Nordic Radio Symp., 1995. [24] G. G. Raleigh and A. Paulraj, "Time Varyinq Vector Channel Estimation for Adaptive Spatial Equalization," Proc. IEEE Globecom, 1995, pp. 218-24. [25] P. Mogensen et et., "Algorrthms and Antenna Array Recomme-ndations," Tech.rep. A020/AUC/A 12/DR/P/1/xx-D2.1.2, Tsunami (II), Sept. 1996. [26] O. Norklit and J. 8. Anderson, "Mobrle Radio Environments and Adaptive Arrays," Proc. IEEE PIMRC, 1994, pp. 725-28. [27] A. M. Saleh and R. A. Valenzuela, "A Statistical Model for Indoor Multipath Propagation," IEEE JSAC, vol. SAC-5, Feb. 1987. [28] Q. Spencer et al., "A Statistical Model for Angle of Arrival I n Indoor Multipath Propagation," IEEE VTC, 1997, pp 1415-19. [29] A. Klein and W. Mohr, "A Statistical Wideband Mobile Radio Channel Model Including the Direction of Arrival," Proc. IEEE 4th tnt'l. Symp. Spread Spectrum Techniques & Applications, 1996, pp. 102-06. [30] M. Lu, T. Lo, and J. Litva, "A Physical Spatio-Temporal Model of Multipath Propagation Channels," Proc. IEEE VTC, 1997, pp, 180-814. [31 J E. S. Sousa et al., "Delay Spread Measurements for the Digital Cellular Channel In Toronto," IEEE Trans. vebtc. Tech., vol. 43, no. 4, Nov. 1994, pp.837-47. [32J J. J. Blanz, A. Klein, and W. Mohr, "Measurement-Based Parameter Adaptation of Wldeband Spatial Mobile Radio Channel Models," Proc. IEEE 4th Int'l. Symp. Spread Spectrum Techniques & Applications, 1996, pp. 91-97. [33] K. R. Schaubach, N. J. DaVIS IV, and T. S. Rappaport, "A Ray Tracing method for Predicting Path Loss and Delay Spread in Microcellular EnVIronment," IEEE VTC, May 1992, pp. 932-35. [34] J. Rossi and A. Levi, "A Ray Model for Decimetric Radiowave Propagation in an Urban Area," Radio Science, vol. 27, no. 6,1993, pp. 971-79. [35] R. A. Valenzuela, "A Ray Tracing Approach for Predictinq Indoor Wir~ less Transmissron." IEEE IITC, 1993, pp. 214-18. [36] S. Y. Seidel and T. S. Rappaport, "Site-Specific Propagation Prediction for Wireless In-Budding Personal Communication System Design," IEEE Trans. Vehic. Tech., vol. 43, no. 4, Nov. 1994. [37J P. Petrus, J. H. Reed, and T. S. Rappaport, "Geometncally Based Statistical Channel Model for Macrocellular Mobile Environments," IEEE Proc. GLOBECOM, 7996, pp. 1197-1201. [38] A. Klein et al., "Direction-af-Arnval of Partial Waves in Wideband Mobile Radio Channels for Intelligent Antenna Concepts," IEEE VTC, 1996, pp. 849-53. [39] H. J. Thomas, T. Ohgane, and M. Mizuno, "A Novel Dual Antenna Measurement of the Angular Distribution of Received Waves in the Mobile Radio Environment as a Function of Position and Delay Time," Proe. IEEE VTC, vol. " 1992, pp 546-49. II

32

Antenna Systems for Base Station Diversity in Urban Small and Micro Cells Patrick C. F. Eggers, J0rn Toftgard, and Alex M. Oprea

Abstract-This paper describes cross-correlation properties for compact urban base station antenna configurations, nearly all resulting in very low envelope cross-correlation coefficients of about 0.1 to 0.3. A focus is set on polarization diversity systems for their potential in improving link quality when hand-held terminals are involved. An expression is given for the correlation function of compound space and polarization diversity systems. Dispersion and envelope dynamic statistics are presented for the measured environments. For microcell applications, it is found that systems such as GSM having a bandwidth of 200 kHz or less can use narrowband cross-correlation analysis directly.

As

1.

INTRODUCTION

the bandwidths of newer digital land/mobile and personal communications systems increase, radio channel time dispersion can produce noticeable frequency-selective fading within the band. The capacity demand on such systems is also increasing, leading to network layouts with smaller cell sizes. A way to assist with cell layout. and to combat interference and fading degradation. is to use base station diversity. A normal figure of merit for narrowband antenna diversity systems is the envelope cross-correlation coefficient P12,.nv. For phase-modulated systems, the phase decorrelation is also relevant especially for predetection combining, and here the complex cross-correlation coefficient PI'2 can be used. The work presented in this paper falls in two areas. The paper describes investigations of the overall diversity potential for antenna systems used in urban small and microcells (predominantly range lengths up to 3000 m and 300 m, respectively), and the degree of overall radio channel dispersion by considering the temporal and spatial domain separately. Two base station sites have been investigated. One microcell site has antenna heights of a few meters above rooftops and range lengths from 50 to 800 m. On the other site, the antennas were elevated approximately 30 m above rooftops and ranges from 200 to 3000 m were used. Antenna configurations with vertical and horizontal separations have been used, as well as compound space and polarization diversity configurations. Hand-held terminals in existing mobile systems and upcoming personal systems will most likely have low polarization and will be subject to varying handset orientation. Under such conditions, polarization diversity is an obvious way of improving link quality. Manuscript received March 1992; revised December 1992. This work was supported by the Danish Technical Research Board. This work was presented in part at the Fifth Digital Mobile Radio Conference, Helsinki. Finland, December 1-3, 1992. The authors are with the Mobile Communications Group, Aalborg University, DK-9220 Aalborg, Denmark. IEEE Log Number 9210927.

Digital transmission using antenna diversity in frequencyselective fading channels has been investigated in [1]. The investigations were based on simulations and assuming fully decorrelated branch signals with uncorrelated scattering [(US), i.e., full decorrelation along the temporal axis], with the same average power delay profile (PDP) at each branch. This paper presents an extensive measurement campaign, performed in an urban environment. The radio channel has been sampled simultaneously in the spatial and temporal domains on two antenna branches. Achievable figures of the degree of antenna signal correlation, dispersion, and fading for practical systems are presented, giving a base of comparison to some of the assumptions used in [1].

II.

MOBILE RADIO CHANNEL CHARACtERIZATION

The mobile radio channel is often modeled by a superposition of the following three major effects: a) Large variations, i.e., range dependency (median pathloss) b) Local variations, i.e., shadowing, (generally considered log-normal distributed) c) Short-term variations, i.e., multipath fading (shown to follow Rayleigh or Rician distributions) In general, the mobile radio channel is both time variant and nonstationary. The measurements reported here have mostly been performed in low-density and low-velocity traffic situations. The roof of the measuring van (2.2 m) was higher than most of the other moving vehicles (scatterers). Thus. a timeinvariant channel situation is assumed during the measurement, although channel condition details may have changed for a later measurement at the same location (note it would be very difficult to find exactly the same location in practice). This assumption allows a comparison of the statistics. The measurement runs used have been short, so that global variation statistics (pathloss) are assumed "frozen." Thus, we consider the multipath fading being superimposed with local variations (shadowing) affecting only the mean values of the envelope signal. A removal of the estimated shadowing component is possible. This will stabilize the mean to a quasiconstant value. Thus, a wide sense stationary (WSS) short-term fading channel is extracted for analysis (described in more detail later). Since the, \VSS mobile radio channel may be visualized as a linear fl;~~r at a given time, it can be described either by its impulse response or by its transfer function. Alternatively, the spatially variant behavior of the radio channel can be investigated. This has the advantage of providing a stable channel, invariant to change in mobile terminal velocity.

Reprinted from IEEE Journal on Selected Areas in Communications, Vol. 11, No.7, pp. 1046-1057, September 1993.

33

The spatially variant impulse response h( 7, x) provides a direct illustration of the multipath phenomenon, where r is the temporal lag (delay) and x is the spatial position along the travelled path. The frequency Doppler transfer function H(f, fa) gives an appropriate picture of the spatialvarying behavior via Doppler shifts. These two functions, which describe adequately a deterministic spatially variant channel, form a two-dimensional Fourier pair. The following subsections describes the statistical parameters used for evaluation of the radio channel considered WSS in the spatial domain and fully stationary in the temporal domain.

envelope correlation coefficient P12 e n v is a specialization of (3) when the real envelope functions Ig(T, x) I are used instead of the complex functions g( 7, x). Thus, P12 e n v becomes a real function. For analysis purposes, the power correlation coefficient P12 p o w has also been used (see Appendix A).

B. Temporal Domain For the temporal domain, the power delay profile (PDP) is used as the basis for overall dispersion analysis

PDP(r)

Often, a CW measurement has been used to determine the correlation between fading signals from two diversity antennas. This corresponds to the DC response along the temporal axis, i.e., the complex envelope of the radio channel is found by integration as

.l:

h(r.x)dr

FCF(~f)

(1)

III.

is

and

== E[g;(x) . 92(X + ~x)) . My == E[g(x)] .

(5)

== F-1[PDP(r)].

(6)

where P, is the total power in the PDP. For high temporal resolution systems. the absolute delay function must be taken into account in (5) through (7). The temporal resolution in our experiments is approximately 1 lIS corresponding to a 300 m spatial resolution. As each measurement run was limited to 51 m.. we consider the absolute delay from base to mobile being constant over the integration of PDP in (5). We can then use (7) without modifications.

The normalized spatial complex correlation coefficient function is

~x

2

Ih(r,x)1 dx.

The coherence bandwidth is found at a given correlation level from the normalized FCF and describes the overall dispersion strength represented in the frequency domain. The RMS delay spread (S) is widely used to describe the overall degree of dispersion in the time domain [6], and is given as the square root of the second central moment of the PDP

corresponding to the DC component of the spatial-variant frequency transfer function H (0. .r ). For wide bandwidth systems.. a Received Signal Strength Indicator (RSSI) signal may also be used for driving a diversity combiner. The RSSI signal represents the total power over the band and will.. in frequency-selective channels. exhibit reduced fading dynamics. The RSSI signal is real and is found by integration of the spatially variant frequency transfer function weighted with the receiver filter. i.e.,

where the cross covariance of lag

I:

The PDP is an average power density impulse response. The frequency correlation function (FCF) follows by inverse Fourier transform via the Wiener-Khintchine Theorem as

A. Spatial Domain

g(l;) =

=

R'!Jlg2(~X)

(4)

R g 1g 2 and JL g are the complex cross-correlation functions and complex mean of the CW or RSSI signals following (1) or (2). E[·] is the expectation value (mean value) operator. For a continuous random variable (RV) V with probability function (pdt) j(v), E[V] = J~(X) vf(v) dv. Similarly, a discrete RV of sampled data V = [VI' ·'UN] is treated with ensemble averaging E[l/] = (V) = (liN) L:=1 Vn. The parameter of most interest when considering antenna systems is the correlation function with zero lag, i.e., ~x == O. The value of (3) at zero lag is normally just referred to as the P12. The widely used correlation coefficient, i.e., Pglg2(0)

=

34

THE MEASUREMENT SETUP

Usually, the choice of channel sounding technique depends upon the application foreseen for the collected data. To permit a study of the instantaneous behavior of the mobile radio channel later on, a direct pulse technique was chosen. The channel sounder, having a time resolution capability of approximately 1 JtS and an instantaneous dynamic range of about 26 dB, was operated at 970 MHz. To achieve adequate resolution of Doppler shifts, a sampling interval of 6 ern was used. The following subsections describe the hardware implementation of the measurement system.

A. Transmitter All frequencies used in the transmitter are derived from a stable 10 MHz frequency standard. The maximum transmitted power was a 5 W peak, the signal being radiated from a quarter-wavelength monopole mounted on the metal roof of the measuring vehicle. The cross-polarization discrimination (XPD) of the transmitter was difficult to measure, but field tests indicated a figure of up to 20 dB. This is quite high compared to XPD figures for normal vehicle mounted whip

antennas, but in our case the measuring van (Ford Transit-L) has a large flat roof, providing a better ground plane than most personal cars .

B. Receiver A dual-branch receiver is used to investigate the complex cross-correlation between the two antenna signals . Each branch is a conventional quadrature detection receiver and, as in the transmitter, all frequencies used are synthesized from a stable 10 MHz frequency standard. The dominating output filters limit the bandwidth (-3 dB single sided) to 0.6 MHz. A linear input signal dynamic range of -45 to -105 dBm is available . The I and Q signals from the two branches of the receiver are simultaneously sampled at a 3.3 MHz rate and stored on a hard disk. Two identical planar array directional antennas with an azimuth and elevation 3 dB beam width of 60° were used . Thus, the same illumination of the environment is achieved for both the copolarized and cross-polarized cases . The antennas have a gain of 8 dBi and a XPD better than 30 dB, ensuring sufficient suppression of the unwanted polarization to provide usable figures for the XPD of the radio environment. C. Measurement Data

The length of each test run was 51 meters . The data was stored on a pc, consisting of three consecutive complex impulse responses for both receiver branches at every spatial sampling point. The data transfer rate of the PC limited the vehicle speed to 3 ta]«. IV.

MEASUREMENT SCENARIOS

Two series of field trials were conducted in Aalborg, a medium-sized European city. The principal variable during these trials was the receiver antenna configurati on at the base station . Fig. 1 shows a map of the test area with the base station and measurement site location. In the first experiment, a typical urban environment was covered, corresponding to a microcell. The base station (BS 1) was located on the roof of one of Aalborg University 's buildings, some 18 m above ground level. The antenna position was just above the rooftop level of the surrounding buildings . The test route was chosen in a relatively flat and heavily built-up urban area consisting mostly of five-story buildings which are in close proximity to each other. A square of 0.75 by 0.75 km was covered by 14 measurement sites (shown by numbered circles on the map). Data were recorded on each of the 14 sites for 12 antenna configurations at the base station (Table I). In the second experiment, a small cell was considered. The base station (BS2) antennas were located on Hotel Hvide Hus, a 15 story building in Aalborg, some 30 m higher than the rooftop level of the surrounding buildings . The first part of the route was similar to that of the first experiment. The second part was chosen in an undulating environment, starting some 1.5 krn from the base station, across a 500 m wide river which splits the town in two. On the Nerre Sundby side of the river. the ground slopes upward reaching a height of 45 m at 2.6 km (site 14) from the base station and the area changes gradually

Fig I. Map
into a suburban one. There were 18 measurement sites chosen (shown by numbered triangles on the map). half in Aalborg and half in Nerre Sundby. the furthest point being around 3 km away from the base station . Data were collected for each of the 18 measurement sites for six antenna configurations (Table II). The higher elevation and longer range lengths at 8S2 will lower the effective beam width of the incoming scattered signal. Thus, larger antenna separations have been used at BS2 to obtain comparable decorrelation to the configurations used at BSI. V . DATA PROCESSING

The measured diversity data have been processed collectively for a block of measurements with the same base station antenna configuration. A measurement series consists of a measurement Hie for each location. The measured data are stored in the measurement files with a recorded impulse response for each diversity branch. Three consecutive in-

35

TABLE ANTENNA CONFIGURATIONS USED AT

(f - fo) = ±200 kHz). This filter provides a power spectrum closely resembling the main lobe of a GSM/GMSK power spectrum. For the CW signals, several correlations between the antenna signals have been calculated: P12, P12 e n v ' P12 p o w ' and PII, PIQ (see Appendix A). For the RSSI signals, only envelope correlations have been calculated. For all three envelope signal types and each antenna, the 1 % level of the cumulative envelope distribution (edt) is found. This information is valuable in assessing necessary fading margins in radio network design. Also, assumptions of Rayleigh fading signals (see Appendix A) can be coarsely validated in a compact manner. For a true Rayleigh fading signal, the 1% cdf level corresponds to a signal level 20 dB below the mean of the signal.

I

as1. THE

NUMBER IN THE BRACKETS

REFER TO THE ELEVATION OF THE LoWEST ANTENNA WITH RESPECf TO THE BASE STATION ROOF.

881

Ml

.....

M2

,'m~

M6

~

2m

(3m)

M4

(3m)

~

M7 (1.5m)

M10

~ V s

~

-I

1.45m

,2m~ (3m)

3m.

~. H pol.

M3

2m.

~

(3m)

-1m-

~45m

(1.5m) ~

~

M8

l.45m

(1.5m)

(2. lm)

pol.

Antenna Configurations

M11

"

~

,3m~ (2. 1m)

=46°pol.

~-lm-

M6

.... -I 1.45m

(l.5m)

M9

~o.2m

(1.5m)

..,.. ..,.. 3m

M12

(3m) = 3m above

TABLE II ANTE~Nr\ CONFIGURATIONS USED AT

852.

VI. RESULTS OF THE ANALYSIS

a.1m)

In the following discussions, all data from all antenna configurations are considered in the figures (unless specifically noted otherwise).

BS 1 roof

A. Radio Environment

SV\IBOLS t.:SED.

FOLLOW SAME NOMf:~CLAnJRE AS t.:SED IN TABLE

I.

BS2 Antenna Configurations M21

-I 2.9m I-

M22 ,2.9m~

M2A

~ 3.8m

M25

-I

~

3.8m

~

M23

M26

..... ..... ..... 2.9m

.... 3.8m

stantaneous impulse responses were recorded for each spatial position. Through coherent (complex) averaging of these three impulse responses, a noise suppression of almost 5 dB was obtained. To retrieve the fast fading components of the recorded signal, the log-normal shadowing component was compensated for. The log-normal component of the data was extracted by performing a moving average filtering in the spatial domain. Two consecutive rectangular windows with lengths of 10 and 6.7 m were used. The PDP~s for each branch were calculated by averaging the power of instantaneous impulse responses over the full 51 m run length. The statistical parameters extracted from the PDP's are delay spread (S), mean delay (r), and total power (Pt ) . The FCF's were found by inverse Fourier transform of the PDP's. The coherence bandwidths have been found for 0.9, 0.7, and 0.5 correlation levels of the normalized FCF. Three fading envelope signals have been generated from each data set-one CW signal and two RSSI signals. The RSSI signals have been extracted for the full bandwidth of the measuring equipment (1.2 MHz-3 dB double sided) and for a GSM-like bandwidth (200 kHz-3 dB double sided). A square 2 2 cosine filter IH(f)filterI = COS (O.S1r(f - 10)/200) kHz has been used for the GSM-like signal (truncated to 0 outside

36

From [2] and [3], it is found that short-term fading statistics are independent of impulse response shape and only depend on RMS delay spread (S) [as long as the delay spread bandwidth product is much smaller than unity, i.e., S . BW « 1]. This is referred to as a low-dispersive channel situation. For low-dispersive situations, the envelope correlation coefficient will be fully relevant, as all large phase and timing jitter variations are associated with fades in the instantaneous total power (RSSI signal). Thus, a classical envelope-controlled diversity combining scheme can also suppress the radio channelintroduced phase and timing jitter. Previous investigations [4] have shown log-normal shadowing with a 8 dB from the area covered by BSI. Short-term effect investigations in [5] indicate Rayleigh-distributed fading behavior for most of the strong power components of the PDP, for the areas covered by BS2. Line-of-sight (LOS) peaks, however, show distinctly Rician distributed envelope fading. Fig. 2 shows the 1% level of the cumulative envelope distribution dependence versus normalized delay spread, i.e., S· BW. For the plotting of the CW case, a 25 kHz bandwidth has been used. As expected, the CW signals are insensitive to radio dispersion whereas the RSSI signals reduce dynamics with higher dispersion and larger bandwidth. From Table IV, it follows that the mean 1 % cumulative CW envelope levels are indeed close to -20 dB corresponding to Rayleigh fading. For the RSSI signals, it follows that there is a strong increase in the 1 % envelope level for higher bandwidths, and that this increase is approximately 2 dB stronger for BS2 than for BSI. This is due to a higher degree of radio dispersion (S) at BS2 compared to BS1. The higher dispersion, the more frequencyselective the channel becomes and the probability of total power vanishing within the band is decreased. The transmitter/receiver equipment filters limit the resolution of the measured channel impulse response. The RMS delay spreads found from the PDP's were compensated for

=

0

CO

2.ci

...

0 0.

\ ...~·t

-5

8

-10

.

0.8

,,::..: .

: . :«.

..... -15 .Q II) >

-20

~

-25

c::

10-2

10 '!

...c,

~ II)

til >.

100

0

\

BS2

2

o

8

1.5

"

00

0 a

0

0

0

0 0

0

00 0

0

"

o0"__ 0

0

00

"

0

•

0.8

t

\

\ \ \ \

\

'

\

400

\

\

c= 0.5: c=8·7: c= .9:

\ \

0.4

c

"" ,,"" " " ,,"

500

-'

!

\

.<'

30

\ \

300

\

0 0

0.2 ~

I

" "

o o

40

,

\

!

.

, \ \

-1

\

\

\

i

\

\

\

\

\

\

\200

\

\

\

\

\

I '_ _~_ _...L...:'---::-:

300

500

BW [kHz]

Norm Path Loss [dB] RMS dela v spre ad ve rs us nor ma lized (ex cess ) pa th

\\ 100

'-

\

1L.._-..l._~....L

0

20

\

200

0.6

:03 ~:;f~~;"~ 10

\

\

(u)

o _

0

0

u

0

\ \

0

o~ ~o

0

0

0

\

\

BW [kHz]

0

0

0

0.5

\

\ \ \ \ \ \

100

0

0

x: BS1 0:

\ \ \

10 1

0

~

Fig . 3.

\

I

\ \ \ \

3

-0

\ \

\

Fig. 2. Cu mulative e nve lope dis tribution at 1rr,. lev el ve rs us nor m ali zed delay sp read .

Vi'

\

\

0.2

x:CW 0: 200kHz .:1.2MHz

f-

c=0.5: \ c=8·7: \ c= 9'

\

I

Normalized delay spread S BW

2-

\

0.4

-3{l)-3

2.5

,

\

Q.l

0.

\ \ \ \

0.6

""

~

\

( h)

Fig. 4. Coherence ba ndwidth at 0.5. 0.7. and 0.9 corre latio n level for (a) 8 S1 a nd (b) 8 S2.

Ill SS .

this time interference by simple variance subtrac tion [6] , i.e..

(8) It is, thus, possible to obtai n RMS de lay spread figur es with finer resolution than the time resolut ion of the transmitter/receiver equipment initia lly wo uld permit. Th is make s the RMS delay spread figur e also relevant to syste ms having higher bandwidths than the 1.2 MHz used by the sounding system. Fig. 3 indic ates a dependenc e betw een exc ess pat h loss and RMS dela y sprea d, i.e., larger dela y spreads for increas ing excess path loss (where the excess path loss is the path loss relative to the free -space path loss). The indicated dependence can be explained by a lower prob ability of LOS path s with an increased excess path loss, as LOS path s will generally have lower excess attenuation than scattered and reflected path s. Thus, LOS paths tend to dominate (increase ) the total power and reduce the effective width (dispersion) of the PDP . Fig. 4 shows the cumulative co herence bandwidth for all measurement locations and antenna systems for BS1 and BS2 . For BSI [Fig. 4(a)] , it is observed that the 0.7 coh erence

37

bandwidth is larger than 200 kH z for 90% of the locations. In most cases, this justifies using narrowband considerations for a GSM-like system in this env ironm en t. The coherence bandwidth function for BS2 [Fig. 4(b)] shows a large discontinuit y aro und the 70- 80% level. Thi s is du e to the rather har sh environment picked for this cell type . A river splits the town in two and allows free propagation for strong echoes across the river. Many PDP 's exh ibit a double-spike shape with abo ut 4 /LS separation (corres po nding to twice the river width). T his shows up as a strong periodicity in the FCF and , thus gives rise to the plat eau in the cumulative coherence bandwidth function . For ba ndwidths over 500 kH z, a direct narr owband description alone cannot give a satisfactory description of the propagati on conditions for the micro or the small cell enviro nme nt investigated. T he GS M system has incorporated a slow frequ ency hopping option over 5 MHz . From Fig. 4, it follows th at if the hopp ing frequ ency is just over 400 kH z neith er the micro nor sma ll cell environment will have locat ions with a FCF having

15

8

o

"

~ ~

a ~

10 -

0< qf II

)( JIC

" II

5f->l'

80

0

~

0: 0

0

0

8

c

08

00><8 0

)(~)(

~

~

........

01 0

0

0 u

0 0

~""0 ><0 ... " II

.9

-

0

08

"

x: BS1

0:

"

0

~

0 00

II "

q, 0

0

0

1

0

0

'0

B :ll "3 U
-

BS2

0.8 0.6

0

o

0.4

00

U

0 -0.2

Distance [m]

"

0

8

o

0

0

00

o~.o ·

0

0.2

0>J,

0

8

"

0

"

CIl> 0

>0<

/

/

;,(

/

/

/

/ "

)( o>e)( /<><

""

o /

/

/~

~?""

~~"t

"

"

.. ,0 :

cc

=

Xpol. only 0.3632

-

--,x: Xpol. and space cc

0

0

<e"

0 0

0

0

=

0.8723 0.5

1

Envelope correlation

Fig. 5. Cross polarization discrimination versus range length.

Equatio n (9) with 1'. >n\ ' .\'p,,1 only (Pond ' = 1) and with measu red versus envelope co rrelation coe fficient for physi call y s lant antenna system s. Fig. h .

I" 'II \" i

levels greater than 0.5. This indicates effective frequency diversity gain by use of the slow frequency hopping option .

Experimental test cases are given in Appendix A to support these assumpt ions. The power correlations are related as:

B. Cross Polarization The possibility of using polarization decorrelation (Ppow x po') at a base station has been reported by [9], where the XPD was found in the range around 6 dB for urban areas at 920 MHz. Compact , compound, horizontal, and vertical space diversity antenna systems and polarization decorrelation have also been reported in [10], where macrocells (10-20 km cell radius) were investigated. For urban areas (partly the same area as used for the experiments in this paper), the XPD was around 4 dB in the 900 MHz band . For suburban areas , XPD was found in the range around 12 dB. Fig. 5 shows the XPD versus the range for all antenna configurations at BSI and BS2. It follows from Fig. 5 that there is a large spread of XPD for both BSI and BS2. From Tab le IV, it follows that the urban (BSl) mean XPD is higher (7.4 dB) for small and microcells compared to the macrocells previously investigated. This is expected, as the depolarization is related to the number of reflections, diffractions, etc., that each path is subject to. For small cells, fewer possible pertubating objects will be in the signal path . For longer range lengths, the mean XPD is 11.4 dB (BS2), which can be explained by the environment having a more suburban character with few obstructions between base and mobile. The higher XPD's found for small and microcells means a reduc ed diversity gain by using polarization decorrelation compared to macrocell case. Appendix A gives a simple extension of the work of Vaughan [12]. The result of the analysis shows that polarization and space decorrelation effects are independent and multiplicative for compound antenna systems, as given in (9). The assumptions are noncorrelated Rayleigh fading polarizations, equal XPD (f) at the antennas, and the quadrature components of the vertical and horizontal polarizations exhibit approx imately equal space decor relation prop erties. In practice, this means that the rad iation patterns of the antennas must be identical and rotationally symmetric .

38

P po\\' 1~

= P po w! ' PpmvXpo l( r . n ) r+

t-2

TAB L E 1lI AVERAGE «» AND STANDARD D EVIATION (U) VALUES OF C W ENVELOPE CORRELATION COEFFICIENT AND BRANCH POWER D IFFERENCE.

...."""'.

g'l_

M1

0 ,1 4

, 0 12

-r.o

M2

0 07

0.0 8

· 1 ..

<>

:

CW

..

P,·Pl

0.5

0.. ;

U

"

0 .2 1

0 .27

- l. i'

""

: 0. 12

- 1.6

MS

0 07

: 0.09

0 .s

"

M.

0 .23

0.12

-o.s

::!,Q

M7

0.70

0 .22

<).7

1.7

M'

0.20

0.1.5

M.

O'

0 09

0 .09

Ml0

0 . 1]

o ie

- - ~t~

!,

"

X ... :

o

.,. "

M ll

0 04

l}.O7

"7

0.12

o.u

-1.0

M21

O.jO

0)0

0.'

• - .: M22

"

1000

U9

"I.S

iI "

'"

M22

0,20

0, 17

M2J

0. 37

O.Z6

I .i 0

M 24

0.64

'J 25

: 0.2

\125

0. 36

U 10

~

o.s

,:

\t26

ll ~ "

u 24

I 0. 2

i .:

Fig. K.

i.1

-

I

[",I

XPD

[dB]

cdr ew

[dBI

cdf xo

IdBI

cdf 12

IdBI

<>

a

<>

a

< >

a

<>

a

<>

a

BSI

0.44

0.36

7.'

20

-19.8

1.6

-157

2.2

·11 0

2.•

BS2

0 82

0.8 1

11.4

30

·19.6

1.9

-13.8

3.1

-9.1

3.1

x : BSI BS2

._ o~

-~

~

-0

...: ~ --'

.J

o

X:O

o alP< c soe

}~~ 6 ~ o

,

-:

.:l

=

~

0 0

11 I I !

10 0

I I

I II

101

Vertical polarization S [usee] Fig . 7.

E nvelope corre latio n co efficient versus range len gth

C. Antenna Systems

S

CO

3000

Distance [m]

,

A VERAGE' « » AND STANDARD D EVIATION (u) V ALlJES OF S, X P D, AND 1% LEVEL OF C UMULATIVE R SSI E NVELOPE DISTRI13lJTION (cdf) FOR C W, 200 kHz, AND 1.2 MHz BANDWIDTII S.

0 :

2000

o.

T A B LE IV

Base

MI

; MIO

+ --; M2I

2.1

1.1

MI2

+

[d81

<>

0 .09

MJ

I

.

.y....,

+

+

RMS dela y sprea d for decouplcd (ho rizo ntal) pol ari zat ion RMS delay spre ad for th e co -polari zed ( vertical) signal.

versus

to couple more power into the cross-polarized component , ther eby increasing the polarization diversity gain potential. The PDP's observed for both the co- and cross polarization show ed very similar shapes except in the case of a dominant LOS peak in the copolarized case. Fig. 7 show s the RMS delay spread for the co- (vertical) and cross polarization . It is seen that most of the data lie around the 1: 1 line, although some stray points are present. This indicates equal strength of the me an radio disper sion effect s on the two polarizations.

Th e overall decorrelation properties of each antenna system are given in Table III. Recent theoretical work into base station cross correlation [11] has shown a model that can predict decorrelation of horizontal as well as vertical spaced antenna combinations. The model shows how the correlation increases with range length. The analysis in [11] is, though, mainl y concentrated on longer range lengths (5 km) and larger antenna sepa rations (10-20 A) than used in our experiments. However, some comparisons can be made. From Table III, it follows that most antenna configurations have an average envelope correlation around 0.1 to 0.3 with a standard deviation aro und 0.2. Only the pure vertical space diversity systems show a significantly higher correlation of 0.6 to 0.7. For BS2, somewhat lower correlations ar e found compared with the examples given in [11] (cases with approximately 10 A horizontal or vertical separation), with the horizontal separation being very sensitive to model parameter variation. For BSl , the base station antenna height is so low that the model assumptions in [11] are questionable and lead to much lower measured correlations than the model examples given . Th e modeling of compact spaced configurations, thus, seems to pose higher difficulty for cross correlation prediction, especially for very low base antenna heights as found in rnicrocell impleme ntations. A trend of cross correlation depending on range (as exp ected from [11]) is clearl y visible in Fig. 8, where cross correlation versus range is shown for some selected antenna systems. Local variations in environment may mask this effect (i.e., make it less pronounced), although the measurement area covering the locations around BSI was chosen for its fairly homogenous nature with respect to building heights, street width s, and building density. Fig. 9(a) shows the envelope correlation coefficient increase for the RSSI signals , compared to the CW signals , versus normalized delay spread S . BW. There docs not seem to be an y clear dependence on rad io dispersion for the overall dat a as expected from the high frequency selectivity of the channel, apart from a slight increase in correlation variability

39

o

x : 200kHz 0 : 1.2MHz

0.4

o

0

o o

0.2

0

0

o

~

I

i

I

o

o

10-1

10°

10 1

Normalized Delay spread S BW ( ,' I

1

.~

0.8

-:

-

:J

~

tl

CIl CIl ~

r I

r ;

,i

~

!)

!I

0.6 0.4

r-

:) ., L \ '._ 1 !

_..x : 200 kHz --.0 :

o

1.2 MHz 0.5

CW envelope correlation I hi

i) For the microcell area, the 0.7 coherence bandwidth exceeds 200 kHz at 90% of the locations. This justifies the use of narrowband considerations for GSM, in most cases . ii) For the micro - and small cell areas, no location shows a 0.5 coherence bandwidth exceeding 400 kHz. Thus, the frequency hopping option of GSM will have a high frequency diversity gain potential. iii) The cross polarization discrimination shows a large variance with location and mean values of 7.4 and 11.4 dB for the micro- and small cell areas. These means are larger than previously reported for macrocells in the same type environment. 2) The general evaluation of the antenna systems show : i) The pure vertical space separation of an antenna system can yield acceptable decorrelation properties (mean PenvV around 0.6 to 0.7) for both micro- and small cell areas. ii) All other antenna systems provide very high decorrelation efficiency (mean Penv around 0.1 to 0.3) and, thus, high diversity gain potential. iii) Polarization diversity provides extra decorrelation in the compound antenna systems with about equal strength as the space decorrelation effect. 3) The analysis and experimental treatment of antenna system cross-correlation dependencies show that: i) The polarization and space decorrelation effects in compound antenna systems are independent and multiplicative. ii) RSSI signal decorrelation are largely insensitive to relative dispersion (bandwidth and RMS delay spread) for GSMtype systems. The information contained in this paper verifies the effectiveness (diversity gain potential) of very compact antenna systems in micro- and small cell environments. The dimensions of the antenna systems lead to much easier mounting and site considerations than for standard mast-mounted macrocell implementations .

Fig. 9. (a) Envelope correlation increase with normalized delay spread . (b) RSSI versus CW envelope correlation coefficients.

ApPENDIX DERIVATION AND EXPERIMENTAL VERIFICATION

with increasing dispersion. A very close agreement between CW and RSSI envelope correlations is seen in Fig. 9(b) . Here it -is shown that there is a slight mean and standard deviation increase of correlation for increasing bandwidth Thus. the analysis for compound space and polarization systems, given in Appendix A, may also apply for wideband systems using RSSI driven combining even though the RSSI signals do not exh ibit Rayleigh fading as assumed in the development of (9). Generally, the antenna systems with pure vertical space separation diversity exhibit marginal decorrelation properties. All the other antenna systems exhibit very strong decorrelation properties, promising a high diversity gain .

VII. CONCLUSION

The main findings in this paper are in three areas. 1) From the general radio environment related investigations of the micro- and small cell areas, it follows that:

OF COMPOUND ANTENNA SYSTEM CROSS-CORRELATION FUNCTION

We will now develop an expression for the CW envelope correlation coefficient for two tilted (orthogonal) antennas with respect to the CW envelope correlation coefficient for co-polarized antennas [(9) in Section VI] . In this analysis, however, power correlations are considered as they are easier to handle than envelope correlations. This is analytically justified by a close relation in value between the two [7]. This can also be seen experimentally in Fig. 10, where a very strong correlation (ee) of 0.99 is found between power and envelope correlation coefficients for the data presented in this paper. This relation has also been experimentally displayed in [8] for two vertically separated antennas with a slightly poorer fit, although Ipd 2 was used instead of Ppow ' While theoretically IPJ21 2 = Ppow [7] for Rayleigh fading signals, it follows from Fig. 13 that experimentally this equality is slightly poorer than the approximate relation between the envelope and power correlation as shown in Fig. 10. Higher variances for the

40

1

C

.9

~

...... 0

U u

4.)

,r'! j

0.8 0.6

. /,"

0.4

U > c

0.2

Ul

0 -0.2

l(

/

~

.9

cc=0.9924

u

0.6 "-

u 0

1

U > c4.)

0.4 -

(i;

0.2 -

C 0

N .;::

0

::r:

0-

10. En velope

corre lation coe fficient coe fficie nt.

versu s

power

correlation

~

Ppow H

0

0

0

0

"

-

"

°

x: BSI

"

-I

BS2

0 :

0

-

0.5

1

0

05'

....2"0

u

~

......

"i

c:0

-10 "-15 "-

o

:r::

-20 "-

.....1\

... :t: +

~

~ tb i:+ . +

o

"

+

T

~o$:$~

. "i!;"

*"

" ')<

.r

+

++~-.+-~ + +

«: 0

if

-25 -25

++ + +

cc = 0.9261

-5

N .;::

"

-20

-15

x: CW o : 200

ld-tz + : 1.2 MHz -10

-5

1.J1

Vertical 1% cdf [dB) Fig. 12.

(A- I)

with subscript 1 and 2 den oting th e field at ante nnas 1 and 2, respectively. A ssuming the two ante nnas are horizontally and/o r vertically displ aced (space diver sity), we hav e two power cor relati on coe fficients ( Ppow v and PpowH for the vertica l and hori zontal co mpo ne nts, respectivel y). Fr om [10) and Fig. 11 (sho wing e nvelope correlation s), it follows th at: PpowV

"

° oSl )( 0 " . >c " " "" "0 " 0,," " ° "

0

Fig. II . Enve lope co rrela tion coe fficient for dccoupl cd (horizon ta l) polar ization versus enve lope corre lation coefficie nt for the co-polarized (vertical) signal.

= Ppow relation can be explained by two factors. For the relation to hold, the signals must be truly Rayl eigh fading while both power and env elope correlation s hav e proven to be fairly insensitive to changes in envelope dynami cs (not all expe rime ntal result s being Rayl eigh d istributed ). Secondly, th e expe rime ntal result s allow for negative correlation s but IpI21 2 does not. Th e field components ar e assumed to have Rayl eigh-di stributed envelopes (r ) with random uniformly distr ibuted ph ases ( 0) and G au ssian -distributed quadrature compo ne nts. Th e field compo ne nts of th e two polariz ations at both ante nnas are assume d unc orrelated in envelope and phase. For the envelope. thi s assumption is supporte d by [9], [12), and T able III . Fo llowing Va ughan's [12) procedure, we find ea ch ant enna has vertical and hor izontal Ra yleigh fadin g field compo nents as

E H I.2 = rll1.2cos(wt + OHi .2)

0 0

0

0

0

0

°

0

0

00 0

Vertical envelope correlation

Ipd 2

Evl.2 = rV1.2 cos(wt + OVI.2);

0

0

-0.2

0.5

c_ 0

Po wer correlation Fig.

cc = 0.7665

0..

~

0

0.8"-

......0

l

1

~

U

1,

>II'''

0.. 0

c

,j'

1% cumu lative envelope levels for decoupled (ho rizontal) polarizalion versus co-po larized (vertical) signa l.

polari zati on s. Furt hermore, it follow s fro m [10) and Table III th at it is reasonable to assume approximately equal me an power at th e two antennas for each polarization.

E [Pvd

~

E[P d ;

(A-3)

Thus, the XPD (T = E [Pvd / E [PI/I ) ~ E [PV2]/ E [PH2D is ass ume d appro xima te ly equal at the two antennas. The powers a re PVI.2 = r ~1.2 ' PHi .2 = r ~ 1.2 ' As the envelo pes of both the po larizati on s at each antenna (rV.HI.2) are assumed R ayleigh distributed, th e following relat ionship holds :

(A-2)

is a reason abl e assumption. Th e correlation (cc) bet ween the two is larger for macrocells (cc = 0.88) [10) th an for the small and microcells con sider ed her e (cc = 0.77) . In pr actic e, thi s relati on implies that the antennas hav e approximately equal radiation patterns, and that th e radiation patterns in th e vertical and horizontal plan e are approximately equal, i.e., rotation symmetry of th e radiation pattern aro und th e axis of th e main lobe. Fig. 12 shows th e 1% cumulative envelo pe level for the cross and cop olarized signals. It is seen that the env elope dynamics are also approximately equa l for th e two

E[P ~.Hl .d = 2E[Pv.II 1.2F-

(A-4)

Using (A- 4) and (A -3), we get th e co rrelations

_ E [PV.H1PV,I/2 ] - E[P v HlP PpowV,II E[P)2 ' V.HI

41

(A -5)

(A-lO)

o : quadrature cc = 0.9951

t: o .;::

ro

due to the phases being random, uniformly distributed, and uncorrelated between the polarizations. The cross-power moment is found as [and reduced via (A-3) and (A-6»)

1

~

.... .... o u

-ci ro

5-

.... o

E[P IP2l = aW(E[PVIP V2l + E[PHlPH2])

0.5

+ a4E[PvdE[Pml + b4E [P v2lE[PHll + R(Vl s V 2)

.... 11) ~

o

0..

"'"aW

x: power cc = 0.9559

o

0.4

0.2

0.6

+

0.8

Square complex correlation

E[PvtPd "'" I? E[PHlPH2l .

(A-6)

Now consider the two antennas being rotated to an angle a to the vertical polarization axis. Then, the voltages received at antennas 1 and 2 are proportional to

V 2 = bE v2 - aE H2

(A -7)

with a = cos (a) and b = (sin(a) as the antenna signal quadrature weights. Following Vaughan's [12) procedure and substituting (A-l) into (A-7) and squaring, the antenna branch powers can be found as PI = [arvi cos(8 VI ) =

P2 =

8H1) 8H2) .

(A-B)

The mean power moments are found as [and reduced via (A-3»)

2

= a E[Pvd

E[P2l = b2E[Pd

+ +

8H1 )l

2 b2E[PHIl = E[ Pvtl(a

(A-11)

R(V" V 2) = -4aW(E[Ivllvz)E[IHlIH2l

+ E[Qv, QvzlE[QHI QH2l + E[lv , QV2l E[IHi Qml + E[lvzQvdE[lmQHI])

+~)

2 a2E[PH2l "'" E[Pvd( b

+~) 42

(A-12)

rV.Hl ,Z· where the in-phase components are Iv.HI.2 cos( 8v.HI.z) and the quadrature components QV.HI.2 rV.HI.2 sin( 8V.HI .2). As for each polarization, the signals are assumed Rayleigh fading and the in-phase and quadrature components are Gaussian distributed with zero mean and equal variances (at = a~) . Using (A-3), the ratio between the variances for each polarization is E[Pl

= ai + a~ = 2a z

aIv.1I1 = a~v.HI = aIv.m = a~v.m z r = av

(A-B)

The quadrature component correlation coefficient is then found as PIIV.H =

E[lv.H,lv.ml E[IZV.HI 1

E[QV.HI Qv.lI2l E[Q~.HI)

_ 2E [Iv.HI lv.ml E[Pv. HI)

(A -14)

Similarly, the cross quadrature expectation becomes E[PV.Hl E[lV.HIQV.HZ l> - -E [Q v.nilv.m l> -PIQV.H-2-·

(A-9)

+ 2abE[rVIlE[rHllE[cos(8v1-

E[P vl ]2 + R (V t , V 2) .

a~ '

The antenna branch voltages in (A-7) are a sum of two Rayleigh fading signals which produce a new Rayleigh fading signal as the quadrature components are Gaussian distributed). Then, the power relationship of (A-4) applies for PI and P2 as well, and the power correlation coefficient {Jpowl2 for the two tilted antennas is

E[P,l = a2E[P vd + b2E [PHI l

r

ponents of field components given in (A-I) as

Comparing the two polarization correlations by substituting (A-5) into (A-2), we have

+ brHl cos(8H1W + [arvi sin(8v,) + brHI sin(8 H1 W a2PVI + b2PHI + 2abrVIrHi cos(8 v, b2PV2+ a2PH2 - 2abrV2rH2 cos( 8V2 -

(a 4 + b4 )

E[PV1Pd

R( VI, V 2 ) can be expressed directly by the quadrature com-

Fig. 13. Square quadrature component correlation coefficients (pi, + pJQ) and power correlation coefficient versus square module of complex correlation coefficient (IP12I')·

VI = aE V1 + bE HI :

(1 + ;2)

(A-15)

For Rayleigh fading signals, it can be found that PIQ = - PQI and PII = PQQ [13]. Furthermore, the complex correlation is related as PIZ = 1/2(PII + PQQ) + j1l2(PIQ - PQI = PII + jPIQ' From (7], it follows that PII + PIQ = Ipd z = Ppowv . This is shown in Fig. 13 for the data presented in this paper. The quadrature component correlations (offset by 0.4 for clarity) have an extremely high correlation (cc = 0.995) with the squared complex correlation coefficients. The power

r:=

.9

1

~

0.... .... 0 u

)(

:rt?<

0.5 )( )(

~

{I)

~

.~ .... u

)(

0

X )(

~

.5

)(

>C

>C

>C~

)(

... 0

N

I

cc=0.909

0

-1

-1

-0.5

ACKNOWLEDGMENT

I

>C

·C

~

~

XX

Xx

)(

tU -0.5 s::

:r:

x

)()(.

)(

.J::

0.. ,

into (A-II), the power correlation coefficient in (A-9) reduces to the expression (9) shown in Section VI.

)(

0.5

0

The authors acknowledge the participation of C. Jensen in the experimental part of this work during his employment at Aalborg University, and H. Fredskild from Telecom Denmark Telelaboratoriet for providing a second matching base-station antenna. The comments of the reviewers, who helped enhance the quality of this paper, are greatly appreciated.

REFERENCES

1

[1] B. Glance and L. J. Greenstein, "Frequency-selective fading effects in digital mobile radio with diversity combining" IEEE Trans. Commun., vol. 31, no. 2, pp. 1085-1094, Sept. 1983. [2] J. B. Andersen, P. C. F. Eggers, and B. L. Andersen, "Propagation aspects of datacommunications over the radio channel-A tutorial," in Proc. EUROCON '88, Stockholm, Sweden, pp. 301-307. [3] B. L. Andersen, P. C. F. Eggers, and J. B. Andersen, "Time and phase variations in the mobile channel," in Proc. Nordic Radio Symp. '89, Saltsjobaden, Sweden, pp. 297-304. [4) P. E. Mogensen, P. C. F. Eggers, C. Jensen, and J. B. Andersen, "Urban area radio propagation measurements at 955 and 1845 MHz for small and micro cells," in Proc. IEEE GLOBECOM'91, pp. 1297-1302. [5] P. E. Mogensen, "Preliminary results from short-term measurements in urban area," COST 231 TD(90)-88, Paris, France, Oct. 8-11, 1990. [6] A. A. M. Saleh and R. A. Valenzuela, "A statistical model for indoor multipath propagation," IEEE J. Select. Areas Commun., vol. 5, no. 2, pp. 128-137, Feb. 1987. [7] J. N. Pierce and S. Stein, "Multipath diversity with nonindependent fading," IRE Proc., pp. 89-104, Jan. 1960. [8] M. T. Feeney and J. P. Parsons, "Cross-correlation between 900 MHz signals received on vertically separated antennas in small-cell mobile radio systems," lEE Proc., vol. 138, no. 2, pp. 81-86, Apr. 1991. [9] S. Kozono, T. Tsuruhara, and M. Sakamoto, "Base station polarization diversity reception for mobile radio," IEEE Trans. Vehic. Technol., vol. 33, no. 4, pp. 301-306, Nov. 1984. [10] P. C. F. Eggers and J. B. Andersen, "Base station diversity for NMT900," in Proc. Nordic Radio Symp., Saltsjobaden, Sweden, pp. 77-85. [11] A. M. D. Turkmani and J. P. Parsons, "Characterisation of mobile radio signals: base station crosscorrelation," lEE Proc., vol. 138, no. 6, pp. 557-565, Dec. 1991. [12) R. G. Vaughan, "Polarization diversity in mobile communications," IEEE Trans. Vehic. Technol., vol. 39, no. 3, pp. 177-186, Aug. 1990. [13] D. E. Kerr, Propagation of Short Radio Waves. Boston, MA: Boston Tech. Pub. Inc., 1964.

Vertical inter-in-phase correlation (a)

e .9

1

I

I

I

)(

~

~

.... .... 0 u

....

~

)( >sc)(

~ )(

0.5 -

)(

)(

)( )(

~

~ ....

"0

~

='

':r :n

'"

0--

)(

:n

)(

0.... u

)(

co -0.5 ~

=

)(

cc=O.8809

0

.:N a

~ ~

)C'

-1

-1

-0.5

0

0.5

1

Vertical cross-quadrature correlation ( b) Fig. 14. Horizontal versus vertical (a) inter-in-phase (p,,) and (b) crossquadrature (PtQ) correlation coefficients for BS2.

correlation coefficients have a slightly poorer correlation (cc = 0.96), but are still very high. With the previous discussion, we can reduce (A-I2) (with the use of (A-14) and (A-I5) and rearranging) to

(A-16) It follows from Fig. 14, that it is reasonable to assume that the quadrature component correlations are approximately equal for the two polarizations, though the shown data only represent BS2. At BSl, the antenna cables were changed between the vertical and horizontal polarization experiment. Thus, the phase offset on the complex correlations will be changed from the vertical to horizontal experiment. Consequently, the PH and PIQ relations between the two polarizations at BSI are altered and not usable. With this discussion and inserting (A-16)

43

A Statistical Model for Angle of Arrival in Indoor Multipath Propagation Quentin Spencer, Michael Rice, Brian Jeffs, and Michael Jensen Department of Electrical & Computer Engineering Brigham Young University Provo, Utah 84602 [1], whose work was based on the work of Turin. Their work consisted of collecting temporal data on indoor propagation, from which they proposed a time domain model for indoor propagation. Most indoor propagation research has dealt with the time of arrival and paid little attention to the angle of arrival. In order to predict the performance of adaptive array systems, the angle of arrival is very important information. Some recent papers have begun to address the angle of arrival. Lo and Litva [3] found that multipath arrivals tend to occur at varying angles indoors, but were not able to arrive at any conclusions based on their limited data. Guerin [4] collected angular and temporal data separately, but did not correlate the two. Wang, et al [5], used a rectangular array to estimate both the elevation and azimuth angles of arrival for major multipaths, but did not measure the corresponding time of arrival. Litva, et al, [6] collected simultaneous time and angle of arrival data, similar to the format of the data used in this paper. They came to the preliminary conclusion that it is possible to make accurate measurements of this type and learn more about what is happening in the indoor multipath channel. However, their experiment was not extensive enough to make any conclusions about the channel. This paper presents an extension to the Saleh-Valenzuela model which accounts for the angle of arrival. This is based on data that includes information about both the time and angle of arrival, presented in [7]. The Saleh-Valenzuela J model is explained, and the new data is discussed. Model parameters based on the new data are derived and compared to the parameters found by Saleh and Valenzuela at a lower frequency.

Abstract- Multiple antenna systems are a useful way of overcoming the effects of multipath interference, and can allow more efficient use of spectrum. In order to test the effectiveness of various algorithms such as diversity combining, phased array processing, and adaptive array processing in an indoor environment, a channel model is needed which models both the time and angle of arrival in indoor environments. Some data has been collected indoors and some temporal models have been proposed, but no existing model accounts for both time and angle of arrival. This paper discusses existing models for the time of arrival, experimental data that were collected indoors, and a proposed extension of the Saleh-Valenzuela model [1], which accounts for the angle of arrival. Model parameters measured in two different buildings are compared with the parameters presented in the paper by Saleh and Valenzuela, and some statistical validation of the model is presented. I.

INTRODUCTION

There have been many different approaches for overcoming the problem of multipath interference, both in outdoor and indoor applications. Some of them include channel equalization, directional antennas, and multiple antenna systems, each being more particularly suited to different applications. The use of multiple antenna systems can be particularly useful for indoor applications such as local area networks, because they allow the possibility of communicating with multiple users simultaneously over a single frequency band, increasing throughput and making efficient use of frequency spectrum. The signals from different antennas can be combined in various ways, including diversity combining, phased array processing, and adaptive array algorithms. Adaptive array sytems are becoming increasingly feasible for high bandwidth applications with continuing improvements in digital signal processors. In addition, the availability of new, higher frequency bands has made wireless networks an increasinly attractive and feasible option. The effects of multipath interference have been studied extensively in various outdoor scenarios. However, the study of the indoor multipath channel is relatively new. In order to be able to predict the performance of indoor communications systems, models are needed that accurately model the behavior of radio transmissions in indoor environments. Several other researchers have already collected various types of data on indoor mulipath propagation. The foundation for much of today's work was by Turin, et al [2], which was a study of outdoor multipath propagation in an urban environment. The first model for indoor multipath propagation was proposed by Saleh and Valenzuela

II.

THE SALEH- VALENZUELA MODEL

The model proposed by Saleh and Valenzuela is based on a clustering phenomenon observed in their experimental data. In all of their observations, the arrivals came in one or two large groups within a 200 ns observation window. It was observed that the second clusters were attenuated in amplitude, and that rays, or arrivals within a single cluster, also decayed with time. Their model proposes that both of these decaying patterns are exponential with time, and are controlled by two time constants: I', the cluster arrival decay time constant, and " the ray arrival decay time constant. Fig. 1 illustrates this, showing the mean envelope of a three cluster channel.

Reprinted from IEEE Vehicular Technology Conference, pp. 1415-1419, May 1997.

44

Crabtree Building, constructed mostly of steel and gypsum board. Each data set can be viewed as an image plot, with angle as one axis, and time as the second axis. A typical data set is pictured in Fig. 2. The images were processed to remove blurring effects so that the precise time, angle and amplitude of each major multipath arrival is known. The data collection and processing is discussed in greater detail in [7]. Visual observation of the data showed that clustering like that observed by Saleh and Valenzuela was present in the data. The nature of the clustering tended to follow the model of Saleh and Valenzuela quite well. In general, the strength of clusters tended to decay with increasing delay times, and arrivals within each cluster showed a similar pattern of decay. One difference from the Saleh-Valenzuela data is the higher average number of clusters per data set.

The impulse response of the channel is given by:

L L f3 00

h(t)

00

k Lb(t

- Tl

-

rkL)~

(1)

l==O k==O

where the sum over l represents the clusters, and the sum over k represents the arrivals within each cluster. The amplitude of each arrival is given by {3kl, which is a Rayleigh distributed random variable, whose mean square value is described by the double-exponential decay illustrated in Fig. 1. Mathematically it is given by:

(2)

{32 (T l , Tkl)

,82 (0, 0) e -

T 1/

r e-

Ik I / , ,

(3)

where {32 (0, 0) is the average power of the first arrival of the first cluster. This average power is determined by the separation distance of transmitter and receiver. The time of arrival is described by two Poisson processes which model the arrival times of clusters and the arrival times of rays within clusters. The time of arrival of each cluster is an exponentially distributed random variable conditioned on the time of arrival of the previous cluster. The case is the same for each ray, or arrival within a cluster. Following the terminology used by Saleh and Valenzuela, rays shall refer to arrivals within clusters, so that the cluster arrival rate implies the parameter for the intercluster arrival times and the ray arrival rate refers to the parameter for the intracluster arrival times. The distributions of these arrival times are shown in equations 4 and 5:

p(T1\T1- 1 ) p( TkL IT(k-l)l)

Ae-;\(T/-T/- 1 )

(4)

Ae-A(Tld-T(k-l)/) ,

(5)

IV.

NluLTIPATH PROPAGATION

In this section we propose a statistical model for the indoor multipath channel that includes a modified version of the Saleh- Valenzuela model, and incorporates an angle-ofarrival model. In addition, methods of estimating parameters from the data are discussed. A.

Time of Arrival

The time and amplitude of arrival portion of the combined model is represented by h(t) in equation (1), where, as before, {3~l is the mean square value of the kth arrival of the lth cluster. This mean square value is described by the exponential decay given in equation (3) and illustrated in Fig. 1. As before, the ray arrival time within a cluster is given by the Poisson distribution of equation (5), and the first arrival of each cluster is given by T l , described by the Poisson distribution of (4). The inter-ray arrival times, Tkl, are dependent on the time of the first arrival in the cluster Tl · In the Saleh-Valenzuela model, the first cluster time T 1 was dependent on To which was assumed to be zero. With the estimated parameter in [1] of 1/ A ~ 300 ns, the first arrival time will typically be in the range of 200 to 300 ns, which is a reasonable figure. However, a problem with this was found when the A parameter in the new data was discovered to be very low, but the delay time to the first arrival was often still on the order of 200 ns. Under the Saleh-Valenzuela model, this would make any long delays which would occur at larger separation distances between transmitter and receiver highly improbable. To remedy this problem, it is proposed that To be the line of sight propagation time:

where A is the cluster arrival rate, and A is the ray arrival rate. In their data, Saleh and Valenzuela did not have any information on angle of arrival, and assumed that the angles of arrival were uniformly distributed over the interval

[0,271').

Other indoor multipath models have been proposed, such as the model proposed by Ganesh and Pahlavan [8], but they will not be discussed here. The data used in this paper fit the Saleh-Valenzuela model well, and as a result the model was chosen as the basis for the extended model presented here. III.

A PROPOSED TIME/ ANGLE MODEL FOR INDOOR

EXPERIMENTAL DATA

In order to analyze and model the indoor multipath channel, a data gathering apparatus was designed which was able to take simultaneous measurements of the time and angle of arrival. The frequency band was from 6.75 to 7.25 GHz. Using the system, a total of 65 data sets were collected in two buildings on the Brigham Young University Campus. In the Clyde building, a reinforced concrete and cinder block building, 55 data sets were collected. For comparison, ten additional data sets were collected in the

To

r , c

(6)

where c is the speed of light, and r is the separation distance. This allows for the time of the first arrival to be more directly dependent on the separation distance.

45

B.

Angle of Arrival

D.

It will be assumed that time and angle are statistically independent. If there were a correlation, it would be expected that a longer time delay would correspond to a larger angular variance from the mean of a cluster. This was not observed in the data, so at this point an assumption of independence is reasonable, but further study of the correlation structure may be warranted. The consequence of this independence is that the complete impulse response with respect to both time and angle, which we will call h( t, B), becomes a separable function:

h(t, B)

~

h(t)h(B).

The extended model for h(t, B) is useful for analysis or simulation of array processing algorithms that might be used in an indoor environment. In order, for example, to conduct a Monte Carlo simulation of an array antenna processor, it is necessary to generate a random channel using the statistical model. This section outlines the procedure for doing so. The first step is to choose the transmitter/receiver separation distance r, which can be chosen either randomly or arbitrarily. Knowing T, the next step is to determine {32 (0, 0), the mean power of the first arrival, which is given by

(7)

As a result, h(B) can be be addressed separately from h(t). We propose an independent angular impulse response of the system, similar to the time impulse response of the channel given in 1:

LL 00

h(B)

(10) where G(lm) is the channel gain at r == 1 meter, and Q is a channel loss parameter. , and B are respectively the ray decay parameter and ray arrival rate in the model for h(t). Equation (10) is derived and the characteristics of Q in the indoor environment are discussed in greater detail in [1]. After 13 2 (0, 0) is determined, the next step is to determine the cluster and ray arrival times. The corresponding distributions are given in equations (4) and (5), where To == r / c. After the times are determined, the mean amplitudes /3 kL are determined by equation 3. The actual amplitudes for each arrival, f3kl, are determined by sampling a Rayleigh distribution whose mean is ,Skl. The angles are determined by first randomly choosing the cluster angles, which are uniforrnly distributed from 0 to 21r. Relative ray angles are then determined by sampling a Laplacian distribution as given in equation (9).

00

fJkl c5 (O- 8l - Wkl),

(8)

l==O k=:.O

where, as before, f3kl is the ray amplitude for the kth arrival in the lth cluster, given in equations (2) and (3). 8 l is the mean angle of each cluster, which is distributed uniformly on the interval [0, 21r). We propose that the ray angle within a cluster, Wkl, be modeled as a zero mean Laplacian distribution with standard deviation a:

p(B)

(9)

The correlation of these distributions to the data is shown in the next section.

c.

Using the Model

Parameter Estimation

v.

This section outlines methods of deriving the distributions and estimating the parameter a given in the previous section. The distribution parameters of the cluster means, 8 l , is found by identifying each of the clusters in a given data set. The mean angle of arrival for each cluster is calculated. In order to remove the specific room geometry and orientation, the first arrival (in time) for each data set is taken as the reference. The relative cluster means are calculated by subtracting the mean of the reference cluster from all other cluster means. To estimate the distribution of cluster means over the ensemble of all data sets, a histogram can be generated of all relative cluster means, disregarding the first clusters (since their relative mean is always 0). The procedure to estimate a is similar. The cluster mean is subtracted from the absolute angle of each ray in the cluster to give a relative arrival angle with respect to the cluster mean. The relative arrivals are collected over the ensemble of all data sets, and a histogram can be generated. Using a least mean square algorithm, the histogram is fit to the closest Laplacian distribution, which gives the value for G'.

IvloDEL PARAtvlETERS FROM THE DATA

The intercluster time decay constant, I', was estimated by normalizing the cluster amplitudes (the amplitude of the first arrival) so that the first one had an amplitude of 1 and a time delay of O..A.lI of the cluster amplitudes were superimposed as shown in Fig. 3. The estimate for r was found by curve fitting the line (representing an exponential curve) to minimize the mean squared error. The values for rand , were estimated for both buildings in a similar manner. In this particular example, the fit is less than ideal, but it was better in the other cases, especially when there were more data points. In their data, Saleh and Valenzuela did not have exact amplitudes available, and as a result were not able to use curve fitting or generate plots as in Fig. 3. Their parameters were as a result very rough estimates, but they did observe the same general decay trend as in this data, which supports the exponential decay model. The Poisson parameters, A and A, representing the intercluster and intracluster arrival rates were estimated by subtracting each arrival time from its predicessor to produce a set of conditional arrival times p(TklIT(k-l)l). The

46

parameter

r !

l/A 1/),

a

Clyde Building 33.6 ns 28.6 ns 16.8 ns 5.1 ns 25.5°

Crabtree Building 78.0 ns 82.2 ns 17.3 ns 6.6 ns 21.5°

rive than the first arrival in the cluster, and are usually attenuated relative to this first arrival. The amplitudes of clusters and rays within clusters both follow the same pattern of exponential decay over time observed by Saleh and Valenzuela. The differences in model parameters are likely due to the difference in frequency (Saleh and Valenzuela used 1.5 GHz). The other discrepancy is in the markedly faster cluster arrival rate, which is most likely explained by the larger overall number of clusters resulting from a more sensitive data gathering apparatus. The model parameters for the Clyde and Crabtree Buildings were in general very similar. The most notable exception is the extremely slow amplitude decay of rays within a cluster in the Crabtree building. In general, the model seemed to be able to accurately describe the differing multipath characteristics in both buildings, regardless of their very different construction. This implies that the model could possibly provide a general representation for many different types of buildings, and model parameters could therefore be found for other types of buildings. The angle-of-arrival model presented here, though yet unconfirmed, is a strong alternative to only previous option for simulation: random assignment of angles or guessing at the anglular properties of the channel. The rnost important area for continued research is applying the model for its intended purpose-comparison of array processing algorithms. This can be done either by mathematical analysis or Monte Carlo simulation. A mathematical analysis is likely intractible due to the large number of variables in the model, but the model can be a very useful tool for the generation of random multipath channels for simulation.

SalehValenzuela 60 ns 20 ns 300 ns 5 ns

Table 1. A comparison of model parameters for the two buildings and from the Saleh-Valenzuela paper [1]

probability distribution of these with the best fitting pdf (for the Clyde Building) is shown in Fig. 4. Fig. 5 shows a CDF of the relative cluster angles for the Clyde Building, illustrating the relatively uniform distribution of clusters in angle. The same was true in the Crabtree Building. The distribution of the ray arrivals with respect to the cluster mean is shown in Fig. 6. The sharp peak at the mean is characteristic of a Laplacian distribution. The superimposed curve is a Laplacian distribution that was fit by integrating a Laplacian PDF over each bin, and matching the curves using a least mean square goodness of fit measure. The Laplacian distribution turns out to be a very close fit in both buildings. Table 1 shows a comparison of the model parameters estimated for the Clyde Building, the Crabtree Building, and those estimated by Saleh and Valenzuela from their data. The most obvious discrepancy is in the estimates for the value of A. This is due to the fact that there were significantly more clusters observed in both the Clyde and Crabtree buildings compared to an average of 1-2 clusters observed by Saleh and Valenzuela. This may be partly due to the higher RF frequency, but the more likely cause is the ability of our testbed to see clusters that were close together in time, but separated in angle. Another interesting phenomenon is that r is very low in the Clyde Building, and ! is larger than r in the Crabtree Building, meaning that the Clyde Building tends to attenuate more than the Crabtree Building. The values of a were close in both buildings, and there is no precedent for comparison with other data.

VI.

REFERENCES

[1J Adel A. M. Saleh and Reinaldo A. Valenzuela. A statistical

[2] [3] (4]

CONCLUSION

Many aspects of the model have plausible physical explanations. Because an absolute angular reference was maintained during the collection of the data, it was possible to compare the processed data with the geometry of each configuration. The strongest cluster was almost always associated with the direct line of sight, even when there were walls blocking the line of sight path. Apparent causes of weaker clusters were back wall reflections and doorway openings. It is likely that each cluster corresponds to a major path to the receiver, and the arrivals within each cluster are likely the result of smaller, closely associated objects that are part of a very similar group of paths to the receiver. These paths will take slightly longer to ar-

(5] [6]

[7]

[8]

47

model for indoor multipath propagation. IEEE Journal on Selected Areas of Communications, SAC-5:128-13, February 1987. George L. 'Turin et al. A statistical model of urban multipath propagation. IEEE Transactions on Vehicular Technology, VT21(1):1-9, February 1972. T. Lo and J. Litva. Angles of arrival of indoor multipath. Electronics Letters, 28(18):1687-1689, August 27 1992. Stephane Guerin. Indoor wideband and narrowband propagation measurements around 60.5 ghz. in an empty and furnished rOOID. In IEEE Vehicular Technology Conference, pages 160164, 1996. Jian-Guo Wang, Ananda S. Mohan, and Tim A Aubrey. Anglesof-arrival of multipath signals in indoor environments. In IEEE Vehicular Technology Conference, pages 155-159. IEEE, 1996. John Litva, Amir Ghaforian, and Vytas Kezys. High-resolution measurements of aoa and time-delay for characterizing indoor propagation environments. In IEEE A ntennas and Propagation Society International Symposium 1996 Digest, volume 2, pages 1490-1493. IEEE, 1996. Quentin Spencer, Michael Rice, Brian Jeffs, and Michael Jensen. Indoor wideband time/angle of arrival multipath propagation results. In IEEE Vehicular Technology Conference. IEEE, 1997. R. Ganesh and K. Pahlavali. Statistical modeling and computer simulation of indoor radio channel. lEE Proceedings-I, 138(3):153-161, June 1991.

10' ~--r--~--r--.,--~--r--.,..--~--r--.,..---,

10

time

F ig. 1. An illust ra t ion of ex p on entia l decay of me an cluster pow er and ray power within clusters

15

20

25

30

delay(ns)

35

40

45

so

F ig. 4. C O F of Relative Ar r ival T imes W ith in C luste rs in the Clyd e 5 .1n s) Bu ild ing (1/.\

=

00014 0.9 0.8 0.7 0.6

E0.5

x

/

0 .4

0.3 0.2

50

1SO

200

angle (degrees)

2SO

300

350

F ig. 5. COF of relative mean cluster a n gles in t he C lyde Bu ild ing wit h resp ect to the fir st clu ster in each set

F ig . 2. A typical raw data set

10'

100

/'

,r

0.12

.-----...-----.,----~---~---~---, x

0.1

10' O.OB 0

~

~ 0.06

..

'0

0.0'

0.02

x

x x

0 - 200

x 10-' L----'------'----'~--'---~~---_'c:_--:!.

a

20

40

60

relative delay (ns)

80

100

120

-1 50

- 100

100

150

F ig . 6. Histogram of relat ive ray arrivals with respect to the cluster mean for the Clyde Building. Superimposed is the best fit Laplacian d istribution (IT = 25 .5°) .

F ig . 3. Plot of normalized cluster amplitude vs, re lative delay for the Clyde Building, with the curve for r = 33 .6 ns superimposed.

48

Chapter 2 Adaptive Algorithms

O

F paramount importance for an adaptive antenna is the manipulation of the signals induced on its elements. Two overview papers on array processing set the scene at the beginning of this chapter and provide the reader with valuable background information on many signal processing issues related to adaptive antennas. They provide a comprehensive and detailed treatment of different beamforming methods, adaptive algorithms to adjust the weights of the antenna array elements according to some optimization criterion, and direction of arrival methods. Some of the adaptive methods that are included here are conventional beamforming, conjugate gradient, least

squares, recursive least squares, linear prediction, maximum likelihood, maximum entropy, minimum norm, constrained optimization, spectral estimation and eigenstructure methods, weighted subspace fitting, and well-known algorithms such as MUSIC (multiple signal classification), ESPRIT (estimation of signal parameters via rotational invariance techniques), SCORE (spectral self-coherence restoral), and MVDR (minimum variance distortionless response). Also, among the topics presented in this chapter are issues related to some of these algorithms, such as spatial smoothing and the estimation of the number of signals impinging on the antenna array.

49

Celebrating a Half Century of Signal Processing

Highlights of Statistical Signal and Array Processing

U

of SSAP . To provide readers with pointers for further study of the field, this article includes a very impressive bibliography-close to 500 references are cited. This is jusr one of the indications that the field of statistical signals has been an extremely active one in the signal-processing community. This article also introduces the recent reorganization of three technical committees of the Signal Processing Society. During the reorganization, the SSAP, Digital Signal Processing, and U nderwater Acoustics Signal Processing technical committees were restructured to form three new committees : Signal Processing Theory and Methods, Signal Processing for Communications, and Sensor Arrays and Multichannel Signal Processing. After the reorganization, research topics that used to belong to the SSAP TC are now distributed to the three new TCs. Therefore, although the name "SSAP'~ does not exist anymore, the research activities related to it have been given a new life and will continue to thrive in the Signal Processing Society . . Now, let me invite you to enjoy this article, which will give you a quick but comprehensive tour ofthe highlights of statistical signal and array processing.

nlike most other technical committees ofthe Signal Processing Society, which deal with signals of deterministic nature and process signals one at a time, the Statistical Signal and Array Processing (SSAP) Technical Committee deals with signals that are random and processes an array ofsignals simultaneously. This issue features the SSAP-TC's contribution to the Anniversary series, which covers this spedal field ofrandom signals and array processing. The field of SSAP represents both solid theory and practical applications. Starting with research in spectrum estimation and statistical modeling, study in this field is always full of elegant mathematical tools such as statistical analysis and matrix theory. The area of statistical signal processing expands into estimation and detection algorithms, time-frequency domain analysis, system identification, and channel modeling and equalization. The area of array signal processing also extends into multichannel filtering, source localization and separation, and so on. Work in SSAP areas has already made an impact in a large variety of applications, ranging from communication and radar/sonar processing to many medical imaging technologies, and even econometrics. This article represents an endeavor by the members of the SSAP -TC to review all these significant developments in the field

Tsuhan Chen) Guest Editor Carnegie Mellon University

Reprinted from IEEE Signal Processing Magazine, Vol. 15, No.5, pp. 21-64, September 1998.

51

any engineering applications require extraction ofa signal or paranleter of interest from degraded measurements. To accornplish this it is often useful to deploy fine-grained statistical models; diverse sensors that acquire extra spatial, temporal, or polarization information; or multidimensional signal representations, e.g., time-frequency or time scale. When applied in combination these approaches can be used to develop highly sensitive signal estimation, detection, or tracking algorithms that can exploit small but persistent differences between signals, interferences, and noise. Conversely, these approaches can be used to develop algorith111s to identity a channel or svstem producing a signal in additive noise and interference, even when the channel input is Un1Gl0\Vn but has known statistical properties. Broadly stated, the statistical signal and array processing (SSAP) area is concerned with reliable estimation, detection, and classification of signals that arc subject to random fluctuations. Opening a recent issue of the IEEE Transactions on Siqnal Processinq to a SSAP paper the reader will probably see one or more of the following: (1) description of a mathematical and statistical model for measured data, including models tor sensor, signal, «.111d noise; (2) careful statistical analysis of the fundamental limitations of the data including deriving benchmarks on performance. e.g., the Cramer-Rao , Ziv-Zakai, Barankin, Rate Distortion, Chernov, or other lower bounds on average estimator/detector error; (3) developruent of mathcrnaticallv optimal or suboptimal estimation/detection algorithms; (4) asymptotic analysis of error performance establishing that the proposed algorithm C0111CS close to reaching a benchmark derived in (2); and (5) simulations or experiments that C0111 pare algoritlun performance to the lower bound and to other competing algorithnls. Depending on the specific application, a SSAP algorithm nlay also have to be adaptive to changing signal and noise environments. This requires incorporating flexible statistical models, implementing low-complexirv real-time estimation and filtering aIgorirhrns, and on-line performance monitoring. Until recently the statistical signal and array processing area was cover~d by the SSAP Technical COlllnlitte~, which grew out of the Spectrum Estimation and Modeling Technical Committee (discontinued in 1991). At ICASSP-98 in Seattle, an administrative restructuring took place that eliminated the SSM, Digital Signal Processing (DSP), and Underwater Acoustics Signal Processing (UASP) Technical Committees, replacing them by three new C0I11111ittees: Signal Processing Theory and Methods (SPTM), Signal Processing for Cornmunications (SPCOM), and Sensor Arrays and Multichannel signal processing (SAM). The SSM areas described in this article have migrated to these new Technical C0I11rnittees and remain very active within the Signal Processing Society. In particular, the following workshops sponsored or co-sponsored by SSAP will continue to pro-

SSAP TC ~Jiernb~:r'i (as of 1\;1ay 199B) Alfred O. Hero III (SPTiVl), University of .Micliiaa», ~41111 Arbor (clutinnnn) Gcorgios R. Giannakis (SPCONl), University oj' Vi1Xfinia (picc-cIJai17J'/I.17J) Moeness Amin (SPTM), Villanova University Kevin Bucklev, Villanova Uuivcrsitv . Jcan- Francois Cardoso (SPTiVl), Eeoll' Natunutlc SlJPt~l'iCtJ1"C des Telecommunications Zhi Ding (SPCONl) ~ Auburn University Petar M. Djuric (SPTNl), State University OJ'NL1P York at Stonv Bl'()o/~ Hamid Krim (SPTM),NortlJ CarolinaStttte University Jeffrey Krolik (SAiVI), Duke University ., Fu Li, Portland State University Hugit Messer-Yaron (SP'fM), Tel-Aviv University Eric Moulines (SPTM)., Ecolc Natuntalc S1Jplrit:1~rc des Tclcconnnunications Arve Nchorai (SAM), Univ. oiIllinois at Chicano L(;uis Scharf (SPTi\tl), University 0f'(:olorndo, Boulder Ananthr arn Swami (5 PCONl), Artnv RCJcnn:1J Lf1b01'nt01~Y

A. Lee Swindlehurst (SPCONl), Briaham YOtHl1T Univcrsitv . ''David J. Tho111S0n (SPTM), Lucent Tcclmolo..n. ics [itcndra Tugnait (SPCOlVl), Auburn University Raghuvccr M. R.~10, Rochester Just. ojTcclmolo..C. f.Y Lang Tong (SPCO.t\tl)., Cornell University Mats Vibcrg (current Vice-Chair SPTM), Clmlnters University ()j'Tlx!J1l(}lo.!.~11 Mati \V~LX, Rnjhel, Israel (SANI) Guanghan Xu (SPCONl), Thc University oj'Tc.,\;(fs at Austin Michael Zoltowski (SPCOM), Purdue University (nell' 1,(~ affilintiollS

ill parc1lthesis';

based on rcstructurnut arc indicated

vide forums for researchers in the area: the Workshop on Higher Order Statistics (to be held in Caesaria, Israel, in 1999 (http://sig.ensr.ti-/-hos99)), the Workshop on Statistical Signal and Array Processing (to be held in the Poconos, Pennsylvania, in 2000), and the Workshop on Signal Processing Advances in Communications (to be held in Annapolis, Maryland, in 2000). Similar to other Technical Committees, SSAP ran workshops, recommended paper awards, and reviewed papers for ICASSP. To facilitate the paper review process and provide focus for award nominations, the scope of SSAP was divided into several subareas, called "'SP EDICS" categories. These categories were spectral analvsis; higher-order statistical analysis; cyclostationary signal analysis; statistical multichannel filtering; statistical modeling; paranleter estimation; detection; performance analysis; system identification; computational aJgorithms; and applications. These categories are covered in this article and continue to be represented in the aggregated EDICS of the SPTM, SPCOM, and SAM Technical Committees,

52

As the reader will see from this article, SSAP impacts a very wide range of applications. Among the applications mentioned in the sequel are: radar signal processing; sonar signal processing; geophysics and climate; radar and optical remote sensing; electrocardiography (ECG); electroencephalography (EEG); nlagnetoencephalography (MEl~); nuclear magnetic resonance (NMR) inlaging; radio-isotope inlaging (PET and SPECT); chemical sensing of the environment; physical oceanography; fractal internet traffic modeling; astronomy: biology; economctrics; speech; and TI1LlSic analysis/synthesis. Over the past several years the application of signal processing to communications has become a prevalent theme in SSAP. The pre-existence of many relevant core SSAP areas made communications a very ripe applications area. In particular, research in cyclostationaritv, higher-order statistics, and systenl identification was a springbuard to the development of novel methods for channel equalization in digital communications. Likewise, work in detection and estimation led naturally to iterative multiuser detection, source separation, and high-performance modulation classification algorithnls. As another example, deployment of phased antenna arrays and the associated signal processing has spearheaded 11111Ch recent activity in spatial diversity reception tC)1wireless communications. The sections by Giannakis, Tong, and others highlight some of these cornmunications applications of SSAP. Our article begins with a group of two sections on recent developments in detection/estimation algorithms written by Alfred Hero and Pctar Djuric, respectively. The section by Hero focuses on two areas of significant activity: constant-false-alarm-rate (CFAR) detection and iterative maximum-likelihood (ML) paranleter estimation using the expectation-maximization (EM) algorithnl. The section by Djuric describes the emerging area of Bayesian signal processing including estimation, detection, tracking and Monte Carlo Markov chain (MCMC) sampling, which is a technique that was largely impractical before the current generation of high-speed conlputers. The article continues with J section on time-delay estimarion written by Hagit Messer and Jason Goldberg and a section on multiwindow spectral estimation by David Thomson. From a historical perspective, time-dclav esrimarion and spectral estimation are two of the oldest areas of statistical signal processing, dating back at least to the late 19th century (see [42 J), yet they remain two of the most active areas today. Continuing along these lines are sections on the increasingly important problems ofdetection and estimation in the time-frequcncv domain, written by Moeness Arn i n , and the time-scale or rnultiresol ution domain, written by Hamid Krirn and Jcan-Christophe Pesquet. Next comes a section written by Georgios Giannakis on recent SSAP activity in channel estimation and equalization tor digit,-11 communications. This is followed by t\VO sections dealing with the critical problems of model-

The fundamental theory behind detection, classification, and estimation has its home in mathematical statistics and decision theory. ing, systenl identification, and the often overlooked area of data validation. Ananthram Swami starts off with a broad overview of non-Gaussian measurement models and higher-order statistical methods, followed by a section by [itendra Tugnait on advances in multichannel systenl identification and testing random processes for non-Gaussian or nonlinear behavior. These are followed by a section written by Arye N chorai on exciting opportunities in SSAP due to recent advances in sensor technology. Finally, the article turns to array signal processing with four sections written by Lee Swindlehurst, Jeff Krolik, Iean-Francois Cardoso, and Lang Tong, respectively. Swindlehurst provides a bird's-cvc view of sensor-array processing and its applications to source local ization, source separation, and channel estimation. Cardoso tallows up with a section focusing on developments in blind-source-separation algorithlTIs. Tong discusses the increasing importance of blind-source separation and diversity in multiuser communications systenlS design. The final section, written by Krolik, discusses the use of C0111putational propagation models for processing sonar and radar arra v data. It is essential to point out that) in a limited overview article such as this, one cannot possibly do justice to the large number of areas that comprise SSAP. Neither can we hope to cover but a fraction of the contributions of individuals who have had a role in the development of SSAP through the years. We offer our sincere apologies to iU1Y individuals who feel omitted from this overview. WWW links relevant to the area oj-'S~/!I): A The (old) SSAP horne page: http.z/www .eng. J.U btl r11. edu/- d ing/SSAP/ SSAP .h rml .. A database of "selected papers" that appeared in the

IEEE Transactions on Signall)1!"occssin4...1J 1988-1995:

http r//www .eng. au bu rn. cdu/- d ing/SSAP /1ntp. html The SPTM'l SPCOM and SAM Technical C0111111ittee horne pages can be accessed through the IEEE Signal Processing Society home page: http://www .ieee.org/sociery/sp/inclex.ht1111 A. A clearinghouse for information on lnany aspects of signal processing is the Signal Processing Information Base at: http://sPi b. rice.edu/sp i b.htrnl A. S0I11e other web pages of interest to those working in SSAP: -The IEEE Societies on Computers, Antennas and Propagation, Communications, Aerospace and Electronic Svstems, Information Theory., and the IEEE NeuA

53

ral Network Council, all have SSAP related activities and links can be found on the IEEE page:

hypothesis testing [227], invariant hypothesis testing [287J, and the generalized likelihood ratio (GLR) test [205J. For lack of space we fOClIS only on CFAR detection using the min-max, GLR, and invariant testing approaches. We regretfully must omit work in adaptive detection for assumed known noise backgrounds, nonpararnetric techniques, distributed detection, Huber robust detection, sequential detection, signal classification, and detection of number of signals. Min-max CFAR hypothesis testing seeks to maximize detection probability subject to a constraint on maximum false-alarm rate. The min-max approach was recently adopted in [20] and [21] in the context of simultaneous detection and classification of multiple signals. This produced optimal detectors that took the form of a weighted likelihood ratio (LR) test. It was also shown in [20] that this min-max CFAR test implicitly implements J variant of Rissanen's maximum data length (MDL) signal selection criterion, establishing that MDL is ruin-max optimal. It is sometimes possible to arrive at min-max optimal detectors through the method of similar tests [366]. Finally, the min-max CFAR optimal detector can be viewed from the point of view of Bayesian detection implernented with a least favorable prior on the unknown noise density. Thus, in principle, the Bayesian methods developed in [119], [30], and more recently in [61], call be manipulated to provide CFAR tests. In n1any cases direct min-max optimization is diffi..:ult, and simpler suboptimal CFAR alternatives are ofinterest. The conceptually simplest approach is the GLR "estimate and plug" procedure, which requires computing ML estimates for the unknown noise paranleters. In [204J the GLR principle produced an adaptive detector for detecting spatio-temporal signals or targets in Gaussian noise with unknown spatial covariance. A different GLR adaptive target detector was derived in [57] for the case of optical images, The Gl.R for a general multichannel measurement was derived in [205], which specializes to the cases derived in [204] and [57] by applying suitable coordinate transformations. A related and important result was presented in [333] where exact confidence regions for the Gl.Rvmaximizing signal vector were derived for unknown spatial covariance. Additional applications of the GLR strategy to multispectral infrared images were presented in [330J and [464J. In [48] the GLR test was applied to arbitrary subspace projections of the data under similar assumptions as [205]. Other notable CF AR applications of GLH.. have appeared in the following areas: signal detection in noise of slowly fluctuating power [100]; transient signal detection in Gaussian noise of unknown power [316]; signal detection in unknown Gaussian-Gaussian mixture noise [39J; colored autoregressive noise [202, 367]; spatia-temporal signal detection in Gaussian noise with unknown spatial covariance [120, 266, 341]; signal detection in unknown i m pu ls i ve noise [59]; rnultiwindow/Gl.R sinusoid detection [187, 296]; tests

http://\vvv\v.ieec.org/tab/clll"_sub_soc_sub_bps.htll1l

-The American Statistical Association: http://vvv./\,,.amstat.org/

-The Institute of Mathematical Statistics: http://vvvvVY'.inlstat.org/

-The International Association for Statistical Computing: http://\Vw\v.stat. un ipg. it/iasc,htm 1 -The Royal Statistical Society: http://ll1aths. ntu.ac.uk/ rS5/ index2. html -The Acoustical Society of America: http://a~a.aip.org/

-The International Union of Radio Science: http://V~l'vV\v.il1t<.:c.rug.ac.bc/Research/Projects/ 11rsi/wc leo III e. hrrnl

-The Institution of Electrical Engineers (Uk): http://v./'vV\V.ice.org.l1 k/Wc leo111e.html

Advances in Detection and Estimation Algorithms for Signal Processing Alfred Hero, University oj'Michtqan The fundamental theory behind detection, classification, and estimation has its horne in mathematical statistics and decision theory [109, 227]. In the context of statistical signal processing one Blust also contend with additional constraints: the exceedingly large size of signal-processing datasets; the absence of reliable and tractable signal models; the associated requirement of fast algorithnls; and the requirement of real-time unsupervised algorirhms. Two statistical signal-processing areas will be discussed in this section: algorithms for robust CFAR detection, and advances in iterative para111eter estimarion using the EM algorithm. CFAR Detection One of the most challenging problems in automated target detection and recognition is reliable detection of targets in high clutter backgrounds. When the clutter statistics are unknown or highly variable, the false-alarm rate of classical detection algorithms, e.g., the matched filter, cannot be controlled and target detection decisions become unreliable. The reason for this is lack of robustness of the test statistics to clutter variations. The objective of CF AR detection is to produce a test statistic whose probability distribution does not depend on the unknown noise paran1eters, e.g., noise power or clutter spectrum, while ensuring a high probability ofsignal detection. Such a detector is also sometimes referred to as a noise-adaptive detector. For such a test statistic the detection threshold can be set to guarantee a prespecified false-alarm rate. There are a wide range ofdifferent strategies available for designing CF AR detectors including: min-max hypothesis testing [109], similar and unbiased

54

uf signal paranlcters

f()t- cases where direct maximization is intractable. While the origins of the algorithn1 are decades older, it was only after the unified overview by Dempster, Laird, and Rubin (1) LR) [89] that the wide applicability of the EM algorith111 became recognized. Twcnrv vcars after DLIZ published their paper Meng and Van i)'vl, [265 -j published an update which cogently described" the considerable recent advances in the EM algorithm. A review OfS0111e signal-processing applications of EM recently appeared in this 111agazine l273]. Here we will focus on important developments that were not covered in [273]. The intuition behind E.L\1 is simple to state. Based on an observation, y, it is desired to maximize the log-likelihood I" (8) = In }'y (.1'; 8), over an unknown paramcter, 8. However, either due to missing data or to a complicated torrn ofrhe log-likelihood, one ~voLll.d 11111Ch rat h c r In a x i In i z c the s i III P1c rIo g -11 k e 11 h 0 0 d , 1.\ (8) = In /'x (x; 8) , of a more informative data san;~lc X, called the "complete data.' As the complete data, X, IS not available one strikes a C0L11pr0111ise by iteratively maximizinj; the best estimate of the simpler log-likelihood given )rand the previous estimate of 8. Here the "best esti111ate " is the 011C that minimizes mean-squared error: the conditional mean. Rcmarkablv, this simple recipe leads to an algorithnl that has I11al1V attractive properties such as stabl~ C( Hlvergence, nl011()t~)l1e increasing likelihood, and tlcxiblc implementation. One of the first signal-processing applications of EM after D LR appeared was to the problem of emission tom: )graphy [372] and shortly thereafter to transmission t01110graphy [221 J. M~lIlY follow-up papers appeared on this topic in the ll1edical-inlaging cornrnunitv (see, e.g. [11°1 for a partial list) before the EM algorithnl was applied to other signal-processing pr~)blcnlS su~h as para111eter estimation for multiple superimposed SIgnals [106, 1(7) and direction finding [271]. Wide adoption of the EM algorithnl was hindered by its disappointingly slow conversrcncc speed. Efforts to improve the convergence LJ of E1\1 include: Aitken's acceleration [247J; over-relaxation [2291, conjugate gradient [180, 196]; Newton methods [38; 262J; quasi-Newton methods [220]; and ordered subsets EM (OSEM) [178]. Unfortunatelv, these methods do not automatically guarantee the monotone increasing likelihood property of standard EM, leading to the additional burden of monitoring the iterations for instability [222 J. It has been cstablis"hed that the EM algorithnl COI1verg e s fo r b 0 U 11d e dun i 1110 d all 0 g 1ike 1i h 0 0 d Lv (8)= In f y (y;8) [461]. When the likeliho(~d function is nvice ditferentiable the asyn1ptOtlC speed of convergence of the EM algorithnl is proportional to the In~L\:i~11U~1 eigenvalue of [F.\. JF.~' [89]. Here F = --:-V-/)'.(8) 21 and F.\, = E H [-V x (8) Iy ,8] are the observ~d Fls~er, 111fornlatioll 111atrices eV3-luated at the ML eStl111ate 8. Furthern10re, in [165 J the l'no1'loto1Jic rate ojcOnJJC1J}CnCe of EM \vas shown to be eq ual to the 111atrix l2 l1orn1

for presence ot'cvclostationarv signals [821 ~ and dctccrion of sampled signals having sarnpliru; jitter r 371]. When the GLR is intractable, e.g., tor non-Gaussian si~nals, and the noise covariance is !(lHJ\Vn up to a scale El~tor, (~FAIZ tests have been proposed based on 111~Lxi miziru; alternative criteria such as: deflection or contra~t [102, 3101; circular correlation coefficient tor sinu~()ld detection [300] ~ and c(.hcrcncc [125 I and generaltzed coherence [68] for multichannel sign~11 detection. C~F ~R detectors have also been derived based on SUIll111ary stattstics such as: order statistical filter outputs [4521, hiuher-ordcr soectra [167, 209]; marched-filter multib r plc-correlation lag products [134J; _\tvei~htc~i subspace fitting residuals [444]; and adaptive tIlterIng followed ~)~' subspace projections [2281. finally, when the noise covariance matri x is Ul1kI10\Vn, CfAl:z. detection has been proposed using 111Jxi111Ul11 signal-to-noise ratio (SNH.. ) criteria and covariance estimates [101:1; group delay statistics [231]; integrated-bispectru111 non-Gaussianitv tests [428J (corrections in [429J); and higher-order CU111111~1l1ts [140, 345; 346]. One of the main justifications of the (; LJZ principle is its asymptotic optimalitv under broad conditions, e.g .., [200, 201 J. However, there are two factors that can make the (~LIZ test unworkable in applic~.ltions: I) the G LJZ mav not be ofclosed form when the clutter covariance: has special structure, e.g., block diagonal; 2) use of the G LR principle entails a loss in cfficicncv [206, 331 J which ~~ltl severely impact finite sample perf irmancc. An alter11at~1ve that can frequently lead to better finite sample pertormancc is the application of the princi pie of invariance [3551., also called exact robustness [192 l The method of invariancc involves expressing uncertainty in the unknown clutter covariance as resulting from set (;r algebraic actions on the image by an appropriate group of transformations. Once the uncertainty has been ~lapped to group actions, one can often identity statistics whose statistical distributions are functiouallv invariant to unknown noise paran1eters yet entail minimum loss of target discrunination capability. On the basis ofth~se statistics; optimal CF AIZ likelihood ratio tests can often be specified. These statistics are called "maximal inv.irianrs." and the resultant LIZtests arc called CFAIZ invariant tests. Manv simple examples exist for which invariancc principles give CFAR tests with higher power than the G~R test (for a simple but nontrivial example see [227, Ex. 6.18]). Despite the difficulty in finding maximal invariants and their statistical distributions the pavotf f()r the extra effort in signal-processing applications can be high [35, 36, 354, 357, 356J, \vhere often the inv~lrial~t LI~ test si bl T l1 ific:UltlV the G LJZ or ~lpproxl. outnert<JrtllS r 111ate (~LIZ test.

!y

The EM Algorithm for Parametric Estimation EM algorithnl has generat~d nluch recent interest i~1 the signal-processing C0111111unity due to its ability ~() rehably C0111pute iterative NIL ~lnd ptnalized-MI" estl111ates

1~he

55

y

gencc to the global maximum for anv initialization. On

In the statistical community the use of priors has been a controversial subject for many years.

~he other hand, the deterministic ll1etl~od of Chretien and

Hero [63] was developed for uondiffer entiable nonconvex likelihood functions and iteratively approximates the ML estimate via cutting plane methods, specifically proximal point iterations with Kullback- Liebler penalty. When the relaxation paranleter in the proximal point algorithm is equal to one, this method reduces to the standard EM algorithm. Bv decreasing the relaxation paran1cter towards ~ero, a 11101~e rapidly convergent ~llg() rithm is obtained, all the while preserving the monotonic likelihood property.

IIFJ~ I<~ [F x - F y ] F.~ I! II. Thus, the speed of convergence of the EM algorithm increases as the complete data, X, beC0111e less informative; i.c., as F.\' approaches F y and X gets closer to the actual measurements, y. However, there is

J

tradeoff between speed of convergence and imple-

A WWW link to the author oj~ the above section: http://\V\V\V .eecs.umich.cdu/-- hero/hero. htrnl

mentation complexitv: the NI step of the standard EM '.1.1gorith111 usually becomes 111()re difficult as X becomes less

informative. It was discovered by Fessler and Hero [ 110] that this tradeoffcan be eased bvrcformulation of the EM algorithm with "hidden data" .sets, which are less inforrnative than complete data sets and which can vary at each iteration. This led to the "space alternating expectation maximization' (SA(~E) algorithnl, its main feature being that it only updates small groups of the paran1eters, c.g., a single coordinate, at each iteration yet preserves monotonicity. In this respect, SAC-~E resembles the "expectation conditional maximization either" (ECME) [239] but SAl~E generally has faster convergence [265]. In Meng and Van Dvk's paper [265] a generalization of SAC-;E and ECME was introduced called "alternating expectation conditional maximization' (AECM), which is a SAGE algorithnl with a "design parameter' similar to SAGE-3 introduced in [1111. Another recent generalization, called paranleter expansion EM (PX-EM), allows one to augment the paralneter space, in addition to the data space, in order to obtain further convergence acceleration [240]. Interestingly, for the case of the superimposed signals problem PX-EM reduces to SAGE. In addition to the. examples shown in [110], [Ill], and [265] the SAGE algorithnl and its variants have been applied to angle of arrival estimation [115], multiuser detection [81, 289J, estimation of constrained covariance matrices [364], and speckle interferometry [363]. For cases where the likelihood function is nonconvex neither the EM algorithm nor its v..i riants are guaranteed to converge. Furthernlore, grouped coordina;e updating techniques like ECME, SA(~E, and AEC1\Il nlay not converge even when the likelihood is convex but is nondifferentiable. In these cases it is still possible to improve on the plain-vanilla EM algorithn1. TV/o recent advances are the method of Laviclle [51., 225 J and the method of Chretien and Hero [63, 64]. The former allows implementation of the E step via stochastic approximarion (or simulated annealing [226]) when a closed form for the E step is not available. This permits much less informative complete data sets to be used, for which the conditional expectation in the E step is intractable, thereby improving the asymptotic convergence speed. Furthermore, under appropriate conditions on the annealing schedule, this method guarantees (w.p.l ) conver-

Bayesian Methods for Signal Processing PctarM. Djuric, State University ojNC1VY01,4!l at St(}1'~V BTOO!l A typical signal-processing task is to extract desired informarion about a signal from observed data. The sought information might be, for example, related to unknown signal para111cters, number ofsignals in the data, distribution of signal power as a function of time and frequency, hidden states ofa systelTI producing a signal, or prediction of systen1 and signal behavior. The Bayesian approach to making the required inference relies 011 the use of probability models for the observed data and the application of probability theory, where a key role is played by Bayes' theorem [28,41, 123, 198,339,355,379]. The inference is made in terms of probability statements, and the methodology, overall, is coherent and on conceptually sound and indisputable grounds. Its framework has considerable practical advantages including substantial flexibility and generality that allow coping with very complex models, The main distinction between Bayesian signal processing and non-Bayesian methods is in the use of prior densities that quantify uncertainty about the unknowns. Use of priors have important implications both in explanation of results and the repertoire of methods, and in the statistical C0111111unity their use has been a controversial subject for Blan)' years. The controversy seems to have subsided lately, partially due to improved interpretations of the priors in ll1any applications and also to the maturation of the theory of 13ayes methods which has greatly clarified the impact of mismatched priors on the final results [28, 41].

Basics To put things in perspective, consider first a problem where estimation is required. Let the observed data be denoted by y and the set of unknown paranleters that have to be estimated by 8. The probability 1110del is described by the joint probability density function, .f'(y, 8), which can be expressed as

f(y,8) = j'(YI8)!(8)

56

where j'(yI8) is the conditional density of the data given the paranleters 8., and f' (8) is the prior density of the parameters. Given the data, y, and the model j'(y, 8), where 8 is unknown, all the information about 8 is summarized in the (normalized) posterior density .((8\ y), which by Bayes' theorem is given by

case of a linear model and additive Gaussian noise, the Bayesian solution is the well known Kalman filter [2451. 1fth e rn 0 del is non lin ear and /0 r the no i s e is non-Gaussian, the Bayesian formulation leads to a nonlinear tiltering problem whose solution again requires multidimensional integration. Some Difficulties The conceptual simplicity of the Bayesian methodologv notwithstauding , Bayesian mcthod s have been underused by the signal-processing community mainly because of their high implementation complexity. Indeed, there are only a 5n1a11 number of scenarios where the needed optimizations or integrations can be carried out analytically. Generally, analytical or numerical approximations are required that often give discouragingly complicated mathematical functions. This picture has gradually changed with the elnergence of increasingly powerful COll1pllters. Fast numerical techniques for implernentation of the necessary operations have revolutionized the practice of Bayesian estimation, detection, and tracking over the past several years.

(1) Although the entire trajectory, {j'(81 Y)} 8" ofthe posterior is sometimes of interest, very often in practice a point estimate ofS is preferred. For example, the value of ethat maximizes j'(81 y), termed the maximum a posteriori (MAP) estimate, is routinely used. The search for the MAP estimate, clearly represents J. multivariable optimization problem, Note that j'(yl 8), when viewed as a function of8 for given y, is called a likelihood function. In the non-Bayesian literature, .f(8) is not available and the ML estimate, i.e., the value of 8 that maximizes f'(yl 8), is of.ten used.

Estimation, Detection, and Tracking The best estimate of 8 is specified as that value, (3 , that

Monte Carlo Methods The multidimensional integrations and optimizations in-

minimizes a prespecified cost function. When the cost function is quadratic, the estimate () is the conditional mean, £[81 y], that is,

8=

volved in Bayesion methodology can be approximated accurately by Monte Carlo simulation if one is able to sample from the posterior distributions. In most cases of interest, however, such sampling is impossible. A 1110re practical method is to generate samples from simpler distributions followed by approximation of the integrals by sample averages. One of these methods is importance sampling [28, 123, 297]. The generating distributions are called importance sampling functions, and their choice is crucial because the variance of the estimated integrals is critically dependent on them. A related procedure is sampling-importance rcsarnpling, which generates samples from the posterior distribution by repeated sampling from simpler distributions [342]. In this method, samples are first generated from ~l suitable appro xi mat ion of 1'(81 y)., say ,-11(8) yielding samples 8 1 ,8 2 " " , S I.. Then to each sample a probability 111JSS proportional to the weight n'/ = 1'(8,1 y) / ....11(8,) is associated, where 1'(6! y) is the uri-normalized posterior density of 8. finally, the posterior density is simulated by drawing samples from {8, ,8.2 ,···,8I.} with probabilities proportional to (111 / ., lP 2 , ... , 117 I.)' Another approach is to use MCMC methods [28,123, 137, 297, 392]. These methods have extended the Bayesian methodology to 1l1Jl1Y previouslv intractable applications. The key idea is to generate samples by funning an ergodic Markov chain whose distribution after convergence is the desired posterior distribution. Similarlv to importance sampling, samples are drawn fr0111 a simple posterior-approximating distribution and arc subsequently corrected to im prove the approximation. The

f 8j'(8\ y)d8

and it represents the mimmum mean-squared error (MMSE) estimate. To implement the MMSE estimate, two multidimensional integrations are required; one for computing () given the posterior, and one for computing the normalization factor fey) = f(yIO)f(8)d8in Eg. (1). For an interesting application see [92 J, which treats the problem of Bayesian power spectral density estimation. Multidimensional integration is also necessary when S0111C of the signal parameters are not of interest. These uninteresting paranlcters, referred to as nuisance p~lLllne ters, are then integrated out of the posterior, which decreases the cornplexirv of the problem in some cases. In signal detection, the prirnarv objective is to determine whether or not a signal is present in the observed data. The signal lllJy be one of ll1any hypothesized signals, so the goal is to decide which one is ill the data. In the Bayesian literature this problem is known as model selection, and it is addressed by first defining a model, ~7v[ I.. ' for the I(-th signal, followed by associating it with a joint probability distribution, f'(y, 8;,< ~7v[ I.. )., where 8/.. denotes the paran1eters of the model MO/... Then one proceeds with the use of Bavcs' theorem and evaluation of the posterior j'( v\,[/._ Iy), which again requires mulridimcnsional integrations. When it is of interest to trncl: signal values that continuously change with time, the signal model is frequently described by a state-sp;lce representation. In the

f

"l

57

tion probabilities, it is often important to determine the filter coefficients, the input, and its 1110del paral11cters from ,1 set of distorted observations. It was shown that the cstimates of all the unknowns in this problem C~Ul be obtained straightforwardlv by~ Gibbs sampling [58]. In a different setc, r ~ ring; blind deconvolution of Bernoulli-Gaussian processes was implemented by all MCMC approach in [96]. Optimal tiltering is another area of major activity in signal processing. When the signal 1110del is nonlinear or the signal is non-Gaussian the Kalman or extended Kalman filters can give very poor performance. A po\verful alternative is to usc Bavesian filters based on sequential importance or C;ibbs samplings [52, 146,391]. A signal-processing application of the Metropolis algorithnl is paranlcter estimation of damped sinusoids [16]. When joint detection ofsinusoids and the estimation of their parumcters are of interest, the reversible jump MCMC has been succcssfullv applied [7., 91]. lVIC.LVIC methods have been successfully applied to the selection of model order of a time series [15, 22., 145, 4171. In S0111C of these papers nonst..i ndard assumptions were made, which make the direct estimation problem quite intractable. Analysis of mixed spectra within a hierarchical Bavesian framework wax proposed in [5.3] via an MCMC ~l1gorithnl. MelVi(: methods have also been very useful in enhancing speech ..m d music signals, which arc degraded by

samples are generat~d sequentially from distributions dependent on the samples 1110st reccntlv drawn, thereby forming J Markov chain. The first MelvIe method was proposed by Metropolis in the early fifties [2681 and was used in computational physics. The Metropolis algorithnl draws samples from a symmetric distribution and they are accepted or rejected according to a prescribed acceptance probability. This procedure is repeated a sutficicntlv large number of times. The Metropolis algorithnl was later generalized by Hastings to nonsvmmetric sampling distributions, which is known as the Metropolis-Hastings aIgorithnl [155 J. A third MCNl(~ method is the Gibbs sampler, which ernploys conditional sampling . i nd nlay be considered as a special case of the Metropolis- Hastings algorithnl. The parall1eter vector is divided in subvectors, and each of them is drawn conditional on the remain ing su bvcctors. It turns out that with this scheme the probability of acccpranee is equal to one and there are no rejections. The Gibbs sampler has been widclv and succcsstullv applied to problems in image processing where the 1111111ber of unknowns is very large r124 J. The M(~Nl(: sampling methods were further generalized to .illow for sampling fr0111 paranlcter spacLs corresponding to different models [148]. The new method, called the reversible jump M(=M(~ sampler, jumps from one parallletcr space to another based on transitu in probabilities of another Markov chain. Once the sampler has converged, the time it spent in a specific parall1erer space is proportional to the posterior probability of the associated model, Thus, the reversible-jump M(~M(: can implcment simultaneous Bavcsian detection and cstimatir m. There arc several important issues related to the usc of MC~1C methods, First of all'I these methods arc iterative, so the question of convergence is critical. Typically, the first N iterations are thrown (1\vay'l a period called burn-in, and determining N demands convergence diagnostics. Of practical importance, too, is the stopping time ofthe chain. One would like to run the chain long; enough to obtain desired accuracy. How 111any parallel chains to run is another important question. When there are 111any chains, their comparison nl~lY allow for easier detcrmination of thei r convergencc.

J

non-Gaussian noise characterized by impulses superimposed on a Gaussian background [144 J. Finally, in addition to its extensive applic.ition to static images, Gibbs sampling has been employed in enhancement ofdegraded video i111ages [210J. With the ever gro'Ning pO'A'er of modern C01l1pllters,

MCMC methods are becoming very useful tools for tackling even the most difficult signal-processing problems. In the years to come these methods will beC0111e more sophisticated, efficient and accurate, undoubtedly leading to ll1any new and effective signal processing algorithnls. A

WT¥vV links relevant to the above section: A \VWW [ink to the author of the section: http://vvvvW.CC.Slll1ysbxdu/-djuric/indcx.hnnl

... A WWW link to topics in the area of MCMC: http://\V\V\V .stars.bris,ac.uk/M eMC/

Time-Delay Estimation: Past, Present, and Future

MCMC Sampling for Signal Processing

.NIC.N1C methods have the potential to yield. iterative solutions to 111any important but difficult signal-processing problems. An increasing large number of papers on the subject have appeared in signal processing conference proceedings and journals. The following is a short list of recent

Haoit J\![CSSC1'" and [ason C;()ldbc1~1J, Tel Apil' [JniVC7/si~11 Timc-dclav estimation (Tl)E)'I or time-of-arrival (TOA) estimation, is a basic tool in statistical signal processing. Applications ofTDf follow ti'0111 the simple relationship of t1r = P . D.t, where D.1'" is the distance an object or a wavefield travels at some constant speed'l P, over sonle tinle interval, ~t. For exanlple'lin range Ineasurell1ents for

contributions; for a more detailed overview, see [8]. One standard problem in signal processing is blind deconvolution of noisy data that represent an output of a linear systenl excited by an unkno\vn input. For instance, if a systenl is nlodeled as ~Ul FIH.. tilter \vith unkno'vvn coetticients'l and the input is a hidden Markov nl( Ktel \vith discrete ~uld knovvn state space but unkno\vn initial state and transi-

radar or sonar, 17 is assu111ed kno\vn, and the target's range is deternlined by ll1easuring ~t, the tin1e required for the transll1itteu sign~:l1 to propagate to a target and be ret1ected back to point of trans111ission. Also., tcn' velocity

58

tween the cases of so-called "acti ve" and " passive" TD E [55, pp. 442-448J . ... In actil' c T D E , when: XI (r j and x 2 (t) correspond to the transmitted and recei ved sig n als, respectively, it is appropriate to assume that 7/ 1 (t ) = 0 and that sit) is a known, deterministic signal. It is well known that for the " no m in al active scenario, " where 112 (t) is ~1 realization of a white, Ga uss ian random process, the (asvm ptotica IIv opti mum ) ML TDE processor is the matched filter, which cross-correlates XI (t) and x 2 (r ). The estimate, si, is the time that corresponds to the maximum of the matched tilter output. ... In passive TD E, where X I (tl and X 2 (r ) are two versions of the recei ved signal, it is usually assumed that fJ. = 1, .r(t ) is ~1 realization OLl statio nary, Gaussian random process, ami 1/ 1 (r ) and 1/ 2 (t) are realizations of murua llv uncorrclarcd, zero mean, white, starionarv, Gaussian ran dom processes that are also uncorrelarcd with the signal. This well known "nominal passive scenario" was addressed and " so lved" so me 20 vcars ago. In particular, it wa s shown that the (asvmproticallv optimum ) ML processor for this scenario is the generalized cross-correlator (G cq [S5, pp . 138-144] . As shown in Fig. 1, the GeC en iss-correlates appropriatclv prcfiltcrcd versio ns of the sensor outputs, tllrming!'it as the time corresponding to the maximum GCe output. Moreover, the GeC has been sho w n to be optimum even in nouasvmptoric conditions [55, pp . 126-129] . Intuiti vely, an estimator for Min Eq. (1 ) should seek the best "match" between x , (t) and a delayed version of x, (r ). In both active and pas s-ive TD E, some form ofcross correlation Ius been proven to be the optimum measure for matching under Gaussian conditions . Another possible measure for matching could be the "error signal," c(t ) = X 2 (t ) - X I (t - L ). An appropriate estimate of time delav , t:.t, wo u ld be the T , which minimizes this error in some sense 1t em easilv be shown that f ir the Gaussian secnarios described above, the optimum TDE processor mini mi zes the mean-square error (MSE ), E[l c(tJI 2 ]. However. straig htfo rward m anipulation of the expression for the MSE shows that the minimum MSE and the co rr elato r based processors are eq u ivalent .

r(t)~ . . . , ...•... r(t ) :,

:

,, A

~t

,

t

... I . The generalized cross correia tor for pas sive TOE. For active

TOE(when s(t) is known and n ,(t) = 0), the matched filt er is obtained by setting H ,((ll) = H,(rll ) = 1.

measurements (e .g., biomedical [S5, pp 469-476 J, or nuclear engineering applications [55, pp. 363-366 ]). ~r is assumed known , and St ; th e time required for a s i g n~1 1 to tra vel the distance , S r, is measured . TDE is also important for more compli cated , nonlinear problems such ~lS so ur ce direction OLlITIval ( DO A) or bearing measurement . If C1ss L1 m ing free -space propagation conditions ) the signal due to a so u rce IS recei ved bv two sensors separated bv distance I, then th e differential delay between the sig na l recei ved bv the sensors is gi\'en bv M = I ·sin e/ 1', where e is the source DOA [S5, p~). 4()3-4091. Determination of so ur ce nO A j, often based 011 a measurement of differential time dclav, !'!.t , or, till' ,1 narrow-band source, a measurement of di fferentia] phase shift, !'ill! = (f) " M = 2 nl/ A S!l1 e, where the so ur ce frc quencv is written in terms of th e so urce w.\\'elength as (f) "

=2rrl'/A .

In practice, one seeks to measure the dcla v between two lloi.ly versions OL1 sig nal (w h ich itsclfmav even be unknown ). Uufortunatclv, there is no single measurement procedure appropri ate for all T11E sce~l~lri()',. This E1Ct, combined with th e practical importance of measuring time delay in so many different applications. is whv Tl) E has received so much arrcnri. 1I1
Basic TOE

Advanced TOE The intuitive basis on which both correlation and MSE-based processors match the two versions of the signal (E q . (2)) leads one to believe that thev mav also function in "non-nominal" scenarios. However, if the assumed conditions under which these procedures were derived are violated , thcv are no longer optimum. This implies that better estimation performance can be achie ved using other processors. Most of the TD E research carried out over the last .2 0 venrs has been devoted to the design and analvsis of processors for non-nominal scenarios that are known to commonly arise in practice . Current and future T11E research focuses on both the theoretical and practical

The most gener~ll T11E problem is now formulated . The noise -corrupted signals rec ei ved bv th e two sensors o ver some time interval can be modeled as : X I

X

2

(t) = si t) + 7/ 1 (t

)

( t) =CJ. . s(t -!'it)

+ 7/ 2 ( t) ,

( 2)

The problem to be so lved is that of using the measured data to determine !'it, an estimate of the rimc-dclav parameter, M . Depending on th e application, different modeling assumptions are made on the s l g n ~l l wavefo rm, sit ); the noise waveforms, 7/ 1 I t) ~1I1d 7/ 2 It); .ind the P~l ramcrcrs, t:.t and CJ.. It is co nve n ient to di,tinguish be-

59

Partially Correlated Gaussian Noise

Correlation-based TDE procedures are based on the assumption that the additive noise processes are uncorrelated. In cases where the signal, s(t), is non-Gaussian while the additive noise processes are Gaussian (and possibly correlated), high-order statistics (HOS) techniques have been proposed for TDE [55, pp.168-171], [166], [427]. Since the Gaussian components of the received data are attenuated in the high- (i .e., greater than second-) order cumulanrs and spectra, processing the data in these domains can result in improved TDE performance when the noise processes are highly correlated. Independent, Non-Gaussian Noise

While non-Gaussian signal and Gaussian noise processes are successfully handled by HOS techniques, a different approach is required • 2. TOE performance vs. signal to noise ratio (SNR) in white, mixed-Gaussian when the noise processes are non-Gaussian. In noise (i.e.. the noise probability density function is an average of two [362 J passive TD E has been studied where the zero-mean Gaussian density functions of different variances). The ratio beadditive, non-Gaussian noise processes are astween the variances, a, indicates the deviation from Gaussianity (the noise sumed to be statistically independent. Fig. 2 becomes "more Gaussian" as a approaches unity). Note that the leveling shows that under such an assumption the GCC off of estimator performance as SNR is decreased is in fact an artifact introcontinues to function but that procedures duced by the limited range over which the search for t!.t is carried out. better matched to the distribution of the noise issues associated with such scenarios . Some examples statistics (e.g., ML) can improve performance. are given below : • For a moving source and/or moving sensors, the Multiple-Window Spectrum Estimates delay to he estimated is time varying. Much research has been directed toward adaptive TDE, tracking, and David]. Thomson, Bell Labs the development of parametric techniques for simulOne of the basic problems in signal processing is to estitaneously estimating delay and Doppler shift, e.g ., mate the spectral density function, or power spectrum, [55, Part 4] . from J finite data sample. Given N observations, x(t) for A The implementation of the GCe requires prior knowlt = 0,1, ... , N - 1, equally spaced in time at Si = 1, how edge of the power spectra of the source signal and the does one estimate the spectrum? As the inventor of noise processes .' It is well known that the asymptotic demultitaper estimates [398], I may have a biased opinion layestimation error docs not increase if the spectra are unon the answer to this question, but, taking a direct quote known . The nonasymptotic error for the case where the from [394] : source spectrum is known has been studied [267] using "Spectral analysis has recently undergone a revolution the Barunkin bound. It has been shown that while the with the development at Bell Labs of sophisticated techshape of the spectrum determines the threshold SNR niques in which the data are multiplied in turn by a set of (under which the estimation error is larger than the tapers which are designed to maximize resolution and Cramer- R,10 bound), prior knowledge ofthe sha pc yields minimize bias [Thomson 1982]. In addition to minimiznegligible benefit in performance. ing the bias while maintaining a given resolution, the • Implementation of modern TDE processors requires multitaper approach allows an estimate of the statistical the use of digital signal processing. The adaptation of significance of certain features (such as spectral lines ) in classical TDE techniques to modern, software-based rethe power spectrum by comparing the character of the alizations raises both practical and theoretical problems. DFT's of different data windows . These techniques are In particular, interpolation of the discrete delay estimate now in routine use..." [55, pp . 343-350] and the effect of sampling on the To understand the origins of this process, remember achievable TOA error [18] have both been studied. that the first commonly used estimate of the spectrum was Schuster's pCl·iodogram, introduced in 1894. It is the Perhaps the most important deviation from the nomisquare of the discrete Fourier transform of the observanal scenarios is that of non-Gaussian statistical scenarios . tions, scaled by l/N and, as an estimate of the spectrum, is Some of the main results for non-Gaussian TOE are now both biased and inconsistent. In practice this means it gets reviewed.

60

ccntration in this band is the best possible. These sequences are the discrete prolate spheroidal sequences, or Sl cpia n seq u c II CCS , [3 7 5 1. N ()\v C () m put e the

an unstable vvrong answer, There are statements in the statistical spectrulll estimation literature that, as a function of Sc.1111ple size, the pcriodogram is "asvrnptoricallv unbiased.' Ignore these. Engineers Jearn that the only reason anyone goes to the trouble and expense of collecting 11101T data is because they are going to ask 1110re ditficu l t q Ll est ion s , s() o n e is a Jwu y s \V 0 r kin g \tv i t h small-sample problems, not with asvmptotics. In engineering data this bias can overwhelm the signal of interest; in [397] I showed data where the periodogranl wax ill error by 1110re than a factor of 10 to over most of the frequency range. The periodogrum is inconsistent because its variance, E{P(f)} 2 + , does not decrease with sample size .ind, in the cited example, was too large bv a factor of 20 more than 10 . Although these problems with the pcriodogram were known before World War II., 111any researchers persist in using pcriodograrns. (Be cautioned that estimates that arc based on the pcriodogram or ra w 11 FTs h avc s i 111i 1arb i asp rob 1e 111 s . S ~l 111p lc aut: icorrelarions are just the Fourier transform of the periodogram so Blackrnan-Tukev, autoregressive, maxinlUlll-entropy, and other spectrunl estimates that depend directly on sample aurocorrelations should not be used. Similarlv, estimates of the analytic signal derived fn.H11 Hilbert transforms using unwindowcd Dl-Ts have periodogrum bias.) A glance at the current literature 011 spectrUITI estimation theory and practice confirms that evolution is a slow process. Fortunately, most signal processors follow the rcci pe in Tukey\ 1966 paper [436 J for C( imputing all estimate of the spectrull1: choose a suitable data window, l)(t)., C( )111pute

Sf) (.f)

= NL- l x(t)[)(t)c -,2 [{Ii 12 I(=()

f~JC1'l{()~fficicnts:

», C() = L X(t)p~/:) (N, W)(-12~/t N -I

t-=()

(4)

W") IS t 1le I~/dJSleplan . sequence ~U1li parall1eters N and HI tor !( = 0, I.,... , K - 1. F rom these the crudest multiple window spectrlln1 estimate is \V 1 1lTl' P t(j,'I(N ' ")

.S(

-

n

=:

~ ~111/. (f) 12 , ~-1. [(

'.

(5)

which is simplv an average of I{ estimates of the form (Eq. (3)) made Llsing the same data but with different ta-

pers. Because the different tapers are orth: >gonal., the different terms in Eq. (S) are approximarclv uncorrelated. Each contributes t\VO degrees-of-freedo111, so Eq. (5) has a X2 distribution with about 4 NJIV df so the estimate is consistent. 1n practice, an adaptivclv weighted average of 2 the I YI.- (1') 1 \ is preferred to Eq, (5)., see [398-1., [400], and [403]. 'The multiple-window theory explained the origins of the data window, or taper. One is finding an approximate solution of the integral equation connecting the observations and the spectral representation of the process. The windows arc the eigenfunctions of the kernel. From this perspective, Tukev's estimate (Eq. (3), was approximatelv the first term of the series solution (Eq. (5)). As part of this theoretical development an effort was made to separate the deterministic (CoI111110nly periodic) C0l11pOnents of the process frOITI the nondeterministic background, which the older esrimates had IU111ped together. With multiple windows, detection of sinusoids is C0111nl0nly done with an F'-test, which is a ratio of the energy explained by J periodic C0111pOnent at frequcncvj, to the remainder of the energy in the frequency baud Cf - W , T + W); see [398], [401], [400.1, [187], and [408J. In the origillal multirapcr estimate, an approximate lincttr inversion of the integral equation was used, and the spectrull1 was obtained by local averages of its rnagnitude; quadratic-inverse theory [400J gives m inimum-variance unbiased expansions of the spectrunl and represents a step in the process ofeliminating the dependence on the choice of the bandwidth Jif7. This has been extended l402] to nonstationary probJellls.

(3)

and smooth, often by convolving Sf) (j'), with J second window, Good data windows give J much less biased cstimute of the spectrllll1 than the pcriodogr.un. Because SJ) (j') is the sum of two sqll~lres (the real and inlaginary parts of the 1) FT at frequency f') it has a chi-squared dis-

tribution with two degrees otfrecdom. Thus S /) C() is still

inconsistent and the smoothing part of the recipe is necessary to obtain a useful estimate. This estimate, however, poses a new problem: where, apart from John Tukev saying it was '-1 good idea, did D, the data window, come from?

Multlple-JPi1ttiOlP, or 7nultiplc-tapcr, spectrU111 esti111ates \vere introduced in [398 J in an atten1pt to correct 111a11)' of the shortc0111ings \'lith stan dard" spcctrU111 estin1atioll I."

When Do You Use Multitaper Estimates? Generally, ll1ultitaper 11lethods have beC0111e the estin1ate

procedures. Here one chooses an ~lnalvsis bandvvidth, J1I , for the esti111ate 0 < TV ~ 1-+ \vith NU/ :::: 4 or 5 a typical choice. The dilllcnsionalitv of J signal \vith band\vidth W and ~l tinle duration of N 'sanlple~ is !( = 2 NW. Because our goal is to estinlate the energy in a frequency band Cf - W., f"' + W) as ~lccllrately ~lS possible, \ve nlust choose the !( sequences of Juration N \vhose energy CO]1-

of choice tt)r serious spectrun1 estinlatioll problen1s, are becon1ing I.l.routine" in geophysics, [393 J") and, \vith an excellent text on the subject [307J and availability in MA TLAB, appear to be beco111ing so in other fields, [208, 238]. A quick survey of papers describing \vork llS-

61

study the relationship between atmospheric COl and global warming [193,218, 405, 406]~ the warming is rnostlv CO 2 and, 1110re disturbing, the seasonal cycle is also being disrupted by hU111an use of fossil fuels. To summarize: if you are estimating spectra, you should be using multitaper estimates. If the data is expensive, of limited duration, or if ditficult questions are being asked, multitaper estimates are mandatorv.

ing multitaper methods shows that 1110st of the early applications were scientific. Special windows [305], and combined time and space F-tests [235, 236] were developed for normal- mode seismologv, estimates of polarization [303 ], attenuation [186, 468] and other geophysical quantities [169,191,304,393, 394J. These, augnlented with coincidence tests, have been applied to processes with 1'1tan.y line COlllpOllcnts in [340 J and [ 408]. Theory and examples of coherence estimation and some multivariate applications are in [218]'1 [398], [407], and [442]. Several papers ([398J, [235], [237], [47-], [369], [337], [448], [153]., and [259]) find multitaper methods outperform "classical' alternatives. Because sample autocorrclations are just the Fourier transform of the periodogram (and hence undesirable), multitaper correlation estimates, the Fourier transform of Eq. (5), have been studied in [440] and [261]. For "traditional" stationary, Gaussian processes spectrum estimates have chi-square distributions. In practice, however, confidence intervals based on these were often wildly optimistic, and resarnpling estimates based on the jackknife [407J and bootstrap [472] are becoming cornmon, For a comparison of resampling methods, see [116]. For explicitly, non-Gaussian data, rnultiraper bispectrum estimates have been developed, [279, 399], and work well. (A few of the difficulties encountered with nonstationarv and non-Gaussian data are discussed elsewhere in this articlc.)

Time-Frequency Distributions in

Statistical Signal and Array Processing

Moeness G. Amin, Villanova University The 50th anniversary of the IEEE Signal Processing Society also marks 50 years since Ville [446] applied Wigncr distribution (WD) to signal analysis. The WI), which was the first distribution introduced in the context of quanturn mechanics [451], has paved the \vay to several key contributions to advances in the area of time-frequency distributions (Tl-Ds) as well as representations of signals with time-varying characteristics. These contributions have aimed at overcoming the drawbacks of the WI) and sought new, more effective tools for nonstationary signal analysis, synthesis, and processing. In the limited space provided, \VC will highlight only S0111e of the advances made in the time-frequcncv distributions over the past half a century with more emphasis on immediate than distant past contributions and on the articles that are relevant to themes embraced by the SSAP technical community. Sixteen years after Wigner had introduced his distribution, Cohen [69] provided a consistent set of definitions for a desirable class of TFDs, often referred as Cohen's class. This class has been of great value in guiding efforts in this area of research. Cohen's class of time-frequency (t ,(0) distributions for the signal x(t) nlay be presented in different forms, including

Variations on the Multiple-Window Theme

Using a different error norm for solving the integral equation, Riedel and Sidorcnko [334, 335] developed a multitaper "sine' estimate, While the 60 dB range of these windows lack the crushing sidclobe performance of the SIepian tapers, they are adequate for many applications and easier to conlpute. Other choices for tJ.pers are discussed in [280] and [17], and efficient methods for computing the Slepian sequences are given in [400] and [ISO] and elsewhere. The theory has been extended to arbitrarily-sampled spatial data [46, 243] but, for irregularly sampled time series, interpolation ina)' be a viable alternative [409, 410]. Turning to nonstationary processes, multitapcr spectrograms have been in use since shortly after the invention ofmultiraper estimates, see [401], [224], [191], [336], [312], and [164]. Multitapcr wavclets [80,85,234] have also been used. For weakly nonstationarv processes, quadratic-inverse theory [402, 403, 406] works well (the first few coefficients are nearly the time-derivatives of the spectrum) while, in violently nonsrationary examples, estimates of the Loevc, orrwo-frequencv, spectrU111 [164, 263, 359, 398] often give 1110re insight. Taking a singular value decomposition of a log multitapcr spectrogram is often useful, [40 I]. Narrowband tracking filters, [223] projection filters [404, 405], and inverse-theory reconstructions [302] are in use. These have been used in a series of papers applying signal-processing methods to

p(t,oo;
=

J:J': (t -

u . '!)x(u

+ r /2)x(u -

t /2)f-/(f)! dud»:

(6)

1) ifferent distributions are obtained by selecting differe 11t k ern e Is, 4>( t , 1). Bot h \IV D (a ls o k no \V n as Wigner- Ville (WV) distribution) and the spectrogranl are prominent members of Cohen's class. Extensive investigation of the desirable properties of a distribution and associated kernel requirements was done by Claascn and Mecklcnbrauker [67]. Their three-part paper published in 1980 drew much attention to the limitations and offerings of Wigner distribution and marked the first comprehensive treatment of the subject using familiar continuous and discrete signal-analysis methods. Other important contributions that have given a timely overall "big picture" view of the state of this dynamically changing and rapidly grovving area in signal processing are the review articles by Cohen [70J, Boashash [32], and

62

fluctuations brings about a clear manifestation of the signal signature in the time-frequcncv domain. Step-by-step design for kernels leading to reduce interference distributions (RI1)s) was given by Jeong and Williams [183]. Indeed, the above t\VO influential papers ([62J, [183]) on the reduction of cross-terms through low-pass tiltering in the ambiguitv domain have made the technical C0l1l111Unitv 1110re attentive to the flexibility and generalization underlying Cohen 's class ofTl-Ds and has set the stage for a surge of activities in this area in the last decade. Other papers that have given valuable insights and important perspectives to TF1)s include the 111
Hlawatsch and Boudreaux-Bartels [170]. In some respects, these articles have elaborated on a single-channel deterministic aspect ofTFDs and did not fully address the problem fr0111 a statistical signal-processing perspective. The focus on deterministic signals stemmed ti'0111 the tact that Tl-Ds have clear and well understood properties when dealing with noiseless and nonstochastic environmerits. Further, TF1)s have been successfully applied to areas where signals are localizable in the time-frcqucncv domain and have fixed distinct signatures that permit their classification and separation. Many of these applications are discussed in the book bv Cohen [71] and also in the book by Qian and Chen [325J. The paper by Boudreaux- Bartels and Parks [37] was the first to recognize that by devising a method to synthesize the signal from the tirne-ficqucncv domain, the WI) 111ay be cast as a tool for signal enhancemerit and noise suppression. In the case of signal in additive white noise, the WI) of the noise is scattered over the entire timc-frequcncv domain whereas that of the signal is confined to a much smaller region. If onlv the signal in that region is synthesized, the desired signal can be retrieved with reduced noise contamination and improved SNl:Z. Several papers have since appeared in the li ter a t u rc as alternatives to the least-squares approximation technique cmplovcd by the synthesis method. For multicornponcnt signals, the cross-terms (also referred to as interference terms), which arc introduced from the bilinear nature of the TF1)s (Eq. (6)), intrude into the tirne-frcqucncv regions containing true signal p()\ver concentrations, known as "auto-term regions." In such a case, and also for low SNR environment, the sign~d auto-terms Inay not be identifiable using WD, which renders signal classifications and synthesis difficult and sometimes impossible. Choi and Willianls [62J and Zhao, Atlas, and Marks r465J have proposed t-f kernels, which make such identification much more feasible than attainable using the WD. The distributions corresponding to these two kernels have come to be known as the Choi- Willianls (CW) and the ZAM TFl)s. In both distributions, the kernel is characterized by one paranleter whose value n1ay be adjusted to achieve a tradeoff between resolution and cross-term suppression. A fully signal-dependent kernel was proposed by Baraniuk and Jones [12 j, where the kernel self-adapts its shape based 011 the underlying signal characteristics. In this respect., unlike both the CW and the ZAM TFDs, the signal-dependent 1-
63

nal, x(n), JS a collection of uncorrcl.itcd sinusoids with random time-varying amplitudes, namely: x(tt)

r.

=

Multiresolution Analysis

Hamid [Cri111, North Carolina State University, and [can-Christophe Pesquct, Ul:JS/LSS

A(TI,wjc,o"'dZ(w)

Solutions to ll1any engineering problems traditionally invoke (at least partly) spectral domain techniques. The simplicity often provided by the Fourier domain in analyzing a process is in large part due to an implied linearity and tirne-invariance (whether by assumption for tractability or otherwise), which yields complex exponentials as the eigenttlnctions of the process. Equivalently, in a stochastic setting, stationary and linear processes have led to high-performance techniques for estimation and detection. The demand for increased performance in statistical signal applications and the enlergencc of a whole class of nonstationarv problems have resulted in a new active area of research, namely time-frequencv methodology (described in the section ofthis article by Amin) and 1110re recently time-scale (multircsolution) techniques.

where A(n, OJ) is the rime-dependent amplitude and 2(0)) is an orthogonal increments process. For a given signal that admits this representation, the ES is defined as the magnitude squared of A(ll, 0)). The work in this area has successfully led to the gel1er,llizatiol1, cstimution, and the linkage of ES to Tl-Ds, For processes with restricted timc-frcqucncv correlation, referred to as undcrspread nonstationarv random processes, it has been shown by Matz, Hlawatsch, and Kozck in [258] that the 1110st popular definitions of time-varying spectra, such as the generalized vV\' spectrU111 and generalized ES, yield effectively equivalent results. The concept of underspread processes proves useful in analyzing Doppler shifts and fading communication channels LlSing time-frequency structures. Tl-Ds have also been examined for detection of nonstationarv sigllLlls. Almost one decade ag(), Flandrin [1131 provided a coherent frumcwork for WV time-frequencv receivers. It was shown that classical receiver structures designed for optin111111 detection of Gaussian signals in Gaussian noise admit an equivalent formulation in the time-frequency domain. The work by Saveed and Jones [350] has gone past the mere equivalence of classical optimum detectors to exploit the time-frequency structures in composite hypothesis tests where an optimum quadratic detector is implemented at each time-frequency point. Time-frequency distributions have recently found applications to radar and sensor-array processing. Barabarossa and Farina [13] have combined conventional space-time processing with TF1)s and demonstrated the advantage of joint processing for clutter suppression and target detection. Bclouchrani and Amin [23 J have introduced the spatial timc-frcqucncv distribution and used it to solve blind source separation and direction tinding problems. They derived the fundamental equation

D.u (t, j~) = AD.\.1 it , j')A J-I,

Complementing Fourier Analysis Multiresolution Analysis

It is well known that many physical phenomena ll1ay be distinguished by characteristics present at different scales [254]. The statistics associated with these characteristics and/or their evolution across scales l1lay provide unique and powerful signatures. The fractal structure of nl3.ny natural phenomena, e.g., the coast line of a continent and the patterns on a tree leaf 111ay be viewed as the result of statistically self-similar patterns [254, 460J. The ubiquity of such phenomena in physical processes has called t()r a systematic and efficient methodology to capture the multiscalc trend and preserve the statistics at the various scales for further processing. The wavelet theory [86 J provided a powerful framework to meet this challenge and gained further prominence upon the ingenious connection Blade by Mallar [253], with the then well known filter-bank theory [437, 443]. The orthonormal wavelet analysis afforded, in ll1an)' cases, analytical tractability and, equally important, an efficient tracking of the statistical properties at different scales. Scale Refinement

(7)

An additional scale refinement 111ay be obtained by iterating the basic wavelet analysis on the coarse scales as well as the tine scales of the signals [72 J. Referred to as the wavelet packet dictionary, this representation consists of an overcornplcte set of functions, out of which an orthonormal basis is selected via an efficient dynamic prozramminu procedure and an additive criterion [72J. b b r Other extensions of the wavelet/wavelet packet deC01l1positions are obtained by imposing some shift-invariancc properties [308]. In the interest of space, we classify the mulriresolurion statistical developments into t\VO major thrusts, the first of which is centered around the analysis with an impact

which relates the TF1)s of the sensors to those of the sources. In Eq. (7) A is the spatial signature matrix. The elernents of D .(\ (t , j') and D ss (t, .f) are not the commonly used matrix correlation functions, but rather the self.. and cross-Tf'Ds of the sensors and sources, respectively. With their role in advancing knowledge, thee )ry, and applications in the statistical signal and array processing area, Tl-Ds have been established as an integral part of this area, and \,yill remain so for ll1an)' years to come.

A WWW liul: to the author ojtbc above section: http://\V\V\;V.ccc.vill.cdu/uscr/mocness

64

on compression and nonlinear estimation, and the second focusing on the mulri resoluti on modelin g aim ed more at large-scale estimation and classification problems.

Multiscale Statistical Signal Analysis and Modeling

On the analvsis front th e good localization prope rt ies of wavelets ha ve played a kev role in the de velo p men t of variou s applications such as co m press ion [253,328 ] and signal reconstruction [94,211 ,253 ,27 5,347 ] (also referred to as denoising ). The tree-like struct ure o f th e wavelet a n al v s is fr am cw o r k Iu s al s o led to effici e nt multircsolurion stoc hast ic model ing techniques wit h ~1 remarkable impact on large scale physi cs-based esti ma tion and classification problems [19, 249 J.

Input

... 3. A

hard, soft, as well as robust thresholding.

Fractal Analysis

f ractal unalvsis has been a t()(US o f inte rest in s ign~1 1 processing for alm ost tw o de cades and contin ues to play an important role in appl ications fields such as co m press io n, ana lysis o f tu rbulence, and co m mu nicatio ns . T he characterizing scalc-invariancc o ffrac tal sig nals, as noted earlier, makes wavelet analysis the tool of cho ice . The orthogonal wavelet rcprcs cnratio n of fractal pro cesses cn rirclv refle ct th e prop erti es (statist ical or o t hers) resid ing at different scales. Such unal vsis led to signitica nr adv ances in fully and accurately idenr it}ring fract al and multifractal signa ls [ 179 /, and in other re lated sy nthes is problems offractional Bro wni an motion [ 114 , 4 60,3 95 , 257]. Using the pa rsimony-achie ving po rcnrial o f wave lets, o the r well adapted .m alvses we re pro pose d [84] for identificarion/svnrhcsis of tu rb ulent velocity sig nals. In part icular, th ese were show n to pro vide a co nsiderab le sim plificatio n and usefulness for turbulence sig nals. Other ap plicatio ns wh ere fracta l signals h ave shown pro m ise include biomedical, hioc hc m isrrv [ 1[. and co mmunications sig nals [460]. Other rnulti scal c an alyses o fr elated processes, such as nonsrationarv processLs w ith stur io narv inc reme nts [4 9, 212], ha ve res ulte d in stationari zi ng pr operti es , thus allo wing th e appl icat ion of classical sta t istica l techn iques and , equally important, a de ep er u nders tand ing of o t he r nonstati onarv pa rametric processLs [212 1. ( f ractio na l Browni an motion is a special case of a no nsra tio narv process with sturiona rv incrcmcnrs. ) A pa rti cul a rlv in teresting exten sion to two-dim en sional sig na ls also resulte d in anal yzin g and synthesi zing fields suc h as the ocean flo or [3 14] .

Signal Reconstruction In an attempt to filte r the o bs e rvatio n noise fr om a signal, and usi ng the ab ilit v o f wave lets to co nc e ntra te s ig n al e n e rg~' along a few directi ons, res earche rs ha ve tried to se p ara t<: the direction s co nt aini ng sig na l en e rg y and the orthogonal direct ions co nta in ing mostlv noise e ne rgy. These noi se d ir ection s ar c t he n d is-

curded and the form e r used in the re construction . The de velopm ent of thi s thrust is marked b y th re e phases tha t rook pla ce in se q uence. Signal in Gaussian Noise : Usi ng to adva ntage the Cl1 er g y co m pact io n property o f wa velets, Mall,u a nd H wang [2 53] fir st sho wed that effective noi se suppressio n ma y be achieved by tr ansf orm ing the no isy siglUl int o th e wavelet domain, and p reser ving only the local ma xima of th e tran sform fo r su bseq ue n t reconst ruction . While per forming amazin gly we ll, a so u nd statist ical formalism to vario us deterministic thresholding techniques W ~lS first proposed by Donoho and Johnston [94 ] wh o co nsidered th e obser ved sign al x( t)

= s(t ) + n it )

w it h t = 1, ,, ' ,](, n( t) - N(0,0 ) and s(t) unknown . They sho we d th at a certain o ptim ality was achie ved by rhresholdinz all wavelet co effic ients o f :\:(t) below T = 20 2 log<- J( . Specifically, given that a wa velet bas is is t . '2

an unconditional basis for a g reat m any smoothness spa ces [25 3], they sho wed th at the reconstruction error was wi th in a scalar multiple of the minim um wo rst case erro r ove r th ese sig nal spaces. Further de velopments ensued as other inte rpretations of sig nal enhancement or deno ising were ado pted . In [275 ], [347] , and [211 ] th e norion o f cod ing was ind ep endently used to lead to ~llgo rirhms w ith more o r less the same tvpe of nonlinear th resholding. The info rmation-theoretic criterion m inimu m description length (M D L ) developed in the late 1970s [338, 365 ] pro ved to be very usefu l in not o nly prov idi ng th e threshold, T , upon its minimi zati o n in the wavelet domain, but also in c1arit)'in g th e relati onsh ips between th e vario us wavelet properties. In Fig . 3, a sum mary of th e nonlinear thresh olding techn ique (othe rwise referred to as a hard-th resholding ) together with its various modificati ons are presented, w hich fo r the purpose of th is ove rview are o n lv worth noting. Other extensions in-

65

Bayesian Approach

.',,'-.':

The above thresholding approaches have been demonstratcd to lead to good results in relatively moderate noise scenarios and have been successfully applied in a variety of settings . They are, however, based upon threshold values, which present two drawbacks: ... They are directly dependent upon the noise variance without regard to the signal characteristics ... They grow without bound with the data record length To address these limitations and particularly when prior information about the underlying signal is available (quite reasonable in practice), a purely Bayesian approach was adopted in [309] . The prior knowledge about the signal in essence regularized the estimation problem. This clearly led to MAP estimates of the signal coefficients . In contrast to the previous approaches, this approach takes a more elaborate form, allowing one to account for probabilistic prior information one may have about the signal of interest. The Bayesianthresholding filter that results is independent ofthe data length, J(, and isonly dependent on the signal and noise variance. Interestingly, it has been shown that many thresholding rules may be included within this framework [445] . For instance, if the noise components are i.i.d. Gaussian and the signal components are i.i.d., zero-mean and have a Laplacian distribution, a soft thresholding policy allows us to recover the signal. To better account for the expected sparsity of the components of the signal of interest (parsimonious wavelet representation), a prior equal to a Bernoulli-Gaussian distribution (which is a degenerate Gaussian mixture) in presence of i.i.d. Gaussian noise, leads to an estimate that is a rradeoffbctween a Wiener and a thresholding estimator [309].

1.5

,

I

,

I

0.5

I

I

I

I

I

r-

f

,I "

" -1

A least favorable noise distribution: Gaussian in the middle and Laplacian on the tails.

... 4.

elude techniques that account for a simple correlation structure among the coefficients, which, if it agrees with the signal structure, should lead to better performance [79]. This, however, also results in a higher computational cost. A statistical approach to optimize a process representation in a wavelet packet set that also led to denoising techniques has been proposed in [2111, [213 J, [309J, and [95].

Multiresolution Modeling

Multiresolution modeling first appeared in the context of data compression [253] . Its recent increased prominence is largely due to the computational efficiency offilter-bank implementations. This led to a multitude of applications, and one of particular interest herein is the clever recasting of time-recursive filtering into scale recursive filtering [19] . This led to modeling algorithms with high efficiency as a result of the tree-like structure of the analysis. The generality of this modeling technique comes from the fact that the nodes may be interpreted differently depending on the application [249] and the appeal of easily implernentable algorithms . Other extensions included the introduction of time dynamics [174] with a variety of applications . One interesting applications is to ground water hvdrologv where measurements are in fact multires(;lution ~(')\!er space, resulting in very efficient data-fusion algorithms and subsequent estimation procedures [83] . Another application is to ocean height estimation with sparse satellite measurement data where multiresolution methods could be implemented with unprecedented efficiency and accuracy, creating a ground-breaking milestone in computational oceanography [112]. Other applications include synthetic aperture

Beyond Normality

While the Gaussian noise assumption may be valid in a number of applications, it certainly does not hold in cer tain environments; for instance, those encountered in an industrial setting. The Gaussian assumption is thus generally limiting, particularly when we have no knowledge about the prevailing noise . An approach inspired by Huber's early work [358] is due to Schick and Krim [358], who, by obtaining the minimum maximal description length (D L) of a data sequence in the wavelet domain, derived a robust nonlinear filter to cope with non-Gaussian scenarios . The minimax DL filter resulted from a mixture noise as shown in Fig. 4 and led to a thresholding rule shown in Fig. 3.

66

switch to a dccision-tccdbacl: mode where previously equalized and quantized (according to the alphabet) sYlnbols are used together with the received data to update channel estimates [24, 321, 323]. ~rr~lining sequences COnSl1111e bandwidth and consumption increases if frequent re-training is required to avoid erroneous convergence of decision-feedback equalizers (Dl-Es ), which occurs when the propagation channel is timc-varvinz: see, e.g., [326-1 and r90J. To obviate training and thus utilize bandwidth efficiently, self-recovering (a.k.a. blind) algorith111s have received attention over the last dozen years for identifying the channel or estimating the equalizer directly using output data only-a feature also i111portant when information tr ansmisxiou cannot be interrupted for training, as, t()r example, in broadcasting and multicasting scenarios [26., 117, 143 J. l~he success of blind methods in a communications context depends on maximum exploitation of input features such ax whiteness, non-Gaussianirv, finite alphabet, constant modulus, cvclostationaritv-s-propcrtics that equalizers are
radar (SAlt) iIl1age classification, compression [207J., as well as the analysis of self..similar processes [112 J. A WWW link to the author oftln: abovesection : http://\V\,vw.ccc.ncsu.cdll/pCoplc/Elculty/bios/ahk.htlllI

Channel Estimation and Equalization Gcorqios Giannahis, University (~f f/i1llrinin Communications and, in particular, channel estimation and equalization arc areas that otter a terti lc ground for statistical signal-processing tools and ~llg()rithnls. lnforIllation sources are mapped to (generally complex) synlbois that take values from a finite alphabet and are thus non-Gaussian signals. 'They undergo distortions that introduce inrcrsyrnbol or interchannel interference (lSI or ICI) as they propagate through channels before being received by single or multiple sensors in noise. Receiver noise is narrowband and hence is modeled well as additive Gaussian, although impulsive models appear also with ambient and atmospheric noise sources in underwater acoustic and radio communications. lSI and I(~I arise due to bandlirnitcd transmit- and receive-filters, amplifiers, delay- and multipath-propagation, relative transmitter-receiver motion, coupling effects, and multiple access interference (MAl) [323J. Depending on the transmission rate, the propagation conditions, the number of transmitters and receivers, the complex discrete-time equivalent baseband channels can be: (1) deterministic or random constants over one or 1110re information svrnbols (modeling flat flding effects); (2) single-input single-output (SIS()) linear time-invariant (LTI) Fl R filters (accounting for frequency-selectivity); (3) multiplicative sequences (modeling ti mc-sclcctivitv ) ; (4) linear time-varvinz ( LTV) filters with random or deterministic coefficients (capturing fast fading effects); (5) nonlinear FIR filters of the Volterra type (modeling saturation nonlinearities of power amplifiers): (6) multi-input multi-output (.wIlMO) filters (for multiuser scenarios); or, possible combinations of (1 )-(6). Equalizers, on the other hand, undo channel effects to recover the tLll1S111itted sequence and, depending on cornplcxitv versus performunce tradcoffs, they can be: (1) linear Of nonlinear; (2) FIR or IIR., and (3) batch for block-by-block equalization, or adaptive for efficient online processing and tracking of slowly varying channels. Channel estimation, equalization, and svmbol recovery algorithlTIs are founded on detection-estimation and system identification principles, and their advances parallcl and cross-fertilize ideas in diverse signal-processing applications including seismic deconvolution, sonar de-reverberation, image restoration, signal reconstruction, rime-series modeling, and various inverse problems involving dispersive media [162]. Estimation of channels using the received (output) samples can be viewed as an input-output system identification problem if a known training sequence (input) is transmitted during the acquisition 111ode. In the operational stage, receivers usually 67

equalizing channels with unit-circle zeros and their asymptotic performance has been shown recently to approach that of 1'11nxil1/1l11J liltclihood sequcnce estimation (NILSE) [66., 323 J. Given a channel estimate, the latter is implemented using Vitcrbi's algorithnl [241 and is the optimum means of recovering the finite-alphabet input, although its practical use is limited due to its complexity, which, especially with multiuser communications, increases exponentially in the channel order and the number of users [4411. Simple nonfrequcncy-selectivc FII\. channels introduce constant amplitude and phase distortions that can be compensated with automatic gain control (AGe) and differential encoding, respectively (sec, e.g., [323]). But recently, even polvnornial phase distortions arising due to relative transmitter/receiver motion can be mitigated with generalized differential encodingtl142l Carrier and timing synchronization arc, in principle.' frequency and timc-delav estimation problems, respectively, and can be solved using ML [323] or cyclic methods with fractional sampling r. 1411· Correct timing acquisition affects performance considerably, cspcciallv for code-division 111ultiple access (C=l)LVIA) systenls entailing asynchronous users, and a variety of methods (including subspace approaches) have been proposed recently (see e.g., [3831 and [25]). Blind equalizers of FIR.. frequency-selective channels do not need to acquire timing because they absorb it into the channel itself. In addition, methods that rely on the constant-modulus (restoral) algorithnl (CMA) do not require frequency offset estimation [143, 416]. C:iVIA is basically a HOS-based technique equivalent to the Shalvi-Weinsrein algorirhm (SvVA) [90,370]. By constraining the linear equalizer's output, CMA reduces the variance of HOS-based techniques and estimates the equalizer directly, as opposed to 1110st parametric and nonparametric HOS based approaches that estimate the channel first (via linear equation or nonlinear matching methods [233, 430]) and next equalize using either: (1) the computationally complex Vitcrbi; or, (2) the so-called zero-forcing equalizer (ZFE), which is nothing but a truncated version of the inverse channel [248] ~ or (3) the NIMSE (a.k.a. Wiener) equalizer that assumes knowledge of the SNIZ to obtain a regularized inverse [317, 323]. Direct adaptive linear eq LILlI izers of the (~MA/SWA type arc particularly attractive computationallv, but convergence and speed 111ay be problematic, especially with channel roots on (or close to) the unit circle-a concern that is alleviated with fractionally sampled versions of the originally developed symbol-spaced Cl'vlA/SW A [90" 232 J. MMSE or ZFEs can be used to initialize the CMi\. or other nonlinear schemes whose convergence depends cruciallv on initialization. CMA can also be invoked in a l)FE mode to improve performance, especially with channel nulls. Although interesting preliminarv results have appeared,

response information (conveyed by 50S) with complete phase response information, which allows equalization of nonminimum phase channels [131., 157, 370, 430J. However, HOS exhibit high-variance, and channel variations ll1ay violate the stationarity assumption as the receiver collects long records required for reliable HOS estimation. With sufficient excess bandwidth at the transmitter, most FIl\. channels can be estimated and equalized blindly using second-order cyclic statistics (SaCS) that become available when one oversarnples (or fractionally samples) the continuous-time received signal at a rate higher than the svmbol rate [122, 129, 130, 413, 414]. The resulting time series is cvclostationarv and the redundancy introduced renders the LTI SISO 1110del equivalent to either a linear periodically time-varying (LPTV) 5150 model or to an LTI single-input multi-output (SIMO) modcl , which can be characterized by multivariate stationary 50S. SIMO models are also appicablc, even when symbol-rate samples are collected by multiple receive-antennas [276 J, and the excess-bandwidth condition adopted to introduce diversity now translates to sufficient sensor separation in order to guarantee channel disparity (co-primeness of the SIMO transfer functions). An important consequence of the I.J)TV-SISO and LTl -SIMO structures under such timeor space-diversity conditions is that LTI FIll channels can b<.: eq ualized exactly by FIR equalizers of the same or greater order; see, e.g., [376]. 1~ecall that L"fI-SISO channels with zeros on the unit circle cannot be inverted, and even with 1VIMSE (Wiener) equalization performance drops due to noise amplification at the frequencies of the channel nulls. The LTI-SIMO equalization property is analogous to the perfect reconstruction encountcrcd with multirare analysis-synthesis FIl\. filterbanks [437], and it has practical implications in C0111111Unications even when zeros are close to the unit circle: (1) truncated Fl R equalizers of FIR channels need not be excessively long to approximate ideal IIR behavior; (2) with appropriate initialization, adaptive equalization algorithlTIs can be globally convergent (at least in high-SNIt environments) [232] ~ and (3) so long as the input sequence is persistently exciting (a minimal condition for identifiability) it can be colored or even deterministic, which is important for coded sequences that arc nonwhite; see e.g., [276] and [462]. Recent work focuses on inducing cyclostationarity or di versiry at the transmitter by means of periodic modularing sequences or redundant filterbank precoders [60, 128, 353, 368, 424]. Equalization exploiting transmit-diversity is very promising because, unlike fractionally sampled equalization (FSE), it imposes no channel disparity conditions, is applicable to nonwhite and deterministic inputs, and shows robustness to channel order overestimation and additive stationary colored noise [368, 424]. At the expense of added complexity and possibility of divergence, DF_Es otter a practical alternative to

68

the convergence of FIIZ C1\!lA-FSEs and their l)FE versions in the presence of noise is not fully understood. Fading channels appear with mobile cellular telephony, tenlperaturc., and salinity variations ill underwater environments, and ionospheric fluctuations in microwave links, where variations of short coherence time cause runa\vay effects in adaptive tracking ~llgorithlns. Thev are modeled as LTv' FI1Z filters with the average extent (delay-spread) of the multipath defining channel 111en10ry and degree ottrcqucncv-sclcctivirv, and with the so called Doppler-spread accounting for the average channel variation and measuring time selectivity [323]. The latter can also arise due to oscillator drifts and relative motion that manifest themselves as multiplicative noise when frequency selectivity is ncgligible:. In general, LT\! channel taps are modeled as uncorrelarcd stationary randorn processes that are assumed to be low- pas-s, Gaussian, with zero mean (Rayleigh E1ding) or nonzero mean (Rician fading) depending on whether line-of-sight propagation is absent or present [323]. Corrcl.itir H1S of the unknown taps capture average propagation characteristics and are used to track the channel's rime evolution usi ng Kalman Filtering estimators. The challenging task of estimating random channel paranleter-s using training data or decision-feedback has been addressed in [4251. Blind approaches for rnndom (OtjJiCic7lt L1(.iing channels arc yet to be developed. However, blind methods adopting deterministic ./initclv pttrttmetcrizcd LTJ;7 models have been proposed recently using .1 basis expansion [133., 421 J. They turn LTV-5150 models to 1.-'-[1-MIM() structures with inputs formed by modulating the transmitted sequence with the bases. Fourier bases are well motivated for modeling rapidly LH.iing mobile C0111111Unication channels when multipath propagation caused by a few dorn inant reflectors gives rise to (Doppler induced) lincarlv varying path delays. Doppler frequencies can be estimated blindly using cyclic statistics, and channel orders can be determined 6"On1 rank properties of a received data matrix [4221. When channel (or Doppler) diversity is complemented by temporal or spatial diversity (available with ovcrsarnpling or multiple antennas) blind estimators of L1"'\/ channels along with direct equalizers become available even with minimal (persistence-of-excitation) assumptions .ibou t the input and the bases [133 J. Multivariate LTl ZFEs lend themselves to adaptive algorithnls that provide fine tuning for possible 1110del mismatch of the bases that capture only the nominal part of the rapidly EH.iing channel (see [133 J for a tutorial treatment and [422 J for blind HOS LTV channel estimators .uid DFEs). l\IIIl\IIO channel equalization is a major challenge in 111 ulti ple-acccss wire less CO 111111 un ications bcca use rnulripath introduces multiple-access interference (1V1A1)., \vhich linlits systenl capacity and bit-error-rate (13E1\_) perfOrll1i:U1ce. LO\V-COll1plexity Cl)M_A syste111S able to cope \vith (perhaps un kno\vn and tinle-varying) nlultipath i:1nd equipped with self-recovering capabilities

Unfortunately" there is no single measurement procedure

appropriate for

an TDE scenarios.

arc most desirable because they arc versatile in variable data rates and fading (e.g., mobile) environments. They also relieve the need for power-control and band"V i d t h - c ()n s II 111 i u g t r a i n i n g seq u c n c c s. L i 11ear suboptimum equalizers with training that either suppress MAr c C)111 pIe tel y (3. k . a . Z e r 0 - fo rei n g (Z F) 0 r dccorrclutiru; receivers) [250 J'\ or their 1\IMSE [177] and minimum-output energy [423J counterparts, offer a compromise between the high-cOll1plexity ML solution r441 J (that assumes knc)\.vledge of aU svstcm parameters) and the matched-filter (Ml-) multiuser demodulators that arc not only known to suffer from ncar-far effects but also exhibit an error tloor in their Bf~lZ due to MAL Addit i()11 ~l I ~1 Ppro J c h es i 11 C lu d c In ul tis tag e ad apt i ve de III 0 d II Ia t()rs, I) FE s [ 9 8 J, and spa t i aI co m bin ers (ltAI(E receivers that in tact arc nothing but inverses of multivariate channels that are assumed to have been estimated) [323]. Frequency-selective multipath induces interchip interference. Simultaneous incorporation and mitigation of asvnchrouism and multipath effects were reported recently in [420 j and [423 J using a multiratc equivalent discrete-time 1110del (sec also [458] and [4591). Blind approaches are well motivated when high-rate C0111111Unication protocols entail small data packets (c.g., in distributed networks and wireless pes prototypes) or when the propagation medium is rapidly varying (e.g., in large cells with considerable delays and high data rate tirnc-varving wireless environments). Such self-recovering Cl)Nli\ receivers were proposed recently in [420], [473]., [3831, [241], [25J and [419]. They capitalize on code diversirv offered by the user coder s) and the received data but have relatively high cornplcxitv especially because they adopt subspace decomposition (via the singular-value decomposi tion (SVI))) of large matrices for sign~ltLlre waveform estimation. Inverse filtering criteria and recursive least-squares (!{.LS) and least nle~l1-square (LMS) algorithnls that include multiparh were reported recently in [419) and [426] by viewing self-recovering Cl)M.A. demodulation as a blind bcamforming problem. Interesting directions for mitigating l\tlAI, asvnchrouism, and multipath by judicious code design at the transrn i trcr incl udc the blind Lagrange- Vanderrnonde Cl)MA transceivers of [352J and the clllerging multicarrier Cl)NIA s)'stenls, both of vvhich turn frequency-selective 111ultipath into tlat f:u.iing [105,471] (see ~llso [349J for \vavelet-hased codes that produce graceful degradation, especially \vith oversaturated C:l)NlA systClllS).

69

justified by appealing to the Central Limit Theorem. The pdf is tractable, and algorithlTIS derived under the Gaussian assumption are usually simple (Iinear/closed-form). In contrast, non-Gaussian signal processing typically involves nonlinear processing. In this section, we will look at recent trends in the modeling of non-Gaussian processes: finite-variance and infinite-variance processes., fractal point processes, and multiplicative noise models.

The plethora of propagation conditions and requiremerits for transmitter-receiver constraints in terms of complexity, transmit-energy, bandwidth, SNR., and perforrnancc specifications (MMSE or BER), otter numerous possibilities for statistical s igrral-processing b algorirhrns. Digital communication SystC111S generally en. tal·1 man-maci e cOlllponents anu..I arc a \,(,para disc" ise tor SIgnal-processing research and development because they provide considerable flexibility to the SP designer. At the same time, SP algorithnls must be tuned to the often L..

Finite Variance Models Non-Gaussian data are encountered in various fields, such as agriculture (the Fisher Iris data) and economics (unemplovment data) [6]; astn mom y (sunspot data) and biology (Canadian lynx data) [322], music (average

strict specifications of communication svsterns and standards. A number of interesting direction"s open up for future research: (1) maximum exploitation of available information and communication constraints (semi-blind approaches along the lines of rS6 J and [147] offer pronlising directions for linear equalization); (2) if there is a choice for inducing diversitv in the input the channel 01" the received output, it appears that input' (i.e., transmit-) diversity in the form of short training sequences modulation, codes, or filterbanks is to be p~'eferred and optimal transceivers should be designed along the lines of [463] (see also [351]); (3) with increasing interest toward low-powcr C011111111nications and nonconsrant modulus transmissions (e.g.., OFl)M or downlink Cl)MA in bzeneral) pre- or postmitigation of power-amplifier nonlinear distortions is necessary using training or self-recovering receivers (see [132] an~i refer~nces therein for steps in th~ direction) ~ (4) convergence studies of adaptive CMA and D FE algorithms in realistic noise environments [90]; (5) performance analysis of channel estimators, especially when model perturbations due to synchronization effects and Doppler frequency drifts are present; (6) BEl~ evaluation of ZF equalizers and experimental comparisons with the MSE equalizers in 5150, SIMO, and MIMO structures; (7) diversity techniques for blind identification of random coefficient models and performance COll1parisons with the basis expansion models using real data; (8) deve lo p me nt and perforrn a ncc analysis of low-complexity blind multiuser equalizers; (9) joint design of equalizers with channel encoders and intcrleavers; (10) exploitation of network protocol structures from higher layers (e.g., ATM) for designing equalizers at the physical layer (see [11] along these lines); (11) balanced combination of the various possible diversity-inducing factors (e.g., codes, channels, fractional sampling, antennas) for blind channel estimation and equalization ofgeneral MIMO channels under minimal but realistic identifiability assumptions (see [373] tor preliminary steps in this direction).

kurtosis of music has been steadily increasing") [149J, exploration SeiS11101ogy., radar, sonar, speech, and image and communication signals. Experimental measurements show that a 111 bierit noise is often significantly non-Gaussian, particularly in urban and radio channels [2701 and underwater acoustic channels [45., 269]. In communication channels, multiple user interference is highly structured and non-Gaussian. Of course, nonlincarities usually lead to non -Gaussian outputs. In general, the multivariate pdfs of non-Gaussian processes are intractable, and with few exceptions there are no general models. In the non-Gaussian context, linear techniques and second-order statistics are not merely suboptimal but are, in S0111e cases, incapable of providing acceptable performance (c.g.., multichannel processes and source separation). Since for nlany problems the Gaussian environment is the least favorable, exploitation of non-Gaussian features can lead to significantly improved performance, although this usually involves nonlinear signal processing [199 J. Optimal estimation/detection ll1ay be feasible if the multivariate pdfs are analytically known (and tractable). If we have access only to training data (Of if the signal is weak), the noise pdf can be es~inlated using vari(~us approaches (kernel density approaches, stabilized histogram estimators, type-based estimators, etc). Alternatively, one ll1ay use a parametric model, such as the Gaussian mixture model, whose para111eters can be efficiently estimated via the EM algorithll1 [260.,273]. Depending upon the application, the (multivariate) pdf estimators ll1ay not be satisfactory unless sufficient data arc available, and the resulting signal estimators n1ay be highly nonlinear. ~ Since non-Gaussian processes are not completely characterized by their first and second-order moments, higher-order statistical descr iptors, such as the higher-order moments and curnulants, and their Fourier transforms, the 1110111ent and cumulant spectra arc required. Consequently, in the last two decades, a lot of attention has centered around HOS [264, 291, 292, 315, 320., 387]. HOS provide a parsimonious (but generally incomplete) characterization ofthe non-Gaussian process and are multidimensional statistics; for example, the fourth -order moment of a stationary random process is a

L

J

J

A WWW link to the author oj'tlJe ttbove section: http://\vatt.seas.virginia.edll/- gg4\vj

Non-Gaussian Processes Anantbram Swami, ARL The Gaussian distribution enjoys a central place in statistical signal processing; the Gaussian assumption is often

70

function of three indices: nt -+ \ (i, j, Il): = E{x(n )x(1t + i) + j)x(1'l + Il)}. HOS have been used to provide tractable solutions to various "nonproblerns' in signals and syst c 111 s : non - G J u.ss ian i t Y, 110 11In i n i 111U m ph as e , n o n c a u s a l it v , nonlinearity, no n r c vc r s ib i l it v , nonadditivirv, and nonstationarity. Sampliru; a continuous-timc minimum-phase linear process usually renders the discrete-time process nonminirnurn phase (NiVIP) and NMP signals are encountered in frequency selective communication channels. Nonlinearities are encountered, tor example, in high-power amplifiers in cornmunicatiou satellites operating near the saturation point, interactions in ocean \NaVeS, magnetic recording channels, and scattering phenomena in radar and sonar. The history ofHOS can be traced back to Fisher and the seminal work ofstatisticians in North America and Eastern Europe: Brillinger, Kolmogorov, Leonov, Rosenblatt, Shirvacv, Sinai, and Tukev, The interest of the signal processing and svstcms communities perked up in the early 1980s, due partly to a US-ONR funded initiative in non-Gaussian processes [449]. Subsequently, there have been f ve biannual international workshops and several special issues of the Transactions on S(ffuallJrocessinlf devoted to the subject; a comprehensive bibliography nlay be found in [387J. The success of HOS-based methods depends clearly on the aITIOLult of non-Gaussianirv and nonlinearity of the underlying processes and models; hence, tests of Gaussianiry and linearity are important. I-I ()S- based methods typically entail an increase in dirncnsionalitv, computational load, and statistical variance of the sample estimators. The potential need tor 1< Ulger data records also cautions one to test for stationarity. The cumulants of a starionarv-random process (or ota random vector) can be represented either as tensors or as m-D matrices. Several results in the scalar case can be generalized to the vector case by replacing scalar multiplications with Kronecker products [3g6]. But, notions r e l ate d tor a 11k , e i g e 11vee tor dec 0 111 po sit ion, diagonalization etc are largely unsolved, but S0111e interesting results 111i.1Y be found in [75, 76] where relationships with the theory of homogeneous multivariable polynomials are established. Even when slices or projections are used, the resulting 21) matrices arc generally not svmmctric, so that the general nonsvmmetric eigenvector problem is involved. Other relatively virgin areas ofHOS research include: extensions of HOS to nonstationarv processes, truly adaptive algorithnls" eftlcient algorithms for nonlinear systelTI analysis, robust estimation algorithnls, applications in point processes" synchronization and multiuser problems in communications, and performance analyses of algoritll111s.

Cauchy r. v. has infinite variance and is a special case of an alpha-stable r.v. It is easy to create a Cauchy r. v. as the ra60 of two (possi blv correlated) Gaussian random variables, and Feller [1 08J shows how the Cauchy r. v. arises in an example with rotating mirrors. Stable LV.'S result from generalized central limit theorems [469], and are

xtn

characterized by four paranletcrs: scale, location, the characteristic exponent (a), and a skewness index (P) [293, 348, 469 J. The exponent satisfies 0 < a < 2, where a = 2 corresponds to the Gaussian and ex. = 1 corresponds to the Cauchy. The skewness index satisfies -1 $ P~ 1, with ~ = Oindicating synlnletry. The non-Gaussian stable processLs are characterized by their infinite variance; indee d , E! xiI' = 00 if r ~ a. Hen c c , 0 n c m us t use lower-order (1" < a) rather than higher-order 1110111ents to study these processes [293 J. The non-Gaussian alpha-stable process lllay be considered impulsive, and at least in the svrnmctric case, there arc connections with the Middleton B model [270 J that remain to be fully explored. A survey of parall1eter estimation techniques ll1ay be found in [293]. Although the lack of (.1 closed-form pdf, except in special cases, has made analysis difficult, numericallv stable ML estimators arc discussed in [294J. Despite the "infinite variance" of these processes, they have found applications in areas such as astronomy (gravitational fields) [469], econometrics (income distributions) [255J" modeling of radar data [294, 418], and Ethernet traffic data r447 J; several applications in the physical sciences are discussed in [181].

Good data windows give a much

less biased estimate of the

spectrum than the periodogram. In the context of linear svmmerric stable processes, idcntifiabilitv of mixed-phase Al~A models was established in [252]; in [389J, it \V~lS shown that normalized HOS could be used to estimate these paranleters, extend-

ing the earlier work of Davis-Resnick, and Mikosch et al. Alpha-stable processes 111ay be used to model very impulsive noise, and they can be suppressed (to SOBle extent) using the usual tools of ranks, weighted medians, and order statistics. Locally optimum detectors lllay also be used [293 J; another alternative is to use data-adaptive nonlincaritics [388 J. Impulsive noise suppression is particularly important for compression and dvnarnic range reduction. Model validation is important, and several ideas are discussed in [294 J and [295]. Differentiating between nonlinearity and nonsrationaritv- mav- be 1110re .

difficult than in the finite variance case [87 J.

Infinite Variance Models

Heavy-tailed non-Gaussian processes, particularly the Gaussian mixture 1110del and the Cauchy LV., have long been used to develop and test signal-processing techniques that are robust to impulsive noise [194 J. The

Fractal-Point Processes Recent papers by Willinger et al. [447J show that Ethernet traffic data are self-similar and alpha-stable, so that the

71

standard Poisson 1110del and conventional analysis (44J are no longer adequate. The self-similarity naturally leads to wavelet-based analysis, sec, e.g., [2]. Semi-parametric techniques, based on log-periodogrd111 regression, arc proposed tor estimating the fractal dimension in [277J, and fractal-point process models have been proposed in [14,219, 344], but several interesting analysis and synthesis problems remain open.

System Identification and Tests for Non-Gaussianity and Linearity [itendra K. Tuunait, Auburn University System Identification

Svstcm identification is the tield of mathematical modeli l~g of systems and signals from cxpcri mental data (377]. In signal-processing applications, system-identification methods arc used for linear prediction, adaptive filtering, noise and interference cancellation, parametric spectral estimation, inverse and forward modeling, and numerous other objectives. In systenls and control applications, models obtained by svstern-identification approaches are used for controller design, system simulation, and prediction. The system-identification methods are applicable to both cases, when input-output data of the systenl under investigation arc available J.S well as when one only has the system output measurements (time-series analysis). The area otsvstcm identification has been an active research area fur past 30 to 40 years. Most of the t()CUS has been on 5150 linear models and on scalar stationary time series, mainly because of its wide applicability and partly because of its analvtical tractabilitv. The available methods have been w~lI analyzed in ~xcellent texts such as [377], [244], and [245J .Thc text by Widrow and Stearns contains a wide range of applications [450]. A typical approach is to choose a 1110del structure with fixed (order) order, e.g., state-space models with known state dimcnsion, autoregressive moving average (AluV1A) models with known AR. and MA 1110del orders, l\tlA models, ctc., and then turn the system-identification problem into one of paranlerer estimation. The choice of the model structure is dictated by the intended application. For instance, [450J favors using MA models for adaptive signal processing in applications where the stability of the fitted 1110del is parall1oullt. A large number of methods exist for paralneter esti matir in including least-squares, prediction-error minimization, Ml.. , instrumental variable methods, output error minimization and others [244, 245, 377]. A complete system-identification methodology should, however, include an iterative process of model structure determination, paranlcter estimation, and 1110del validation f244, 377]. The problem of MIMO SystC111 identification has proved to be more complex, Unlike the scalar (5150) case, for MIMO ARMA models with known orders the representation of systelTI output measurements in ter111S of Al~ and MA matrix coefficients is not necessarily unique. This lack of uniqueness c a n lead to ill-conditioning in paranleter estimation. A possible solution is to use a canonical parameterization requiring knowledge ofcertain structure indices that are difficult to determine in practice [152]. An interesting solution to MIMO model identification (including ARMA paralTIetcr estimation) via a subspace-based realization approach using state-space formulation has recently been prop( ised [438 J. It is applicable to both input-output ('"l.deternlinis-

Time-frequency distributions have recently found applications to radar and sensor-array processing. II

MultiplicQtive NDise Just as uonlinearities give rise to non-Gaussianitv, they also lead to nonadditive noise. For example, if the received signal, X(ll) = S(ll) + Jl'(/~), passes through a ZerO-111ell1ory nonlinearity (ZMNL) (the receiver amplitier), and if the signal, J(lt), is weak relative to the noise, lv(lz), the o bs e r v e d s ign,-ll C ..m be w r i tt e n as y(ll) = J(/l) [8 + ...{f(ll)] + l'(ll), where 8 is a paranleter of the ZMNL. The noises P(ll) and g(ll) are now, in general, correlated, non-Gaussian, and have nonzero means. Multiplicative noise is encountered in speckle imagery [40], fading channels [141,324], underwater acoustics [99], and lidar and radar [29]. An interesting approach is taken in [318], where the signal is a harmonic (i.c., line components}, the noises are modeled as being nonrandom ; and error bounds on the frequency estimates are developed. In the case of a harmonic signal, and stationary noises, conventional spectral analysis of the data ('l.cyclic mean") or of the squared data ("'cyclic variance") have been proposed and analyzed in (467]; fourth-order cumulants were used in [99]. Cramer- Rao bounds were established in [ 118], [466], and [385]. Some paranleter estimation problems are discussed in [384 J. Note that all these papers aSS111ne that the additive and multi plicarive noises are independent. The detection of the weak signal, S(/l), fr01TI the observed data, )1(/l} = 8s(/l) [1 + ~Cf(/~)] + l'(/l), has been considered both for the random and nonrandom cases in [378J, [31], [9],and [126], but much more work remains to be done (e.g., temporally correlated noise, and "nonweak" signals).

A WWVV link to more information 011 non-Gaussian processes and on hi.qlJct order statistics: http://wvt/\V.C0111111.uni-brct11cn.dc/HOSHOME

72

tcrcst to investigate the nature of the givLll signal: whether it is a G~l~lssiall process and, ifit is non-Gaussian, whether it is a linear process.

tic") models with noisy output measurements as well as to multivariate time series ('"\tochastic" models). Svsterns and techniques not captured by the above formulanons (stationary linear time series, linear SYStC111S with noisy output 1~lcasure111cnts but noise-free input measurements, time-domain approaches) have also received considerable attention in recent years. Some representative examples include: ~ Errors-in-variables models: These are models where both input and output measurements are noisy. Hi bzhcr-ordcr statistics have been used in [135], .[431], and [434]; second-order statistics and SUbspace mstrull1ent-al variable methods in [38l J; and cyclic and/or higher-order spectral analysis in [135 J and 415]. ... Nonlinear svstcms: Volterra svstcms have attracted con s i d era b I~ a t ten t i 0 11 [ 2 8'1 ]. M () reg e 11 era 1 Hammerstcin svstcrns have been considered in [3271. Time-series models have been treated in detail in [412]. A Frequency-domain approaches have been investigated

Gaussianity Tests

Several tests have been devised on the basis of the tact that the hisrhcr-ordcr cumulant spectra f43 J of Gaussian proD . cesscs vanish. One of the earliest tests based upon testing of the signal bispcctrurn is given in l329J. Hinich r 168 J has simplified the test of [329J by using the known expression for the asvmptoric covariance of the hispccrrurn estimators. Notice that a vanishing bispectrum does not necessarily irnplv that the underlying signal is Gaussian; it mav result from the fact that the signal is non-Gaussian vvith zero bispcctrum. Therefore, a next logical step would be to test tor vanishing rrispcctrurn of the record. This has been done in [272] using the approach of [ 168 J~ ext ens i 0 11 S () f r 3 2 9 Jar e t () 0 C0 rn PIi cat cd . Computat ionallv simpler tests using C.C.in~e~rated polvspcctrurn n of the data have been proposed In [. 4 29 1. T h e i 11 t c g rat e d p () IYs pc c t r U111 (b Is p c c t r U111 or trispcctrum) is computed as cross-p()\ver spcctrunl and it is zero for Gaussian processes. Alternatively, one nlay test hilTv her -()r dere II 111Ul ant fun c t i()n s ()f the d a t a ill time-domain. This has been done in [1361. Other tests that do not rely OIl higher-order cumulant spectr~l of the data 111~lY be found in (412 J.

in [432], [433],and [ 361]. Approaches of [432] and [433) do not require explicit noise 1110deling.

A. for time-series models higher-order statisricx-bascd approaches have attracted considerable attention due to their ability to (blindly) idcntifv nonminimurn phase and/or nOl{causal models [292., 435].

Gaussianity and Linearity Tests

Linear parametric models of stationary random processes, whether signal or noise, have been found to be usef\..11 in a wide variety of signal-processing tasks such as signal detection, estimation, filtering. and classification, and in a wide variety of applications such as digital COl11rnunications, automatic control, radar and sonar, and other engineering disciplines and sciences. Parsimonious b 'r par ametr ic models such as Alt.., MA, AR_MA. or state-space, as opposed to impulse-response modeling, have been popular together with the assumption of Gaussianitv of the data. Linear Gaussian models have lOl1 bU been ~iolninant both fCJr siznals as well as for noise processes. Assumption of Gaussianirv allows irnplcmcnration of statistically efficient paC.l111eter estimators such as ML estimators. A stationary Gaussian process is C0I11plctelv characterized by its second-order statistics (autocorrelation function or equivalently, its po\ver spectral densitv-s-Po D) and it can alwavs be represented by a linear process, Since the P 51) depends only 011 the magnitude of the underlvine transfer function, it docs not .yield .. b information about the phase of the transfer function. Determination of the true phase characteristic is crucial in several applications such as seismic deconvolution and blind equalization of digital communications channels. Use of higher-order statistics allows one to uniquely idcnt i fv non 111 i n i 111U III - Phas epa r a n1 e t ric 111 0 del s . Hi~:rher-ordcr cU111ulants of Gaussian processes vanish; b hence if the d~lta ~lre stationarv G~lL1ssian, a 111ini-

Linearity Tests

For a stationarv time series one can define a 11< irrn alized bispcctrum, c~lled bicohcrcncc or skewness fun~ti()n [292 L which turns out to be a (nonzero) constant tor all

bifrequencics if the signal is linear non-Gaussian with nonvanishiug bispccrrurn. This property has been used bv Subba R~~() and Gabr [329] to design a statistical test f~r lincarirv. Hinich [168] has "simplified' the test of [329]. Notice that this test is useless if t1~e siglUl is non-Gaussian with zero bispccrrum. Therefore, a next logical step would be to test ~l normalized trispcctrurn (rricohercncc function). This has been done in [272 J using the approach of [168 J ~ ext~nsions of [32~ J ar~ t~)~) complicated. The approaches of [329J ~lnd [1.68 J will tad if the data are noisv. A modification to [329J 1S presented in [428] when additive Gaussian noise is present. finally, other tests that do not rely on higher-order curnulant

0

mum-~hase

(or ma.ximum-phase)

on(~ can esti111~lte. Given

s peCtl-~l of the record 111ay be found in [412].

Advanced-Sensor Signal Processing Ncbomi, The University oflllinois at Chicano O~!er the last several vears, advanced sensors have been int rod 11 ce dan dec; 111 bin e d w i t h s tat i s t i ( a 1 SC11sor-array-proccssing ll1ethods. These sens~)rs exploit lllorc physical infornlation and ~lre orders (~t 11lagnlt~ldc

A1/'VC

l~odcl is the "~est" that

these [lcts, it has been of

1110re sensitive than traditional sensors. TheIr usc has 1111proved the pert()rnlanCe of current systelllS, increased

S0111e 1n-

73

the a ccur ~1 c y 0 f the clas s ()f U11 b i as c d para m e t e r estimators). Quality measures are defined f( ir estimating direction and orientation in 3l) space, including nleal1-sql1'.lre Jngular error and covariance of vec-

Just as nonlinearities give rise to non-Gaussianity, they also lead to nonadditive noise.

tor-angular error. Lower hounds on these measures give

concrete results on expected perf irmancc. A fast algorith1l1 for direction finding using an electro-

the scope of signal-processing methods, and created entirely new applications. Four examples are presented in the following.

magnetic vector sensor has been proposed in [284] and [287]. Inspired by the Poynting theorem, it t()n11S the cross-product of the electric field vector with the complex conjugate of the magnetic vector and averages over time. The asvmptotic performance under general conditions is shown to be close to optimum. A minimum-noise-variance bc.unformer for interference cancellation emplovinj; a single electromagnetic vector sensor h .1S been proposed in [283 J. It assumes that the direction and polarization of the source arc known. This enables suppression of uno irrclatcd interference, even if it C0111CS fr0111 the S..u ne direction as the source, based on polarization as well ..is location differences. Analysis ofthe signal to interference-plus-noise ratio showed the beamformcr to be very etfective, particularlv when the signal and interference are diftcrcntlv polarized. An arrJy ofspariallv distributed vector sensors (;111 additionally exploit time delays an10ng the sensors. In gcneral, arrays of vector sensors can achieve better performance than scalar-sensor arrays, while occupying less space. Alternatively, they can be spaced further apart to increase aperture and hence performance wi thour introducing ambiguities [390" 453]. S0111C high-resolution direction-finding algorith111s have been developed for electromagnetic vector-sensor arrays r 171., 173, 230]. Preliminary results on the application ()f vector sensors to communication problems can be found in 470 j and [65].

Electromagnetic Vedor Sensors

Each element in all electromagnetic vector sensor, introduced in [284] and [287] for estimating the direction and polarization of electromagnetic sources, measures all six electromagnetic field cOll1ponents at ~l single point. In contrast, conventional antennas measure a single con1ponent of the electric field. Vector sensors are commerciallv available and actively researched. ENIC Baden Ltd. in Baden, Switzerland, manufactures them for J. 75 Hz-30 M.Hz frequency range, and Flam and Russell, Inc., in Horsham, Pennsylvania, for 2 MHz-30 MHz. Lincoln Labs at MIT has performed prcliminarv localization tests with the vector sensors of Flam and H..ussell, Inc. [156]. Other research on sensor development is reported in

l189] and [190],

Electromagnetic vector sensors are sensitive to both the direction and polarization information in the incorning waves, The polarization provides a crucial criterion for distinguishing and isolating signals that 111ay otherwise overlap in conventional sc..i lar-sensor arrays. When a single vector sensor is used for direction finding, it has the following advantages and capabilities: ~ Direction estimation in 31) is possible with sensors occupying very little space ... Estimation ofthe directions and polarization ellipses of up to three sources [172, 175] is possible ... Resolution of very closely spaced (even co-incident) sources based on polarization differences is possible A One is able to process wideband signals in the Sa111e \vay as narrowband signals ... One can handle sources with either single or dual-message signals & Vector scnsr irs have isotropic response A There is no need for location calibration and time svnchronization ,1S there is among different sensor clements of &:1 conventional spatial array SOBle of these advantages result from the fact that no time delays are used, In contrast, conventional scalar-sensor methods require a 21) array for direction finding in J 31) spacc, need accurate location calibration and time synchronization, and require much higher COlllPUtat io n a! cost to process w ide band r a the r than narrowband signals. The optimum accuracy of source para111eter estimation for vector-sensor arrays is analyzed in [284] and [287] in terms ofCramer-Rao bounds (absolute limits

r

Acoustic Vector Sensors

Acoustic vector sensors measure the acoustic pressure and all three conlponents of the ac( rustic particle velocity vector at a given point; standard methods measure only the pressure. The use of acoustic vector sensors for array processing has been proposed in [2HS J and [2H6 J. These references introduced 111CaSUrC1l1ent models, derived fast direction-estimation algorithms, ..i nd presented general Cramer-Rae bounds on direction paranleters. These developmcnts have coincided with a surge of interest in particle-velocity sensors r 27] and im provcn1ents in fabrication techniques 197J to make vector-sensor arrays a practical reality [290]. Since ~1 vector sensor extracts directional information directly ti'0111 the structure of the velocity field, it can, for example, locate up to t\VO sources in 31) space [175]. By making use of this extra information, arrays of vector sensors improve source localization accuracy while using smaller array apcrtures. Bcamforming and Capon direction-estimation procedures with acoustic vector-sensor ~l1T~l\'S have been dev e I()P e din [I 60 ]. I t \vas s h ()w 11 t h '-1 t the extr a

011

74

velocity-field information re1110VeS all ambiguities such as grating lobes. This allows the use of simple structures for which [1St direction estimation algorithnls exist, e.g.., a uniform linear arrav, to determine both the azimuth and elevation of a source. It also 111e~lnS that spatially undcrsamplcd arrays of vector sensors can be used to increase aperture and hence performance. A fast esti rnation algorithn1 that makes uses of this property was developed in [455 J and another algorithn1 for arbitrary array shapes appears in l454 J. In [158] the effect of sensor placement on the directi< H1 estimation performance of all array of acoustic vector sensors has been considered. Using the Cramer- It~10 bound 011 the paran1eters of a single source, conditions were derived that minimize the lower bound on the asvmptotic l11ean-sl]uare angular error, and that it is isotropic. The increase in estimation accuracy obtained hy vector sensors is greatest for linear or planar arrays (as opposed to 31)),\ small number of sensors, and low SNIts. By exploiting velocity and pressure information, any vector-sensor ~llTay,\ a popular linear array, can be used to Ul1ambiguously determine both the azirnutb and elevation of a single source. Vector sensors have been succcssfullv applied in other areas such as hull-mounted applications" where they overcome serious problems in detecting low-frequency emitting targets [159]. (At low frcqucncv the vessel's hull is acoustically flexible" leading to a vcry 10\V pressure signa.! but a strong velocity signal.)

[319 J was to reduce the expected location error, which was accomplished by moving the sensor in directions opposite to the gradients of the Cramer- Rao bound. Monitoring of disposal sites Oil the ocean floor using chemical sensor arrays was considered in [184 J. Such sites have been suggested as suitable for the relocation ofdredge materials fr()~l~ harbors and shipping channels, where th~ir buildup has a detrimental impact upon <.:conol11Y and military security. Algorithnls t()I" detecting possible release of pollutants ncar these sites were developed, and their pert< irmancc was analyzed. The results were used to optimally design arrays with respect to the numbers of sensors and time samples ~lS well as sensor locations.

Superconducting Quantum

Interference Devices (SQUIDS) SQUIl)S arc the 1110st sensitive detectors of magnetic flux currently available. Thev find broad application" from measurements of nl~lgnetic fields induced by brain activitv to nondestructive evaluation of materials and the location of underground objects .uul structures. Their 1110st irnporranr cornmcr cia l USt is in 111agnetoencephalography (~1 E(~ ). lvl E(~ is concerned wi th n1apping electrical activity in the brain by measuring the induced external 111agnetic field [151 J. ME(~ sensor arrd)'s measure cxtracr.uii.il mugnctic fields of only a few hundred femro'Tcxla-e-a billion times smaller than the Earth's steady 111agnetic field. Together with electroencephalography (EEG)" which measures electric potentials on the scalp, ME(~ has emerged ax a powerful noninvasive too} for the localization and tracking olclcctrical sources in the brain. 'The solution to this problem is otgreat importance in the diagnosis and evaluation of various brain disorders such as cpilcpsv, and for furthering understanding of brain function. In [274] the MUSI(~ algorithnl was applied for localizing brain sources modeled as current dipoles. Maximum-likelihood techniques have been developed in [93] to account for unknown spatiallv correlated noise" predorninanrlv due to sporadic background activity in neurons. The optimization of NIEG sensor arrays to minimize the lllean-square error of dipole location estimates has been proposed in [176]. SQUIl)s have opened other new applications of Sigll~ll processing, such as detecting the wake of a ship using an airborne systcn1 [282]. III conclusion, it is expected that the use of novel sensors will continue to be ~1 source of new applications and further developments in sign~11 processing.

Chemical Sensors Chemical sensors arc useful for detecting explosives, drugs" and leakage of hazardous chemicals, and for 1110I1itoring the environment. They arc manufactured by C0I11p au i cs such JS Cvr ario Sciences, Inc. ~ Science Applications International Corporation (SAI(~); JnJ [avcor in San l)jcgo, California. Array-processing techniques using chemical sensors have been proposed in

[ 184]., [2 88 ]'\ :111 d [ 3 19 J. CO111p arc d \tVit h a 11i m a 1 chemoreception, these techniques have the advantage

that they share information and can be optimized. Methods ten" detecting and localizing vapor-emitting sources were developed in [288]. Based on the diffusion equation, distributions of vapor concentration in rime and space were derived for various environments. The results were used to develop statistical models of the array measurements. Maximum-likelihood estimates and general-likclihood ratio tests were derived to estimate the u n _ known source parameters and detect the existence of a s 0 L1 r ce. The per fo r m ~l n ce \\1 a san a Iyzed u s i 11 g Cramer-Ran bounds and probabilities of detection and L

A WWW link to tile nutlior oithe above section: http://\vv\''\V.cccs.uic.cdu/-nchoraij

false alarm.

Sensor-Array Signal Processing

Employment ofn10ving sensors \vas proposed in [319J.

A. Lee SJPindlchurJt,

A single nloving sensor CJn achieve the task ofn1ultiple stationary sensors by taking ll1easurenlents at different locations and tin1es. Its path can be planned in real tinle t<) optinlize a pert<JrnlanCe criterion. The criterion used in

Br~T!J(l1'1t YOltllf}

University

The processing and n1anipularion ofdara received by a spatiaIlv. distributed arrav. ofsensors has been an active area of research in the signal-processing COnl111Uniry r()r \vell over

75

30 years. The long-lasting attention devoted to this area can be traced to the large number of applications where data is collected in both space and time. Figure 5 depicts a generic scenario in which energy (possibly acoustic, electromagnetic, seismic, erc .) from two ",.,; sources is received by a senSource 1 sor array . There are a number of possible objectives of such ... 5. A Generic array signal-processing scenario. a system, with the most important being: Fig. 5 as a separate term in the above sum; i.c., we would ... SourceLocalizatilm-determine the azimuth and elevalet s., (t) = S 2 (t - T). The outplltS of an array of m eletion angles to the sources, and possibly the range to the ments can be stacked in a vector, as follows : sources as well (if they are located in the ncar-field of the array) ; information on source velocity can be obtained by Xl If measuring frequency shifts, or angle and range rates of x(t) : = (;a (8".)S,, (t) + o(t) change.

••

(t)]

[

... Source Separation-determine the signal waveforms transmitted by each source; the fact that the energy from each source arrives from different directions allows these waveforms to be separated even if they overlap in time and frequency . ... Channel Estimation-determine the space -time propagation effects between the sources and the array; estimate where reflections occur or how much the transmitted signal is spread in time and angle . Which of the above three objectives is most important depends on the application. In active radar and sonar, the received waveforms are approximately scaled and delayed versions of a known signal, so it is the location (and motion) of the sources that is most important. In a communications system, it is the information-bearing waveform and not the location ofthe sources that is critical. For seismic applications, the source signals arise from explosive charges. The received energy is used to characterize the propagation channel, whicll"in this case provides information about the structure of the ground . To be more precise, and to aid the discussion that follows, we introduce some simple mathematical notation . Referring to Fig. 5, in the simplest case the sources and array lie in the same plane, and the sources are far enough from the array so that the arriving signals have planar wavefronts. For this case, if the signals are assumed to be "narrowband" and there arc a total of d sig nals, then the output of array element p is given by

k=1

(10) T

where a(8 k ) = [a , (8k )- •• all/ (8k )] denotes the vector array response, and we have added a term, o(t), to account for unrnodelcd measurement noise and interference . Using this notation, we can give concrete examples ofthe three objectives listed above. For example, in source localization, we would use sam ples of the array output x(t, ),. ·· , x (t N ) to estimate the DOAs 8, ,.· · , 8 d of the sources. Source separation involves extracting samples of one (or more) of the signals, s/: (t l ), " ',S k(t N ) , from the array data. In channel estimation, we may have Sk(t) = u ks(t - T ,,), in which case we are interested in the amplitudes, U "., and delays, T!., of the various arrivals (perhaps as well as the DOAs). Early research in sensor-array signal processing, conducted mainly in the 1960s and 70s, was based on the observation that, if the array is composed of identical uniformly spaced elements (i.e., a uniform linear array, or ULA for short), then a direct analogy exists with temporal sampling. The array elements perform a uniform (one-dimensional) spatial sampling of the wavefield, and the spacing between dements determines what spatial frequencies can be uniquely represented. Signals whose wavefronts are nearly parallel to the ULA (8 se 0°) have low spatial frequencies; as lSi ~ 90° increases, the spatial frequency of the signal increases and reaches a maximum at 8 =±90°. A spatial version of the Nyquist criterion states that the array response vector a(8) of a ULA is unique provided that the elements of the array are separated bv no more than one-half the wavelength of the signal. Using this analogy with temporal sampling, it is possible to design spatial filters that pass signals with certain spatial frequencies (i.e., that arrive from certain directions) and attenuate others . However, unlike temporal filtering, it is usually not known a priori what spatial fre-

If

."Cp (t) = Ia,,(8!.) s!.(t)

«; (t)

(9)

where 8 k represents the DOA of the wavefront from source k, and a" (8,:) is the (complex) response of eleruent p to a signal arriving from that direction. In the general case, we would treat the reflection of source 2's signal in

76

quencies arc occupied by signals of interest. Thus, the C111phasis was on data adaptive methods that estimated the directions to the desired user( s) prior to computing the appropriate spatial filter coefficients. Such algorithnls arc usually referred to as adaptive bcamformcrs, since the desired spatial frequency response is a narrow beam "steered' towards the source of interest. The problem of estimating the source directions can also be addressed using this analogy with temporal sampIing. A point source located in the far field of a UIJA at 110A e produces the following output at sensor !(:

The further integration of

computational wave-equation solutions and signal-processing techniques win pose many challenges and rewards in the future..

the sources are often noncooperative and little 1l1ay be known about the signals they generate. In such situations, discri ruination based on only the spatial properties of the reccivcd signals is necessary, which in turn requires that the response of the array be accur . i tclv calibrated with respecr to 8. The sensitivity of subspace based methods to array calibration errors limited their usefulness to S0111C degree, particularly in underwater environments where the propagation medium is severely nonuniform. As militar)' funding has waned, and as the field of personal wireI e ss co m m u n i c ~1 t i ()11S has e m erge d, in t ere stin applications of array signal processing to comrnunications systen1S has blossomed in the past few years. The use of multiple antennas at the base station Of.1 wireless network offers a processing gain that C~111 increase base station range and improve coverage. By exploiting the spatial selectivity of an antenna array, co-channel interference 111ay be reduced, which in turn can be tr .i ded tor increased systenl capacity. 111 addition, communication channels can be multiplexed in the spatial dimension just as in the frequency and time dimensions, This is often referred to as spatial-division multiple access (Sl)MA). A distinguishing aspect of using antenna arrays in C011111111l1ications applications is that, due to the cooperative nature ofsuch systems, significant information about the source signals is available and can be exploited for spatial processing. For example, it nl~lY be known that training sequences are present in the data, or that the signal is digitally modulated with a known symbol constellation and pulse-shaping filter, or that the signal has a constant amplitude envelope, etc. Each of these properties can be used by a systenl e1l1ploying multiple antennas to achieve source separation without the need for explicit array calibration data. Algorith111s that use this approach are reIcrrcd to as "blind" source separation methods (see the sections by Cardoso and Tong). Such techniques can also be extended to perform blind equalization of propagation channels with significant delay spread. A breakthrough in this area carne in the earlv 1990s, when it was ShO\VI1 that if a pulse-amplitude modulated signal is received by an array of antennas, then the channel can be identified using only second-order cvclosration.uv statistics. Although blind methods can eliminate the need for calibration information, signiticant performance improvel11ent CJn be achieved if reasonably accurate calibration is available. Techniques that exploit both the spatial

( 11) where 8 is the distance between adjacen t sensors and A is the wavelength of the signal. Viewed as a function of It, the vector of samples from the array is seen to be a cornplex exponential in noise. With multiple sources, the problem of DOA estimation is seen to be equivalent to the classical spectral analysis problem of determining the frequencies ofa collection ofsinusoids in noise. This connection has led to nUl11erOUS "cross-over" techniques, the most popular of which are those based on simple Fourier analysis with windows (Bartlett, Hamming, Chebyshev, ctc.) used to control resolution and sidelobe levels. U 11like their counterparts in temporal frcquencv estimation, the rcsol ution of these methods cannot be increased bv collecting 1110re data fr0111 the array; the ability to resolve sources spaced closely in 8 is limited by the aperture of the ;.lrray, which is tvpicallvfixed, In addition, the fixed ~lper ture can produce large sidelobes in the spatial frequency spcctrUL11 that lead to inconsistent l)OA estimates when more than one source is present. To overcome some of these deficiencies, techniques based on maximum entropy, autoregressive modeling, and linear prediction were proposed, with mixed success. A dramatic shift in emphasis in sensor-array signal processing occurred during the 1980s with the introduction ofthe so-called.HtbJpacc-basedtechniques. These methods are based on the observation that, if the number of sensors is strictly greater than the l1U111ber of sources, the signal con1ponent of the array data lies in a low-rank subspace. Under certain conditions, this subspace uniquely identifies the D()As ofthe signals and can be determincd quite accurately using, for example, a numerically stable singular value decomposition. A larg~ number of parametric estimators have been developed based on the subspace idea. These techniques enjoy a number of advantages over earlier methods, including statistical consistency and very high resolution. NU111er0l1S extensions to the simple 1110del outlined above have been considered, including generaliz~ltions to widcband or diversely polarized signals, techniques for handling cortelated or perfectly coherent signals., or arrays whose response, a(8), is not precisely known or calibrated. The main driving force behind research in ~llTay signal processing in the 1980s was provided by militarv applications, primarilv in radar and sonar. In these applications,

77

- 50 L -

-501L.----- - -- - ' , 0 0.1 0.2 0.3 0.4 0.5

---J ..

Output of one of eight EKG electrodes (left); reconstruction of the mother's EKG contribution (middle), ond of the fetus' EKG contribution (right) using all eight electrodes.

~ 6.

and temporal properties of the received signals arc thus of high interest at the present time. Joint space-time processing in radar and sonar applications is also receiving added attention, as the speed and throughput of multichannel DSP processors continues to increase. Further advances in computing power will bring other difficult array signal-processing problems to the forefront, such as soul:ce localization an~i separation for wideband signals, parameter estimation for sources with distributed spatial spectra, and matched-field processing for sonar applications. There is a large body of published literature in the area of sensor-array signal processing available to the interested reader. The references listed below are good starting points because of their tutorial nature and their extensive bibliographies: ~ General books: [161], [311], [185] ~ Connections with spectral analysis: [197], [382J ~ Adaptive beamforming [77], [439J ~ Applications to radar systems [104J, [163] ~ Subspace methods [299], [34], [214] A Applications in communications [298], [34], [306] A WlVW link to the author of the above section : http://www .cc.bvu.cdu/ -swindle

Blind Separation of Sources [can -Francais Cardoso, Ecole Natumale Sllpil'ieu1'c des T tfLCcommlmicntions (ENST)

Objectives Source Separation

Source separation consists of recovering a set of "source signals" from the observation ofseveral mixtures of these signals . This problem typically arises when the available signals are obtained at the output of an an'I1JI of sensors that temporally and spatially samples signals emitted at different locations in space . In general, each sensor receives a mixture ofall the source signals : if there are fewer sources than sensors, the received mixture ofsignals is (in general) linearly invertible: this is the case of)patial diversity discussed in the following section by L. Tong. Figure 6 shows an example ofseparation ofelectrocardiography (EKG) signals . The left panel shows the out-

panel) to be separated.

pllt of one EKG electrode located on the abdomen ofa pregnant woman : the fetus heart beat cannot be easily distinguished. The data set [88] contains the Olltputs of seven other sensors placed on the mothers chest and abdomen . A source-separation technique allows the cont rib uti 0 ns fr 0 m the mother (middle) and from the fetus (right

Blind Source Separation

Exploiting an array of sensors to t()(US on a particular source signal while rejecting other "interferers" is .1 standard task in array processing (see the sections by Swindlehurst and Krolik in this article). The blind source separation (1355) problem consists in recovering all the sources without using prior information about the channels; i.e., about the transfer function between the sources and the sensors. BS5 is an "output-only" technique in the sense that neither source signals nor training sequences arc available: all ofthe available information is contained in the observed data themselves. Two seminal papers on this topic are those by Jurtcn ct al. [188] and Comon [74]. The major strength of the blind approach to source separation stems precisely from the fact that a precise model of the underlying physical phenomena, e.g., wave generation, propagation, and transduction , is not required. Thus, for example, BSS can be applied to uncalibrared arravs in situations where calibration is difficult or impossible or when physical modeling is overly complicated or unreliable. The basic idea ofB5S is that one makes up for the lack of information about the channels by assuming that the source signals are (statistically) independent. Statistical independence is a relatively strong assumption but it is plausible in many contexts because it arises from a lack of phy sical relationship between the various sources. The simplest source separation model assumes an n-sensor array receiving signals, x, (t), . .. , x" (r), from as many sources, .II (t), . .. , .I,,(t ), and an instantaneous and noise-free mixture :

{/

..

II

The mixture coefficients, a .., can be collected in an n x 11 " the 11 source signals and "mixing matrix," A. Collecting the n a~Tay outputs into 11 X l~ column vecrors. the BSS model reads more concisely as x(t) = As .

78

both the unknown systelTI., A, and to the distribution of the independent sources, s; the third objective function arises when one imposes the addition constraint that the recovered signals should be uncorrelated. This can be understood as follows: SUl1l111ing independent random variables "tends" to produce a more Gaussian result (think of the central limit theorem) so that driving y a\vay from Gaussianity n1ay be thought of as away of recovering the original source signals. The ML principle gives rise to measures of independence, non-Gaussianity and entropy (as listed above), which are based on information-theoretic quantities. Because they 111ay be difficult to manipulate, one often re-

A 7. Mixing and separating. Source signals: 5; array output: x, es-

timated sources: y.

If the mixing matrix, A~ were known, source signals could cas ilv be recovered by direct inversion: s(t) = A -I x( t). Conversely, if the source signals were avai lable , the mixing matrix could be easily estimated by a simple input-output identification procedure. The challenge of BSS is that of "double blindness:": neither s(t) nor A is available in the model x = As(t). Algorithnls for the blind separation of sources try to determine a j-eparating matrix, B., to un-rnix the observation vector into y = Bx (see Fig. 7). Ideally, the separating matrix should approximate the inverse of the mixing matrix. The next section outlines some statistical ideas for

Estimating signals under various

diversity receptions is currently

an active research area.

accomplishing BSS.

sorts in practice to more tractable approximations, e.g.., high -order correlations (triple, quadruple correlations), high-order curnulants, or pair-wise correlations between nonlinear functions of.'" I .,... , Y n .

Principles

Before listing a few of the ideas behindBSS algorithnls., it

is instructive to explain why the simplest idea-Ending a 111 a t r i x t hat In a k est h e 0 Ll t put s uncorrelated-does not work. The reason is that decorrclation is a svmmetric propert..Jv : Corrtv =0 • " I' I also implies that Corr( V . , V .) = O. Therefore, there are . J • only as Inany decorrelation conditions as pairs of sources, narnelv nin - 1) / 2.,\vhich is about half of the constraints needed to determine the n 2 entries of a separating matrix. Thus, pairwise correlations (second-order information) are not sufficient to solve the BSS problem: it is necessary to express statistical independence in a stronger sense. That decorrelation is not sufficient has an important consequence. Recall that for Gaussian var iablcs , decorrelation implies independence; it follows that Gaussian sources cannot be blindly separated because their independence boils down to pairwise dccorrelation. Therefore, SOiDe non-Gaussianirv is needed to achieve BSS. One Inay see BSS as the art of exploiting the non-Gaussianity of the signals and measurements. S0111e ideas for deriving source separation algorithllls are to adjust B in such ~l way that: .. The probability distribution of y is as close as possible to S0111C prespecified distribution of independent COlllPOnents, or A The outputs, )' 1 ., •.• ., Y 11 ' are as independent as possible, or .. The outputs are dccorrelated and as non-Gaussian as possible or as low-entropic as possible. All these objective functions can be derived from the application of the ML principle [50J under various assurnptions. The first objective function results from fitting x to the linear 1110del x ::: As where A is unknown and s has a fixed (hypothetical) distribution of non-Gaussian independent COl1lpOnents ~ the second objective function results from fitting to the model x = As with respect to

s epa rat i n g

v.:

Perspectives

J

~

Beyond the simple cases described here, blind source separation has been applied to much 1110re general models such as noisy observations, complex signals, nOl1square mixtures, and convolutive mixtures [50]. The latter extension often brings the BSS problem close to the problern of channel equalization and of systenl identification (see the sections by Giannakis and Tugnait in this article). Another interesting extension is to consider the case where the BSS model does not hold. In this case BSS olay be implemented as a data exploration technique for which one is interested in finding the linear transformation of a random vector, x, into y = Bx such that the C0111pOnents of yare as independent as possible. BSS is then seen as a device for independent cOll1ponent analysis (leA), which can complement principal cOlllpOnCl1t analysis (PCA). WWW links relevant to the above section: ... The author's web page: http://sig.ensr.fr/-cardoso/stutrhtnll • WWW links to independent cOlllponents analysis

1

(leA):

-lCA research group at the Helsinki University of Technology: hrtp.r/www.cis.hur.fi/projects/ica/ -Laboratory for Open Information Systems at the Riken Institute: http://~rwvv.bip.rikcn.go.jp/open/Welco[ne.htInl

-Conlputational Neurobiology Lab at the Salk Institute: http://\v\vw.cnl.salk.edu/-te\von/ica sub cnl.html A Miscellaneuous WWW links of relevance: http://sound. mcdia.mit.edu/--- paris/ica.html http://'N\\'vl.bnlc.riken.go.jp/sensor/AUan/ICA/

79

the cell-phone tower along major highways. (Imagine how much more you could hear at the cocktail party if you had an array of ears!) Recently, the so-called smart antenna technology has attracted considerable interest from both academia and industry (see a recent survey by Paulraj and Papadias [306]) . ... Spectral Diversity. Spectral diversity can be obtained in many ways. To counter frequency-selective fading caused by echoes, the transmission band can be divided into a number of smaller frequenc y bands through which modulated signals of different power are transmitted. This kind of multicarrier transmission is the basis for the OFDM (orthogonal frequenc y division multiplexing) system for digital audio broadcasting (D AB) . Frequency-hopping (FH) spread spectrum [374] is another way of achieving spectral diversity. Invented in the 1940s, this technique avoids hostile interference by changing the transmission band in a way known to the receiver but unpredictable to the jammer. ... Temporal Diversity. Diversity can also be achieved in the time-domain by transmitting multiple copies ofthe signal at different times or , in a more intelligent way, by transmitting the signal in some special waveform known to the receiver. A direct-sequence spread-spectrum signal gains diversity by modulating thesource with a special code sequence. When the received signal is sampled at a rate higher than the bit rate, we effectively obtain the transmitted symbol at a different time , which is analogous to spatial diversity where the transmitted signal is obtained at different locations. Different users can be separated if they use different codes. The exploitation of temporal correlation of the signal waveform provides a crucial property used in many signal-processing algorithms. See again [306] for discussions about space-time processing.

Source Separation and Diversity for Communications Lang Tong} Cornell University Source Separation and Diversity

Separating multiple sources is a fund amental signal pro cessing problem that arises in man y applications. An easily understood scenario is the so-called "cocktail party" problem where the objective is to separate one voice from others in the room. To a large degree , humans perfo rm this task remark ably well. We are able to distinguish different voices, and may even follow the conversation. The cocktail-party problem is germane to many similar applications in biomedical signal processing, geophysical signal processing and , most notably, in communication system designs . In wireless communications, for example, when the signal is transmitted over a nonideal channel, sophisticated signal processing is often required at the receiver to counter noise and interference from various sources. One of the major channel impa irments is multipath fading. The problem here is analogous to calling someone a hundred yards away in the Grand Canyon. Your friend will hear your voice and its echoes. He can separate your voice and its echoes unless, of course , you speak too fast. It will be much harder for him if someone else tries to talk to him at the same time. But this is exactly the problem in today's digital cellular system where the tr ansmitted signal is interfered not only by its echoes (multipath interference) but also by other users in the neighborhood (cochannel interference) . We face the source separation problem similar to that in a cocktail party. What makes it possible for us to separate and track different voices in a cocktail parry ! What features do we use] Can this process be made automatic] These questions touch upon some of the fundamental issues in source separation. Perhaps it is easier to see what makes it more difficult to separate sources. Can you still separate different voices with only one ear] Perhaps, but it is more difficult. What if you close your eyes? The loss of visual signals will definitely make it much harder. As it turns out, the key to source separation is diversity. To separate different sources, it is important to have different receptions of the signal. This can be done in different ways by exploiting the characteristics of the signal and its propagation medium. ... SpatialDiversity. Using sensor arrays is an effective way to gain spatial diversity. You can see the antenna array on

.'-

. Sl(t)

-

. . .EB+ n1(t) ·

Channel

Xl(t) ~

,---'-- -'--'-,

H

<,.,'

~(t)

Statistical Signal Processing for Source Separation

Estimating signals under various diversity receptions is currently an active research area. A general model for source separation is shown in Fig. 8 whe re sources {s i (t) } are propagated through a channel H, contaminated by noise {Sj(t )}, and received with diversity as {xj (t )}. The goal of source separat ion is to estimate one or all of the source signals from {Xi (t) }. Classical approaches to source separation are based on either the knowledge ofthe channel or the ability ofhaving access to the channel input so that the channel (or its inverse) CaI1 be estimated by sending "training" signals. Depending on the application, different criteria (such as minimizing the detection error probability or minimizing the mean-square error ) can be used to design the signal estimator. In recent years, there has been considerable intere st in the so-called blind-source- separation problem. Here, neither the channel is known a priori, nor is it possible to have access to the channel input so that training CaI1 be made (recall again the cocktail-party problem) . The merit of blind signal separation is twofold. First, there are cases

.:'

Source Estimator

' s~(t)

... 8.

Source separation withreceiver diversity.

80

when channel estimation by training is very difficult. Second, the transmission of Water: Column training inevitably reduces Sound-Speed • (rs,Zs) the transmission rate of inSource formation. Such reduction can be significant when training has to be pertarmed repeatedly. The key to blind signal sep,. aration is to exploit qualitative 3 Sediment; Dens ity = 1:75 g1cm ., information about the strucAttenuation =0.13 dBIA. ture of the channel and characteristics of the input 1580 mls sources. For example, source 1600 mls ..• Sub-bott om 3 signals in communications of..Density = 1,75 g/cm ten can only have a finite .... Attenuation = 0.15dB/A number of alphabets, which enable the separation of multiple sources. The statistical Z independency among sources is another condition that leads & 9. A typical scenario for passive sonar matched-field processing. to a number of effective undo the effects of coherent multipath one could actually source separation algorithms [50], among them is the exploit them to achieve dramatically improved perforwidely applied CMA [143, 416]. In a special issue of IEEE mance with the assistance of a numerical propagation Proceedings [242] to appear later this year, tutorials of model is the essence what is now known as matched-field blind-source-separation techniques and the ir appl ications processing (MFP) . The availability of inexpensive, are presented. For related topics on this growing field of computing to rapidly calculate numerical sohigh-power research, readers are also referred to a special issue (edited lutions of the wave equation is what has really driven the by G. Giannakis and G. Xu) on signal processing for adof MFP methods over the last decade. development vanced communication [127] and the recent book by Matched-field processing was first developed as a simPoor and Wornell [313]. ple generalization of narrowband plane-wave bearnforming wherein conventional array weights based on plane-wave "steering vectors" were replaced by rrepSSAP with Computational Acoustic and lica vectors " derived from the full-field solution of the Electromagnetic Propagation Models wave equation in a ducted channel. Historically, these Jeffrey [(roM, Dulle University techniques have been applied to the problem ofunderwater passive source localization [10]. A common set-up for Statistical signal and array processing ofsignals carried by passive sonar MFP is illustrated in Fig . 9, where acoustic propagating waves has traditionaUy been developed assignals in a shallow-water waveguide are received at a versuming simple plane-wave acoustic or electromagnetic propagation models. This is despite the fact that in many tical array of hydrophones from a distant source in the problems associated with sonar, radar, wireless commupresence of interference from surface shipping plus diffuse ambient noise. The objective is to detect and localize nications, and geophysics, complex coherent multipath propagation between the source and receiver is a domithe source in range and depth. The medium is described nant feature. The historical focus on signal processing usby a sound-speed profile within the water column toing plane-wave models has been a result of: 1) their gether with the bathymetry and geoacoustic properties of analytic simplicity, 2) elegant analogies between familiar the bottom. Because distant signals have numerous intertime -domain filtering/spectral analysis and plane -wave actions with the ocean boundaries , single-path beamtorming/field-direcrionalirv mapping, and 3) the plane-wave propagation is clearly not an appropriate fact that accurate numerical models for complex model here . However, given sufficiently accurate environmental information, the coherent sum of multipaths multiparh propagation were too computationally intensive tor signal-processing applications. Although difficulbetween source and receivers for different hypothesized ties with plane-wave approximations in coherent source locations can be predicted by numerical solution multipath environments are often dealt with by a variety of the wave equation and its boundary conditions. The of mitigation techniques, the performance ofsuch methmost basic MFP approach , known as Bartlett ods is inevitably upper bounded by the case where matched-field beamforming, is then to simply correlate multipath is absent. The notion that instead of trying to each ofthese replicas with the field measured at the array. .;

:'

81

ronmental conditions. In order to suppress ambiguous sidelobes, several variations of minimum variance (MV) adaptive bearnforming have been proposed [78, 215, 360]. The basic MV matched-Held bearnformer selects weights for each hypothesized source location that minimize output power subject to the constraint of unity gain for signals emanating from the desired range-depth point under the assumed environmental model. MV matched-field beamformers generally do provide lower sidelobe levels than the Bartlett processor, but often this comes at the price ofeven greater sensitivity to mismatch between the assumed and actual environmental conditions . To achieve matched-field localization performance that is robust to environmental mismatch, a number of approaches have been proposed, not only in the form of beamforming methods, but also by jointly estimating the source location and environmental parameters [73, 332] . One robust beamforrning approach that has been demonstrated to provide lower sidelobes and higher probability of correct source localization in the presence of environmental uncertainty is the minimum variance beamformer with environmental perturbation constraints (MV-EPC) [215, 216] . Essentially, the MV -EPC method minimizes output power subject to a set of linear constraints designed to maintain array gain for signals emanating from a hypothesized range-depth point but over an ensemble of'perrurbed ch
... 10. A Bartlett matched-field ambiguity surface far SACLANT Mediterranean data.

... 1'. An MV-EPCmatched-field ambiguity surface for SACLANT Mediterranean Data.

The estimated source location is then the hypothesized range and depth which maximizes the power at the output of the beamforrner. The so-called ambiguity surface ofthe Bartlett beamformer is its output power versus hypothesized range and depth. A typical Bartlett ambiguity surface obtained using actual Mediterranean data from a 48-sensor vertical array is shown in Fig. 10. Note that there is a local maximum of this ambiguity surface near the true source location at a range of5900 m and depth of 70 m. This data set was collected by the NATO SACLANT Centre [139] and is currently available to the public on the IEEE SP database web-site at Rice University (http://spib.ricc:.edu). Two major difficulties facing matched-field processors are: 1) high sidelobe levels, as seen in Fig. 10, which result in ambiguous source location estimates, and 2) the sensitivity ofthese methods to errors in the assumed envi-

82

unresolved multipaths from ground reflections local to the aircraft [301]. These predictions are essentially correlated with complex delay-Doppler data in a ML altitude estimation method that uses multiple radar dwells on the aircraft target as illustrated in Fig. 12. The sub-plots in this figure represent consecutive delay-Doppler surfaces from an OTH radar . Note that the dominant vertical band in each sub-plot is due to ground clutter and the peak encircled in each plot is a small twin-engine aircraft approximately 2,500 km away from the radar . Matched-field altitude estimation (MFAE) ex• 12. Log-amplitude delay(aka. slantrange) versus Dappler frequency surfaces formultipleradardwells. ploits the fact that the complex fading characterable integration times. Those interested in pursuing these istic of the target peak is highly dependent on aircraft altiproblems can find much of the necessary software and tude. In MFAE, a time-evolving log-likelihood function of data at N]IT's Ocean Acoustics Library web-site aircraft altitude is updated with each radar revisit ofthe target. An example of this function is shown in Fig. 13 for this (http://oalib.njit.edu). real-data example. Observe that the matched-field estimate Beyond underwater acoustic applications, ofaltitude isremarkably close to its true height of5,OOO ft. matched-field techniques are also currently being exIn swnmary, statistical signal and array processing with plored for problems involving multipath electromagnetic computational propagation models permit the exploitapropagation. Although this work is still in its early stages, tion of complex multipath conditions to achieve signifielectromagnetic matched-field processing (EM-MFP) cantly enhanced performance over traditional methods. has already been proposed in several applications including: 1) aircraft height-finding using a low-angle microwave radar that can exploit multipath due to the target's direct-path and specular reflection off the ground [182]; 2) inversion of tropospheric refractivity parameters that characterize ducted radio-wave propagation conditions over the sea surface using point-to-point microwave transmissions [138]; and 3) target localization in high-frequency skywave over-the-horizon (OTH) radar [217,301]. One application where EM-MFP provides an existing radar with an entirely new capability is that ofaircraft altitude estimation for OTH radar . Over-the-horizon radars use the refractive properties of the ionosphere for wide-area surveillance of targets at megameter ranges. And although target localization in latitude and longitude is typically achieved by tracing the paths of rays refracted through the ionosphere, determination oftarget altitude has never been reliably achieved. In recent work, however, a form ofEM-MFP has been developed that uses an ionospheric propagation model to predict the signal in complex delay- Doppler space due to

•

83

1J . Time-evolving fog-likelihoodof aircraft altitude for a small twin-engine plane at an altitude of 5,000 feet and range of 2,500 kilometers.

Such approaches involve a tight coupling between the physics of wave propagation and. signal processing. They also rely 011 the availability of sufficiently accurate estimates of the environmental paratneters. Numerous results obtained with real data in very different settings, however, suggest that "sufficiently accurate" should by no means be interpreted as "perfect knowledge' ofthe environment. Indeed, in SOBle situations, robust signal processing methods have been developed that facilitate matched-field processing with almost "COffilTIOl1 knowledge" of the environment. Clearly, the further integration of computational wave-equation solutions and signal-processing techniques will pose Inany challenges and rewards in the future. A WWW link to the author of the above section: http://wwvv.ee.duke.edu/people/ik.htI111

15. G. Barnet, R. Kohn, and S. Shcarhcr, "Bayesian estimation of an

autoregressive model using markov chain monte carlo,"

j071T1llf!

of Eamomet-

"in, vol. 74, pp. 237-254, 1996. 16. P. Barone and R. Ragona, "Bavesian estimation of parameters of a damped sinusoidal model by a markov chain monte carlo method," IEEE TransattU11IS011 S(lTllal Proccssinq, vol. 45, pp. 1806-1814, 1997. 17. D. L. Bartholomew and J. A. Tague, "Quadratic power spectrum estimation with orthogonal frequency division multiple windows," IEEE TI'f1.1ISf1.Ctions 011 SilJlul! Pl'OCCSJilt/T, \'01. 43, pp. 1279-1282, 1995. 18. A. Bartov and H. Messer, "Lower bound on the achievable DSP pcrform.lIKC for localizing step-like continuous signals in noise," JEEl:: Tmusac-

tiousem Sigllal P7"OccssiJl.!.1, vol, SP-46, 1988. 19.1\1. Basscvillc, 1\11. Benveniste, A. Chou, K. Golden, R. Nikoukhah, and A. S. Willskv, "Modeling ami estimation of rnulrircsolution stochastic proCC\Sl:S," IEEE Trmls. 011 IT, vol. IT-38, pp. 529-532, Mar. 1992. 20. B. B.1ygun and A. O. Hero, "Optimal simultaneous detection and csnrna-

rion under a false .11.11"111 constraint," IEEE T'·n1IS. pp. 688-703, 1995.

011

11Iji}1"1II. Theory, vol. 4-1,

no, 3,

References

21. B. B.1ygUll and A. O. Hero, "An iterative solution to rhe min-max simultancous derccnon and esnrnation problem," in Proc. afthe IEEE 111m1ul1(Jp on Statistunl S~71ltll anti An'a)' ProceSSl1tlT, pp. 8-11, Corfu, Greece, June 1996.

1. A. Arncodo, Y. D. Carafa, R. Audit, E. Bacry, J. MllZY, and C. Thcrrnes, "What can we learn with wavelets about dna sequences,' Pbvsica A, vol.

22. E. Beadle and P. M. Djuric, "Parameter estimation for non-Gaussian

24Y,pp.439-44S,19Y8.

autoregressive processes," in ICASSP, pp. 3557-3560, 1997.

1. P. Abrv and P. Flandnn, "Point proccssc'i, LRD .1I1d wavelets," ill Wavelets

23. A. Bclnuchrani .1I1J 1\1. Armn, "Blind source separation based on time-frequency signal rcprcscut.mons." IEEE Transactions 011 S~Tllnl Pro-

ill Biolo,..f1.Y and Medicine, A. Aldroubi .111<.1 M. Unser, editors, Cll.C Press, 19lJ6. 3. B. G. Agee, S. V. Schell,

.111d

ceS!jill.!l, P: Submitted.

W. A. Gardner, "Spectral self-coherence

24. S. Benedetto, E. Biglicri, and V. Castellani, Di.tritn1 Transmission 71JCOl}',

rcsroral: a new approach to blind adaptive signal extraction using antenna .irruvs," Pl'OcccdilllTS oftbe IEEE, vol. 78, p~"'\. 753-767, 1990.

Prentice-Hall Inc., New Jersey, 1987. 25. S. E. Bensley .1I1d B. Aazhang, "Subspace based channel estimation for

-I-. 1\1. Amin, "Time-frequency spectrum analvsis and estimation for

cdrna communication systems," IEEE Trans. 011 Cotuutunicntions, pp.

nonstarionary random processes,' in Ti11lt-F1-CqufUC." Siannl Analysis: Methods and Applicatums, H. Buashash, editor, pp. 208-232, \!Vilcy Halsted Press, Australia, 1992.

1OOY-l 020, August 1996. 26. A. Benveniste, M. Goursar, and G. Ruger, "Blind equalizers," 1£1-:£ Trans.

5. M. Amin, "Minimum v..iriancc rime-frequency distribution kernels for signal IJ1 additive noise," JEEE Transnctums 011 S~71l(J1 Ptocc.uilllT, vol. 44, pp. 2352-2356, 1996. 6.

on Cmuntunicntunts, pp. X7I-XX2, August 19M2.

27. M. J. Berliner and J. f. L. (Ed\.),Awmtic Partulc VdOClt')' Sensors: DfJ'.!J1I, Pcrfimuanccmitt .Applimtio1l.\, All' Press, Woodbury. NY, 1996.

n. Andrews .1I1J A.

Herzberg, Data. A collection ofpmblemJfI'01I111Ul11.'Y fields .fin'the student and research worker, Springer- Verlag, 1985.

28.

7. C. Andrieu, A. Doucet, and P. Duvaut, "'Joint Bavcsian detection and cstimarion of sinusoid» embedded H. C. Andricll, A. Doucct,

III

noise,"

III

proccs~ing,"

Signal PmCt:ssill..". vol. 43, 30. T. G. Birdsall and

Technical Rcport CUED/F-fNFENGrrR. 324, Uni-

9. J. B.le and I. Song, "R'1I1k-b.lscd detection of weak random sign.lI~ in .1 mul [J plic.ltivc noise model," S(lJllnl Pmccssilll1, vol. 63, pp. 121-131, 1997. ~U1d

P. N. l\llikhalcvsky, ~~An overview of IEEE jou17lf1.1 ofOcenllicEll.l. 1i-

III

AlInl.vJisn1ld An·n." 1990.

l'1"Oe. ICASSP'98, 1998.

1995.

011

Pl'OCCS.mJll,

S. Haykin, cditor, pp. -tlB-517, Prcllticc-H.ll1.

33. B. BO.lsh~l~h and P. O'She ..l, ~'Pnl~'nomial \Vigner- Ville di~triburion!\ and thcir relationship to time-v.lrying higher ordcr spc:ctrJ.,"

tion: Oprinul kernel dcsign," IEEE Tl-mlJactiolls Oil Stf11ltll ProCCSJiu...l1, vol. 4-1, pp. 1589-1602,1993.

011 SLmml

ProCL'SSiltlT,

\'01.

IEEE Trf1.lISflcti01H

42, 1995.

34. N. Bose and C. Ran, edirors, Hn1ldb(J()/~ 0fStntlJtu:S, VOlIlUlL' 10, St1T1lf1.1 Pro-

'~SpJLe-timc-fi'cqLlency processing of ~ynthctic

a.uill..." nlld itJ Applicatwlls, EI~cvier Science Publishers,

.lpeftllre rad.11" signal"," IEEE TrmlsllctimlJ OIl AtTOSpncc Electr. ~"st., vol. 30,

Am~terdal11, 1993 .

35. S. Bosc and A. O. Stcmll.1rdt, "Ac..i.lptive J.rra~· dl'tectiul1ufullcerr.lin rank olle \\'J.\'ctunns," IEEE Trnwactio1lS 011 S~llltll Prot:fSsilI..", vol. 44, no. 11, pp. 2801-2809, Nov. 19l)6.

pp. 341-358, 1Y94. 14. S. Barbaross.l, A. Scaglione, A. R.liocchi, ,lnd G. Colletti, ~~Modcll11g net-

work trattlL data by doubly stoCh,l:itic point proce~ses with selt:simibr intcn~ity proccss

NDV

32. B. Bo.1shash, '~Tlme-ti"CqllLl1cy \ignal .1I1.1Iysis," in J~dp(1llccJ ill Spectru1l/

12. R. Haraniuk .111d D. Jones, UA signal dcpendent time-frcquency represcnta-

13. S. B.lrbarossa .1Ild A. F.lrin01,

11, pp. 2733--M,

pp. 197-210,1994.

1I1 oce.1l1 .lcollstic~:'

J. Q. Ibo and L. Tong, ~~AppliL.l[ions of blind cqualization in wirelcss ~nm networks,"

110.

J. O. Gohicll, "SuHicicnt ~t.1tistlc~ and rcproduclng den-

31. R. Blum, "A~ympt(Jti(.ll1y rnbu~r detection fnr k.nown signals in cont~lIni nJ.red multtplicJ.ti\'c l1C1isl'," IEEE TrallJnctiollJ 011 Sif11lal Proccssing, vol. 38,

llcc1·iu.!T, vuJ. IS, no. 4, pp. 401-424,1993. 11.

with random amplitudes:

sities in simlllt.lI1eOll~ "cquenti.11 detection and cstin1.1tion:' IEEE Trm/J. IlljiJ1711. Them", vol. 19, pp. 760-768, 1973.

vcrsity of C.lmbndgc, UK, 1998.

miltchcd-ticJd methods

~ignal~

least-squares estimators .llld rhcir ~t;lti~tiLClI .1Il.1Iy~i~," IEEE TrnllJnctilJ71s OJI

Vv. J. Fitzger.lld, and S. J. Godsill, "'An introdlll.:tion

10. A. Haggeroer, \A'. Kuperman,

Bernardo and A. F. M. Smith, Bavcsinn Theory, John Wiley, New

2tJ. O. Besson and P. S[OiGl, "Sinusoul.il

ICASSP, pp. 2245-2248, 1998.

to the theory and .1pplicati()n~ of simulation bascd cOl11pllt..uional mcthods

in sIgna)

J. M.

York, 1994.

36. S. Bose .l.nd A. O. Sreinlurdt, "A maxim.ll im'.lri.1J1t ti'an1c\Vork for .ldapri\'c:

.1Ild tl·'lCt.11 rem:w,ll point procc~s," in P1"()c. Tbi1·ty-Fh-st and ComjJuters, pp, 1112-16, Montcrey,

AJi/own?" COl~f()l1 S(fJllnls, .~VJtt:lIlJ

dctection with structured and unstructurcd c(J\'~lri.lIKc mJtriLcS," IEEE

CA, Nov 1997.

Tl'fl1ISflcti011J 011 Si...mml ProussilI...lT. vul. 43, 110. 9, pp. 2164-2175,

84

Sepr. 1995.

37. G. F. Boudreaux and T. Parks, "Tirnc-varving filtering and signal estirn ..l-

59.

Acoustics. Spud), tuui

S~T1/tll

J. Cheung and

L. Kurz, "A gencr~\lized Mvinterval partition detcetoir with

applications to signal detection In impulsive noise," IEEE Transnctums 01/

tion using Wigner distribution synthesis techniques," IEEE Trnusnctions 011

Proccssinn, vol. 34, pp. 442-451, 1986.

Acoustics, SPt:CclJ. nud S£rpzn! Procrssiu...", vol. 41, no. 1, pp. 213-221,1993.

38. C. Bouman and K. Sauer, "Fast numerical methods for emission ..i nd transmission tomographic reconstruction," in Proc. COllI on Illjinm. SCZfllCCS I11ld .Svstcius, Johns Hopkins, 1993.

60. A. Chevreuil and P. Loubarou, "Blind second-order identification of fir

39. M. Bouvet and S. C. Schwartz, "Comparison of ad.iptivc and robust rccciv-

61. C.-M. Cho .uid P. Djuric, "Bavcsiun detection and estimation of cisoids in

channels: Forced cvclo-statiunaritv and vrructurcd subspace method.' IEEE

colored noise," IEEE Trausrurions

crs for sign.11 detection in ambient underwater noise," IEEE Trnns. Acoustics,

C. Tiao, Bayesiall lnjcrcncc . John Wilcv, New York,

63. S. Chretien .uid A. Hero, "Acceleration of the EN! algorithm via proximal point iterations." in Prot".

4-2. D. R. Brillinger, Time Series: Dnta Analvsis and Theory: Springer- Verlag,

dle unplemcnt.itions." Technical Report CSPL-3l4, C0111l11. and Sig. Proc. L.1b. (CSPLj, Dept. EECS, University oL\tlichig;lll, Ann Arbor, .1.\1.11".

a~pens

1<)9~.

of the study of ordirurv time series and

65. P.-H. Chua, C.-lv1. S. See, and A. Nchorai, "Vector-sensor

pOll1r processes," Developments in Statistics, vol. 1, pp. 33-133, 1978.

for csrirn.iting

4-5. P. L. Brockert, 1\11. Hinich, .111d G. R. Wilson, "Nonlinc.ir and non-Gaussinn ocean noise," f. Auw.r6crz! Society 13g6-1394, 1<)87.

CO//IIJl1.f11

lihood ratio dcrccnou.' 11:'£1:- Transactions Oil

S~f]/lI1!

C.llllb.llll'l

011

Procrssniq, vol. 44, no.

IT, vol. 41,

110.

011

icntunis, vol. 43, no. 10, pp. 25H2-2604, Ocr, 1Y95.

Discrete time signal", - part III: Rcl.irions

6~.

With

other time-frequency signal

D. Cochran, H. Gish, and D. Sinl1u, "A gcometric approach to multiplc-chanuel ~igrl.ll detection:' IEEE TrmlJn(tw/lJ 43, no. 9, pp. 2049-2057, 1995 .

.md C. Houdrc, "On the continuous wavelet tr.1l1storm uf sec-

ond order Llndol11 proccsscs:' fEE£ Tnl1lJ.

V. Evuboglu, .1I1d G. D. Forney, "'MMSE

transf irmarions." PIJl/ips I Rcsrarch; vol. 35, 1980.

like-

4, pp. 912-927,1996. ...J.<) . .s.

Proc. (ICASSP98),

time-frequency ~Ignal nnalvsis- p.lrt 1: Continuous-time sign.lls, - part II:

Sinnal Processinn, vol. 40, pp. 2941-2946,1992. gcn~ralized

~;1.

S~J.

67. T. Clausen and \V. Mecklcnbr.iukcr, "The Wigncr-distriburion - a tool Ior

47. T. P. Broncz, "On the performance .ldvanr.1ge of multiraper spectral an.ilv-

48. K. Burgess and B. D. Van V cell, "Subspace-based adaptive

processing

decision feedback equalization .uul coding - parts i and ii," IEfE Trans.

generalized prolate spheroidal sequences," IEEE Transactions

TI'fl1IJaaiO/lJ01/

.UT.1Y

.1I1d times of arrival of multipath communication sig-

66. J. N1. Cioffi, G. P. Dudcvoir,

Sijl1JllIProccssitut, vol. 36, pp. 862-873, 198H.

sis," I£b"£

,1I1g1e~

nals," 111 Proc. IEEE hal C01~r 011 Acousr.,Speech, and pp. 3325-3320, Sc.irrlc, \tVA, Mav llJ9X.

vol. H2, pp.

(~f'A1IJtTit.:(!,

46. T. P. Broncz, "Spccrral estimation of irregularly sampled multidimensional 01/

on Infonuntion Theory, iYUT,

64. S. Chretien and A. Hero, "Generalized proximal point .ilgorithms and bun-

polvspcctr,i," Annals klntlJ. Statistics, vol.

36, pp. 1351-1374, 1<)65.

pr{)L~SSCS by

(~rIEEF. .~1'1/lp()Jilf//J

Cambridge, August 1999.

New York, 1901.

4-4. D. Brillinger, "Comparative

43, no. 12, pp.

AW1HtiCJ, Speech, and S(ffllfll Proccssinn, vol. 37, pp. ~62-871, 1989.

1973.

n. Brillingcr, "An introductiou to

\'01.

mulncomponcut signab using exponential kernels," IEEE Transnctionson

36, no. 10, pp. 1618-27, Oct 1988.

43.

Si.!.T"nl Pl'OCCJSill.!T,

62. H. Choi and \V. Williams, "Improved timc-frcqucncv representation of

40. A. Bovik, "On detecting edges ill speckle imagen':' IEEE Trans ./iSSP, vol.

C~.

01/

2943-2952, 1995.

Speech, ant! Sinnn! Pmcc.ui1z...rr, vol. ASSP-37, no. 5, pp. 621-626, 1989.

41. G. E. P. Box .ind

Letters, vol. 4, no. 7, pp. 204-206, July 1997 .

S~T1tn/l'roL

3, pp.

S£fl1lt1! PrOl:cJ.rill!T, vol.

(}7/

69. L. Cohcn, "Gcneralized 1'l1.1se space distribution functions," !ollJ"unl of

628-633, ;\ILly 1<)95.

Alat/).

P/~vsio.,

vul. 7, pp. 781-786, 1966.

70. L. Cohcn, "Time-frequency distriburion~ - .1 rcvie\v," PmcccdiJz...ns oftbf

SO. I.-F. Cardo">o, "Blind Signal scp;1r..n ion: stati~tic.ll principles," jJroc. oftlJl'

It"]-:!::'. Spccial i.\Jut:011 blind idclItijicntilJlJ (Juri t'StilJ'/ITtilm, 1998. To .1ppe4.1r.

IEEE, vol. 77, pp. 941-l)~1, 19X9.

51. J.-f. Cardoso, M. L1\'idle, and E. l\tlo11lin~s, "Un algorithme d'identification par lluximlllll de vr..lisembl.1nn: pour des dOllne~s incol11plctcs," CompteJ Rcudu.Idc l'Amdcwic des SciclI£:n, StTicJ I, vol. 320, !lO. 3, pp. 363-368, 1<)95.

72. R. R. Coifnun and tvl. V. \Vickerh.luscr, '·Entropy-bJ.sed algOrithms for best b.1Si~ selection," 1£££ Tra1/s. 01/ IT, vol. [T-3H, pp. 713-718, i\tbr. 19tJ2.

S2. B. P. Carlin, N. G. Pn150n, ~lIld D. S. Stoffer,
73. M. C()lIl11~ and W. [(uperm.l11, ·'hKJliz.uiun: Envirunmcnt.l1 focusing .1l1d

!l1Ollte

l1ollnornl.11 .1Ild nonlinear state-sp.Ke modeling," Jonnlnl Stt1tlstiml ,,·!J.wdntiou, vol. 87, pp. 493-500, 1992.

cado .lpproach ro (~ftlJc American

source locJ.liz.1tion,"!o/tr1url (~ft!J( ACOll.ftim! S()(/fty

53. C. K. C.lrter and R. Kuhn, ··Sel1lip"1rametric Bayesian infercnce for time S~

a.uiJl...fT, Elsevier, vol. 36, no. 3, pp. 287-314, J 9l)4.

1997.

75. P. Comon

54. G. C.\rr~r, editor, Sp(xin/ IS.HIC 011 Timc Dda)' EJti1Jlatioll, volume ASSP-29, Ifl-;}; TmllsnctuJ7lJ

OJ[

hl'c!JitatlJ1-!~J

Test,

l11/(i

ear furms,"

Epa/uHtioll EJl!TillfCrJ, rEEE-Pr~ss,

N], 1993.

and tr.1ining ~eqllcllc~ ba~~d channel ~stimarion," in Proc. 1st IEEE S'.!.TJlaL Proc. rVor"'- fJJ/ rVil'clc.u CO/ll//l., pp. 12<)-132, 1997.

7X. H. Cox, R.

110.

1-3, pp. 93-108, Sep 1996.

and M. 1\1~'er.'i, "A subarray apprn.1Lh to m..uched-rIeld ufthc AC01Htlml Society of.'-J.mcrica, vol. 05, no. 3, pp.

1158-1166,llJ90. 79. 1\!1. S. Crou~e, R, D. Nowak, .1nd R. G. B.lr.1niuk, "':;ignaJ estinution llsing

-1-6-59, 1987.

w.wc!et-I11.1rko\' models," in PrOf. IEEE Iut. Cm~r ..1 W/lst., SjJeech, n:.uillp-ICASSP '97, i\!lunich, G~rmany, 19<)7.

5g. R. Chen and T.-H. Li, "Blind IT~turari()11 of lil1~~lrly dcgr..lded discr~t~ sig011 S~llllll

vol. 53,

Z~skind,

proces~ing," JOlt17U1!

57. }. \'. Chen ,1Ild 1. S. ReeJ, "A L!crectH1I1 .11gurithm for optical targets in dutter," IEEE Tnl1ls, 01/ ,Acrosp. Elcrtl'01l. ami ,~),JtCllls, vol. AES-23, no. 1, pp.

by gibbs s.unpling," IEEE Tm1l.mctiow

S~l/1t1! ProCfssill!T,

77. R. T. Compton, Adnptil'£' ~-f/ltt/l1/tU - C01/ceptJ muf PnfiJnllnllcc, Prentice ~l.lll, [ilL., Englewuod Cliffs, NJ, IlJ88.

56. E. D. C.lr\'~llho ..llld D. T. Nl. Siock, ·'CraIl1Lr-r.lo bounds ~()r scmi-blind

lUis

III

76. P. (u1110n and B. J\tlollrrain, '·necompo~ition of qllantic~ in powers of lin-

55. c;. Cart~r, ~dit()r, Cohere11ce mid Ti1lleDL'!a.1' EJtl1/latioll-AJl Applied Tutrwil1! Pi.t,Glt.l\\'a~',

F. C.1rdo~o, ·~EigeI1Yalue decompositioll of a tensor with Proc:. ST'IE CU1~r 01/ Anl'tlJlccdSi...fT11r1! Proce.ul1I...fT A [wwitlmls. olin ImpiCJJ1l.'. /ltntiollJ, pp. 361-372, San DI~go, CA, July 1990.

Jlld ].

J.pplicltions,"

Acoustic.r, SpfCCIJ, miff S~T1lfzl ProrC,l"sill/}, 19H1.

jiwRfscnnIJ, Drpc!opmcllt,

90, no.

74. P. COl1lon, "Independcnt component "l11.1lysi~, .1 new concept? ," S(rrlln! Pro-

nes with mixed ~pecrr<1," Jourlla! OftIlL' R(~l'al Statistiw! Socift:'i B, vol. 59, pp. 25~-2()H,

(~fJl1JltTim, vol.

3, pr. 1410-1422, 1<)91.

xu.

ProccssiniT, vol. 43, pp.

2410-2413,1095.

85

Z. C"ctkovic, "Shorr-time touner

~ll1.1ksis-.l novel

S~T1JtJl

Pro-

windu\\' design procc-

dure," in Proc. lC..-iSSP, v()lul1l~ 3, pp. 1773-1776, 19l)~.

81. D. Dahlhaus, A. [arosch, B. Fleury, and R. Heddcrgott, "Joint demodulation in ds/cdrna systcms exploiting the space and time diversity of the mobile radio channel," in Proc. IEEE Int. 5,.W'I'tp. 011 Personal Indoora,ui Mobile mdio Communications, pp. 47-52, Helsinki, Finland, 1997.

102. M. H. El Ayadi and B. Picinbono, "NAR AGC adaptive detection of nonoverlapping signals in noise with fluctuating power," IEEE Trans. Acoustics, Speech, and Signal1'?"ocessi1Jg, vol. ASSI'-29, no. 5, pp. 952-962, 1981.

82. A. Dandawatc and G. Giannakis, "Statistical tests for presence of cyclosrarionariry,' IEEE Transactions 011 Stl11url Processi1l,g, vol. 42, no. 9, pp. 2355-2369, 1994.

103. European Telecommunications Standards Institute, editor, European Telccomnumiauums Standard, Radio broadcast systems: Digital audio brondcastitu; (DAB) to mobile, portable, and fixed receivers, Standard ETS 300401, 1994.

83. M. Daniel and A. Willsky, "A mulriresolution methodology tor signal-level fusion and data assimilation with applications to remote sensing," IEEE ProceedilllTS, vol. 85, no. 1, 1997.

104. A. Farina, Au tenna Based SilT,zal Processint; Tcclnuqurs ft1' Radar Systems, Artech House, Norwood, MA, 1992. 105. K. Fazel and G. Fcttwcis, Multi-Carrier Sp1'cad Spectrum, Kluwer AC.ldemic Publ., 1997.

84. K. Daoudi, "Multifractal representation of turbulence signals: A wavelet based approach," in INRIA Intcmational Wnvelcts COllference, INRIA, editor, Tanger, Morocco, April 1998.

106. M. Feder and E. Weinstein, "Optimal multiple source location estimation via the EM algorithm,' in Proc. IEEE Int. Con]. Acoust., Speech, aud SLq. Proc., pp. 1762-1765,1985.

85. 1. Daubcchies, "The wavelet transform, rime-frequency localization, and signal analysis," IEEE T'·tI7lS II/I tu«, vol. 36, pp. 961-1005, 1990.

107. M. Feder and E. Weinstein, "Parameter estimation of superimposed

86. I. Daubcchies, Ten Lectures 011 Wavelets, CBMS-NSF, SIAM, Philadelphia, 1992.

108. W. Feller, All Introduction to Pmbability Tbc01)' aud Its Applications- Volume II, John Wiley and Sons, second edition, f971.

87. R. Davis and S. Resnick, "Limit theory for bilinear processes in heavy-tailed noise," A 111ialsof Probability, vol. 6, no. 4, pp. 1191-1210, 1996.

109. T. S. Ferguson, MathematicalStatistics -A Decision Theoretic Approach, Academic Press, Orlando FL, 1967.

88. De Moor B.L.R. (cd.). DAISY: Databasef01' tbc Identification(}!S)'stems. http://www.esat.kuleuvcn.ac. bc/sisra/daisy, October 1997.

110. J. A. Fessler and A. O. Hero, "Space-alternating generalized EM algorithm," IEEE Transactions on Stl1nal Processiu...'1, vol, SP-42, no. 10, pp. 2664-2677, Oct. 1994.

89. A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm,"]' Royal Statistical Society, Scr.

Ill.

B, vol. 39, no. 1, pp. 1-38, 1977.

Fitzgerald, and P.

1. W.

Fessler and A. O. Hero, "Penalized maximum-likelihood image re-

T,'mzs. o1'lImagc Processi1zg, voJ. IP-4, no. 10, pp. 1417-1429, Oct. 1995.

ccssin...1f Handbools, V. K. Madiserti and D. Williams, editors, chapter 24, eRC Press, 1998.

J. GoJsill, W. J.

J. A.

construction using space alternating generalized EM algorithm," IEEE

90. Z. Ding, "Adaptive filters for blind equalization," in Digitnl Sigtzal Pro-

91. P. M. Djuric, S.

~ig

nals using the EM algorithm," IEEE Trans. Acoust., Speech, and SLIT. Proc., vol. 36, no. 4, pp. 477-489, April 1988.

112. P. Fieguth, W. Karl, A. WiJlsky, and C. Wunsch, "Mulrircsolution optimal interpolation and statistical analysis of topcx/poseidon satellite altimetry," IEEE Trans. Oil Geoscience and Remote SCllSi1'zg, vol. 33, no. 2, pp. 280-292, March 1995.

Rayner, "Detec-

tion and estimation of signals by reversible jump markov chain monte carlo

computations," inICASSP, pp. 2269-2272, 1998.

113. P. Flandrin, "A time-frequency formulation of optimum detectors," IEEE

92. P. M. Djuric and H.-T. Li, "Bayesian spectrum estimation," Sig'nalPro-

Transactions 011 Acoustics, Speech, and StlJllal Pl'ocessiuJ:1, vol, 36, pp.

cessi1ttr Letters,vol. 2, pp. 213-215,1995.

1377-1384, 1988. 114. P. Flandrin, 4'On the spectrum of fractional Brownian motion," IEEE

93. A. Dogandzic and A. Nehorai, "Estimating evoked dipole responses by MEG/EEG for unknown noise covariance," in Proc. 19tb Ann. Intl C01if. IEEE £ll.H. Med. Bioi. Soc., pp. 1224-1227, Chicago, IL, October 1997.

TrmlS. »ctt. vol. IT-35, pp. 197-199, Jan. 1989.

115. B. Fleury, D. Dahlhaus, R. Hcddergorr, and M. Tschudin, "Wideband angle of arrival estimation using the SAGE algorithm," in Proc. of IEEE Jut.

94. D. L. Donoho and I. M. Johnstone, 'LIdeal spatial adaptation by wavelet shrinkage," preprint Dept. Stat., Stanford Univ., Jun. 1992.

5,.vmp.

011

Spread SpeCt7'IlW Tech. mId Apps, pp. 79-85, Mainz, DL, 1996.

116. I. K. Fodor and P. B. Stark, "Multitaper spectrum estimates," SOHO 6/ GONG 98 JI1orlubop 01/ bclioseismolo.ll.V, 1998. in press.

95. D. Donoho and 1. Johnstone, "Adapting to unknown smoothness via wavelet shrinkage," jASA, vol. 90, pp. 1200-1223, Dec. 1995.

In

Proc.

117. G. J. Foschini, "Equalization without altering or detect data," .riT &T Tcch.fournal, pp. 1885-1911, October 1985.

96. A. Doucet and P. Duvaur, "Bayesian estimation of state space models applied to deconvolution of Bernoulli-Gaussian processes," Si.qllal Pmcessi1'lg, vol. 57, pp. 147-161, 1997.

118. J. Frances and B. Friedlander, "Bounds for estimation of multicornponenr signals with random amplitude and deterministic phase," IEEE Transactions on Si.Jl1'zal Proccssinq, vol. 43, no. 5, pp. 1161-72, May 1995.

97. G. D'Spain and W. S. I-Iodgkiss, "The simultaneous measurement of infrasonic acoustic particle velocity and acoustic pressure in the ocean by freely drifting swallow Boats," IEEE]. OceanicE11...'1., vol. 16, pp. 195-207, April 1991.

119. A. Fredricksen, D. Middleton, and D. Vandcl.indc, "Simultaneous signal detection ..ind estimation under multiple hypotheses," IEEE Trans. 011/11fo171l.

Tbe01Y,

\'01.

18, pp. 607-614, 1972.

9g. A. Duel-Hallen, "Decorrelating decision-feedback multiuser detector for synchronous CDMA channel," IEEE Trans. 01/ Communications, pp. 285-290, February 1993.

120. D. R. Fuhrmann, "Application of toeplitz covariance estimation to adap-

99. R. F. Dwyer, "Fourth-order spectra of Gaussian amplirude-rnodulatcd sinusoids,' j. AcousticalSociety of A1I1t17.ca, vol. 90, no. 2, pp. 918-926, August

121. M. V. E. G. D. Forney,

rive beamforming and detection," IEEE Transactions on Si...rT1lrtl ProcCJsill~lT, vo1. 39, no. 10, pp. 2194-2198,1991.

Ir., "Combined equalization and

coding using

preceding," IEEE Communications Mnqazinc, pp. 25-34, Dec 1991.

1991.

122. W. A. Gardner, editor, Cydostationarity in Communicationsaiu! Si...lJllal Pmccssillg, IEEE Press, 1994.

100. M. H. EI Ayadi, "Generalized likelihood adaptive detection of signals deformed by unknown linear filtering in noise with slowly fluctuating power," IEEE T,·f.J1lS. Acoustics, Speech, and S[J1llal Proccssin... '" \'01. ASSP-33, no. 2, pp. 401-405, 1985.

123. A. Gelman,]. B. Carlin, H. S. Stern, and D. B. Rubin, Baycsia» Data Analysis, Chapman and Hall, New York, 1995.

101. M. H. EI Ayadi, ~'NAR estimators of spatial covariance matrices for adaptive array detection," IEEE TrmlStlctiolls 01/ StJT1UlI P1'Ocessillg, vol. 39, no. 7, pp. 1682-1690,1991.

124. S. Geman and D. Gernan, '4Stochastic relaxation, gibbs distributions and the Bayesian restorJtion of inl.1ges," IEEE T1"mlSactiolls au Patte111 A Iln~YJi.~. rmd MaclJillc Illtelligcllcc, vol. 6, pp. 721-741, 1984.

86

couvolutivc channels," in Proc. ICASSV'97, pol. V, 1997.

19<)~.

148. P.

127. G. Giann.ikis and G. Xu, editors, Spcciol lssuc

(11/

,lting discrete prolate spheroidal sequences," IEEE Transactionson Stmzn1 151. 1v1. Harnalainen, R. Hari, R. Ilmonierni,

Haudbool: V. K, Madiscrti ,1I1d D. Williams, editors, chapter the StJ-

Pbvs., vol. 65, pp. 413-497, April 1993.

noisy fir channels: direct ,111d adaptive solutions," fEEE Trnnsnctions 071 S~T

152. E. Hannan and M. Deistler, The Statistica! Theory (~fLillen1" SYStC1JH, Wiley,

45, pp. 2277-2292, September 1<)LJ7.

New York, 1988.

J. M. Mendel, "Ideutificarion of non-minimum phase

153. J\t1. Hansson .ind G. Salornonsson, "A multiple window method for esti-

A coustus. Speech, awl

mation of peaked spectra," IEEE Transactions 011 S~17J(l1 Pl'oCCJJi1lg, vol. 45,

Proccssinn, vol. 37, pp. 360-377, ;\ILlrch 1989

pp. 778-781,1997.

132. G. B. Gi.innakis and E. Scrpcdin, "Linear multich.mncl blind equalizers of

154. H. Har.ishima ,1I1d H. Mivakawa, "Marched-transmission technique for

nonlinear fir \'0Itcr1'.1 channels," lEEr; Tmusactions 011 StfTllf1/ Proctssintt, vul.

ch.innel, with intersvrnbol interference," IEEE Trans. on Commun., pp.

4S, pp. 67-81, Iauuarv 1l)t)7.

774-780, Aug 1972.

133. G. B. Giannakis and C. Tcpcdclenlioglu, "Basis expansion models and di-

155. \-\1. K. Hastings, "Moure carlo sampling methods using markov chains and

vcr:.. irv techniques for blind equalization of time-varying channels," Proceed-

their applications," Biomctnlm, vol. 57, pr. 97-109, 1970.

i1J[fs oftbr IfEE', September 19LJ8 (to appear). .1\1(11\/1.

156. G. Harke, "Performance analysis of the SuperCART antenna array," Tech-

K. Tsntsanis, "Sign ..rl detection .uul classification

Llsing matched filtering and higher order statistics," IEEE Transactions

Acoustics. Speech,

fl11r1.)'tf7llnf

nical Report #AST-22, MIT Lincoln Laboratory, Lexington, !VIA, March 19<,)2.

011

Proccssiuu ; vo]. 3K, no. 7, pp, 12K4-1296, 1990.

157. O. Harzinakos and C. L. Nikias, "Blind equalization lI~ing a tricepsrrurn

135. G. Giannakrs, "I'olvspccrral and cvclost.itionarv appro
b.lsed algorithm," IEEE Trans. ou COJIllJ1ltJlicnti07LS, vol. 39, pp. 669-682,

riou of closed-loop svstcrns," IEEE TrCl1lS. Autoumtic COllt1'01, vol. 40, pp. 8~2-8~5,

136.

C~.

1\1av 1991.

1995.

158. N1. Hawkes .ind A. Nchorai, "Effects of sensor placement on acoustic vee-

Gianuakis and l\1. Tstarx.uns, "Time-domain tests for Guussianirv and

timc-rcvcrsibilirv," IEEE Transactions O1J S&71ll1l

Pror.:c_ui1~rr,

tor-sensor arr.1Y performance," Submitted to IEEE j. Oceanic Eug.

vol. 42, pp.

159. M. Hawkes ,1\1<.i A. Nehorai, "Surface-mounted acoustic vector-sensor ar-

3-!-60-3472, 1994.

ray processing." in Proc. IEEE Jut! Con]. 011 Acoust., Speech, and 5ig. Proc. (fCASSPC}OJ, pp. 3710-3713, Atlanta, GA, J\.1.1Y 1996.

137. W. R. Gilks, S. Richardson, .ind D. J. Spiegclhalter, Maruov Cluun Montc Carlo

138. D.

I1l

Practice, Chapman and Hall, New York, 1996.

Gingr.1~,

processing:

160. Nl. Hawkes and A. Nehorai, "Acoustic \'ector-sensor beamt()rming and

P. GerstC)ft, .1nd N. Gerr, "Ekctromagnetlc Jl1J.tchcd-ticid

Ba~ic

tiollS OJl .'-!JJtC1l1JflS

C~lpon

concepts ,1Ild tropospheric simulation.. ," IEEE T1'nwncl1wi Prop(l~f7nti()ll, vul.

45, pp. 1536-1 S45, 1947.

Socia}' OfAl1lfl'im,

vol. 97,

6, pp.

110.

140. F. Gini, "A CllIllUI.1I1t-bascJ tion in

J

35~t)-3598,

~Hhpti\'l:

fOll17WI

()1l

162. S. H;1ykin, tditur, Hlilld DccollJ'o/utifJ1l, Prentice-H.ll1, 1994.

lLJ95

163. S.

tcchniquL: for coherent rad,ll" dctec-

IEEE, \'01. R6, pp. in

Jppnuch," IEEE Traumcti01H COJJlJJlltIJirntio1/s, vol. 46, pp. 40()--!-l 1, 1\1.1rch 1t)LJ8.

166. M,

167. ""1.

Hinlch .1lld C. R. \,yibon, "Time delay estimation using the cross

r.

Hll1ich
1l011-G~ll1SSI:ln

Spccch.

of '">pt:eeh and .H.1diu signals which call

/\1\1\11\ processL:s," 11ltcl·]Ultiollal.)'tatistimf RCl'iclP, \'ui. 65,

J. GodsilL "Robust modelling of nOIsy ARl\1A

nlld S~qJlal P/·o(fssill.[T, vol.

ries,"/. Time slgl1J.Is," in ICASSfJ,

169. L. Hinnuv

pp.

SCl"icJA71a~"JiJ, ~l1ld

87

011

Acoustics)

38, no. 7, pp. 1126-1131, 1990. stat1onar~'

time se-

vol. 3, pp. 169-176,1982.

J. P.lrk., "Multi-windowcd spec[rul11 estimates of the ILS

pobr motion:' in The Ea,.t!J)~· Rotatioll

3797-3800, 1997.

signals in

'~Detecti(ln ofnon-G.mssi,l11

noise using the bispecrrul11," IEEE Transactio1ls

168. N1. Hll1ich, "Testing for GJlISSi;lnity .1I1d linearity of a

pp. 1-21,1997.

10+5. S.

J.

vol. ASSP-40, 1Y92.

C01/l1/umi-

mtiom, pp. 1867-1875, 1980.

~lS

1998.

bispectrul1l," IEEE T1'fl1ISflctio7lJ 011 Acowtics, Speecb, and 5i..,Cf1Ul/ PmccSSi1J·...ff,

143. D. N. CodarLl, "S~If..rcco\'ning cq\ultz,ltlon and carrier tr~H.::klllg in

be modelled

pre~s,

pp. -11- 54, 1995 ,

e
cl1h~lI1cell1enr

and T. ShephL:rd, edi[ors, R..adm' An'ny PmcesJill..J1,

165. A. O. Hero ,1\H.I J. A. Fe'\slcr, "Co!1vcrgt:nce in norm ti>r
141. f. C~ini .1l1d G. R. l~iannakis, "Generalized differclltlal encoding: A nonlin-

1--!4-. S. Godsill, "B.lyesian

J. Litva,

164. S. Ha~'kin ~lIld D. J. Thomson, "Signal detecrion III a nonstationary environment reforIlluLued a~ .lJ1 adaptive pattern classification problem," P1'Oc.

cycl()~tatlo[ury

(}II

H.1~!kil1,

Spnnger-Verlag, Berlin, 1993.

4S, nu. (), pp. lS07-151LJ, 1997.

two-dimcllsional da[J communICation system'">," 1£1:£ Trt11lJ.

Sigunl ProcCSJillp, vol. 46,

1984.

oftht' /1collstical

141. f. Gini and G. B. Giannakis, ''"Frequc!1cy uthet and symbol tUlling reeo\"-

ny in tlat t:1ding ch.1l11lcls: A.

OH

161. S. Ha~'kin,Arrl1ySi...qllal PmccsJi1J...'T, Prentice-Hall, Engkyvood Cliff~, NT.,

mixture of K-distributed clutter and a Gaussi.1n disturb
TrnJlJl1c!ioJlJOIl S~TlInl PmasJI1l.!}, \'01.

direction estima[ioIl," IEEE Tnl1lJactiollJ

September 1998.

139. D. Gingras and P. Gersroft, "lnversioJl t<x geometric Jnd geoacoustic parameters ill slullow \,vater: Experimental results,"

O. Lounasrnaa,

Noninvasive Studies of Signal Processing of the Human Brain," RL1'. Mod.

130. C. B. Giunnakis .uid S, Halford, "Blind fracrionallv-spnccd equalization of

134. G. B. Gi.innakis

J. Knuutila, and

"Magnetoenccphalographv-e-Theorv, Instrumcntanon, and Applications to

tistical Slgl1.l1 Processing Section, eRC Press, 199R.

OJI

Prcn[ice-Hall, 1995.

Pmccssi71.!.1, vol. 42, pp. 3276-3278, 1994.

129. G. B. Gi.innakis. "Cvclosrationarv sign.11 analysis." in f)~Jltal S(fTlla/ Pro-

svstcms using higher-nrder statistics," IEEE Trnus.

PmCCSSfJ,

150. D. I\'1. Grucnbachcr and O. R. Hummels, "'A simple algcrithm for gener-

rzariou," IEEE S~l11nl Proccssiun Letters, 1,'01. 4, pp. lX4--1X7, June 1<)97.

S~Tnnl

Green, "Reversible jump markov chain monte carlo computation and

149.1\1. Grigoriu,.I'1pplierf NOll-GClllSSUl1l

118. G. H. Guunakis, "Filtcrb.mks t<J!' blind channel identification and equal-

131. G. B. Gi.uirukis and

J.

Bavesian model determination," Btonictrika, vol. 82, pp. 711-732, 1995.

S(f711al Proccssm...fT For Ad-

vouccti Counnunictttions, 1997.

\'01.

A. F..M.. Smith, "Novel approach to

140, pp. 107-113,1993.

Signals in multiplicative noise," in Proc. ICflSSP'98, volume IV, pp.

nal PruCCSSill/.7,

J. Salmond, and

147. A. Gorokhov and P. Loubaron, "Semi-blind second-order identification of

126. M. Ghogho and A. Nandi, "Locullv oprimurn detectors tor deterministic

crssillp

D.

nonlinear/non-Gaussian Bayesian state estimation,' lEE Proceedin...as, vol.

no. 1, pp. 240-247, 1985.

2137-40, Se.irtlc, Mav

J. Gordon,

146. N.

125. A. A. Gerlach, "Optimum detection performance of passive coherence estirnators," IEE£ Tmus. Acoustics, Speed), and Sitnut! Proccssitut, vol. ASSP-33,

flJlri

Rrjcroll.:c Fl'm"llcsfino

Gc(}de~"

aua

Geodynamics, A. Babcock and G. Wilkins, editors, pp. 221-226, Kluwer Academic, Dordrecht, 1988.

191. M. E. Kappus and F. L. V. IfT, "Acoustic signature of thunder from scismic records," f. Geopbys. Res.. vol, 96, pp. 10,989-11.006, 1991. 192. T. Kariva and B. K. Sinha, Rolntstncss ofStatisncal Tests, Academic Press. San Diego, 1989.

170. F. Hlawatsch and G. Boudreaux-Bartels, "Linear and quadratic time-frequency signal representations," IEEE SLlJual Proccssiu...tT AJa...anztne, vol. 9, pp. 21-68, 1992.

193. T. R. K.IrI, P. D. [ones, R. 'tV. Knight, O. R. vVhite, "V. Mende, ]. Beer,

and D. J. Thomson, "Testing for bias in the climate record," Science, 271, pp. 1879-1883,1996.

171. K.-C. Ho, K.-C. Tan, and A. Nehorai, "Estimation of Directions-of-arrival of Completely Polarized and lncornpletcly Polarized Signals

with Electromagnetic Vector Sensors," in 11th IFA C Symp. em S)'Jt. Ideut.,

194. S. Kassam .U1d H. Poor, "Robust techniques for signal processing:

volume 2, pp. 523-528, Kitakyushu City, Japan, July 1997.

vey,"

172. Ie-C. Ho, K.-C. Tan, and W. Ser, "An investigation on number of signals

173. K.-C. Ho, K.-C. Tan, and B. Tan, "Efficient method for estimating direc-

199. S. lelY and D. Sengupta, "Recent advances in non-Gaussian autoregressive processes," in Adl'/J1U;CJ in spectnuu nnalvsisand tlI7'ay jJmccsJill.,fT, S.

176. B. Hochwald and A. Nchorai, "Magnerocnccphalography with di-

177. M. L. Honig, U. Madhow, and S. Verdu, "Blind adaptive multiuser dctcc-

20 L. S. M. K.1Y, "Asvmprotically optimal detection in incompletely character-

IlljiJ. Tbe01,,', pp. 944-96, July 1995.

ized non-Gaussran noise," IEEE Trans. Acoustics, Speech, and St1T11n11Jro-

1.78. H. Hudson and R. Larkin, "Accelerated image reconstruction using or-

CCSSill...fI, \'01.

dered subsets of projection data," IEEE Transactions 01'1 Medica! bnagillg, 179. S. Iaffard, (LEXPOS~ll1ts de Holder en des Points Donncs ct Coefficients

St"nal Processintt, vol. 41.

1989.

EM algorithm," t.A.« Statist. Assoc., vol. 88, no. 421, pp. 221-228,1993.

J. Kelly, "An adaptive derccrion algorithm," IEEE Tl'flTLJ. 011 ALTOJp. Electron. and Systems, V(JI. AES-22~ no. 3, pp. 115-123, 1986.

204. E.

Statistical Science, vol. Y, no. 1, pp. 109-126, 1994.

J. [ao, "A matched array bearnforming technique tor low angle radar tracking in rnultipath," in IEEE National Radar Conference; pp. 171-176, 1994.

182.

205. E.

in optimal signal detection," IEEE T1-mlS.. Acoustics. Speech, and SLtT1/fll Processiu...tt, vol. ASSP-35, no. 5, pp. 671-67lJ, 1987.

184. A. Jeremic and A. Nehorai, ~'Dc~ign of chemical sensor arrays "(X moni-

toring disposal sites on the ocean Hoor," To .lppear in IEEE J. OceanicEllg.

207. A. Kim tor

185. D. Johnson and D. Dudgeon, An"{.J~" StlJllal PmCCSSill...I1-COllCCpts and Techniques, Prentice Hall, Inc., Englewood Clitl~, NJ, 1993.

'~Hierarchicli ~rocll.lsric

narrowband .md broadb.lnd

Tu appear

III

modeling of SAR imagery T'·fl.ll5nctio1l.\ (nI 5i...'fllfl1

IEEE

tllllC

J. F.1Y, ~~Au;]pti\'e

scp.lranon of unknown

'Ieries," in Proc. ICASSI', vol. -L pp.

2525-2529, L998.

M/{tT. RCS01Uf.1/CC, vol. B 110, pp. 138-149, L996.

209. D. Kirner .1l1d H. l\'tcs~er, ·~Sllboptim.l1 detection of non-GaLlssian !-.ignab by third-ordcr an.llysl~," IEEE Tran~acti()ns 011 Acollstics, Specch, .md Signal Processing, vol. 3R, no. 6. pp. 901-909. 1990.

of the l1luJti·-window

harmonic detector .lnd its application to real data," IEEE T"n1lmaiolls ou

SigHalp.rocesJing, vol. 41, no. 4, pp. 1702-1705, 1993.

210. A. Kok.u·am,lHotioll PictureRfJtorntioll, Springer Verlag, New Yl)rk,

An adaptive algo-

1998.

rithm based on neuromimctic architecnu·e," Si...lfllal P1"oc:cssill.t!, vol. 24, no. 1, pp. 1-10, July 1991.

~/1.lllat, D. Donoho, .1nd A. Vv'illsky, "Besr ba~i'i .llgonthm sign.ll enhaIlLement," in IC:--l.'·iSP'95, Dcrroit, NU, J\1.1Y 19lJ5, IEEE.

211. H. Krim, S.

189. M. K.lnda, "An electromagnetic near-field sensor f(x simuJr.:l.I1eous electric J.nd magnetic-tield measurements," IEEE TrailS.

H. KnJll,

208.1. P. Kirstein.'!, S. K. l\t\chta, .1l1d

NMR spectroscopy," j.

J. Herault, ~~Blil1d scparation of ~ources I.

~lnd

segmentation/compre~si()n,"

ProaJJI1l...I1.

J. Johnson, D. J. Thomson. E. X. Wu, ~ll1d S. C. R. Williams, '"MultI-

188. C. Juteen and

K. Nt. forsythe, "Adaptive detection and paramerer esti-

206. C. G. Khan; and C. R. R..IO, "Effects of estimated noise COV.ln.lIKC matrix

1992.

Pf,1

J. Kelly and

mation for rnultidimcnsiorul signal models," Technical Report 848, rv1.I.T. Lincoln Laborarory, April, 1989.

183. J. Jeong and W. Williams, "Kernel design for reduced interference distributions," IEEE Transactions 011 S(qnal ProuSJillg, vol. 40, pp. 402-412,

Jonsson and A. O. Steinhardt, '~The total

10, pro 3066-3U69, 1993.

CIMrl.lITO, "Evolunonary pcnodograrn for nousrutiorurv signal»," 1fEE Transactions OIL SLfTlwl Proccssiun, vol. 42, pp. 527-1536,1994.

181. A. Janicki and A. Weron, "Can one see a-stable variables and processes},"

J. O.

110.

203. S. Kayhan, A. El-Iaroudi. and L.

180. M. [amshidian and R. 1. [cnnrich, "Conjugate gradient acceleration of the

187.

ASSP·37, no. 4, pp. 627, 1989.

202. S. ~1. Kay and D. Scngupt«, "Detection in mcomplcrclv characterized colored non-Gaussian noise via par.Ul1ctnc modeling," IEEE' Trausactunis 01/

vol. 13, no. 12, pp. 601-609, 1994.

VIVO

H,lykin. editor, volume 1, pp. 141-210, Prentice-Hall, L991.

200. S. M. K.1Y, "Asyrnproricallv optimal detection in unknown colored noise via autoregressive modeling," IEEE Trans. Acoustics, Speech, aut! Simm! PI'Ocessinp; vo]. ASSP-31, no. 5, pp. 927-933, 1983.

vcrselv-oricntcd and multi-component sensors," IEEE Trans. Bunncd. £11....n.,

vol. 34, pp. 40-50, January L91.)7.

ple-window spectrum estimation applied to in

~11-6,

PTR Prentice-Hall, Englewood Cliff), NJ, 1993.

with vector-sensor applications," IEEE Transactions 011 Sigl1rrl Proccssinn, vol, 44, pp. 83-95, January 1996.

186.

36, pr. 741-761, July

\"01.

198. S. K.lY, Fundamentals ofStntisticn! SttTlla1 Processiu...lJ. Estinuttion 11JC01')',

175. B. Hochwald and A. Nchorai, "Identifiability in array processing models

79-~ 1,

1\1. Cioffi, "Vector coding tor partial re-

Englewood Cliffs, NJ, 1987.

3098-4001, Atlanta, GA, 1996.

d'Onddettes," C.RA.S Paris, vol. 1, no. 308, pp.

J.

197. S. K.1Y, SpectralEstimation: Theory and flpplit:ntul1l, Prentice Hall, Inc.,

efficient solution of large-scale space-time estimation problems, n in

011

Aslanis, and

tron emission tomography," IEEE Trans. OJ! kIL'diral bllnJ}i1ttT, vol. no. 1, pp. 37-51,1987.

174. T. Ho, P. Fieguth, and A. Willsky, "Mulriresolution stochastic models tor

tion," IEEE t-e«

sur-

190. L. Kaufman, "Irnplerncnring and accelerating the ENl algorithm for posi-

tiona-of-arrival of partially polarized signals with electromagnetic vector sensors," IEEE Transactions au -'(tTl/al Proccssin...q , vol. 45, pp. 2485-249R, October 1997.

vr, pp.

J. T.

sponse channels," IEEE Trans. O1I1J~fiJ. Thc01)', 1990.

netic vector sensor," Stl11'lal Proccssiu...lJ, vol. 47, pp. 41-54, November 1995.

.1

IEEE, vol, 73, pp. 433-481, 1985.

195. S. Kasruria,

whose directions-of-arrival are uniquely determinable with an electromag-

ICASSP'Q6, volume

[>1'OC

\'01.

Elet;t7·01JJfl.....".

Compllt., \'01.

212. H. Krim .Uld I.-C. Pesquct,

26, pp. 102-110, August 1984.

nOllst~ltiollJ.r~· processes,"

·~MlIltiresolution ~lI1.1lysl~

of.1 cb'l~ of

IEEE Trans. t{IIlI TlJco1'"Y, vol. 4-1, no. 4, pp.

1010-1020, Jlll~' 1995.

190. M. Kanda and D. Hill, '~A three-loop method for detcnnl11ing the radia-

tion characteristics of an electrically small source," IEEE TrmlJ. Elt'Ctr01lln...rT. Compnt., vol. 34, pp. 1-3, February 1992.

213. H. Krim, D. Tucker, S. MaHat, and D. Donoho. best b.1Sis se.1I"cl1," JulJ1uittt'd tv IEEE Tl'nliJ.

88

Oil

~·NeJ.r-optill1;l1 ri~k

IT, 1997.

for

t()J'

214. H. Krim and M. Viberg, "Two Decades of Arr.1Y Signal Processing Research:' IEEE 51'i\1.ngnzillc, vol. 13, LlO. 4, pp. 67-94, Jlily 1996.

235. C. R. Lindberg, Multiple taper spectral annlvsisoftcrl'l'Jtrill/.f1'·ce oscillruions, PhD thesis, Univ. Calif., San Diego, 1986.

215. J. Kralik, "Matched-field minimum variance bcamforrning in a random ocean channel," [ourual Ofthe Acoustical SocietyofAult'rica, vol, 92, no. 3, pp. 1408-1419,1992.

236. C. R. Lindberg and 19R7.

J. Thomson, "Comment on LOa new method of spectral analvsis and its application to the earth's free oscillations: the 'sompi' method," by S. Hon et .11.," J. Gcopbvs. Rcs., V(I!. 95, pp. 12, 785-12, 788, 1990.

J. Krolik, "The performance of marched-field bcamforrners with mediterranean vertical array data," IEEE Transactions 011 Sternnl P1"OLCJsill....n, vol. 44, no. 10, pp. 2605-2611, 1996.

237. C. R. Lindberg and O.

216.

J.

217.

Krolik and R. Anderson, "Maximum likelihood coordinate registration

J. Thomson. Methot! and Appm'ntuJjiJ1' Detcctiun Control Si.!plflls, 1995. U.S. Parent 5,442,696.

238. C. R. Lindberg and D.

for over-the-horizon radar;' IEEE Transactums on S~T1Jai Proccssinn, vol. 45, no. 4, pp. 945-959, 1997. 218. C. Kuo, C. Lindberg, and D.

239. C. Liu and D. B. Rubin, "The ECME algorithm:

J. Thomson, "Coherence established be-

tween atmospheric carbon dioxide and global temperature," Nature, vol.

240. C. H. Liu, D. B. Rubin..md Y. N. Wu, "Parameter expansion for Eivl .1(-

ccler.ition - the PX-ENI algorithm," Bunnctril:a, to appear. 19<)8.

219. VV. Lam and G. Worncll, "Mulrisc.ilc representation and estimation of fractal point processes," IEEE Transactions 011 S£fJllal Proccssinq, vol. 43, no. 11, pp. 2606-17, Nov 1995. quasi-uewroman acceleration of the

E~1

extension of

633-648, 1994.

Delay Estimntuni, G. C. Carter, Ed., IEEE Press, 1993.).

'~A

.1 simple

ENI and ECiYi with fast monotone convergence," Biomctrikn; \'01. 81, pp.

343, pp. 709-714, 1990. (Reprinted in pp. 395-400 of CO/Jen:llCC ruu! Timc

220. K. Lange,

J. Park, "Multiple-taper spectral unalvsrs of terrestrial

free oscillations: parr II," Gcopbys.]. R(~v(J.1 A4st1·. Soc., vol. 91, pp. 795-836,

241. H. Liu .md G. Xu, l.l/\ subspace method for signature waveform estima-

tion in synchronous CDNIA systems," lEEf Trnns. COI1JJ1l/micnti01/J, pro 1346-1354, No\'. 1<)96.

Jlgorith111,"

242. R. Liu and L. Tong, editors, Special IJJUC 011 Blind Chnuncl Idcnnfication

Statistica Siuica; vol. 5, no. 1, pp. 1-18, 1995.

atu! Sutnnl Estiumtum, 1998.

221. K.. Lange and R. Carson, "EM reconstruction .1lgorlthms for emission .md

transmission tomography,"]. C01Up. Assisted T01J1o.!p·nplJy,

\'01. ~,

243. T.-C. Liu LlI1d B. D.

no. 2, pp.

VeeJ1, "Multiple window based rnmimum vari-

VJIl

ance spectrum estimation for multidimensional random fields," IEEE Trans.

306-316, April 1984.

SP, vol. 40, pp. 578-589, 1992.

222. D. Lansky and G. Casella, "Improving the ENI .1Igorithm," in C01l/!Jltt1JtfT

ttnd Statistics: Prot: ,~'I1JJp.

011 the Iutcrjacc; C. Page .uid R. Lcl'agc, editors, pp. 420-424, Springer- Verlag, 1990.

244. L. Ljung,

J. Lanzcrotti, T. P. Armstrong, R. E. GoIJ, C. G. Maclennan, E. C. Roclof, G. l\tl. Simnett, D. J. Thomson, K. A. Anderson, S, E. H. III, S. M.

245. L. Ljung and T. Soderstrom, Theory and Prncticc of Recursive Ldcntijicntiou;

Krimigis, R. P. Lin, M. Pick, E. T. Sarris, and S.

246. Loughlin,

223. L.

J. Lanzcrorti, R. E. Gold, D. J. Thomson, R. E. Decker, C. G. Maclennan, and S. M. Krimigis, "Staristical properties of shock-accelerated

S~J11111 Proccssinq,

EM algorithm," f. Royal Statisticnl Society, 226-233, 1982.

225. M. Laviclle, "Stochastic algorithm for parametric .ind non-parametric esti-

SC1'.

B, vol. 44, no. 2, pp.

248. R. 'vV. Lucky, "Techniques for adaptive cqualrzarion of digital communi-

mation in the case of incomplete data," IEEE Transactions (}1J S(rr1JnL 1'1'0-

cation systems," Bell Syst. Tech.[ourual,

1, pp. 3-17, 1995.

pro 255-286, Februarv 1966.

249. M. Lucrtgcn and A. Willsky, "Multiscalc smoorhinj; error models," IEEE

226. M. Lavielle and E. Moulines, "A simulated annealing version uf the El\tl

Trans.

algorithm for non-Gaussian deconvolution," Statist. Contput; to appelr,

01/

Automtuic Control, 1994.

250. R. Lupas .md S. Verdu Linear multiuser detectors for synchronous

1998.

code-division multiple-access channels," IEEE Tmns. IllfiW1'1J. 1/JCmy, pp.

227. E. L. Lehmann, Testin.I. J Statistical Hypotheses, Wiley, New York, 1959.

123-136,1989. 251. R. LLlP~lS and S. Vcrdu, "Near-far resistance of multiuser detectors in asynchronous channels," IEEE Tram'. O1l C01Jl1ll., pp. 496-508, April 1Y90.

228. H. Leung and S. Haykin, "Detection .ind cstirn.itiou using an adaptive 1'.1tional function filter," IEEE Transactions all Sigllal Processiun, vol. 42, no. 12, pp. 3366-3376, 1994.

252. X. fvb and C. Nikias, "'Parameter estimation and blind chantlcl identificarion in impulsive signal environmenrs," IEEE Tl'nllJflcti01H O1l S'i...qJlnl PJ"()-

229. R. Lewitt and G. Muehllehner, CLAccdcratcd iterative reconstruction for tamngraph~'," IEEE

assil1Jr,

Trans. O1l Jl1fdirnl bmtfTmp, vol.

MI-5, no. 1, pp. 16-22, 1986.

253. S.

43, no. 12, pp.

\'01.

~1albt,

28~4-2R97,

Dcc 1995.

A-1 f,Vm'elct Tour ofS({Jllt1/ Procc.uiu..f.T, Academic Press. Boston,

230. J. Li, "'Direction and polarizJtioll estllll.ltion using arr.lys with sl11Jllloops and -;hort dipoles:' IEEE T7·flJ1.I'. fflltCllJlnJ 1111d Prop., \'01. 41, pp. 379-387, ~1arch 1993.

25··l-. B, l\-lanJclbrot, Tbe Fractnl Gcollu:tr-y q(NntlllL 'vV.H. rrecmatl and

231. X. Li and Nt Bilgllta~r, <'yViener tilter realization t()I" t.1rgct dctccriol1 using

lSS. B. t\.1.1l1delbrnt, Tilefi·nctnl.!7fOllJet1-Y

1998.

pall~',

group delay statistics," IEEE Ttawm.:tio11J 01/ SLerna} Pl'occjJillJJ, vol. 41, no, 6, pp. 2067-2074, 1993.

(~flJntl/n:

(2nd Editi()1lJ. \VJ-J. Free-

r. Flandrin, "vVigner-Ville specrral LlIlalysis for IlOI1StJtioll.1ry processes:' IEEE TrrmmctiollJ 011 AWlIJtics, Speee/;, mut SlJTllnl, l'l'occssiJl....fJ, vol. 33, pp. 1461-1471,1995.

256. W. ~lartill and

adaptive equalizl:rs," IEEE Trmuaetiow 011 S~]1lfll PmccHilllT, pp. Slt)-826, April 1996.

257. E. A1asry, '''The wavelet transr()rm of stochastic

233. K. S. Lii and M. Roscnblatt, LLDccol1volutioll ,1I1d estimation of transfer

incremenrs L1lld its application

hll1ction phase .1nd coefticienrs for non-Gaussian line.lr procl:sses," TIJ/' A121lfllsofStfltistics, vol. 10, pp. 1195-1208, 1982.

J. Lilly J.nd J. Park,

Ll~lultiwavclet

seismic records," GcoplJ)'J.]. hltl.,

[0

proces~es

fracti 011.1I Brownian

with

IllcJt!on,"

stati()ll~l1-~'

IEEE Tr·flllS.

on IT, vol. IT-39, pp. 260-264, ILlIl. 1993. 258. G. Matz, F. FHawatsch, and W. Kozck, "Generalized

spectral ,111d polarizatiol1 Llllalysis of

\'01.

(Olll-

1977.

mall LlIld Co., 1983.

232. Y. Li and Z. Ding, "lilobal cOl1vergencl: of fractionally spaced gociJrd

234.

vol.

247. T. A. Louis, "Finding the observed information matrix when Ll~ing the

ions 1Il the Durn hcliospherc." Ap. j., vol. 380, pp. LlJ3-L96, 1991.

positron emission

L. Atlas, "Construction of positive

42, pp. 2097-2705, 1994.

224. L.

110.

J. Pitton, and

time-frequency distributions," IEEE Trnnsnctions ()JJ

vol. 268, pp. 1010-1013, 1995.

42,

Idcntijication: Theory fin' tlic USCI', Prcnncc-Hall,

NUT Press, Cambridge, iV\A, 1987.

J. Tappin, "Over the

southern solar pule: low-energy inrcrplunctarv charged particles," Science,

ccssi7~rr, vol.

.\v.l"tC1JJ

Englewood CliHs, NJ, 1987.

..1Ilalysis ..lnd the

122, pp. 1001-1021,1995.

Tnl1lJactitJ1/s

89

We~rl

011 S(rrllnl

e\'oilltionar~· spectral spectrum of n0l1st.1tiunary random processes," 1£1:.".£

P1'Occssill..f. T, vol. 45, pp. 1520-1534, 1997.

J. NlcCo~', A. T. \Valden, and D. B. Percival, "Mulrir.iper spectral estimation of power law processes," IEE1:: Trausactunts 011 StIJ1U1! Proccssiu...fJ, vol. 46, pp. 655-668, 1<)98.

159. E.

2X1. S. N.lI11 and E. Powers, "Applications of higher order spcctr.il ~lIl.lIYSIS to cubically nonlinear system idcnrificarion," IEEE T1"I11IJflcti07LJ Oil Siqnal ProCCSJittlJ, \'01. ~2, pp. 2124-1135,1994.

260. G. Mcl.achlan and T. Krishnan, The Ei\lI nlnonrlnn and extensions, \'\Iiley,

2S1. Z. Nan and A. Nehorai, "Detection of ship wake using an airborne

netic transducer,' in Prot: 321Zd AsiI01I1fl1' C01l1

1997.

Pacific Grove, CA, November

261. L. T. !V1cvVhortcr and L. L. Scharf, "Mulnwindow estimators of corrcla-

non," IEEE Transactions 011 StJ111fl1 PnJl:eJJilI.!T, vol. 46, pp. 440-448, 19<)8. 202.

j.

r. Mcihjson, UA [1St improvement to

263. R. Mcllors, 1. r. L. Vernon, and

n. 1. Thomson, "Detection of dispersive f. Intl.,

2R4. A. Nehorai and E. Paldi, "Vector-sensor processing t()r clccrromagnctic

to

source localization," in Proc. 1.5tl1 Asilomar Con]. 01'/ Sinntils, Svst. nuti Comput., pp. 566-572, Pacific Grove, CA, November 1991.

J. M. Mendel, "Tutorial on

higher-order statistics (spectra) in signal prosystem theory: theoretical results and some applications," 1'1"0cadiTtf1.'; of the IEEE, \'01. 79, pp. 27R-30S, March 1991.

264.

285. A. Nchorai and E, Puldi, "Acoustic vector-sensor ,1rrayprocessing," in Proc. ],()tIJ Asilomar C01~f: 011 Sigllnh, Syst. nnd CO/IJput., pp, 192-19~, Pacific

ce~sil1g .ind

Grove, CA, October 1992.

265. X. L. Meng and D. Van Dvk, "The EL'v1 algorithm - an old folk-song sung

to a fast new tunc," j.

R(~"{I1 Statistical

Society,

SLT.

2~6.

B, \'01. 59, no, 3, pp.

Stf11lfll Proccssiua, vol. 43, no. 12, pp. 2964-2974, 1995.

2g~.

267. H. Messer and S. Tsruv,i, "I'crtonu.incc analysis of time delJy estimation

143-253,

channels: .in Eivl-bascd approach," IEEE Trans. ou Conmtunicatunts, vol. 44, no.

threshold signal processing III undcrwarcr .icousrics: an analyric overview," IEEE J Occanu Ell...'l!T, vol. 12, pp.

293. C. Niki.is and M. Sh.io, S~Tllnl proccssitu; JP;tlJ nlplm-stablc dtstrilmtious aut! applit:ati011S, John \IViley & Sons, 1995.

and rhe EM .llgorithm," JEEE Trtmmctiolls OTl Acoustics, S(mwl Procf.l"sill.!l, vol. 3g, no. 9, pp. 1560, 1990.

J. Nol.1n, "Par.lI11eter cstim,ltiol1 .1Ild d.1ta.1Ilalysls t()r sr.lble di~triburion'i," in Proc. J1lir(1'-FirJt E!silO1lJar COllfOIl S{fPJfliJ-, .\l'.rtCI1lJmiff Computel)', pp. 443--!7, l\tlollterey,CA, No,' 1~97.

294.

serie~."

J. Nolan. "1\1.lIlti\'~lriate sr.lble di~rribllrions: approxim.lt1oll, esrinutiol1, SilllUI.ltioll .1l1d idel1t1~ication," in A In-a/.."tica/...l Juidf to I)cfll~l' tnih: Jtatistical tuIl1l1'lZlCJ./br alla~ll:::,ill...ff !JmJ~ll tailed distl'ilmti01IJ. R. f. R. J. Adler ,1Ild 1\1. T.lqqU, editor~, Birkhall~cr, 19<)8.

295.

273. T. Nlool1, "The expectation maximizarion algorirhm," IEEE S(fT1U1! Pro\'01.

13, no. 6, pp. 47-60,1996.

J. ~1osher, P. Le\\'i~, .lllt.! R. Leahy,

~~1\111ltiplc dipole modeling and localizanoll ti'ol11 sp.ltio-telllporal MEG data," IEEE T1·aw. BioNlcn. Ell,H., vol. 39, pp. 541-557, June 1992.

275. P. Moulin, LLA wavclet rcgularization mcthod

rex diffusc radar

2<)6. R. Onl1 ,lnd A. Steinhardt, "A l11ultiwindo\\' method for ~pectrum estima-

tioll .lnd 'iinu~oid derectlon in .111 arm)' envirol1ment," IEEE Trn1JSfletiOTlJ 01/ Si.,f/llnll'mccsJill...f1, \'01. 41, no. II, pp. 3006-3015, 1994 .

targer im-

•1ging and ~peckk noise reduction," JOLH7l. AllltlJ. 11Jlnllill..tT 111ld Visiou, pp. 123-134,1993.

\'01.

3,

J. ]. K. O'Ru.lll.lidh .1I1d \V. ]. Fit7.geraJd, ..VI//Ilt:ricnl Bayesiall /l'lctbodJ£!pplied to S£/1111111'7'OCt:.uill..ff, Springer Verlag, Nc\\ York, 1996.

297.

276. E. i\1011lines. P. Dllh.l1nd, J.-P. CJrdoso, .1I1d S. Mayrarglle, ~'SlIbspace

2lJ8. B. Otterstell, "AITa~' Processing tC)1' Wircles~ Comlllunications." in Pmc. 8th HTorl~JlJop OIl Stnt. S&J. awi .t:I7i·a.1' Proc., pp. 466-473. Corfu, Greece, June

for the blind idcnti~ic.ltion of multichannel til' ~ilters," IEEE T7'fl1LJllaimlJ 011 Sif1lla/ P1'OCfJsi1lJT, vol. 43, pp. 516-525, 1995. ll1erhod~

1996.

277. E. lviolllille~ and P. Soulier, ·~FrJction.1J exponenriall11odd for fr.1Ctal

199. B. Orterstcll. 1\1. Viberg, P. StoiCl, .llld A. Nehorai, '~EXJCl Jnd Lll'ge

point proLesses," in Proc. Thirty-First Asilomar COllfOll S£f/llals, SYStL'1IlS ann Computers, pp. 1107-11, i\1onn:rey, CA, No\' 1997. 178. R.

S.lI11ple IVI L Techniquel\ t'()r Parameter

E~tilll.lriol1

,1nd Detection in Array

Processing," ill /{anm' Arm." PnJCfJJill!.7, Haykin, LI[V,l, .1Ild Shepherd, editors, pp. 99-151 , Springer-Verlag, Berlin. 1993.

J. l\111irhe~ld, Aspt:CtJ 0f/vlu/tiJ'(l1'iatc Statistical Tllt:fn")', vVilc~', New York,

llJ82.

or the circular correlation codlicicllt receiver," IEEE TrmlJ. AmllJtifJ. Spcn:IJ, alln Stf/lwl p,.oaSJi7tfT, vol. ASSP-34, JlO. 3, pp. 39LJ--104, 19S6.

300. L. Pakula .1I1d S. M. K,ly. "n~tecti()11 pcrf()rmance

279. C, F. Nlllllinl\. ~~l\tlultiple window Cllll1ulal1r estimatioll," in Proc. IEEE SP IJlorkrlJ(}jJ

J. lVl. Mendel, "Sign,ll processing WIth higher-order ~pec

292. C. L. Nikia~ and A. P. I'crropulu, Hinhcr-Ordcr Spectra Analysis, Prentice-Hall, EnglL'w()od ClilE, NJ, 1993.

~indlllg

CCJJill.!7 lHIY1n::.i1lt:,

III

tra," IEEE Stf11lfll Proccssiu..1. 'f iHa..q azinc; vol, 10, pp. 10-37, July 1993.

271. 1\1. 1. Milkr and D, R. fuhrmann, "1\tl.uilllum-likclihood ll.1rrow-b.lIlddi-

274.

"A vertical .1ITJY of directional acoustic sensors,"

291. C. L. Nikia~ and III

non-Gaussian noise," in Advances ill statistical SttT1lai prOffSJill,!.1- 1'0/ ]: SttTHal Detection, H. Poor and J. Thomas, editors, pp. 137-215, TAl pres~., Greenwich, CT, 1993.

.lnd L\t1. Hinich, ~'Tripsectr.ll analySIS of st.1tioll.lrytime j. .t1.amstica! SOl". America, vol. 97, 19<)5.

J. Giles, ,1I1d

Proc. Mast. Occanstbru TeeD. (Occam 92), pp. 340-34S, Newport, R.I, October 1992. l~. f)'SpJII1,

4-28,1987.

n. Nlolle

12, pp. 1700-1710,1996.

2<)0. J. Nickle~, G. Edmonds, R. H.lrri.~s, F. Fisher, \V. Hodgkiss,

n. Middleton, "Channel modeling and

171. J.

January 1995.

2H9. L. B. Nelson and H. V. Poor, "Iterative multiuser receivers for CD1V1A

268. N. Metropolis, A. \V. Rosenbluth, M. N, Rosenbluth, A, H. Teller, and E. Tcllcr, "Equations of stare calculations by fast computing machines, n JOH1i"1n1 (~rCI)f11Jic{JII'/~1'Jin, vol. 21, pp. 1087-1092,1953.

1l1ld

A. Nchorai, R. Porar, and E. I'aldi, "Detection and localization of V.1-

por-cnurting sources," IEEE Tmnsactious 011 Stf111f11 ProcCJJi7I!l, vol. 43, pp.

of a signal with unknown spectral parameters," ill Pmcccdill.!Ts of tile Fifth ,-iSSP Workshop on Spectrum Estimation and Modeling, IEEE, 1990.

recrion

IEEE

netic source localizarion," IF.EE Transactions on StlJ1lfll Pmccssitut, vol. -l2, pp. 37{)-3C)~, February 1994.

passive detection/localization of wide band sources," IEEE Trnnsnctions 011

Spudl,

arrJ.~r processing,"

287. A. Nchorai and E. Paldi, "Vector-sensor array processing for clcctrom.ig-

266. H. Messer, "Potential pcrfonuancc g'lIIl in using spectral information in

270. D. Middleton and A. Sp.lldding, "Elements of weak-signal detection

A. Nchorai and E. Paldi, "Acoustic vector-sensor

Transactions on Sinun! Pmcl'JJi1J.!l, vol. 42, pp. 2481-2491, September 1994.

511-567, 19lJ7.

269.

T.1I1, "Minunum-noisc-variancc

bc.unforrncr with .111 electromagnetic vector sensor," in Proc. IEEE Jut! Con]. 071 AcoltJt., Speech, and S~l. 1'7'OL (IC~-iSSP98), pp. 2021-202-l-, Seattle, \VA, lvLly 1l)9H.

the EM .llgorithm on its own terms,"

sign.lls lIsing multir.ipcr double frcqucncv coherence," G£'oplJ.VJ. appear, 1998.

lll.lg-

SLf/1lflIJ, Svst, asu!Comput.,

199~.

183. A. Nchor.ii, Ie-C. Ho, and B.

Statistical Society, SC1·. B, vol. 51, no. 1, pp. 117-138, 1989.

R~1'al

O1l

011

HtlJ!Ji-'1'-OrdtT stntiJtiCJ, p. M 1.4, South L.lke Tahoe, CA, 1993.

J. Krolik, '\\1ulti-dwell n1.lrched-ficld alntude estimation for o\'cr-the-hurizon radar," in Proc. IEJ:EIut. C01l1 AWl/Jt., Speech, and SilT- ProL'., 199~.

280. T. C. Mullis and L. L. Scharf, ~'QlI;ldratic estin1.1tors of the power spec-

trul1l," in Adl'allCt.'s ill SpectrulII Alla~"siJ aun A1Ta.,' Proccssillp, S. Ha~'kin, tor, volume L pp. 1-57, PrentlLc-Hall, 1991.

301. L\1. Papazoglol1 and

edi~

90

J. Park, "Envelope csnmanou for

302.

noise:

.1

326. S. U. H. Qureshi, "Adaptive cqualizarion," Proccrdill.!TJ ofrhe 1F.£E, pp. 1349-1387,1985.

quasi-periodic gc()plw~i1cal slgn.lls ill

multitaper approach," in Statistics ill the Environmcutttl aut! Earth

Sciences, A. Walden and P. Guttorp, editors, pp. 189-219, Edwin Arnold, London, 1992.

J. Park, f.

303.

327. J. Ralston, A. Zoubir, ..ind B. Boashash, "Identification of a class of non!iI1C~1l" systems

V. III, .1I1d C. Lindberg, .... Frequency dependent polarization

.inalvsis of high-frequency c;eisll1ograms,"].

Gf(}p/~l'J. f{CJ., \'01.

01/

LJ2, pp.

J. Park, C. Lindberg, and f. V. IlL "Mulrirapcr spccrr.il analysis of high-frequency seismograms." j. Gcopbvs. s«, vol, 92, pp. 12,675-12,684,

304.

Tillie SeriesA1U1~1'JZJ,

arr.ivs," IEEE Tram.

H. Krirn, H. Lcporini, .1I1d E. Hamman, "Bavcsi.u: approach

optical Speech,

.111

01/ ALTmp.

Electron. and

1~I'Jtt..·J1H, vol. AES-I0,

pp.

(~fthc

ArOl/s-

334. K. S. R.iedd and A. Sidorcnko, "iYlilllJl1L1I11 bias multiple t.lper spectral csri-

310. B. Picinbono and P. Duvaut, "Optimal linear quadratic systems tor dcrcc-

marion," IEEE Transnctious ou S~Jlln/ Proccssiun, vol. 43, pp. 188-195, 19LJ5.

Theory, vo] 1T-34, no. 2, pp.

335. K. S. Riedel and A. Sidorcnko, "Adaptive smoothing of the log-spectrum

304-311, 1988.

with multiple rapcnug," IEEE Tmnsactious 011 Siqnal Proccssinn, vol. 44, rp.

311. S. U. Pillai,.A17'ayS(rf1ll!/ PmCCJSil1JJ, Springer Verlag, 1LJ8 J. l

1794-1800, 19'J6.

312. ]. W. Pitton, "Nonstarionarv spcetrull1 cstirnanon and time-frequency

336. K. S. Riede], A. Sidorcnko, N. Bretz, and D. J. Thomson, "Spectral cstim.ition of plasma fluctuations. II. noust.uionary analysis of edge localized

conccnrrarion," in Prof. ICASSP, volume 4, pp. 2425-2428, 1l)9~.

313. V, Poor .1I1d G. Worncll, IVi1·c!r.H Comuunncatum: S~l1ll1/ Proccssinu Pcrspcc-

mode spectra,"

tivcs, Prentice Hall, New [crscv, 1995.

P/~I'J.

P!nJ1Iln, vol. 1, pp. 501-514,1004.

337. K. S. Riedel, A. Sidorcnko, and D.

P. Costa, P. Larzabal, and H. Clcrgcor, . . ExtLlCtiOI1 of parameters from .1 wavelet packet analvsis tor rcrr.iin matching," in Prot. ofthf 3-rd IEEE Int. IVorh!Jop Oil Time-Scnlc, Ti1Jlc-Frcql/f1/~1" pp. 423-4-26, P.ui~, 18-21 Juin, 1996. POpCSCL1,

J. Thomson, ....Spectral estimation

plasma fluctuations. 1. comparison of methods." 485-500, 1094.

IJ/~),J. Plnsina, \'01.

of

L pro

J. Rissanen, . . jvludcling by shortest d.1t.1 descriptiun," Auto1llaticn, vol. 14, pp. 465--+71, 1978.

338.

315. B. Porar, DlJJita! Pro,,·cSJill....rJ ofRmufolJ1 S~lllnls, Thcm)' -::> AlrtlJods,Prcnrice

339. C. P. Roben, The Bnynim/ Choicc, Springer Verbg, New York, 1994.

Hall, Englewood CliH\ N J, 1t)L)-+.

340. D. A. Roberts., K. "vV. Ogilvic, N1. L. Goldstein, D. J. Thom~ol1, C. G. NLlcknn.m, and L. J. LlIlzerotti, "The nature of the ,>olar wind," Nl1turL', vol. 381, pp. 31-32, 191)6.

316. B. Pur:.1t ~lI1d B. Friedlander, "Adaptive detl:etioll of tral1."il:nr signals," .~/Ji:fc/J,

AWl/Jr,

333. C. D. Richmond, "Derived PDF of maximum likelihood sign.ll estimator which employs all estimated noise covariance," IEEE Transactions 011 S£1111n! ProCCJSi11Jf, vol. 44, IlU. 2, pp. 305-315, 1()LJ6.

the best basis selection," in ICJ-1SS]J'l)t>, Atlanta. liA, ;\by 1996.

IEEE TrailS. AcolJstics,

f.

19~().

in an uncertain sound speed deep ocean environment," [ournal tical Society otAmcrica, vol. 89, no. 5, pp. 2280-2284, 1991.

wavelet reprcscntarious," IEEE Transttctions01/ S~Jllfl! Proccssiua, Aug. 19tJ6.

314. B.

1, pp. 14S- 158,

332. A. Richardson and L. Nolte, "A posteriori prob.ibilirv source localization

30H. J.-C. Pesquct, H. Krun, and H. Carf.intan, "Time invariant orthouorrual

011 lntonu.

Rare Distor-

853-S63, 1974.

iHultitapt-Tfwd Conventional Univariate Tcclmiqucs, C~lmbridgc Uruv. Press, 19l)3.

tion .uid estimation," JEEETrans.

.1

331. I. Reed, I. Mallet, and L. Brennan, "Rapid convergence rare in adaptive

for wireless communi-

307. D. R. Percival and A. T. Walden, Spectra! J-1Jl1l~VJlS fiJI' Pbvsical Applicauons:

J. Pcsquct,

\'01.

panern with unknown spectral distnbution," IEEE Trl11/J. and S~l. r-«, vol. 38, no. 10, pp. 1760-1771, 1990.

cations," IEEE Sinnal Proccssiu.; lV[t1jlnzi1Je, vol. 14, pp. -+LJ-83, 1997.

to

IEEE Trausnctions

330. I. S. Reed and X. Yu, "Adaptive multi-band CrAR detection of

J. Park, C. Lindberg, and D. Thomson, "Multiple-raper spectral analvsis of terrcstiul free oscillations: part L" Gcoph»). f. RI~1'nl Ast/". Soc., vol. LJ 1, pp. 7SS-794, 1987.

305.

309,

11()I1-G.ll\SSI~111 excitation,'

45, pp. 719-735, 1997 .

329. T. S. R.l0 and M. Gabr, '·A rest t()r lincanrv of sr.iriorurv rime series,"

1987.

pr()ces~ing

\'01.

328. K. Ramchandran and M. Vcrterli, "Best Wavelet Packets ill tion Sense," IEEE [m.n.\". 011 Ima.!Jf Proc., pp. 160-175, 19tJ3.

12,664--12,674,1987.

306. A. I'aulraj and C. Papadias, "Space-rime

under stariorurv

Sinnal Proccssinjt;

llnd S~71tnl Procc.IJill!T, vol. /\SSP-34, no. 6, pp.

1410-14J8,1986.

341. f. Robey,

317. B. Pnrat .1nd B. Frit:dl.lI1der, . . Blind cqualization of digir.ll COllllllUllication c1unnels using high-order moments," IEEE TrnllJ/1etioJJJ Oll S~11/111 Pm-

n. Fuhrmann, E. Kcll~', and

R. Nitzberg, ".\ CFAR ,ld.lptive

matchcd tilter dctecror," IEEE Trnmnctim/s Oil An"o.,pna fl1ln Elcctrollic .~l'J'

CCJJi1l.!.7, vol. 39, pp. 522-526, 1991.

to/tJ,

vol. 43, no. 12, pp. 2964-2974, 1992.

342. n. B. Rubin, ... U . . lI1g rhc SIR .llgorithm tu silllldate posrcrior di~rriDlI rions," in BaYfJinll Statistics 3, 1. M. Bernardo, ~/l. H. DeGroot, D. V. Lindll:Y,and A. f. i\1. Smith, editors, pp. 395-402, Univcrsit), Prcss, Oxtc)rd, UK, 1988.

31 S. B. Porar .1nel B. FriedL1I1dcr, "E..,rin1.1tioll of frequel1c) in the prc~enCe ()f nOll-random intl:rfCrcllce," IF:Et· TmllSnctlO1H ON S~{Jllf!1 ProCCJJiJ'--.", vo!. 44, no. 3, pp. 640-651, lvL1r 1996. 319. B. Por~H .1Ild A. Nehor.li, "LuLllizing vapor-cIllmiting sources by Illoving sl:nsors," IEEE Trnwncti01lJO}/ SLfTluz! ProCCSj'ill~l1, vol. -+4, pp. 1018-1011, April 1996.

343. A. RLlIZ, J. LVI. Cioffi, .md S. K.asrLlri~l, "Discrete multiple tonc modllbtion wirh coset coding for the spectrally shaped channel," IEEE Tm11J. Oll COJI/I/J1tllimti01/J, pp. 1012-1029,11)92.

310. E. Powers, B. Bo,lslush, ~\I1d A. Nl. Zoubir, editors, Hi!l/lcr OrdtT Stntist/-

wI S~TllL7' P7·oCl'sJill....rJ nlld AppliCf1tioIlJ, Longnl.ln Chcshire, N1clbourne, AlIstr.llia,1995.

344. H. R~'lI and S. LO\vcn, "Point-proo.:ss appro.lChes ru tllc modeling .1l1d .11lal~'sis

of self-simil.lr tr~lffic," ill Pl'U( IEEE C(Jllf!J~tiJWlll, pp. 1468-7;';,

1996.

321. R. Pricc, "Nol1linell" tct:dback equalized PAt"l versus up~Kity tix noisy tiltcr clunnels," in Prot:. of/ntl. COllI" n11 CmlllJlJl1limtiolls, pp. 22.12-22.17, 1972.

345. B. 1\11. S.ldlcr, "Detection ill currelated impulsi\'e noisl: using tC>Urth-urder

322. M. B. Priestley, lVvlI-Ll1lCflr m/ft NOJ/-Statummy TilJ/c Saic.\" J·IJ1n.~l'JiJ, Audcmic Pres~, San Diego CA, 198X.

346. B. M. .s~Kiler, G. Gi.lIl11akis, and K. Li, "E~t1n1
323. I. G. Proakis, Di.!Titnl 2nd nlition, 1989. 324.

CUIII1J1l1l/iwti01H,

Nkl;r~l\\'-Hill Book Comp.lny,

J. G. Pro.lkis, Dtl1itnl COJ1lJJJll1limtio}/J, 71Jirn cnitioJl, McGraw

cumulants," S(111](l1 l'rocc.iJi1'--.11, vol. -±4, no. 1 L pp. 2793-2800, IlJ96.

lInl ProcfJsillg,

NY,

krsc~',

42, no. 4, pp. 2729-27-+1, lLJ94.

347. N. SJito, Loml fmtlJrc extmctiml nmi its applimtio1ls PhD rhesi:-;, Yale University, Dec. 1094.

Hill, 1995.

348. G.

325. S. Qian and D. Chen, Joint Tilllc-F'J'cqllcm:" AlJazvJiJ - Ala/J(}nJ miff -I-1ppliw-

tiow, Prentice-H.lll, :--Jew

\'01.

1996.

Sal11()rodnitsk~'

Ch~lpman

91

llJIIl....17

n hbrmy oflmJcs,

.1nd Nl. Taqqll, Stnble 1I0l1-GnllJ,I"inllra1ldom

& Hall, 1994.

/J7"O(CJJCJ,

34Y. S. D. Sandberg and lVI. A. Tzannes, "Overlapped discrete multironc modulation for high speed copper wire communications," IEEE journal 011 SfleeredAt"ensin Communications, pp. 1571-1585, 1<:>95.

370. O. Shalvi and E. Weinstein, "New criteria fl)r blind deconvolution of

350. A. Savecd and D. Jones, "Oprirnal detection using bilinear time-frequency and rime-scale representations," IEEE Transactions Oil S(flllnlPl'oussi1'lJT, vul. 43,pp. 2872-2883,1995.

371. I. Shorter and H. Messer, "The bispcctrum of sampled data: part II -

nonrninimum phase systems (channels)," IEEE T7·lll1J.

Of:",

1994. 372. L. A. Shepp and Y. Vardi, "Maximum likelihood reconstruction tor emission tomography," IEEE Trans. on Medica! blUll1ing, vol. MI-l, No.2, pp.

ncb," IEEE Transactions 01lIlIf017J1l1tio1'l Theory, 1<:>99.

113-122, Oct. 1982.

352. A. Scaglione and G. B. Giannakis, "Code-only dependent asynchronous

t()r rnui elimination and mitigation of unknown Of 3 Ist Astlonmr Con]: 011 StfT1uzIJ, s.YJte11lJ, anti Computers, pp. 950-954. 1997.

373. N. D. Sidiropoulos, G. B. Gianrukis, and R. Bro~ "Deterministic wave-

CD~1A receivers

form-preserving blind separation of DS-CD~1A signals using an antenna array," in Proc. SSAP'98, Portland, Orcgoll, 1998.

multi path," in Proc.

374. M. Simon,

353. A. Scaglione, G. B. Giannakis, and S. Barbarossa, "Redundant filterbank

certainty V: the discrete

1991.

lCASSP-'94. vol. IV, pp. 585-588, 1994.

356. L. L. Scharf and D. W. Lytle. "Signal detection in Gaussian noise of un-

377. T. Soderstrom and P. Sroica, S~'YstC11l Identijicatio», Prentice Hall Inrcrn.,

known level: all invariance application," IEEE Trans. 011 Infonu. Theory; vol.

London, 1989.

IT-17, no. 3, pp. 404-411,1971.

378. 1. Song and S. Kassam, "LOD of signals in a generalized observation

357. L. Scharf anti B. Friedlander, "Marched subspace detectors," IEEE Trans-

model," IEEE Trm/J Inju Tbcol'Y, vol. 36, no. 3, pp. 502-15,516-30, M.1Y 1990.

actiouson Stfl1lnl Praassin...'T, vol. 42, no. 8, pp. 2146-2157, 1994. l\1U-

379. H. '1\'. Sorenson, ParameterEstimation, Marcel Dekker, New York, 19HO.

nich, Gcnnany, Nlay 1997, IEEE.

359. R. E. Schild and

n. J. Tholl1son, "The Q0957+561

380. L. Stankovic .1I1d S. Stankovic, LLAn analysis of instantaneous frequency

time delay, quasar

representation using time-frequency distributions-generalized Wigner distri-

structure, and microlensing." in Astronomicaltime series, D. M. cr 411., editor,

bution," IEEE Transactions em Signnl Proctssiu....fT, vol, 43, 1995.

pp. 73-84. Kluwer Academic Publishers, Dordrccht, 1997.

38 J. P. Sroica, M, Cedcrvall, and A. Eriksson, "Combined instrumental vari-

360. H. Schmidt, A. Baggcrocr, W. Kuperman, and E. Sheer, "Environ-

able and subspace fitting approach to parameter estimation of noisy in-

mentally-tolerant bcamforrning for high-resolution marched-field process-

put-output systems." IEEE Transactions 011 S~l1Jnl Proccssin...fT. vol, 43, pp.

ing: Deterministic mismatch:' journal oftbc AcousticalSociety ofAmerica,

2386-2397, 1995.

vol. 88, pp. 1802-1810. 19YO. 361.

J. Schoukcns and

382. P. Sroica and R. Moses, Introduction to SpcctmiAnalysis, Prentice Hall, Inc., Upper SJddlc River, NJ, 1997.

R. Pintelon, Ide11t~ficntlO1l ofLi1unr Systems, Pergamon

Press, Oxford, UK, 1991.

383. E. G. Strom, S. Parkvall, S. L. Miller, and R. E. Otrcrsren, "Propagation delay estimation in asynchronous direct-sequence code-division multiple access systems." IEEE Tmlls. 011 Com1l1., pp. 84-93, Jan. 1996.

362. P. Schultheiss, H. Messer, and G. Shor, "Maximum likelihood time delay estimation IJ1 non-Gaussian noise," IEEE Transactions (JJl Signal Processin...q, vul. 45, 1997. 363. T.

679-6~2,

364. T.

384. A. Swami, "Multiplicativc noise models: paral11eter estin1atton using

J. Schultz, "Fast algorithm t{)r maximum-likelihood inl.lging with co-

hcrt:nt speckle

measurcments~"

cumuJants," StfT1lni ProccJsiu...l1, vol. 36, no. 3, pp. 355-373. April 1994.

in Proc. IEEE CUlIfolllmagc Proc., pp.

S.mtJ BJrbara, CA, 1997.

385. A. Swami. "Cramer-Rao bounds f()I" deterministic signals in additivc and multipliGltlvt nOise," StlTlla/ p,"oCL'JJilll1, vul. 53, nu. 2-3, pp. 231-244, Sep

1. Schultz, "Penalized ma.ximum-likclihood cstimation of cov.lriance

1996.

rllJtriccs with lincar structure," IEEE TrnllJnctiollJ 071 Si....fT1Jn/ P1'OCCJsillg, vol.

4-5,

flO.

12, pp. 3027-3038~ 1997.

386. A. 5W<11111, G. B. Giann,lkis, and S. Shamsunder, t~MlIltichannel AR1\1A processes," IEEE

365. G. Schwartz, '~Estlmating the dimension of a model," Anll. StiltS., vol. 6,

Stf.TlItll ProCL'SSill...fT, vol. 42, pp. 898-913.4

387. A. Swami, G. Gialll1
366. R. E. SchwJ.rrz. "Minim
Statistics," SLqllnl ProfCssiug, vol. 60,

July 1969.

110.

in colored non-Gaussian .lutoregress1ve processes," IEEE Trnllsnctir)1JS OIl Acoustics, Spud}, alld S[/J1tnl PnJcCJJiJl.!T, vnl. 38, no. 10, pp. 1661-1676,

389. A. SW:ll11i and B. M. S.ldler, "Parameter cstimation tor lineJr alpha-stable processes," IEEE StfTllnl Proccssilllf Lettcn,

1990. ization using modulation induced cyciostationarity," IEEE TrnllJaetiollJ 011 '~ISAR using

5, no. 2. pp. 48-50, Feb

390. K.-C, TJn, K.-C. Ho, .11ld A. Nchorai, "Uniqueness study of measurements obt.1inable with arr.lys of dectrom'agllctic vcctor scnsors," IEEE TrallmctionJ Oil S(fTunll'l'Occ.uiu.!.T, vol. 44, pr. 1036-1039, April 1996.

46, pp. 1930-1944, 1998.

36Y. ;\,1. S. Seymour and S. Haykin,

\'01.

1998.

368. E. Serpedin and G. B. Giann.lkis, "Blind channel identitication and equal\'01.

1. pp. 65-126, July 1997.

388. A. Swami and B. M. Sadler, "TDE, DOA, and related p.uameter estimation problenls in impulsive nuise," in Pmc. IEEE SP IVorllSlJop 071 HOS, pp. 273-77. Banft: Canada, 1997.

367. D. Sengupta and S. .1\1. Kay, "Parametcr cstimation clnd GLRT detection

Si...fT1/fl/ Processi1l...fT,

T7'n11SflCti01IJ 011

1994.

pp. 4611-4640, 1978.

722-725,

Bell L\VStC1Il Tn·h. t., vol, 57, pp. 1371-1429,

376. D. T. M. SICKle, uBlind fracrioually-spaccd equalization"! pcrfcct-reconsrruction filter hanks and multichannel linear prediction," in Proc.

355. L. L. Scharf, Statistical S(qnnl Proccssiuq: Detection, Estimation, 111ld Time

H. Krirn, "Robust wavelet dcnoising," in ICASSP'97,

C.15C,"

1978.

Tbco1'Y, vol. IT-19, no. 3. pp. 422-427, 1973.

r. Schick and

Omura, R. Scholtz, and R. Levitt, SpreadSpectrum Conunu-

375. D. Slcpian, "Prolate spheroidal wave functions, Fourier analysis, and un-

354. L. L. Scharf, "Invariant Gauss-Gauss detection," IEEE Tram'. 01IIlIjb,.1II.

358.

1.

uicntion Handbook, NlcGrJw-Hill. New York, 1994.

precoders and equalizers, parts [ & II," to .\ppear in IEEE Tmusactionson Sifnnl Proccssinq, 1»».

~1A.

Infonuntion The-

rcr," IEEE Transactions {lJ1 SttJ1lni Processing; vol. 42, no. 10, pp. 2706-2714,

optimizing information rate in block transmissions over dispersive chan-

Analysis, Addison-Wesley, Reading,

01'1

36, pp. 312-321. 1990.

Monte Carlo simulations of detection and estimation of the sampling jir-

35] . A. Scaglione, S. Barbarossa, and G. B. Giannakis, "Filterbank transceivers

SC11CJ

\'0J.

Thomson's multi\vmuow

391. H. Tanizaki, Nonlillcnl'.fi/tcn: EJtzmntion nllrlApplicfltirJ1ls, Springer Verlag,

.ldaptive ~pectrllm cstimation method," IEEE T,.nlls. Atl'O. nlld Elee. ,~vs., vol.

New York, 1996.

29. pp. 1065-1070.1993.

92

3LJ2. M. A. Tanner, Toolsior Statistical lnjcrcncc. Springer Verlag, Nl:W York, 19lJ6. 393. L. Tauxc, "Scdimcntarv records of relative paleoinrcnsuv of the geol11.lgnetic field: rhcorv and practice." Rev. GCOphYSICS, vol. 31, pp. 31 ~-354, 1LJ93.

415. C. Tontiruttananon and J. Tuguair, "Identification of dosed-loop linear

systems via cyclic spectral analysis: An equation-error formulation," in Proc. IEEE Int. C01~f Acoust., Speech, nnd Stff. Proc., 1998.

Gt:ophys Rcs., vol. 9:;, pp. 12,337-12,350, 19YO.

418. r. Tsak~lides and C. Nikias, "The robust covariation-bascd MUSIC (ROC-NIUSIC) algorithm for bearing estimation in impulsive noise environments," IEEE Transnctions an S£f11lCl.1 P1Vcc~Jill!T, pp. 1613-1622, JuJ 1996.

3<)5. A... Tcwfik and A. Kim, "Correlation structure of the discrete wavelet cocttictcnrs of fractional brownian motion," IEEE TmJH. 011 IT, vol. IT-38, pp. LJ04-LJ09, 1<)92.

419. M, K. Tsatsanis. "Inverse filtering criteria for CDtvlA systems," IEEE

for

Transactions on SLIT1lfli Prol"f.rsillgg, pp. 102-112, Jan. 1997.

CDlvlA SY'stCIl1S," IEEE Prrstnutl COJ/J11l. A1n.!JClzi1'lc, vol. 3, no. 5, pp. 16-25,

420. M. K. Tsatsanis and G. B. Giannakis, "Multiratc filter banks for

Ocr. 19lJ6.

code-division multiple access systems," in Prof. ICASSP'95, pol. II, pp.

397. D. J, Thomson, "Spectrum estimation techniques tor characterization .1I1d

1484-1487,1995.

development of\VT4 waveguide," Bdl ,~Vj·tC1Jl Tech. t., vol. 56, pp. Pnrt I,

421. ;\,'1. K. Tsarsanis and G. B. Gianrukis, "Equalization of rapidly fading ch.uincls: Self recovering methods," IEEE Transactions on Caunnnnicatums; vol. 44, pro 619-630, 1996.

1769-1815, Pnit II, 1983-2005, 1977. 39R. D. J. Thomson, "Spectrum estimation and harmonic .malvsis." Proc. IEEE, vol. 70, pp. l055-10lJ6, 19H2. 399. D. 400. D.

422. M. K. Tsursanis and G. B. Giannakis, "Modeling and equalization of rapidlv fading channels,' Intcrnationnl [ournal ofAdaptipc Control and Siqnal

J. Thomson, "Mulri-window bispcctrum estimates," in P1'O(' IEEE

JVor/(Jbup

011 Hiqlur-ordcr

spectralanalvsis, pp. 19-23, Vad, Colorado, lSlS9.

J. Thomson, "Quadratic-inverse spectrum estimates:

Procf::iSi1llJ, vol. 10, pp. 159-176, 1996.

applications to

423. M. K. Tsatsanis and G. B. Giannakis, "Optimal linear receivers for

palcoclimatologv," Phil. Trans. R. Soc. Loua. , vol. A 332, pp. 539-5l.J7, 1<.)90.

DS-CDtvlA sysrcms: a signal processing approach," IEEE Transactionson StfJ1'1I11 Proccssinn, vol. 44, pp. 3044-3055, 1996.

J. Thomson, "Time series analysis of holocene climate data," PIJil. Tnm.l'. R. Soc. Loud., vol. A 330, pp. 601-616,1990.

424. LVI. K. Tsatsanis ..uid G. B. Giannakis, "Transmitter induced cvclosrauonarirv for blind channel equalization,') IEEE Transactionsau Sig-

-to 1. D.

UCl!

402. D. J. Thomson, "Non-stariouarv fluctuations in "stationary" time series,"

Prot: SPfE, vol. 2017, pp. 236-244, -t03. O.

19~3.

tion of fading channels with random coefficients," Si.Hnal Proccssinq, vol. 53, pp. 211-229,1996.

J. Thomson, "An overview of multiple-window and quadratic-inverse spec-

J. Thomson, "Projection filters tor

SP HT(W/:.fbop

OJ/

Stat.

S~J.

data arulvsis." in Proc.

St:J'C1ttJJ

426. J\lI. K.

IEEE

T~.ltS
and Z. Xu, "On minimum output energy cdrna receivers in

the presence of rnultipath," in Proc. 31st ConI ou Inf». Sciences and SJ'fte11/J, pp. 724-729, Johns Hopkins Univ., Baltimore tvlD, March 19-21,1997.

and rfrrny Proc., pp. 39-41, Quebec, 1994.

J. Thomson, "The seasons. glob·.lltemperature, and precession," Scicucc, vol. 268, pp. 59-68, 1995.

405. D.

427.

J.

TugnJit, "On time delay estimation with unknown spatially correlated

Gaussi.ui noise using tourth-order cumulants and cross cumulants," IEEE

Transactions Oil Acoustics, Speech, and Sill1tnl Pl'OCCSJi1l11, vol. ASSP-39, 1991.

406. D J. Thomson, "Dependence of glob.ll temper.lture~ 011 atmospheric CO 2 and solar irradiancc," Proc. Natl. Awn. Sci. U.C,A, vol. ~4, pp. 8370-S377, 1997. -t07. D.

Proccssinq, vol. 45, pp. L785-1794, 1997.

425.1\1. K. Tsarsanis, G. B. Giannakis, and G. Zhou, "Estimation and equalize-

trum estimation methods," in Proc. ICASSP, volume \'1, pp. 185-94,1994. 404. D.

new approach to rnultipath correction

417. P. Troughton and S. J. Godsill, "Bayesian 1110ddselection for time series using markov chain monte carlo," in IeA.SSP, pp. 3733-3736, 1997.

equatorial Pacific: relative palcointcnsitv of the gl:onugnetic field>," j.

AIT.1YS

'~A

of constant modulus signals,' IEEE T7·a1lS. ACOllSt., Speech, nnd Sig. Proc., vol. ASSP-31, pp. 459-472, April 1983.

39-+. L. Tauxe and G. yVU, "Normalized rcrn.incncc in sediments of the western

396. J. Thompson, P. Grant, .ind B Nluigrew, "Smarr Antenna

J. R. Treichler and B. G. Agee,

416.

428.

J. Tugruur, "Detection of non-Gaussian signals

using integrated

. vol. 42, no. 11, pp. polvspccrrum," IEEE Transnctions on Si.inuil Proccssiu..t1,

J. Thomson and A. 0 Ch.we, "JackknifcJ error estimates for spectra,

3137-3149, 1994.

coherencLs, ~lIld tLlllsrcr functions," in Arfl'nllccJ il1 SpectruJJJ AllCl~VJiJ fmd

429. J. Tugnait, "Corrections to "detection of non-Gaussian signals using integr.1ted polyspectruITI," IEEE TmuSl1eti{)}/J 011 S£mzn.l ProceSSi1llJ, vol. 43, no. 11, pp. 2792-27Sl3, 1995.

A1'ray Pl'ocnJilllT, S. Haykin, editor, volume 1, pp. S8-113, PITntice-Hall, 1<,)91. D. J. Thumson, C. G. Nbdcnn.ll1, and L. J. Lanzerottl, "PropagJ.tioll of <,obr oscillations thruugh the intl:rplanetary medium," Nl1ture, vol. 376, pp. 139-144,1995.

430.

J. Thomson and R. Schild, "Processes \vith level-dependent dday," in Prot:. IEEE SP ll'(}rl~J!Jop 01/ HtlTba-Ordc1' statiJtics, pro 374-378, Suuth L.lke Tahoe, CA., 1<)<,)3.

431. J. K. TugnJit,

J. Thomson <m<.l R. Schild, "'Time delay estimates for Q0957+S61 A, It'' in Applimt/lJ1H {~rtimt' serifSI11ln~"JiJ ill nstrmlOlll_l' aun JJ1ctco1"OIo!:(,', T. S. Rau, Nl. B. Prrestlcv, .llld O. Lessi, editurs, pp. 1R7-204, Chapman and H.lll, London, 1997.

432. J. K.. Tugnait, "Identification of ITIultivariable stochastic linear systems via spl:ctral analysis gIven time-domain data," IEEE Transactions on Sigllcr.l Pmassill17, vol. 46, pp. 1458-1463, 1998.

40~L

-!-09. D.

York, 1990.

43, pp. to appear, 1998.

-!-13. L. Tong, G. Xu, H. Hassibi, and T. K.lil.lth, "~Blind Ch.ll1l1cl i<.kntiticatio\l

434. J. K.. Tugnait and Y. Ye, "Stochastic systl:m identification with noisy input-output measurements using polyspectra," IEEE TrmlJ. AUTOJIUlticCOll-

based on second-order staristics: :\ tl-equency-domain .1pproach," IEEE T''nJlJ. olllnfinmntilJ1/ Theory, \'01. 4 L pp. 329-334, 19<,)5.

trol, vol. 40, pp. 670-683, 1995.

435. J. Tugn;lit, "IdentiticJtion and deconvolution uf l11ultichJ..nnellinear non-Gaussian processes using higher-order statistics and inverse tilter criteria," IEEE TrmlJflctlOllS 011 S£mwl Processillg, vol. 45, pp. 658-672, 1997.

4l4. L. Tong, G. Xu, .1Ild T. Kaibth, "Blind idcntificuion and equalization 11lJin"llll1tioll TbcOJ'}',

~l

time donuin appro.lCh," JEEE Tl'n11J.

multivariJbk stochastic linear systems via

433. J. K. Tugnait and C. Tontiruttananon, "Identification of linear systems via spectral .111alysis given time-domain Jata: consistency, reduced-order apprnxim ..ulon .111d perform.lI1ce .1I1alysis," IEEE T'"n.1lS. AUt01JlCltic COllT1'01, vol.

Electrol/. Lett., vol. 7, pp. 138-139,1971.

basl:d on second-order statistics:

~'Identitlc1tion of

al1.1lysis givcn noisy input-output time-domain data," IEEE Tra1ls.. A./ttomatic C01ltrol, vol. 43, pp. to appe.lr, 1998.

411. Nl. Tomlinsun, 'LNew .llltomatic equ.lliser cmplovlng modulo arithmetic,"

~e\\'

and

pol~'spectral

-!-10. D.

4]2. H. Tong, NOIl-/iIll'I11' Time SO'in, Oxt(.nd Ul1iversit~, Press,

r. K. Tugnait, "Idcntiticltion of linear stochastic systems via second-

fourth-order CluTIulant matching," IEEE TrmlS. 011 IIl/on-uatio1l Them')', vol. 33, pp. 393-407, 1987.

011

pp. 340-34-<,), 19<,)4.

93

436.

J. W. Tukey, "An

introduction to the calculations of numerical spcctrum

analysis," in SpectralAnalysis of Time Series, B. Harris, editor, pp. 25-46,

456. K. Wong and M. Zoltowski, "Uni-vcctor-scnsor ESPRIT for mulrisource

J.

azimuth, elevation, .1I1U polarization esrim..ition," IEEE Trans.

Wiley and Sons, 1967.

457. K. Wong and M. Zoltowski, "Closed-form direction-finding with arbi-

437. P. P. Vaidvanarhan, Multirate systemsaut!filter banks, Prentice Hall, New

tranlv spaced electromagnetic vector-sensors at unknown locarions," in

Jersey, ] 992.

Proc. IEEE Jut! COllI 011 Acaust.. Speech. nud StlT. Proc. (ICASSP98) , Pl': 1941J-1952, Seattle, \"l A, Nlay 1998.

438. P. van Overschcc and B. de Moor, Subspace Identificationfor Linear Systems:

Theory- Implcnumtatiau-Mcthods, Kluwcr Academic, Boston, MA. 1996.

458. G. Worncll, "Sprcad-sigrururc CD/VIA: Efficient multiuser commuruca-

439. B. D. Van Vcen and K. Buckley, "Bearntorrning:A Versatile Approach to

rions in the presencc of bding, " IEEE Trans. on Iufo. Theory, pp.

Spatial Filtering," IEEEASSP iV1.aptlzi1lt" vol. 5, no. 2, pp. 4-24, April 1988.

]418-1438, Sept. 1995.

440. B. D. V~U1 Veen and L. L. Scharf, "Esrimatiou of structured covariance matrices and multiple window spectrum analysis," IEEE Transactions

o.l-59. G. Worncll, "Emerging applications of multirare signal processing and W.l\'C-

011 Si..f.T-

let') in digital conununications." Proacdi1{qSl{tbc IEEE, pp. 586-603, 1996.

nal Proccssillp, vol. 38, pp. 1467-1471, 1990.

460. G. Worndl, Si.tflln/ Processint; witl) Fractals: A Wavelet-BasedApproach, Prentice-Hall, Upper Saddle River, NT, 1995.

441. S. Verdu, "Multiuser detection," in..Advances ill Stntistical S~T11fll P1'OCCSJill...q, V. P., editor, pp. 369-409, JAT, NY, 1997.

461. C. F. 1. Wu, "On the convergence propertIcs of the £1"1 .llgurithm:' ..1.111l111JofStntistics, \'01. 11, pp. 95-103, 1983.

442. F. L. Vernon III, Analysis of data recorded Oil the ANZil seismic lletwor!:, PhD thesis, Univ, Calif., San Diego, 19~Y.

462. G. Xu, H. Liu, L. Tong, and T. K.lil.1th, "A least-squares approach to

blind channel identification," IEEE Transactions011 Stqual Proccssin...a, vol. 43,

443. M. Vctterli .1I1d 1. Kovacevic, Wavelets milt Subbant! Codill[!, Prentice Hall, Englewood Cliffs, NJ, 1995.

pp. 2982-2993, 1995.

J. Yang and S. Roy, "On joint transmitter .111d receiver optimization for multiple-inpur-rnultiple-output (NUMO) transrmssion svsrcms," IEEE Trn/ls. 011 Connnuuicatious, pp. 3221-32~1, 1994.

463.

444. M. Viborg, B. Otrersren, and T. Kailath, "Detection and estimation in sensor arrays using weighted subspace fitting," IEEE Transactions011 Stf1unl Proccssiu...tT, vol. 39, no. 11, pp. 2436-2449, 1991.

464. X. Yu, T. S. Reed, and A. D. Stocker, "Cornparanve performance analysis

445. B. Vidakovic, "Nonlinear wavelet shrinkage with Bayes rules and Bayes,"

of adaptive multi-spectral detectors," IEEE Trnnsactums on Si...lJual P1'Occ.f.'.ill,tr, 41, no. 8, pp. 2639-2655, 1993.

[oumal oftbe American Statistical Association, vol. 93, pp. 173-179, 1998. 446.

J.

\'01.

Ville, "Theone ct applications de la notion de signal .uialytiquc," Cables

465. Y. Zh~lO, L. Atlas, and R. Marks, l
ct Transmissions, vol. 2A, pp. 61-74,1948.

eralized time-frcqucncv representations of nonstarionarv signals," IEEE

447. W. L. W. Willinger, M.S. Taqqu and D. Wilson, ~lOn the self-similar nature of ethernet traffic,' IEEE/.ACiVI T1'fl11S01l NftJl'odd1ll1, vol. 2, pp. 1-15,1994.

Transactiuns 011 Acoustics, 5fJL'/xlJ, and S[fJllnlPvoccssinp ; vol. 38, pp. 1084-1091,1990. 466. G. Zhou and G. Gi..innakiv, "Harmonics in Gaussian multiplicative and

448. A. T. Walden, "Multirapcr estimation of the innovation variance OLl stationary

additive noise: Cll.Bs," IEEE Transactiauson S[fTllfll Processiun, vol. 43, no.

rime series,' IEEETmusactians on S~11lfll Proccssiu...f1, vol. 43, pp. 181-187, 1995. 449. E. Wegman, S. Schwartz, and

5, pp. 1217-31, N1J~' 1995.

J. Thomas, editors, Topicsill Non-Gaussutn

467. G. Zhou ..ind G. Giannakis, "Harmonics in Gaussian multiplicative and

Silfufl1 Proccssiu...f.T, Springer- Vcrlag, New York, I 989.

additive noise: performance analvsix of cyclic statistics," IEli}" Trausactunis StIJ1l111 Proccssinq, vol. 43, no. 6, pp. 1445-60, [un IlJ96.

450. B. Widrow and S. Stearns, Adaptive S£f/1tnl Proccssiuq, Prentice-Hall,

(}11

Englewood Cliff.c;, N.r., 1985.

468. T. Zhu, K.-P. Chun, and G.

451. E. P. Wigner, "On the quantum correction tor thermodynamic equilib-

determination

452. K. M. \"Iong and S. Chen, ~'Detcction of IlJITow-b,lnd sonar signals using

u~ing

469. V. ZolorJITv, 01lf-dilllL'1lJiollal stab!l'distnbutlO1lJ, American Ivlathcmatical Societ\', }986.

order statistical filters," IEEE T1"n1l.l·,.Acoustics, Speed), nlld S[tT1lfll P1·ocesJill•..fT, vol. ASSP-35, no. 5, pp. 597-613, 1987.

470. M. Zoltowski and K. vVOl1g, '~Pubriz~ltjon diversity .md extcnded-.lpcrture sp~ltial diversity to mitigate t:lding-channd effccts with ~1 ~pclrse ~l1TJY of elcctric dipolc.:s or magnctic loops," in J)/'(}f. IEEE Iut. Vt:bic. Tce/;. COllf, pp. 1163-1167,1997.

453. K. Wong and M. Zoltowski, UHigh accuracy 2D .lngle esrinl.ltion with extended ilperulrc vector sensor .lrrays," in Proe. IEEE Iut! COllf 011 AUJIlst.,

Speccb, tmd Sig. Proc. (lCASSP96), pp. 2789-1792, Atbnt.l, GA, 1996. ~lClosed-form undcrwater

r. West, "High-frequency p-wv«: .ureuuarion

multiple-winduw ~pectl'al analysis nlcrhf Jd," Bull. Sfis1I1. Soc. .l"!m., vol. 79, pp. 1054-1069, 1lJ89.

rium," Physics R£'piCJP, vol. 40, pp. 749-759, 1932,

454. K. Wong and M. Zolto\\'ski,

A-!lItC1I1U1Stntd

Prop" vol. 45, pp. 1467-1474, October 19<)7.

471. \tV. Y. Znu .1I1d Y. VVu, "'OFD~/l: An overVlcw," IEEE Tralls. ill...f.T, vol. 41, pp. 1-8, 19,}5.

acoustic direc-

tiun-tinding with arbitrarily spaced vector-hydrophones at unknown locations," IEEE J. Ocemlic Ell...f.T., vol. 22, pp. 649-658, October 1997.

Ull

Broadcast-

472. A. !vI. Zoubir, LlThe bootstrap .1Ild ir~ appli<.:.\riolls," in I'roc. IC'ASSl', volume VI, pp. 65-100,1994.

455. K. Wong and M. Zoltowski, ~~Exrended-apertureunderwatcr Jcoustic l11ultisource ilzimuth/clcv.ltion direction-finding using uniformly but

473. Z. Zvon.lr ~lI1d D. Rr,ld~', «Linc.w lllultipJch-decorrel.ulIlg n:celvers for

sp.lrscly spaced vcctor hydrophones," IEEE f. OCClmic EU.!I., vol. 22, Pl'. 659-672, October 1997.

CDrvlA fi'equcncy-sdecri\'c t:lding ch.lnncl\:' IEEE Tm1/s. tirms, pp. 650-663, Jan. 1996.

94

011 C01JJ1Jllwim-

Application of Antenna Arrays to Mobile Communications, Part II: Beam-Forming and Direction-of-Arrival Considerations LAL C. GODARA,

SENIOR MEMBER, IEEE

c

Array processing involves manipulation of signaLs induced on various antenna elements. Its capabilities of steering nulls to reduce cochannel interferences and pointing independent beams toward various mobiles, as well as its ability to provide estimates of directions of radiating sources, make it attractive to a mobile communications system designer. Array processing is expected to play an important role in fulfilling the increased demands of various mobile communications services. Part I of this paper showed how an array could be utilized in different configurations to improve the performance of mobile communications systems, with references to various studies where feasibility of an array system for mobile communications is considered. This paper provides a comprehensive and detailed treatment o] different beam-forming schemes, adaptive algorithms to adjust the required weighting on antennas, direction-of-arrival estimation methods-i-including their performance comparison-s-and effects of errors on the performance of an array system, as well as schemes to alleviate them. This paper brings together almost all aspects of array signal processing. It is presented at a level appropriate to nonexperts in the field and contains a large reference list to probe [urther.

c:

CANAL CDMA CMA CRLB

d

DOA

E[·] ~l

ESPRIT

Keywords- Beam forming, conjugate gradient method, eigenstructure methods, ESPRIT, least square algorithm, linear predicuon method. maximum entropy. maximum likelihood method. nununum nann, mobile communications, multipatn arrivals, MilSIC, MVDR estimator, neural networks, recursive least square algorithm, weighted subspace [uting.

E fo fl\T

FBW FDMA FFT FINE FIR

NOMENCLATURE

.4. . :4..,

AIC

B BER BPSK

L by M matrix, with its columns being the steering vectors. Amplitude of the lith source using frequency modulation. Denotes L -1 weights after the kth tap in TDL structure in a beam-space processor. Akaikcs information criterion. Blocking matrix or the matrix prefilter for a narrow-band beam-space processor. Bit error rate. Binary phase shift keying.

G

GxyCf)

g( t)

GMSK GSC GSM

Manuscript received March 30, 1997; revised May 10, 1997. The author IS WIth University College, University of New South Wales, Australian Defence Force Academy, School of Electrical Engineering, Canberra ACT 2600 Australia (e-mail: [email protected]). Publisher Item Identifier S 0018-9219(97)05718-6.

H(f)

LJ X J constraint matrix. Speed of propagation of a plane wave front. Concurrent nulling and location. Code division multiple access. Constant modulus algorithm. Cramer-Rao lower bound. Interelement spacing of a linear equispaced array. Message symbol in TDMA system and message sequence in CDMA system (associated with the 'lth source). Direction of arri val. Expectation operator. Vector of all zeros except the first element, which is equal to unity. Estimation of signal parameters via rotational invariance technique. J -dimensional vector specifying the frequency response in the look direction. Center frequency. Nyquist frequency. Fractional bandwidth. Frequency division multiple access. Fast Fourier transform. First principal vector. Finite impulse response. Array gain of the optimal processor. Cross-power spectrum of two broad-band signals x(t) and y(t). Pseudo-random noise binary sequence having the values + 1 or -1. Unbiased estimate of the gradient of the mean squared error or the mean output power. Gaussian minimum shift keying. Generalized side-lobe canceller. Global system for mobile communications. Transfer function.

Reprinted from Proceedings of the IEEE, Vol. 85, No.8. pp. 1195-1245, August 1997.

95

HEOS I

J

Jo

J(n)

L

In[·]

LMS LS M

m.s(t) MAP MOL MEM

min-norm

ML MLM MMSE MSE MVDR

MUSIC N

Ni

'nt(t)

P~-1N(O) P~'1U (())

Highly elliptical orbit satellite. Identity matrix. Number of taps in a tapped delay line structure. Reflection matrix with all its elements along the secondary diagonal being equal to unity and zero elsewhere. Cumulative mean square error at the nth iteration, cost function. Number of elements in a subarray. Covariance matrix of the weights at the nth iteration. Number of elements in an array. Natural logarithm of [.J. Least mean square. Least square. Number of directional sources, number of beams in a beam-space processor. Misadjustment. Complex modulating function of the ith source. Modulating function of the signal at time instant n. Modulating function of the signal source at time instant t. Maximum a posteriori. Minimum description length. Maximum entropy method. Minimum norm. Maximum likelihood. Maximum likelihood method. Minimum mean squared error. Mean squared error. Minimum variance distortionless response. Multiple signal classification. Number of samples. Number of possible combinations of elements with lag i. Random noise component on the fth element. Noise-alone matrix inverse. Projection operator. Power estimated by Barrette method as a function of O. Power estimated by linear prediction method as a function of O. Power estimated by MEM as a function of O. Power estimated by minimum norm method as a function of O. Power estimated by MUSIC method as a function of (). Power estimated by MVDR method as a function of O. Output noise power. Mean output power of the processor for a given 1:Q.

p,

PI Ps

p(t)

PIC QPSK

9.(t) R

R(n)

'£i T -d

RLS RMS

S

S(f)

SMI SNR SPNMI STD

T

TAM TDL TDMA TLS Tr(R) [1.",

Us

96

Power of the i th source as measured at the reference element. Power of a directional interference. Power of the source in the look direction, referred to as the signal source. Mean output power of the conventional processor. Sampling pulse. Postbeam-former interference canceller. Quadrature phase shift keying. M - 1 dimensional vector denoting outputs of M - 1 auxiliary beams of a beam-space processor. Array correlation matrix. Array correlation matrix estimate at time instant n. mth subarray matrix of the forward method. mth subarray matrix of the backward method. Noise-only array correlation matrix. Reference signal. ith correlation lag. Position vector of the fth element. LJ -dimensional vector denoting correlation between the desired signal and the array signal vector. Recursive least square. Root mean square. Steering vector in the look direction. Steering vector in direction (¢ I, 0I ). Steering vector associated with the direction (¢i,Oi) or the ith source. Steering vector associated with the direction O. M by M matrix denoting the source correlation. Power density of broad-band signal s(t). Sample matrix inversion. Signal-to-noise ratio. Signal-plus-noise matrix inverse. Standard deviation. Delay between successive taps of TDL filter. Bulk delay. Steering delay in front of fth element to steer an array in (¢o, ( 0 ) direction. Steering delay in front of fth element to steer an array in look direction. Toeplitz approximation method. Tap delay line. Time division multiple access. Total least square. Trace of R. Matrix with its L-NI columns being the eigenvectors corresponding to the L-M smallest eigenvalues of R. Matrix with its M columns being the

eigenvectors corresponding to the M largest eigenvalues. Unit-norm eigenvector corresponding to

Sampling interval. Magnitude of the displacement vector. Forgetting factor. Complex scalar denoting the correlation between the signal and an interference. Correlation between two broad-band signals x ( t) and y( t ) . Error signal. c(·) Error signal between the reference signal i(n) and modified output. 6.i(n) Change in error signal when array output is perturbed by a small amount ~'Y. Error between the array output and the reference signal for a given 'JQ( n). Look direction. (¢o,(}o) Direction of an interference. (¢I,(}I) Direction of the zth source. (¢'i" e'i) Message part of the lith source using ~i(t) frequency modulation. L by L diagonal matrix with At, f ;= 1, A L being its diagonal entries. fth eigenvalue of the array correlation matrix. Maximum eigenvalue of R. Amax £th eigenvalue of P RP. At(PRP) A1nax(PRP) Maximum eigenvalue of P RP. Constant. J.-Lo Gradient step size. ~1 Step size at the nth iteration. {L( n) p Scalar quantity, which depends on the direction of the interference relati ve to the signal source and the array geometry. Correlation function of a broad-band signal. Cross-correlation function. L by L matrix with V t , I!. == 1, L being its columns. Variance of random noise. Variance of quantization noise. Direction of the ith source. Time constant of the fth trajectory. Time taken by a plane wave arriving from the ith source in direction ((ji) and measured from the fth element to the reference point. Time taken by a plane wave arriving from the lith source in direction (¢'/, (j,() and measured from the Rth element. Differential delay between elements i and j due to a source in direction e. Complex conjugate. * Transpose of a vector or matrix. Complex conjugate transpose of a vector or matrix.

At.

'U -1

ULA

vo» V~C~(n))

f!(¢'i' (jd W's

1dl(n

+ 1)

si.; 'ill

:fu \ 1S E j

WSF

£(n)

;r(t) x'(t) ~s(t) £1\'

(71,)

y(t)

y(n)

'y(n)

Yo

Column vector of all zeros except one element, which is equal to one. Uniformly spaced linear array. Difference between estimated weights and the optimal weights at the nth iteration. Variance of the gradient. Unit vector in direction (¢i, (}.i ). Matrix prefilter to block the look direction in a broad-band beam-space processor. Weighting on fth element for the narrowband beam former. Array weight vector. Mean of the estimated weights at the nth iteration. Array weight vector at time instant n + 1, new weights computed at the (n + 1)th iteration Array weights of the conventional beam former. L weights after the (m - 1)th tap in TDL structure. Weights of the optimal beam former. Weights with minimum mean squared error. Weighted subspace fitting. Total signal induced on the fth element due to all M directional sources and background noise. Signal induced on the £th element due to the signal sources only. Array signal vector at time instant n. Array signal vector at time instant t. L -1 dimensional signal vector following matrix prefilter. Array signal vector at time instant t due to the signal sources only. Array receiver vector not containing the signal at time instant n. Output of a beam former at time t. Output of a beam former when it is operating with weights :!Q( n). Modified output of a beam former when it is operating with weights 1dl(n). Output of the main beam of a beam-space processor. Weighted output of the auxiliary beams of a beam-space processor. Desired amplitude in the absence of interference. Correlation between the reference signal and the array signals vector. Output SNR of the optimal processor.

I.

INTRODUCTION

The demand for wireless mobile communications services is growing at an explosive rate, with the anticipation

97

that communication to a mobile device anywhere on the globe at all times will be available in the near future. An array of antennas mounted on vehicles, ships, aircraft, satellites, and base stations is expected to play an important role in fulfilling these services' increased demand for channels and in realizing the dream that a portable communication device the size of a wristwatch will be available at an affordable cost for such services. Part I of this paper showed how an antenna array could be used in various configurations to improve the performance of mobile communications systems, with references to theoretical analyses, computer simulations, and experimental system developments. Array signal processing involves the manipulation of signals induced on the elements of an array. The widespread interest in the subject area has been maintained over decades due to its applicability to many walks of life. The first issue of IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, published in 1964 [1], has been followed by many special issues of various journals [2]-[6], a number of books [7]-[ 12], a selected bibliography [13], and a vast amount of specialized research papers. Some of the general papers that discuss various issues include [14]-[31]. This paper provides a comprehensive review of various beam- forming schemes, adaptive algorithms to adjust the required weighting on antennas, DOA estimation methods, and array-system sensitivity to parameter perturbations. As array signal processing has applications in many other disciplines, this paper aims to provide a complete treatment of the subject area by extending coverage to topics that might not be directly relevant to mobile communications. This paper, however, provides references where beamforming and DOA estimation methods have been suggested for mobile communications systems. In Section II, a signal model useful for array processing is presented along with various beam-forming schemes, including descriptions of conventional delay and SUIn beam formers, null steering, constrained beam forming and optimization using a reference signal, beam-space processing, broad-band array processing in time and frequency domains, digital beam forming, and eigenstructure methods. Section III describes adaptive algorithms to adjust the weights of an array. These algorithms include SMI, unconstrained as well as constrained LMS, norrnalized LMS, structured gradient, RLS, CMA, conjugate gradient method, and neural-network approach to beam forming. Some discussion on implementation issues, convergence characteristics of adaptive algorithms, and signal sensitivity of the LMS algorithm is also provided in this section. Section IV describes various DOA estimation methods, compares their performance, and analyzes their sensitivity. These methods include spectral estimation, MVDR estimator, linear prediction, maximum entropy, ML, various eigenstructure methods-including many versions of MUSIC algorithms-min-nonn, CLOSEST, ESPRIT, and WSF. This section also contains a discussion on various preprocessing and number-of-source estimation methods.

Section V discusses the effect of errors and perturbations on the performance of the array processing schemes. A signal model applicable to multipath situations is discussed, and it is pointed out how multipath degrades the performance of an array processing system. Various cures for multipath degradation are highlighted in this section, which also presents a discussion on look direction and steering vector errors, element failure and element position errors, and weight errors. References to many robust beamforming schemes are also included in this section. Section VI concludes this paper. II.

BEAM FORMING

In this section, various beam-forming methods are discussed in detail. First, notation, terminology, and a signal model useful for this purpose are introduced.

A. Terminology and Signal Model Consider an array of L omnidirectional elements immersed in a homogeneous media in the far field of M uncorrelated sinusoidal point sources of frequency fo. Let the origin of the coordinate system be taken as the time reference, as shown in Fig. 1. Thus, the time taken by a plane wave arriving from the ith source in direction (
where 'f.t is the position vector of the fth element, i!.(i, ()i), c is the speed of propagation of the plane wave front, and . represents the inner product. For a linear array of equispaced elements with element spacing d aligned with the x-axis such that the first element is situated at the origin, it becomes

(2) The signal induced on the reference element due to the ith source is normally expressed in complex notation as

(3) with mi(t) denoting the complex modulating function. The structure of the modulating function reflects the particular modulation used in a communications system. For example, for an FDMA system, it is a frequency-modulated signal given by mi(t) == Aiej~1 (t), with Ai denoting the amplitude and ~i(t) denoting the message. For a TDMA system, it is given by

(4) n

where p(t) is the sampling pulse, the amplitude d; (n) denotes the message symbol, and ~ is the sampling interval. For a COMA system, mi(t) is given by (5)

98

z

Source

...

Output y(t)

--------~.----~-~~----~y

Fig. 2. x

Fig. 1.

Narrow-band beam-former structure.

bandpass filters, and so on. It follows from the figure that an expression for the array output is given by Definition of coordinate system.

y(t) ==

where d, (en) denotes the message sequence and g( t) is a pseudo-random noise binary sequence having the values +1 or -1 [32]. In general, the modulating function is normally modeled as a complex low-pass process with zero mean and variance equal to the source power Pi, as measured at the reference element. Assuming that the wavefront on the fth elements arrives T{ (
'JQ

AI

(t+Tt(¢1,8 1))

+ ne(t)

(8)

==

[Wi, W2,"', WL]T

(9)

and signals induced on all elements as

~(t) == [Xi(t),X2(t),·" ,XL(t)]T

(10)

the output of the beam former becomes

y(t) ==

wH ~(t)

( 11 )

where superscripts T and H, respectively, denote the transpose and complex conjugate transpose of a vector or matrix. Throughout this paper, 'JQ and ~(t) are referred to as the array weight vector and the array signal vector, respectively. If the components of ~(t) can be modeled as zero mean stationary processes, then for a given 1Q, the mean output power of the processor is given by

The expression is based upon the narrow-band assumption for array signal processing, which assumes that the bandwidth of the signal is narrow enough and that the array dimensions are small enough for the modulating function to stay almost constant during "e(
2: mi(t)cj21rfo

e=l

where * denotes the complex conjugate. Denoting the weights of the beam former as

(6)

Xe ==

L

2: WeXt(t)

P(1Q) == E[y(t)

y*(t)]

== 1QH R1Q

( 12)

where E[·] denotes the expectation operator and R, is the array correlation matrix defined by

R == E[~(t)

(7)

i=i

{fH (t)].

( 13)

Elements of this matrix denote the correlation between various elements. For example, ~j denotes the correlation between the zth and the jth element of the array. Denote the steering vector associated with the direction (01' B'i) or the ith source by an L-dimensional complex vector ~'l as

where ne(t ) is a random noise component on the fth element, which includes background noise and electronic noise generated in the fth channel. It is assumed to be temporally white with zero mean and variance equal to a~. It should be noted that if the elements were not omnidirectional, then the signal induced on each element due to a source is scaled by an amount equal to the response of the element under consideration in the direction of the source. Consider a narrow-band beam former, shown in Fig. 2, where signals from each element are multiplied by a complex weight and summed to form the array output. The figure does not show components such as preamplifiers,

~i

== [exp(j2n lOTi (¢i

1

()i), ... , exp(j2n Lsri. (cPt,

B,t)]T. (14 )

Algebraic manipulation using (7), (10), and (13) leads to the following expression for R Ai

R ==

2: Pi~i~f + (J~I i=l

99

( 15)

where I is an identity matrix and Pi denotes the power of the ith source measured at one of the elements of the array. It should be noted that Pi is the variance of the complex modulating function mi(t) when it is modeled as a zero mean low-pass random process, as mentioned previously. Using matrix notation, the correlation matrix R may be expressed in the following compact form:

form an orthonormal set, this leads to the following expression for R

There are many schemes to select the weights of the beam former depicted in Fig. 2, each with its own characteristics and limitations. Some of these are now discussed.

where columns of the L by M matrix A are made up of steering vectors, i.e.,

B. Conventional Beam Former A conventional beam former is a simple beam former, sometimes known as the delay-and-sum beam former, with all its weights of equal magnitudes. The phases are selected to steer the array in a particular direction (cPo, ( 0 ) , known as the look direction. With ~o denoting the steering vector in the look direction, the array weights are given by

(17)

and M by M matrix 5 denotes the source correlation. For uncorrelated sources, it is a diagonal matrix with

==

Pi , { 0,

't==J

i

i= j.

(18)

'J:Qc

Sometimes, it is useful to express R in terms of its eigenvalues and their associated eigenvectors. The eigenvalues of R can be divided into two sets when the environment consists of uncorrelated directional sources and uncorrelated white noise. The eigenvalues contained in one set are of equal values. Their value does not depend upon the directional sources and is equal to the variance of the white noise. The eigenvalues contained in the second set are a function of the parameters of the directional sources, and their number is equal to the number of these sources. Each eigenvalue of this set is associated with a directional source, and its value changes with the change in the source power of this source. The eigenvalues of this set are bigger than those associated with white noise. Sometimes, these eigenvalues are referred to as the signal eigenvalues, and the others belonging to the first set are referred to as the noise eigenvalues. Thus, the R of an array of L elements immersed in M directional sources and the white noise has M signal eigenvalues and L-M noise eigenvalues. Denoting the L eigenvalues of R in descending order by At', f == 1, L and their corresponding unit-norm eigenvectors by U e- f == 1, L, the matrix takes the following

o

(24)

Thus, in vector notation, using a steering vector to denote relevant phases, the array signal vector due to the look direction signal becomes (25)

and the output of the array with weight vector ui; becomes y ( t) == 1JJ.~

s, (t )

== m's(t)cJ27ffot

(26)

yielding the mean output power of the processor

P(w.c ) == E[y(t)

== P»-

y*(t)] (27)

Thus, the mean output power of the conventional beam former steered in the look direction is equal to the power of the source in the look direction. The process is similar to steering the array mechanically in the look direction except that it is done electronically by adjusting the phases. This is also referred to as electronic steering, and phase shifters are used to adjust the required phases. It should be noted that the aperture of an electronically steered array is different from that of a mechanically steered array. The concept of a delay-and-sum beam fanner can be further understood with Fig. 3, which shows an array with two elements separated by distance d. Assume that a plane wave arriving from direction 0 induces voltage s(t) on the

with a diagonal matrix

o

(23)

Xts(t) == ms(t)ej2trfo (t+Tt(<po,8 o)) .

(19)

A==

1

== I.§.o·

The array with these weights has unity response in the look direction, that is, the mean output power of the processor due to a source in the look direction is the same as the source power. This may be understood as follows. Assume that there is a source of power Ps in the look direction, hereafter referred to as the signal source, with m., (t) denoting its modulating function. The signal induced on the t'th element due to this source only is given by

form:

Al

(22)

t=1

(16)

Sij

L Atl1tcrf + a~I. A1

R ==

(20)

and (21)

This representation sometimes is referred to as the spectral decomposition of R. Using the fact that the eigenvectors 100

signal

Duree !

element 1

'h/ I

set)

Output s(t- T )

I

element 2

Fig. 3.

Two-element delay-and-sum beam-former structure.

first element. As the wave arrives at the second element seconds later, with

-

d

T == - cos () c

t

where ~I denotes the steering vector in direction (4)1, () I ). In the next section, a beam former that puts nulls in the directions of interferences is described.

(28)

C. Null-Steering Beam Former

the induced voltage on the second element equals s( t - T). If the voltage induced at the first element is delayed by an amount equal to T, producing voltage s(t - T), and no delay is provided at the second element, then both voltage waveforms appear in phase and the output of the beam former is produced by summing these waveforms. A scaling of each waveform by 0.5 provides the gain in direction () equal to unity. In an environment consisting of only uncorrelated noise and no directional interferences, this beam former provides maximum SNR. For uncorrelated noise, the R!'{ is given by

A null-steering beam former is used to cancel a plane wave arriving from a known direction and thus produces a null in the response pattern in the DOA of the plane wave. One of the earliest schemes, referred to as DICANNE [33], [34], achieves this by estimating the signal arriving from a known direction by steering a conventional beam in the direction of the source and then subtracting the output of this from each element. An estimate of the signal is made by delay-and-sum beam forming using shift registers to provide the required delay at each element such that the signal arriving from the beam-steering direction appears in phase after the delay. Then these waveforms are summed with equal weighting. This signal is then subtracted from each element after the delay. The process is very effecti ve for canceling strong interference and could be repeated for multiple interference cancellation. Though the process of subtracting the estimated interference using a delay-and-sum beam former used by DICANNE scheme is easy to implement for single interference, it becomes cumbersome as the number of interferences grows. A beam with unity response in the desired direction and nulls in interference directions may be formed by estimating the weights of a beam former, shown in Fig. 2, using suitable constraints [22], [34]. Assume that ~o is the steering vector in the direction where unity response is required and that ttl' ... , Qk are k steering vectors associated with k directions where nulls are required. The desired weight vector is the solution of following simultaneous equations:

(29)

and the output noise power of the beam former

(30)

L

It shows that the noise power at the array output is L times less than that present on each element. Thus, the processor with unity gain in the signal direction has reduced the uncorrelated noise by L, yielding the output SNR == psL/a~. As the input SNR is ps/a~ this provides an array gain, which is defined as the ratio of the output SNR to the input SNR, equal to L the number of elements in the array. Though this beam former provides maximum output SNR when there is no directional jammer operating at the same frequency, it is not effective in the presence of directional jammers, intentional or unintentional. The response of the processor toward a source in direction (1)1, ()I) is given by H. _ J:Q c !ir I

L1 10H.ttl

== 1 'JJlH s; == 0,

(32)

'JJlH ~o

i == 1, "', k.

(33)

Using matrix notation, this becomes

(31 )

-w

101

H

A == -1 cT

(34 )

where R N is the array correlation matrix of the noise alone, that is, it does not contain any signal arriving from the look direction (1)0, ( 0 ) , and /-to is a constant. For an array constrained to have a unit response in the look direction, this constant becomes

where A is a matrix with its columns being the steering vectors associated with all directional sources, including the look direction, that is

A £ [so S1 ... ,Sk] - ,-,

(35)

and ~l is a vector of all zeros except the first element, which is one, that is ~1 ==

u,o, ... ,O]T.

J.Lo:=

1

(40)

HR- 1

~o

f\T ~o

leading to the following expression for the weight vector:

(36)

For k == L - 1, A is a square matrix. Assuming that the inverse of A exists, which requires that all steering vectors are linearly independent [35], the solution for the weight vector is given by

"

1Jl.

==

R -f\! 1 ~o

H

~o

(4] )

tt;-1 ~o .

As the weights are computed using NAME, the processor with these weights is referred to as the NAME processor [42]. It is also known as the ML filter [43], as it finds the ML estimate of the power of the signal source, assuming all sources as interferences. It should be noted Rf\r may not be invertible when the background noise is very small. In that case, it becomes a rank deficient matrix. In practice, when the estimate of the noise-alone matrix is not available, the total R (signal plus noise) is used to estimate the weights and the processor is referred to as the SPNMI processor. An expression for the weights for this case is given by

(37) In case the steering vectors are not linearly independent,

A is not invertible, and its pseudo inverse can be used in

its place. It follows from this equation that due to the structure of the vector ~l' the first row of the inverse of matrix A forms the weight vector. Thus, the weights selected as the first row of the inverse of matrix A have the desired properties of unity response in the look direction and nulls in the directions of interferences. When the number of required nulls is less than L - 1, A is not a square matrix. A suitable estimate of weights may be produced using

"

w.:=

R-

1 §.0 ' sH R-1 S0 -0

(42)

These weights are the solution of the following optimization problem:

(38)

minimize .YL

Though the beam pattern produced by this beam former has nulls in the directions of interferences, it is not designed to minimize the uncorrelated noise at the array output. It is possible to achieve this by selecting weights that minimize the mean output power subject to the above constraints [36]. An application of a null-steering scheme for detecting an amplitude modulated signal by placing nulls in the known directions of interferences is described in [37], which is able to cancel a strong jammer in a mobile communications system. The use of a null-steering scheme for a transmitting array employed at a base station, discussed in [38], minimizes the interferences toward other cochannel mobiles. A performance analysis of a null-steering algorithm is presented in [39].

subject to

(43)

Thus, the processor weights are selected by minimizing the mean output power of the processor while maintaining unity response in the look direction. The constraint ensures that the signal passes through the processor undistorted. Therefore, the output signal power is the same as the look-direction source power. The minimization process then minimizes the total noise, including interferences and uncorrelated noise. Minimizing the total output noise while keeping the output signal constant is the same as maximizing the output SNR. It should be noted that the weights of the NAMI processor and the SPNAMI processor are identical, and in the absence of errors, the processor performs identically in both cases. This fact can be proved as follows. The Matrix Inversion Lemma for an invertible matrix .4 and a vector {f states that

D. Optimal Beam Forming The null-steering scheme described in the previous section requires knowledge of the directions of interference sources, and the beam former using the weights estimated by this scheme does not maximize the output SNR.. The optimal beam-forming method described in this section overcomes these limitations. Let an L-dimensional complex vector iQ represent the weights of the beam fanner shown in Fig. 2, which maximizes the output SNR. For an array that is not constrained, an expression for ill. is given by [17], [24], [40], [41]

H)-l _ A- 1 --

(A +xx

_

A- l J'. J2H A- l H' 1 + {f ..4 - 1 £

(44)

Since

it follows from the Matrix Inversion Lemma that R-

(39)

102

1 -

-

s:: _P N

S

R -f\! l §.o§.oHR-1 l\T H

1 + ~O

-1

Rf\,T

~oPs

.

(46)

A substitution for R- 1 in (42) and algebraic manipulation leads to the expression for weights given by (41), showing that the two expressions are identical. The processor with these weights is referred to as the optimal processor. The output SNR {~ of the optimal processor is given by [29]

HR- 1 a == Pstio l\r tio' A

An interesting special case is one where the interference is much stronger compared to the background noise, PI» For this case, these expressions may be approximated as

0";.

a A

A

tio

==-.

L

(56)

When interference is away from the main lobe of the conventional processor p ~ 1, it follows that the output SNR of the optimal processor in the presence of a strong interference is the same as that of the conventional processor in the absence of interference, implying that the processor has almost completely canceled the interference, yielding a very large array gain. The performance of the processor in terms of its output SNR and the array gain is not affected by the look-direction constraint, as it only scales the weights. Therefore, the treatment presented above is valid for the unconstrained processor. For the optimal beam former to operate as described above and to maximize the SNR by canceling interferences. the number of interferences must be less than or equal to L - 2, as an array with L elements has L - 1 degrees of freedom and one has been utilized by the constraint in the look direction. This may not be true in a mobile communications environment due to existence of multipath arrivals, and the array beam former may not be able to achieve the maximization of the output SNR by suppressing every interference. As argued in [46], however. the beam former does not have to suppress interferences to a great extent and cause a vast increase in the output SNR to improve the performance of a mobile radio system. An increase of a few decibels in the output SNR can make a large increase in the channel capacity of the system possible. In mobile communications literature, the optimal beam former is often referred to as the optimal combiner. Discussion on the use of the optimal combiner to cancel interferences and to improve the performance of mobile communications systems can be found in [46]-[49]. It should be noted that the optimal beam former descri bed in this section, also known as the MVDR beam former, does not require the knowledge of the directions and power levels of the interferences as well as the level of the background noise power to maximize the output SNR. It requires only the direction of the desired signal. In the next section, a processor is described that requires a reference signal instead of the desired signal direction.

(48)

(49)

and (50)

For the case of one-directional interference of power PI, the expression for the output SNR becomes

(51 )

and the array gain is given by

(52)

where

- 1_

SH S SH S -0 -1-1 -0

£2

(53)

is a scalar quantity and depends upon the direction of the interference relative to the signal source and the array geometry [29]. It follows from (23) and (53) after rearrangement that H H P == 1 - 'JQ c ~1 ~I 'JQc'

(55)

and

(47)

Thus, the weights of the optimal processor in the absence of errors are the same as those of the conventional processor, implying that the conventional processor is the optimal processor for this case. The output SNR and the array gain G of the optimal processor for this case are, respecti vely, given by

P-

psLp

::= - - ' ) CT~

For a special case of the noise environment when no direction interference is present, a simple calculation yields

-W

~

(54)

Thus, this parameter is characterized by the weights of the conventional processor. As this parameter characterizes the performance of the optimal processor, it implies that the performance of the optimal processor in terms of its interference cancellation capability depends to a certain extent upon the response of the conventional processor to the interference. This fact has been further highlighted in l44] and [45].

E. Optimization Using Reference Signal A narrow-band beam-forming structure that employs a reference signal [24], [27], [28], [50]-[52] to estimate the weights of the beam former is shown in Fig. 4. The array output is subtracted from an available reference signal T(t) to generate an error signal E(t) == T(t) - 'JQH ;f(t), which is

103

Control for Weight Estimation

Error signal E(t)

Reference Signal r(t) Fig. 4.

Structure of narrow-band beam former using a reference signal.

used to control the weights. Weights are adjusted such that the MSE between the array output and the reference signal is minimized. The MSE is given by MSE

== E[Ic:(t)/2] == E[I1'(t)1 2 ] + Yl H RYl -

2JllH ~

the output would consist of the signal that has not been canceled but strong interferences have been reduced. When an adaptive scheme (discussed in Section III-B) is used to estimate 1Qr\'1sE' the strong jammer gets canceled first as the weights are adjusted to put a null in that direction to leave signal-to-jammer ratio sufficient for acquisition. Arrays using reference signals equal to zero to adjust weights are referred to as power-inversion adaptive arrays [53]. The MSE minimization scheme (the Wiener filter) is a closed-loop method compared to the open-loop scheme of MVDR (the ML filter) described in the previous section. In general, the Wiener filter provides higher output SNR compared to the ML filter in the presence of a weak signal source. As the input signal power becomes large compared to the background noise, the two processors give almost the same results [54]. This result is supported by a simulation study for a two-vehicle mobile communications situation in [55]. The increased SNR by the Wiener filter is achieved at the cost of signal distortion caused by the filter. It should be noted that the optimal beam former does not distort the signal. The required reference signal for the Wiener filter may be generated in a number of ways, depending upon the application. In digital mobile communications, a synchronization signal may be used for initial weight estimation, followed by the use of detected signal as a reference signal. In systems using a TDMA scheme, a sequence that is user specific may be a part of every frame for this purpose [56].

(57)

where

s. == E[~(t) T(t)]

(58)

is the correlation between the reference signal and the array signals vector ;ret). The MSE surface is a quadratic function of !!l. and is minimized by setting its gradient with respect to W. equal to zero, yielding the well-known Wiener-Hoff equation for the optimal weight vector

"

1Qlv1SE

1 == R- ~.

(59)

The MMSE of the processor, also known as the Wiener filter, using these weights is given by MMSE

== E[I1'(t)1 2 ] - ~H R-l~.

(60)

The scheme may be employed to acquire a weak signal in the presence of a strong jammer, as discussed in [50], by setting the reference signal to zero and initializing the weights to provide an omnidirectional pattern. The process starts to cancel strong interferences first and the weak signal later. Thus, intuitively, there is expected to be a time when

104

The use of a known symbol in every frame has also been suggested [57]. In other situations, the use of an antenna for this purpose has been examined to show the suitability to provide a reference signal [57]. Studies of mobile communications systems using reference signals to estimate array weights have also been reported in [58]-[60].

Main Beam

F. Beam-Space Processing

XL(t)

In contrast to element-space processing, where signals derived from each element are weighted and summed to produce the array output, beam-space processing is a twostage scheme where the first stage takes the array signals as input and produces a set of multiple outputs, which are then weighted and combined to produce the array output. These multiple outputs may be thought of as the output of multiple beams. The processing done at the first stage is by fixed weighting of the array signals and amounts to produce multiple beams steered in different directions. These weights are normally not adaptive, that is, they are not adjusted during adaption cycle. The weights applied to different beam outputs to produce the array outputs are optimized to meet a specific optimization criteria and are adjusted during the adaption cycle. In general, for an L-element array, a beam-space processor consists of a main beam steered in the signal direction and a set of not more than L - 1 secondary beams. The weighted output of the secondary beams is subtracted from the main beam. The weights are adjusted to produce an estimate of the interference present in the main beam. The subtraction process then removes this interference. The secondary beams, also known as auxiliary beams, are designed such that they do not contain the desired signal from the look direction to avoid the signal cancellation in the subtraction process. A general structure of such a processor is shown in Fig. 5. Beam-space processors have been studied under many different names, including Howells-Applebaum array [24], [51], [61], GSC [62], [63], partitioned processor [64], [65], partially adaptive arrays [66]-[72], PIC [73]-[77], adaptive-adaptive arrays [78], and multiple-beam antennas [79]-[81]. The pattern of the main beam is normally referred to as the quiescent pattern. and is chosen such that it has a desired shape. For a linear array of equispaced elements with equal weighting, the quiescent pattern has the shape of sin Lx / sin x, with L being the number of elements in the array, whereas for Chebyshev weighting (the weighting dependent upon the coefficients of the Chebyshev polynomial), the pattern has equal side-lobe levels [82]. The pattern of the main beam may be adjusted by various forms of constraints [51] and pattern synthesis techniques, which are discussed in [83]-[87] and the references therein. There are many schemes to generate the outputs of auxiliary beams such that no signal from the look direction is contained in them, that is, the beams have nulls in the look direction. In its simplest form, this can be accomplished by subtracting the array signals from presteered adjacent pairs [26], [88]. This relies on the fact that the component of the

WI

Output y(t)

Fixed Weights Auxiliary Beams

B Matrix Prefilter

Fig. 5.

Adjustable Weights

Structure of a general beam-space processor.

array signals induced from a source in the look direction is identical after the presteering, and this gets canceled in the subtraction process from the adjacent pairs. The process can be generalized to produce M - 1 beams from L-element array signals x( t) using a matrix B such that

(61 ) where M -1 dimensional vector q(t) denotes the outputs of M - 1 beams and the matrix B, referred to as the blocking matrix or the matrix prefilter, has the property that its M -1 columns are linearly independent and that the sum of the elements of each column equals zero, implying that M - 1 beams are independent and have nulls in the look direction. For an array that is not presteered, the matrix needs to satisfy

(62) where ~o is the steering vector associated with the look direction. It is assumed in the above discussion that ]\11 S L, implying that the number of beams is less than or equal to the number of elements in the array. When the number of beams is equal to the number of elements in the array, the processing in the beam space has not reduced the degree of freedom of the array, that is, its null-forming capability has not been reduced. In this sense, these arrays are fully adaptive and have the same capabilities as those of the array using element-space processing. In fact, in the absence of errors, both processing schemes produce identical results. On the other hand, when the number of beams is less than the number of elements, the arrays are referred to as partially adaptive. The null-steering capabilities of these arrays have reduced to that equal to the number of auxiliary beams. When adaptive schemes are used to estimate the weights, the convergence is generally faster for these arrays. 105

The MSE for these arrays, however, is also high compared to that of the fully adaptive arrays [89]. These arrays are useful in situations where the number of interferences is much less than the number of elements. They offer a computational advantage over element-space processing, as one needs only to adjust M - 1 weights compared to L weights for the element-space case with AI < L. Moreover, beam-space processing requires less computation than the element-space case to calculate the weights in general, as it solves an unconstrained optimization compared to the constrained optimization problem solved in the later case. It should be noted that for the element-space processing case, the constraints on the weights are imposed to prevent the signal arriving from the look direction from being distorted and to make the array more robust against errors. For the beam-space case, these are transferred to the main beam, leaving the adjustable weights free from constraints. A performance comparison of an element-space processor and a beam-space processor for the case of a single interference case is presented in [90]. The beam-space processor considered is a single auxiliary beam processor, referred to as the PIC processor, which is useful for canceling single interference only. The study shows that in the absence of errors, both processors produce identical results, whereas in the presence of look-direction errors, the beam-space processor produces superior performance. The situation arises when the known direction of the signal is different from the actual direction. The weights of the processor are constrained with the knowledge of the look direction. When the actual signal direction is different from the one that is used to constrain weights, the element-space processor cancels this signal as if it was an interference close to the look direction. The beam-space processor, on the other hand, is designed to have the main beam steered in the known look direction, and the auxiliary beams are formed to have null in this direction. The response of the main beam does not alter much away from the look direction, and thus the signal level in the main beam is not affected. Similarly, when a null of the auxiliary beams is placed in the known look direction, a very small amount of the signal leaks in the auxiliary beam due to a source very close to the null. Thus, the subtraction process does not affect the signal level in the main beam, yielding a very small signal cancellation in beam-space processing compared to element-space processing. For details of the effect of other errors on beam-space processors, particularly GSC, see, for example, [91]. The auxiliary beam-fanning techniques other than the use of a blocking matrix (described above) includes formation of M - 1 orthogonal beams and formation of beams in the direction of interferences if known. The beams are referred to as orthogonal beams to imply that the weight vectors used to form beams are orthogonal, that is, their dot product is equal to zero. The eigenvectors of R taken as weights to generate auxiliary beams fall into this category. In situations where the DOA's of interferences are known,

the formation of beams pointed in these directions may lead to more efficient interference cancellation [78], [92]. The auxiliary beam outputs are weighted and summed, and the result is subtracted from the main beam output to cancel the unwanted interference present in the main beam. The weights are adjusted to cancel the maximum possible interference. This is normally done by minimizing the total mean output power after subtraction by solving the unconstrained optimization problem, and leads to maximization of the output SNR in the absence of the desired signal in auxiliary channels. The presence of the signal in these channels causes signal cancellation from the main beam, along with interference cancellation. A detailed discussion on the principles of signal cancellation in general and some possible cures is given in [28], [52], and [93]. Use of multiple-beam array-processing techniques for mobile communications has been reported in various studies [94 ]-[98], including development of a 16-element array system using digital hardware to study its feasibility [99].

G. Broad-Band Beam Forming The beam-former structure of Fig. 2 discussed earlier is for narrow-band signals. As the signal bandwidth increases, the performance of the beam former using this structure starts to deteriorate [100]. For processing broad-band signals, a TDL structure, shown in Fig. 6, is normally used [100]-[ 108]. A lattice structure consisting of a cascade of J simple lattice filters sometimes is also used [109]-[ 113], offering some processing advantages. The steering delays in front of each element in Fig. 6 are pure time delays and are used to steer the array in a given look direction (cPo, eo). If Te(¢o, eo) denotes the time taken by the plane wave arriving from direction (cPo, eo) and measured from the reference point to the fth element, then the steering delay Tt ( cPo, eo) may be selected using

Tt(cPo, ()o) = To + Tt(cPo, ()o),

f. = 1,2,· .. ,L

(63)

where To is a bulk delay such that Tt (4;>0, eo) > 0, Ve. If s(t) denote the signal induced, on an element present at the center of the coordinate system, due to a broad-band source of power density 5(f) then the output of the fth sensor pre-steered in (cPo, eo), is given by

xe(t) == s(t

+ T e(cP, e) - Ti (cPo, ()0)).

(64 )

For a source in (cPo, eo), it becomes

xe(t) == s(t - To) f == 1,2,··· ,L

(65)

yielding identical waveforms after pre-steering delays. The TDL structure shown in the figure following the steering delay on each channel is a FIR filter. The coefficients of these filters are constrained to specify the frequency response in the look direction. It should be noted that these coefficients are real compared to the complex weights of the narrow-band processor. Let w., defined by (66)

106

Steering Delay

Tapped Delay Line Structure

:nr\ Fig. 6.

~2

Broad-band beam-former structure using TDL filter.

denote LJ coefficients of the filter structure with '.W..nl. denoting the L coefficients after the (M - 1)th tap. The mean out power of the beam former for a given 'JQ is given by

It is related to the spectrum of the signal by the Fourier transform, that is (70)

(67)

Thus, from the knowledge of the spectra of sources and their DOA's, the correlation matrix may be calculated. In practice, this can also be estimated by measuring signals at the output of various taps. In situations where one is interested in finding coefficients such that the beam former cancels the directional interferences and has the specified response in the look direction, the following beam-forming problem is normally considered:

where the LJ x LJ -dimensional real matrix R denotes the array correlation matrix, with its elements representing the correlation between various tap outputs. The correlation between the outputs of the (f - 1)th tap on the mth channel and the (k - 1)th tap on the nth channel is given by

(Rrn,n)e.k == p[(m - n)T + Tt(cPo, eo) - Tk(¢o, eo) +Tk(¢,())-Tt(¢,())] (68) with p(T) denoting the correlation function

p(T) == E[s(t)s(t + T)].

(69) 107

minimize

(71 )

subject to

(72)

IDL

~

IDL lDL Fig. 7.

Structure of partitioned realization of the broad-band beam former.

where F is a J -dimensional vector that specifies the frequency response in the look direction and C is an LJ x J constraint matrix. For a point constraint in the look direction

1

1

point constraint in the known direction of the signal would cancel the desired signal as if it were an interference. The other directional constraints to improve the performance of the beam former in the presence of the look-directional constraints include multiple linear constraints [117], [118] and inequality constraints [119]-[ 121]. A set of nondirectional constraints to improve the performance of the beam former under look-direction errors is discussed in [122]. These are referred to as correlation constraints, which use the known characteristics of the desired signal to estimate an LJ -dimensional correlation vector 'f.d between the desired signal and the array signal vector. The beam-forming problem using these constraints becomes

o

C==.

(73) 0

1.

with 1 denoting the L-dimensional vector of Is. Let fll. denote the solution of the above problem. It is given by l25]

iu == R-1C(C T R-1C)-1 F.

(74)

The point-constraint minimization problem specifies J constraints on the weights such that the sum of L weights on all the channels before the jth delay is equal to F j . For all pass frequency responses in the look direction, all but one F j , j == 1, ... ,J are selected to be equal to zero. For j close to (J + 1)/2, F, is taken to be unity. Thus, the constraints specify that the sum of the weights across the array is zero except for one near the middle of the filter, which is equal to unity. The implication of these constraints is that the array pattern has a unity response in the look direction. This pattern can be broadened by specifying additional constraints, such as derivative constraints [114]-[ 116], along with the constraints discussed above. The derivative constraints set the derivatives of the power pattern with respect to () and ¢ equal to zero in the look direction. The higher the order of derivatives, that is, the first order, second order, etc., the broader the beam in the look direction normally becomes. A broader beam is useful when the actual signal direction and the known direction of the signal is not precisely the same. In such situations the processor with the

minimize

Y2.T Rw

(75)

subject to

r.~ w == Po

(76)

where Po is a scalar constant that specifies the correlation between the desired signal and the array output. Application of broad-band beam-forming structures using TDL filters to mobile communications has been considered in [56] and [123]-[125] to overcome multipath fading and large delay spread in a TDMA as well as a CDMA system.

H. Partitioned Realization The broad-band beam-former structure shown in Fig. 6 is sometimes referred to as an element-space processor or direct form of realization, compared to a beam-space processor or partitioned form of realization, as shown in Fig. 7. The structure shown in Fig. 7 is discussed below for a point constraint, that is, the response is constrained to be unity in the look direction. A discussion of partitioned realization for derivative constraints may be found in [126]. The steering delays are used to align the waveform arriving from the look direction, as discussed. The array 108

the maximum attainable SNR and depends upon the FBW of the signal. This range includes a quarter-wavelength spacing at the center frequency fo. The quarter-wavelength spacing produces a 90° phase shift at fa and is equal to 1/410. By measuring the tap spacing as a multiple of this delay, it is indicated that the intertap spacing with multiple around l/FBW yields close to the highest attainable SNR. With the multiple between l/FBW to 4/FBW, one needs a larger number of taps for an equivalent performance. A study of the jamming rejection capability [104] and the tracking performance of the array in a nonstationary environment [105] also indicates that when tap spacing is measured in terms of the center frequency of the signal, the best performance is achieved when the spacing is 1/4fo. For this tap spacing, R has less eigenvalue spread, which is the reason for this performance. The eigenvalue spread of a matrix indicates the range of values its eigenvalues take. A larger ratio of the largest eigenvalue to the smallest eigenvalue indicates a larger spread. The TDL filter tends to increase the degrees of freedom of the array, which may be traded against the number of elements such that an array with L elements is able to suppress more than L-1 directional interferences, provided their center frequencies are not the same and fall within the FBW of the signal [107]. Though the TDL structure with constrained optimization is the commonly used structure for broad-band array signal processing, alternative methods have been proposed. These include:

signals after the steering delays are passed through two sections. The top section consists of a broad-band conventional beam with required frequency response obtained by selecting the coefficients of the FIR filter. Signals from all of the channels are equally weighted and summed. For this realization to be equivalent to the direct form of realization, all the weights need to be equal to 1/ L, and the filter coefficients F j , j == 1, 2, ... ,J need to be specified as before. Furthermore, the output of the upper section is given by

L Jik+ly(t - Tk)

J-1

Yc(t) ==

(77)

k=O

with

_ ~T(t)l ( ) --L-. yt

(78)

The matrix prefilter shown in the lower section is designed to block the signal arriving from the look direction. Since these signal waveforms after the steering delays are alike, it can be achieved by selecting the matrix W s such that the sum of each of its rows is equal to zero. For the partitioned processor to have the same degree of freedom as that of the direct form, the L - 1 rows of the matrix ~TS need to be linearly independent. The output Xl (t) after the matrix prefilter is an L - l-dimensional vector given by

x'(t) == Ws~(t)

(79)

and can be thought of as the outputs of L - 1 beams, which are then shaped by the coefficients of the FIR filter of each TDL section. Let an L - I-dimensional vector fJ:.k denote these coefficients before the kth delay. The output of the lower filter is then given by

1) adaptive nonlinear schemes, which maxirmze SNR subject to additional constraints [127];

2) a variation of a Davis beam former [88], which adapts one filter at a time to speed up convergence [128L

J-1

(80)

3) a composite system, which also utilizes a derivative of beam pattern in the feedback loop to control the weights [129] to reject wideband interference:

These coefficients are selected by minimizing the mean output of the processor, that is

4) optimum filters, which specify rejection response [87L

Ya(t)

=:

'" T I LtQkK-(i-kT).

k=O

5) a master and slave processor with broad-beam capabilities without derivative constraints [130]; 6) a hybrid method that uses an orthogonal transformation on data available from the TDL structure before applying weights [131] to improve its performance in multipath environment;

The performance of the broad-band arrays as a function of the number of various parameters, such as the number of taps, tap spacing, array geometry, array aperture, and signal bandwidth, has been considered in the literature [101 ]-[ 108] to understand their influence on the behavior of the arrays. An analysis [101] of broad-band arrays using eigenvalues of R indicates that the product of the array aperture and the FBW of the signal is an important parameter of the broad-band array in determining its performance. The FBW is defined as the ratio of the bandwidth to the center frequency of the signal. It is shown that the number of taps required on each element depends upon this parameter as well as on the shape of the array, with more taps needed for a complex shape. A study [102], [103] of the SNR as a function of intertap spacing indicates that there is a range of intertap spacing that yields close to

7) weighted Chebyshev method [134]; 8) two-sided correlation transformation method [135].

I. Frequency-Domain Beam Forming A general structure of the element-space frequencydomain processor is shown in Fig. 8, where broad-band signals from each element are transformed into frequency domain using the FFT and each frequency bin is processed by a narrow-band processor structure. The weighted signals from all elements are summed to produce an output at each bin. The weights are selected by independently minimizing

109

x.(t)

ITP

@

[}!

@

F

T

x I. (t)

T

x L(t)

N

un T

Broadband Time Domain Signals Fig. 8.

N

N

F

y(t)

F yN(t)

T

@ Narrowband Processing on each Frequency Bin

Conversion to Time Domain

Element-space frequency-domain processor structure.

---~8-1

Fig. 9.

y.(t)

t(8) i

----"'~8 Output~

Delay-and-sum beam former.

the mean output power at each frequency bin subject to steering-direction constraints. Thus, the weights required for each frequency bin are selected independently, and this selection may be performed in parallel, leading to a faster weight update. When adaptive algorithms such as the LMS algorithm (discussed in Section III-B) is used for weight update, a different step size may be used for each bin, leading to faster convergence. Various aspects of frequency-domain beam forming are reported in the literature [136]-[ 150]. The performance of the time- and frequency-domain processors are the same only when the signals in different frequency bins are independent. This independence assumption is mostly made in the study of frequency-domain beam forming. When this assumption does not hold, the frequency-domain beam former may be suboptimal. Some of the tradeoffs and comparisons of the two processors may be found in [136] and [149]. A study of the frequency-domain algorithm [140] for coherent signals indicates that the frequency-domain method is insensitive to the sampling rate and may be able to reduce the effects of element malfunctioning on the beam pattern. A study in [141] shows that due to its modular parallel

structure, beam forming in the frequency domain is well suited for VLSI implementation and is less sensitive to the coefficient quantization. The computational advantage of the frequency-domain method for bearing estimation is discussed in [144], [146], and [150], and the advantage for correlated data is considered in [145] and [148]. A general treatment of time- and frequency-domain realization with a view to comparing the structure of various algorithms of weight estimation in a unified manner is provided in [139]. J. Digital Beam Forming

Consider the analog beam-former structure shown in Fig. 9, where the signals from each element are weighted, delayed, and summed to form the beam output

y(t) ==

L WiXi(t - Ti(O)). L

(82)

i=l

The delays are adjusted such that the signals induced from a given direction, where the beam needs to be pointed, are aligned after the delays. This aspect of beam steering was discussed in detail earlier. The weights are adjusted to shape the beam.

110

Samples from

Element

,

<,

Samples at Time It t-~

~

'V

G?

"""4

o

o 0 0 0

81

83

\ / B

t-

7f!.

0 0 0 0 0 U,O 0 0 0 0 0 e <, 0 0 0 0 0 0 Q 0 0 0 0 0 o~o 0 0 0 0 o ~,<e5 0 0 0 o e> s 0 0 040 0', 0/0 0 0 0 0 0 0 o 0 0 0 0 0 0 £)/ 0 0 0 0 0 a 0 a a 0

1 ~ 0 0 2 ~'Q 0

3 ~
92

t-3~

°

L~/a'o ~ 0 / /~I tll"

0

0

0

0

°

/

Samples at Time

\

t

,\1

0

/

t-~

,

1 ~ 0 0 2 o vo, 0

3 CP

cp

.....

8]

<j)

°

~

/

o

~

(S)/Q

0/ 0 0 0/0 0

"o, o

0

0

0 0

/c1 ,0 Q

0 0 0

0 0 0

~

I

I

I

C

0

0

0

\9~C5

o e

0\0 /0 b ~

fi)/ 0

)1'0 0 0

0

/

o (})

e d a 'Gy/o

o ,0 0 0

0

0 0 0

a a 0

0\0

o 0 o 0'

~/~~~ B

Fig. 10. Digital beam-forming process.

Q)

L1

/D

C

q

cp 0 9 o }f cp 0/0

84

~

0\ 0

0 cp 0 9 0 G:> 0

85

<,

e

a

0 '0,0

t-3~1

6).

0', 0

a

,

"

E

0 0

,0,A

2

Fig. 11. Effect of sampling on digital beam forming.

In digital beam forming [151]-[ 164], the weighted signals from each element are sampled and stored, and beams are formed by summing the appropriate samples such that the required delay is incorporated by this process. It requires each delay as an integer multiple of the sampling interval ~. The process is shown in Fig. 10 for a linear array of equispaced elements, where it is desired that a beam is formed in direction O2 . Let the direction be such that

The practical requirement of an adequate set of directions where simultaneous beams need to be pointed implies that the array signals be sampled at much higher rates than required by Nyquist criterion to reconstruct the waveform back from the samples [165]. The high sampling rate means a large number of storage requirements along with highspeed input-output devices, analog-to-digital converters, and large bandwidth cables [152]. The requirement of high sampling rates may be overcome by digital interpolation [152], [157], [163]. This process basically simulates the samples generated by high sampling rates and thus increases the effecti ve sampling rate. It works by sampling the array signal at the Nyquist rate or higher and by padding between each sample with zeros to form a new sequence. The number of zeros padded decides the effective sampling rate. For a sampling rate to increase L- fold, L - 1 zeros are padded to create a sequence as large as if it were created by sampling at high speed. The padded sequences then are used for digital beam forming by selecting appropriate samples as required, and the beam output is passed through an FIR filter to remove unwanted spectra. This filter is normally referred to as an interpolation filter. The beams formed by interpolation beam formers have slightly higher side-lobe levels. A tutorial introduction to digital-interpolation beam formers is given in [152], whereas some additional fundamentals of digital-array processing may be found in [155]. A comparison of many approaches to digital beam-forming implementations is discussed in [156] and [159], showing how a real-time implementation is a tradeoff between various conflicting requirements of hardware complexities, memory, and system performance. The shape of a beam, particularly its beamwidth, is controlled by the size of the array. Generally, a narrow beam results from a larger array. In practice, the array size is fixed

(83)

Thus, the signal from the ith element needs to be delayed by ('l:-l)~ seconds. This may be accomplished by selecting the samples for summing (as shown in Fig. 10 by the line marked with symbol A). Similarly, a beam may be steered in direction (}3 by summing the samples connected by the line marked with symbol B in Fig. l O, where the signals from the zrh element need to be delayed by (L - 'i)il seconds. The beam formed in direction ()l, by summing the samples connected by the line marked with symbol C, does not require any delay. It follows from the above discussion that using this process, one can only form beams in those directions that require delays equal to some integer multiple of the sampling interval, that is

(84) where ki , i == 1,2,· .. ,L are integers. The number of discrete directions where a beam can be pointed exactly increases with increased sampling, as shown in Fig. 11, where the sampling interval is Ll/2. The figure shows that additional beams in directions (}4 and ();s may be formed. These exact beams are normally referred to as synchronous or natural beams [152], and it is possible to form a number of these beams simultaneously using a separate summing network for each beam.

111

and its extent is limited. A process known as extrapolation may be used [158] during digital beam fanning to simulate a large array extent resulting in improved beam pattern. As the interpolation increases the effective sampling rate, the extrapolation extends the effective array length. More information on signal extrapolation schemes may be found in [165]-[ 170]. Digital beam-forming techniques for mobile satellite communications are examined in [95] by studying a configuration of a digital beam- forming system capable of working in transmit and receive mode. Digital beam forming for mobile satellite communications has also been reported in [59], [95], [171], and [172]. An introduction to digital beam forming for mobile communications may be found in [173].

III.

ADAPTIVE BEAM FORMING

In practice, neither R nor Rl\r is available to calculate the optimal weights of the array, and the weights are adjusted by some means using the available information deri ved from the array output, array signals, and so on to make an estimate of the optimal weights. There are many such schemes, which are normally referred to adaptive algorithms. Some of these algorithms are described here, and their characteristics, such as the speed of adaption and the mean and variance of the estimated weights, and the parameters affecting these characteristics are briefly discussed. A. SMI Algorithm

This algorithm estimates the array weights by replacing

R with its estimate. An unbiased estimate of R using N samples ~(n), n == 0,1,2, ... , N - 1 of the array signals

K. Eigenstructure Method

As discussed previously, the eigenvalues of R can be divided into two sets when the environment consists of uncorrelated directional sources and uncorrelated white noise. The largest M eigenvalues correspond to M directional sources, and the eigenvectors associated with these eigenvalues are normally referred to as signal eigenvectors. The L-M smallest eigenvalues are equal to the background noise power, and the eigenvectors associated with these eigenvalues are known as noise eigenvectors. The eigenvectors of R are orthogonal to each other and thus may be thought of as spanning an L-dimensional space. This space may be divided into two orthogonal subspaces. The subspace spanned by signal eigenvectors is referred to as the signal subspace, whereas the subspace spanned by the noise eigenvectors is referred to as the noise subspace. The signal subspace is also spanned by M steering vectors associated with M directional sources. This fact is exploited by eigenstructure methods of beam fanning in a number of ways [174]-[178]. An array using a weight vector contained in the signal space such that it is orthogonal to the interference-direction steering vector is able to cancel the interference. In situations where the directions of interferences are not known, the weight is estimated by minimizing a suitably selected cost function. A weight estimation method that minimizes a cost function applicable to a digital communications system using a BPSK modulating scheme discussed in [176] demonstrates the utility of this beam- forming concept. An application of the eigenstructure method for estimating weights of beam-space processors using eigenvectors of the R N , that is, the matrix with the signal component removed, as is done for secondary beams, suggests the effecti veness of this method for interference canceling [178], [179] in beam space and for achieving the desired perfonnance in a short observation time. An application of the eigenstructure method for correcting errors in steering vectors is reported in [174]. Forming beams using eigenvectors associated with the largest eigenvalues of R for mobile communications applications has been reported in [180].

may be obtained using a simple averaging scheme

R(n)

1

=N

l\T-1

:L ;f(n);r.H(n)

(85)

n=O

where R( n) denotes the estimate at the nth instant of time and ~(n) denotes the array signal sample, also known as the array snapshot, at the nth instant of time, with t replaced by nT and the sampling time T omitted for the ease of notation. The estimate of R may be updated when the new samples arri ve using

and a new estimate of the weights 1Q(n + 1) at time instant n + 1 may be made. The expression of the optimal weights requires the inverse of R, and this process of estimating R and then its inverse may be combined to update the inverse of R from array signal samples using the Matrix Inversion Lemma as follows:

R- 1 (n ) == R- 1 (n - 1) R- 1 (n - l)~(n)±H ('n)R-1(n - 1) 1 + ;fH(n)R-l(n - l);f(n) (87) with co

> o.

(88)

This scheme of estimating weights using the inverse update is referred to as the RLS algorithm, which is further described in Section III-C. It should be noted that as the number of samples grows, the matrix update approaches its true value, and thus the estimated weights approach the optimal weights, that is, as n --+ 00, R(n) --+ Rand 1Q(n) --+ W or 1Q1\.1SE' as the case may be. More discussion on the SMI algorithm may be found in [40] and [181]. Procedures for estimating array weights with efficient computation using SMI are considered in [182], and an analysis to show how it

112

performs as a function of the number of snapshots is provided in [89]. Application of SMI to estimate the weights of an array to operate in mobile communications systems has been considered in many studies [56], [59], [60], [183]-[186]. The study in [183] considers beam forming for GSM signals using a variable reference signal as available during the symbol interval of the TDMA system. An application discussed in [184] is for vehicular mobile communications, whereas that presented in [186] is for inducing delay spread in indoor radio channels. A presentation in [59] is for mobile satellite communications systems.

B. LMS Algorithm

iteration. The array signal vector, however, is £( n + 1), the reference signal sample is 1'(n + 1), and the array output (92)

In its standard form, the LMS algorithm uses an estimate of the gradient by replacing R and ~ with their noisy estimates available at the (n + 1)th iteration, leading to

+ 1)~H (n + l)~(n) - 2£(n + 1)1'(n + 1) == 2~(n + 1)C:*('JQ(n))

fl(1Q(n)) == 2~(n

where C:(1Q(n)) is the error between the array output and the reference signal, that is

The application of the LMS algorithm to estimate the optimal weights of an array is widespread, and its study has been of considerable interest for some time now. The algorithm is referred to as the constrained LMS algorithm when the weights are subjected to constraints at each iteration. It is referred to as an unconstrained LMS algorithm when the weights are not constrained at each iteration. The latter is mostly applicable when weights are updated using a reference signal and no knowledge of the direction of the signal is utilized, as is the case for the constrained case. The algorithm updates the weights at each iteration by estimating the gradient of the quadratic surface and then moving the weights in the negative direction of the gradient by a small amount. The constant that determines this amount is normally referred to as the step size. When this step size is small enough, the process leads these estimated weights to the optimal weights. The convergence and the transient behavior of these weights, along with their covariance, characterize the LMS algorithm, and the way that the step size and the process of gradient estimation affect these parameters is of great practical importance. These and other issues are now discussed in detail. J) Unconstrained LMS Algorithm: A real-time unconstrained LMS algorithm for determining optimal weight 1Qf\'1SE of the system using the reference signal is [27], [187]-[ 199]

w.(n

+ 1) == w.(n) -

JLfl(w.(n))

Thus, the estimated gradient is a product of the error between the array output and the reference signal as well as the array signals after the nth iteration. For JL < 1/ A1ll a x , with An 1ax denoting the maximum eigenvalue of R, the algorithm is stable and the mean value of the estimated weights converges to the optimal weights. As the sum of all eigenvalues of R equals its trace, the sum of its diagonal elements, one may select the gradient step size JL in terms of measurable quantities using JL < 11Tr (R), with Tr (R) denoting the trace of R. It should be noted that each diagonal element of R is equal to the average power measured on the corresponding element of the array. Thus, for an array of identical elements, the trace of R equals the power measured on anyone element times the number of elements in the array. The con vergence speed of the algorithm refers to the speed by which the mean of the estimated weights (ensemble average of many trials) approaches the optimal weights. It normally is characterized by L trajectories along L eigenvectors of R with the time constant of the fth trajectory given by Tt

(90)

at the nth iteration with respect to w.(n), given by

V' ~MSE(w.)I~=~(n) == 2R1Q(n) - 2if.

1

==-2J-l At

(95)

with At denoting the lth eigenvalue of R. Thus, these time constants are functions of the eigenvalues of R, the smallest one dependent upon A1ll ax , which normally corresponds to the strongest source, and the largest one controlled by the smallest eigenvalue, which corresponds to the weakest source or the background noise. Therefore, the larger the eigenvalue spread, the longer it takes for the algorithm to converge. In terms of interference rejection capability, this means canceling the strongest source first and the weakest source last. The convergence speed of an algorithm is an important property, and its importance for mobile communications is highlighted in [200] by discussing how the LMS algorithm does not perform as well as some other algorithms due to its slow convergence speed in situations of fast-changing signal characteristics. The availability of time for an algorithm to converge in mobile communications systems depends not only on the system design, which dictates the duration of

(89)

where 1J2( n + 1) denotes the new weights computed at the (n + l)th iteration; JL is a positive scalar (gradient step size) that controls the convergence characteristic of the algorithm, that is, how fast and how close the estimated weights approach the optimal weights; and g(w.(n)) IS an unbiased estimate of the gradient of the MSE

MSE(1Q(n)) :=E[I1'(n + 1)1 2 ] + 'JQH (n )R1ll.( n) - 2'JQH (n)if

(93)

(91)

It should be noted that at the (n + 1)th iteration, the array is operating with weights w.( n) computed at the previous 113

and

the user signal present (such as the user slot duration in a TDMA system) but also on the speed of mobiles, which changes the rate at which a signal fads. For example, a mobile on foot would cause the signal to fade at a rate of about 5 Hz, whereas the rate would be on the. order of about 50 Hz for a vehicle mobile, implying that an algorithm needs to converge faster in a system being used by vehicle mobiles compared to one used by a hand-held portable device [47]. Some of these issues for an IS-54 system are discussed in [56], where the convergence of the LMS and SMI algorithms in mobile communications situations is compared. Even when the mean of the estimated weights converges to the optimal weights, they have finite covariance, that is, their covariance matrix is not identical to a matrix with all its elements equal to zero. The covariance matrix of the weights is defined as

(100) then the misadjustment M is given by (101 ) For a sufficiently small u; this results in M ~ 2J-t Tr( R). It follows from this expression that increasing J-t increases the misadjustment noise. On the other hand, an increase in J.L causes the algorithm to converge faster, as discussed earlier. Thus, the selection of the gradient step size requires satisfying conflicting demands of 1) reaching vicinity of the solution point more quickly but wandering around over a larger region and causing a bigger misadjustment and 2) arriving near the solution point slowly with the smaller movement in the weights at the end. The latter causes an additional problem, particularly in a nonstationary environment, say, when the interference and optimal solution move slowly, causing adapting estimated weights to lag behind the optimal weights. This phenomenon is referred to as the weight vector lag. Many schemes, including variable step size, have been suggested to overcome this problem [201 ]-[208]. Some of these schemes are now discussed. The adaptive algorithm estimates the weights by minimizing the MSE. Thus, in schemes where a variable step size is used, it reflects the value of the MSE at that iteration (going up and down as the MSE goes up and down) such that it stays between the maximum permissible value for convergence and the minimum value based upon the allowed misadjustment. It may be truly variable or it may be allowed to switch between a few preselected values for the ease of implementation, as well as to shift by one bit left or right where digital implementation is used. The step size may also be adjusted to reflect the change in the direction of the gradient of error surface at each iteration [207]. The optimal value of the step size at each step is suggested in [203] such that it minimizes the MSE at each iteration. This is a function of the value of the true gradient at each iteration and R. In practice, these may be replaced by their instantaneous values, leading to a suboptimal value. Instead of having a single step size for an entire weight vector, one may select a variable step size for each weight separately, leading to an increased convergence of the algorithm [204]. The convergence speed of an algorithm may also be increased by adjusting the weights such that interferences are canceled one at a time [209], [210] and by using a scheme known as block processing [211]. For broad-band signals, an implementation in the frequency domain may help increase the speed of convergence. The application of frequency-domain beam forming to estimate the weights using the LMS algorithm for the case when a reference signal is available [138], [139], [142], [143] shows how the frequency-domain approach yields improved convergence and reduced computational

where 'ill == E[1Q( n )] denotes the mean of the estimated weights at the nth iteration. This causes the average of the MSE not to converge to the MMSE and leads to the excess MSE. From the expressions of the MSE and MMSE, it follows that for a given 1Q( n), the MSE is given by

MSE(1Q(n)) == MMSE + V H (n)RV(n)

(97)

V(n) == 1Q(n) - fQ

(98)

where

is the difference between the estimated weights and the optimal weights at the nth iteration. Note that E[V(n)] ~ 0 as n - 7 00. As all elements of k·w·w{n) do not approach zero as n - 7 00, it follows that the average value of the excess MSE does not approach zero as n - 7 00, that is, lilnn~00 E[l/H (n)RV(n)] :f O. The transient and steady-state behavior of the weight covariance matrix and the average excess MSE are important parameters of the LMS algorithm and are discussed in detail in [188] and [198]. A study of the convergence of the LMS algorithm applicable to the PIC processor and a discussion on the gradient step size selection can be found in [75]. The difference between the weights estimated by the adaptive algorithm and the optimal weights is further characterized by the ratio of the average excess steady-state MSE and the MMSE. It is referred as the misadjustment. It is a dimensionless parameter that measures the performance of the algorithm. The misadjustment is a kind of noise and is caused by the use of the noisy estimate of the gradient. This noise is referred to as the misadjustment noise. For the present case when the gradient is estimated by multiplying the array signals with the error between the array output and the reference signal and the gradient step size is selected such that (99)

114

W2

Constant Power Surface Cotour

-Jl g(~(n))

Fig. 12.

Constrained LMS algorithm: pictorial view of the projection process.

signal sensitivity compared to the normal LMS algorithm. A discussion of its application to mobile communications can be found in [225]. 3) Constrained LMS Algorithm: A real-time constrained algorithm [7], [25], [226]-[233] for determining the optimal weight vector {Q is

complexities over the time-domain approach. Improved convergence normally arises from the use of different gradient step sizes in different bins. For the constrained LMS case, this is likely to cause deterioration in the steady-state performance of the algorithm. This deterioration, however, does not affect the performance of the unconstrained algorithm [212]. An algorithm known as a sign algorithm [208], [213], where the error between the array output and the reference signal is replaced by its sign, is computationally less complex than the LMS algorithm, as discussed. The algorithm is usually analyzed assuming that successive samples are uncorrelated. This assumption helps in simplifying the mathematics by allowing expectations of data products to be replaced by the products of their expectations. A discussion of situations of correlated samples and a nonstationary environment may be found in [214]-[216]. Applications of an unconstrained LMS algorithm to mobile communications systems using an array include basemobile communications systems [46], indoor-radio systems [47], and satellite-to-satellite communications systems [97]. 2) Normalized LMS Algorithm: This algorithm is a variation of the constant-step-size LMS algorithm and uses a data-dependent step size at each iteration. At the 'nth iteration, the step size is given by

J.L(n)

Mo

= ;rH(n);f(n)

u:(n + 1) == P {1Q(n) -

M9(ill (n ))}

-

+ ~o

~o ~o

( 103)

where H

P~I_~o~o

L

( 104)

is a projection operator, 9 ('!ld.(n )) is an unbiased estimate of the gradient of the power surface 'JQH ('n )R1Q(n) with respect to 1Q( n) after the nth iteration, /-L is the gradient step size, and 20 is the steering vector in the look direction. The algorithm is "constrained" because the weight vector satisfies the constraint at every iteration, that is, 1QH (n )~o == 1, Vn. The process of imposing constraints may be understood from Fig. 12, which shows how weights are undated and how a projection system uses a vector diagram for a two-weight system [25]. The figure shows constant power contours, the constraint surface (a line 1J2H ~o == 1 for a twodimensional system), a surface parallel to the constraint surface passing through the origin ('JQH ~o == 0), weight vectors 1Jl.( n), ~(n + 1), and {Q, and the gradient at the nth iteration. The point A on the diagram indicates the position of the weight after completion of the nth iteration. It is the cross section of the constraint equation 1Jl.H ~o == 1 and the power surface 'JJ!..H (n)R1Q(n) (not shown in the figure). The weights are perturbed by adding a small amount - M!l(1Q( n)) and then are projected on 1QH ~o == 0 using

( 102)

where /-La is a constant. The algorithm and its convergence using various types of data have been studied widely [217]-[ 2241. It avoids the need for estimating the eigenvalues of the correlation matrix or its trace for selection of the maximum permissible step size. The algorithm normally has better convergence performance and less 115

projection operator P. This point is indicated by B on the diagram. Note that P~o == O. Thus, the projection operator projects the weights orthogonal to ~o' The constraint now is restored by adding §.o/§.{f §.o and the updated weights 1Q( 11,+ 1) move to point C. The process continues by moving the estimated weights toward point D, the optimal solution. The effect of the gradient step size J.L on the convergence speed and the misadjustment noise may also be understood using this figure. A larger step size means that the weight vector moves faster toward point D, the solution point, but wanders around it over a larger region, not reaching close to it and causing more misadjustment. The gradient of w. H (n)Rw.(n) with respect to 1Q(n) is given by

speed of the algorithm depends upon the eigenvalue spread of PRNP. The discussion so far has concentrated on the convergence of the mean value of the weights to the optimal weights. The variance of these weights is an important parameter, and the transient and steady-state behavior of the weight covariance matrix k~ ~ ( n) are indicators of the performance of the algorithm, as discussed' previously for the unconstrained LMS algorithm. An expression for k~~ (n) indicates [228] that it is a function of the variance of the gradient estimate. For the standard algorithm, an expression for the variance of the gradient is given by

The steady-state value of the weight covariance matrix governs the misadjustment. For the standard algorithm, it is given by

and its computation using this expression requires knowledge of R, which normally is not available in practice. For a standard LMS algorithm, an estimate of the gradient at each iteration is made by replacing R by its noisy sample £(n+l)£H(n+l) available at time instant (n+1), leading to g(1Q(n)) == 2~( n + 1) y* (w(n)). Thus, the gradient estimate is the product of the array signals and the array output available after the nth iteration. The mean value of the weights estimated by the algorithm using this gradient converges to the optimal weights, provided that the gradient step size is small enough to satisfy

O<J.L<

1 • 2An1ax(PRP)

-

-1

(106)

1

(107)

where in[.] denotes the natural logarithm of [.] and At (P RP) and A1llax (P RP), respectively, denote the lth eigenvalue and the maximum eigenvalue of P RP. It follows from

w.(n + 1) == Pw.(n) + ~o - J-lPg(~(n)) §.o §.O -

(108)

(111 )

(112)

and examine the projected gradient vector term Pg(Y2('n)). Expressing -

and

p§.o == 0

1 - J.L

i= l

a) Signal sensitivity: The convergence of the mean weights to the optimal weights is a function of the eigenvalues of P R N P and thus is independent of the look direction signal. This is not the case, however, for the weight covariance matrix, which depends on the projected covariance of the gradient used for the weight update algorithm, that is, PVg('Jl!..(n))P. For the standard algorithm, this variance is a-product of Rand the mean output power w. H (n)Rw.(n) at the nth instant of time. Thus, PVg (Y2(n))P , which is proportional to 'JQH (n)Rw..(n)PRP,- contains a signal from the look direction indicating that the performance of the standard LMS algorithm is not independent of the signal and that the transient behavior of weight covariance depends on it. The following, a rather heuristic argument, explains how the signal level causes the weights to fluctuate using an explicit expression of weights rather than their weight covariance matrix. Rewrite the constrained LMS algorithm as follows:

The convergence of the mean weights to iQ along the lth eigenvector of P RP has the time constant

Tt = 1!n[l _ 2JtAt(PRP) ~ 2Jt At(PRP)

1 I-p,A 1 ( P R P ) ~L-l 1 Ul=l I-p,A 1 ( P R P )

~L-l

J.L u

M -

(109)

( 113)

that P RP == P RNP, and hence the convergence speed of the mean value of the weights characterized by the time constants and the upper limit on the gradient step size depends only on the eigenvalues of P R N P, indicating that the signal arriving from the look direction does not affect these quantities. The eigenvalues of P RN P are a function of the directions and powers of the directional sources as well as the array geometry, with the maximum eigenvalue controlled by the strongest source governing the initial convergence speed. The latter part of the convergence is controlled by the smaller eigenvalues associated with the weak sources or the background noise, and thus the overall

and noting that (114)

with ms(n) denoting the sample of the complex modulating function of the signal and ~N(n) being the array receiver vector not containing the signal, one obtains

Pfl('JQ(n)) == P~Jv(n

+

+ l)x~(n + l)w.(n)

m: (n + 1)P

X J\T

(n + 1)§.ff 'JQ( n). (1 15)

m;(n+l) is a random quantity with variance equal to the look-direction signal power. This makes Pfl ('JQ( n)) a noisy 116

where N, denotes the number of possible combinations of elements with lag i and summation is over all these combinations. For a linear array of equispaced elements, N, == L - i. It should be noted that for a nonuniform linear array, the amount of improvement realized by the structured method would depend upon the number of elements in R with the same correlation lag. An algorithm that uses the structured method to estimate the matrix using all available samples is discussed in [233]. It has a better convergence performance than that of the RLS algorithm in the presence of a strong look-direction signal. The algorithm is referred to as the improved LMS algorithm. The discussion of the LMS algorithm implies that one has access to all array signals. In situations where this access is not available or not economical, one could estimate the required gradient using perturbation schemes [226]-[228], [235]. Algorithms using these schemes perturb the array weights using some orthogonal sequences and use the measured array output power over the perturbation cycle to estimate the gradient. For a perturbation cycle of length J, for example, the algorithm requires J samples at each iteration to estimate the gradient. Thus, the iteration number and the sample numbers are different and the algorithm is slower by a factor of J when measured in time rather than iteration number. The gradient estimation also adds additional noise to the system, known as the perturbation

quantity that fluctuates with the signal power and causes the 1Q(n + 1) to fluctuate. The fluctuations in 1Q( n + 1) increase as the signal power increases. Thus, the weights estimated by the standard algorithm are sensitive to the signal power, requiring a lower step size in the presence of a strong signal for the algorithm to converge, which in turn reduces its convergence speed. This fact has been demonstrated in [234] for a high-speed GMSK mobile communications system. The system has been implemented by mounting an array on a vehicle to measure its BER performance. The signal sensitivity of the standard LMS algorithm is caused by the use of a sample correlation matrix in estimating the gradient and could be reduced by using an estimate of the correlation matrix from all available samples. A recursive LMS algorithm uses all previous samples and updates the correlation matrix as a new sample arrives, using

R(n

+ 1) =

nR(n)

+ ±JV(n + 1h~(n + 1) . n+l

(116)

The algorithm then uses this matrix to estimate the req uired gradient [l.(~(n)) ==

2R(n

+ l)w(n).

(117)

The estimated gradient is unbiased and has variance

Vfl(1Q(n)) = (n

4

+ 1)21Q

H

(n)R1Q(n)R.

norse.

(l18)

A method similar to that used in [236] for adjusting equalizer taps can also be used for adjusting array weights. The method uses a running average of the past gradients to estimate the required gradient at the nth iteration rather than using the past correlation matrixes to estimate RCn), as is done in the recursive LMS case to reduce the weight noise. It should be noted that all of these gradient estimating schemes-which reduce the variance of the gradient, leading to less fluctuations in array weights, inherently increase the convergence speed of the algorithm as one is able to increase the step size without compromising the stability of the algorithm. 3) Implementation Issues: The convergence speed, fluctuations in array weights during adaption, and misadjustment noise are the measures of the transient and steady-state behavior of the LMS algorithm. The theoretical performance of the algorithm and the effect of the look-direction signal and gradient step size discussed in the previous section assume the existence of infinite precision, that is, the variable is allowed to take any value. Now, the implications of finite-precision implementations are briefly discussed. a) Finite-precision arithmetic: In real life, when the algorithm is implemented using digital hardware, where a variable can take only discrete values, there are other parameters that affect its performance and other issues that need consideration, including quantization noise as well as roundoff and truncation noise caused by finite-precision arithmetic [204], [237]-[244]. First, when a b-bit quantizer is used to convert an analog signal of range -T 1n ax to Truax into a digital signal, it adds

Comparing this with the variance of the standard LMS algorithm, it follows that the variance of the gradient was reduced by a factor of (n + 1)2 using the recursive method, thus making the recursive algorithm less signal sensitive. As ti - 7 oo, the signal sensitivity of the recursive LMS algorithm approaches zero. The signal sensitivity of the LMS also can be reduced by spatial averaging instead of sample averaging, as is done when the weights are estimated using a structured gradient algorithm. b) Structured gradient algorithm: For a linear array of equispaced elements, the array correlation matrix has the Toeplitz structure, that is TO

T1

1'£-1

1'* H

R-=

1'*

£-1

(119) TO

with ''',. i == 0,1, ... , L - 1 being the L correlation lags. The noisy sample of R used in estimating the gradient for the standard LMS algorithm does not have this structure. The structured gradient algorithm [231], [232] exploits this structure of R such that the estimated matrix has this structure. The 'tth lag 1'i (n) is estimated as

r,(n)

= ~i

L xe(n)xt+Jn), E

rl==O,l,···,L-l

(120) 117

a quantization noise of zero mean and variance [245] 2

O"q

==

2-

2b .2 1 Inax

3

component using the Hilbert transformer or quadrature filter [247], which has the transfer functions

(121)

H(f) ==

to the system. Second, the effect of the finite word length of the devices where the numbers are stored causes the roundoff or truncation noise to be added to the system. This arises from the fact that when arithmetic operations are performed using these numbers, the answers are normally longer than the available word length and thus need to be rounded off or truncated to fit into finite word memory. Last, all the variables, such as the estimated gradient, gradient step size, and estimated weights, are allowed to take only finite values and can be increased or decreased by a factor of two. The combined effect of all these on the algorithm is a larger fluctuation in the weights and a larger misadjustment than otherwise. The misadjustment appears to be the most sensitive to the finite word length effect on weights, suggesting that the weights should be implemented using a longer word length [237]. A reduction in the step size below certain levels may even cause the misadjustment to increase [242], which is contrary to the infinite-precision case, where a decrease in the step causes the misadjustment to decrease. It appears [244] that the finite word length effects are amplified in an environment that yields smaller eigenvalues for the correlation matrix. An important effect of the finite word length on the weight update is that when a small input does not cause the weights to move more than the least significant bit (the smallest possible increment, which depends upon the number of bits used to store weights), then the algorithm stalls and the weights do not change anymore [242], requiring a bigger step size, which in turn increases the weight fluctuations. A postalgorithm smoothing scheme suggested in [238] appears to reduce the weight fluctuations, leading to a better convergence performance. It suggests a running average of past weights. Thus, the weights are recursively updated using past weights with or without finite memory. A discussion of system design applicable to mobile satellite communications that takes into account quantization noise and other issues discussed above may be found in [59]. b) Real versus complex implementation: There are situations where the input data to the weight adaption scheme are real, and situations where these are complex (with real and imaginary parts denoting in-phase and quadrature components). In both of these cases, the weights could be updated using the real LlvlS algorithm or the complex LMS algorithm. The former utilizes real arithmetic and uses real variables and updates real weights (the in-phase and quadrature components are updated separately when complex data are available), whereas the complex algorithm [246] utilizes complex arithmetic, uses complex variables, and updates as well as implements weights as complex variables similar to the treatment presented in this paper. For real data using a complex algorithm, one needs to generate the quadrature

{~j J

f>O

(122)

f < O.

For a similar misadjustment, the complex algorithm converges faster than the real algorithm. For more details on this aspect, see, for example, [198] and [228]. C. RLS Algorithm

The convergence of the LMS algorithm depends upon the eigenvalues of R. In an environment yielding R with a large eigenvalue spread, the algorithm converges with slow speed. This problem is solved in an RLS algorithm [64], [248]-[258] by replacing the gradient step size J.-L with a gain matrix R-1(n) at the nth iteration, producing the weight update equation

1Q(n) == 1Q(n - 1) - R-l(n)~(n)£*(w..(n - 1))

(123)

where R(n) is given by

R(n) == boR(n - 1) + ~(n)~H (n)

==

L bg-k~(k);£H (k) n

(124)

k=O

where bo, a real scalar smaller than but close to one, is used for exponential weighting of the past data and is referred to as the forgetting factor, as the update equation tends to deemphasize the old samples. The quantity 1/1 - 80 is normally referred to as the memory of the algorithm. Thus, for 80 == .99, the memory of the algorithm is close to 100 samples. The RLS algorithm updates the required inverse of R(n) using the previous inverse and the present sample as

R-1(n) =

:0 [R-1(n - 1)

_ R-l(n - l);f(n)~H(n)R-l(n -

80

+ ~H(n)R-l(n -

1)~(n)

1)]

.

(125)

The matrix is initialized as R-1(O)

= ~I,

EO

co

> O.

( 126)

A discussion on the selection of co and its effects on the performance of the algorithm can be found in [253]. The RLS algorithm minimizes the cumulative square error [251], [252]

J(n) ==

L 8gn

k

lc(k)12

( 127)

k=O

and its convergence is independent of the eigenvalue distribution of the correlation matrix. The algorithm presented here is the exact RLS algorithm. For other forms of the RLS algorithm with improved computation efficiency, see, for example, [249] and [253]. A comparison of the convergence speed of the LMS, the RLS, and some other gradient-based algorithms using 118

in mobile communications has been investigated in [267]. A variation of CMA referred to as differential CMA and reported in [180] has inferior convergence characteristics compared to CMA but may be improved using DOA information to make it operative in beam space.

quantized or clipped data indicates that RLS is the most efficient and L,MS is the slowest [259]. A computer-simulation study of the RLS, LMS, and SMI algorithms in a mobile communications situation suggests that the former outperforms the latter two in flat-fading channels [260]. An application of the RLS algorithm for the reverse link of a cellular communication using the COMA system is considered in [261] to show an increase in channel capacity by an adaptive array.

E. Conjugate Gradient Method

An application of the conjugate gradient method [268]-[270] to adjust the weights of an antenna array is discussed in [57] and [271]. The method in general is useful for solving a set of equations of the form ./l.Yd. == !2. to obtain 'JQ. For an array processing problem [57], [271], W denotes the array weights, A is a matrix with each of its columns denoting consecutive samples obtained from array elements, and Q is a vector containing consecutive samples of the desired signal. Thus, a residual vector

D. CMA

CMA is a gradient-based algorithm that works on the premise that the existence of an interference causes fluctuation in the amplitude of the array output, which otherwise has a constant modulus. It updates the weights by minimizing the cost function [96], [262 ]-[264]

len) = ~E[(ly(nW - Y6)2] 2

( 128)

t: == 12 - A1Q

using the following equation:

1Q( n + 1) == 1Q( n) - /-Lfl(1Q( n))

(134 )

denotes an error between the desired signal and the array output at each sample, with the sum of the squared error given by r"[. The method starts with an initial guess y:!.( 0) of the weights, obtains a residual

(129)

where y(n) == 1QH (n );K(n + 1) is the array output after the nth iteration, Yo is the desired amplitude in the absence of interference, and fl(W(n)) denotes an estimate of the gradient of the cost function. Similar to the LMS algorithm discussed previously, it uses an estimate of the gradient by replacing the true gradient with an instant value given by

(135)

and an initial direction vector (136)

(130)

and moves the weights in this direction to yield a weight update equation

where 6. ( E(n) == ly(n)l-') -

Yo

'))

y(n).

(131 )

w(n

The weight update equation for this case becomes

1Q(n + 1) == Yd.(n) - 2J1E(n)~(n

+ 1).

- /-L(n)[L(n)

(137)

where the step size

(132)

( 138)

In appearance, this is similar to the LMS algorithm with a reference signal where c(n) ~ d(n) - y(n).

+ 1) == ~(n)

The residual r( n) and the direction vector updated using

( 133)

Its application to a digital land-mobile radio communications system using TDMA is studied in [265] to compensate for selective fading. Discussions of hardware implementation of a CMA adaptive array and its BER performance for high-speed transmission in mobile communications may be found in [234] and [266]. Development of CMA for beam-space array signal processing, including its hardware realization, has been reported in [99]. The results presented in [96] indicate that the beam-space CMA is able to cancel interferences arriving from directions other than the look direction. CMA is useful for eliminating correlated arrivals and is effective for constant modulated envelope signals such as OMSK and QPSK, which are used in digital communications. The algorithm, however, is not appropriate for the CDMA system because of the required power control [261]. Use of CMA to separate cochannel FM signals blindly

~(n)

are

and [l(n

+ 1) ==

AHr(n

+ 1) -

a(n)[l(n)

(140)

with

o:(n) ~

IA H [(n + lW 4H r (n )12

1 .:

.

(141 )

The algorithm is stopped when the residual falls below a certain predetermined level. It should be noted that the direction vector points in the direction of the gradient of the error surface r.. H (n )r..( n) at the nth iteration, which the algorithm is trying to minimize. The method converges to the minimum of the error surface within at most L iterations for an L-rank matrix equation and thus provides the fastest convergence of all the iterative methods [57], [270], [272]. 119

could be measured to estimate the instant gradient. The weight update equation then becomes

Use of the conjugate gradient method to eliminate multipath fading in mobile communications situations has been studied in [57] and [271] to show that the BER performance of the system using the conjugate gradient method is better than that using RLS algorithm.

1Q(n + 1) = 1Q(n) -

In this section, an algorithm referred to as Madaline Rule III (MRIII) is described. A discussion of various aspects of this algorithm as well as other related issues can be found in [273]. For a general theory of neural networks and their applications, see, for example, [274] and [275]. The MRIII algorithm described here is applicable when the reference signal is available and minimizes the MSE between the reference signal and the modified array output rather than the MSE between the reference signal and the array output, as is the case for other algorithms discussed previously. The array output is modified using a nonlinear mapping such as hyperbolic tangent

1 - e- 2x 2 1 + e: x

(142)

IV.

+ 1) == w.(n) -

ftfl(w.(n))

(143)

where tt is the gradient step size and g(w.( n )) is the instant gradient of the MSE surface with respect to the array weights w.(n). When the array is operating with weights w.(n), producing the array output

A. Spectral Estimation Methods These methods estimate the DOA by computing the spatial spectrum and then determining the local maximas [43], [279]-[284]. Most of these techniques have their roots in time-series analysis. A brief overview and comparison of some of these methods can be found in [279] and [281]. One of the earliest methods of spectral analysis is the Bartlett method [279], [284], where a rectangular window of uniform weighting is applied to the time-series data to be analyzed. For bearing estimation problems using an array, this is equivalent to equal weighting on each element. Thus, by steering the array in () direction, this method estimates the mean power, an expression for which is given by

(144) the modified output y( n) becomes

y(n) == tanh(y(n))

(145)

and the resulting error signal is given by

i(n) == y(n) - T(n + 1).

(146)

The instant gradient of the MSE surface with respect to the array weights '.!Q( n) thus becomes

fl (W. (n ))

P (())

== 8 (f* ( n) £( n ))

B

8Yl(n )

n 8'JR(n)

= 2i(n) 8i(n) 8y(n) 8y(n) ow.(n) _ 8€(n) = 2c:(n) Qy(n)£(n + 1).

(147)

Replacing 8i( n) / 8y( n) with ~E( n) / ~y for small ~y in (147) results in

g(1Q(n)) = 2i(n) ~~(n) £(n uy

+ 1)

== §.t!R§.fj L2

( 150)

where §.O denotes the steering vector associated with the direction (). A set of steering vectors {.~o} associated with different () is often referred to as the array manifold in DOA estimation literature. In practice, it may be measured at the time of array calibration. The process is similar to that of mechanically steering the array in this direction and measuring the output power. Due to the resulting side lobes, the output power is not only contributed from the direction in which the array is steered but from the directions where the side lobes are pointing. The processor is also known as the conventional beam former, and the resolving power of the processor depends upon the aperture of the array or the beamwidth of the main lobe. Its use for mobile communications has been studied in [285].

== 2-( ) 8i(n) E

DOA ESTIMATION METHODS

In this section, a review of DOA estimation methods, including their performance, sensitivity, and limitations [278], is presented. The direction of a source is parameterized by the variable ().

and the weights are updated using

w.(n

(149)

The MSE surface of the error signal i(n) may have local minimas, and thus the global convergence of the MRIII algorithm is not guaranteed, which is not the case when MSE between the reference signal and the array output is minimized [273]. The algorithm, however, is very robust, suitable for analog implementation and resulting in fast weight updates. The MRIII algorithm described here is suitable when the reference signal is available. A scheme to solve a constrained beam-forming problem using neural networks is analyzed in [276], and its implementation using switched capacitor circuits is described in [277]. Computer simulations and experimental results indicate the suitability of the scheme.

F. Neural Network Approach

tanh(x) ==

2p,~(n) ~~~) £(n + 1).

(148)

where ~i'( n) denotes the change in the error output when the array output is perturbed by a small amount 6.y and

120

B. MVDR Estimator

D.MEM

This is the ML method of spectrum estimation [43], which finds the ML estimate of the power arriving from a point source in direction () assuming all other sources as interferences. In beam-forming literature, it is known as the MVDR beam former as well as the optimal beam former since in the absence of errors, it maximizes the output SNR and passes the look-direction signal undistorted. For a DOA estimation problem, the term "maximum likelihood" is used for the method that finds the ML estimate of the direction rather than of the power, as is done by this method [286]. Following this convention, the current estimator in this paper is referred to as the MVDR estimator. This method uses the array weights, which are obtained by minimizing the mean output power subject to unity constraint in the look direction. An expression for the power spectrum is given by

This method finds a power spectrum such that its Fourier transform equals the measured correlation subjected to the constraint that its entropy is maximized [289]. The entropy of a Gaussian band-limited time series with power spectrum 5(f) is defined as

Pt\·rv(O) ==

H

2.(J

1

R-l

2.(J

.

H(S)

J:;:

in S(f) dj

( 154)

where fA! is the Nyquist frequency. For estimating DOA from the measurements using an array of sensors, the method finds a continuous function Pt\-1E((}) > 0 such that it maximizes the entropy function

H(P)

f?1r = Jo lnPI\1E((})d(}

( 155)

subject to the constraint that the measured correlation between the ith and the jth element R1,j satisfies

(151)

This method has better resolution properties than the Bartlett method [42] but does not have the best resolution properties of any method [281].

c.

=

where TiJ (0) denotes the differential delay between elements i and j due to a source in () direction. The solution to this problem requires an infinite dimensional search, which may be transformed to a finite dimensional search using the duality principle [290], leading to

Linear Prediction Method

This method estimates the output of one sensor using linear combinations of the remaining sensor outputs and minimizes the mean square prediction error, that is, the error between the estimate and the actual output [281], [287]. Thus, it obtains the array weights by minimizing the mean output power of the array subject to the constraint that the weight on the selected sensor is unity. An expression for the array weights and the power spectrum is given, respectively, by [281]

( 157) where iQ is obtained by minimizing

subject to

( 152)

T

Yl T.-

and

==

2w

( 159)

and ( 153)

( 160) with 2((}) and

where ~l is a column vector of all zeros except one element, which is equal to one. The position of one in the column corresponds to the position of the selected element in the array for predicting its output. There is no criterion for proper choice of this element. The choice of this element, however, affects the resolution capability and the bias in the estimate, and these effects are dependent upon the SNR and separation of the directional sources [281]. The linear prediction methods perform well in a moderately low SNR environment and are a good compromise in situations where sources are of approximately equal strength and are nearly coherent [288].

r defined as

== [1, h cos(27rf T12( ()) ) , ... ]T r== [R 11,hR12 , ... ]T.

~(O)

(161) (162)

It should be noted that the dimension of this vector depends upon the array geometry and is equal to the number of known correlations R i j for every possible i and J. The minimization problem defined above may be solved iteratively using a standard gradient descent algorithm. More information on various issues of the MEM may be found in [200] and [291]-[295]. The suitability of MEM

121

It sh?uld be noted that the noise subspace is spanned by the eigenvectors associated with the smaller eigenvalues of the correlation matrix, and the signal subspace is spanned by the eigenvectors associated with its larger eigenvalues. In principle, the eigenstructure-based methods search for directions such that the steering vectors associated with these directions are orthogonal to the noise subspace and are contained in the signal subspace. In practice, the search may be divided into two parts. First, find a weight vector that is contained in the noise subspace or is orthogonal to the signal subspace. Then search for directions such that the steering vectors associated with these directions are orthogonal to this vector. The source directions correspond to the local minimas of the function I1QH ~81. In this function, Q8 denotes a steering vector. When these steering vectors are not guaranteed to be in the signal subspace, there may be more minimas than the number of sources, and the distinction between the actual source direction and a spurious minimas in Iw H ~81 is made by measuring the power in these directions. Many methods have been proposed that utilize the eigenstructure of the array correlation matrix. These methods differ in the way the available array signals have been utilized, required array geometry, applicable signal model, and so on. Some of these methods do not require explicit computation of the eigenvalues and eigenvectors of the array correlation matrix, whereas in others, it is essential. An effective computation of these quantities may be made by methods similar to those described in [311]. When this matrix is not available, a suitable estimate of the matrix is made from the available samples. One of the earliest methods of DOA estimation based on the eigenstructure of a covariance matrix is due to Pisarenko [312] and has a better resolution property than those of the minimum variance, maximum entropy, and linear prediction methods [313]. A critical comparison of this method with two other schemes [314], [315] applicable for a correlated noise field that exists in situations of multipaths has been presented in [316] to show that Pisarenko ' s method is an economized version of these schemes restricted to equispaced linear arrays. The scheme presented in [314] is useful for off-line implementation similar to those presented in [16], [317], and [318], whereas the method described in [315] is useful for real-time implementations and uses a normalized gradient algorithm to estimate a vector in the noise subspace from available array signals. Some other schemes suitable for real-time implementation are discussed in [319]-[321]. A scheme known as the matrix pencil method, shown [322] to be similar to Pisarenkos method has been described in [323]. ' Eigenstructure methods may also be used for finding DOA when the background noise is not white but has either a known covariance [324] or an unknown covariance [325], or when the sources are in the near field and/or the sensors have unknown gain patterns [326]. For the latter case, the signals induced on all elements of the array are not of equal intensity, as is the case when the array is in the far field of the directional sources. The effect of spatial coherence

for mobile communications in fast-fading signal conditions has been studied in [200].

E. MLM This method estimates the DOA's from a given set of array samples by maximizing the log-likelihood function [286], [296]-[303]. The likelihood function is the joint probability density function of the sampled data given the DOA's and viewed as a function of the desired variables-the DOA's, for this case. The method searches for those directions that maximize the log of this function, the log-likelihood function. The ML criterion signifies that plane waves from these directions are most likely to cause the given samples to occur [304]. The maximization of the log-likelihood function is a nonlinear optimization problem. In the absence of a closedform solution, it requires iterative schemes for solutions. There are many such schemes available in the literature. The well-known gradient decent algorithm using the estimated gradient of the function at each iteration as well as the standard Newton-Raphson method are well suited for .the. job [305]. Other schemes, such as the alternating projection method [298], [300] and the expectation maximization algorithm [286], [306], [307], have been proposed for solving this problem in general as well as for specialized cases, such as unknown polarization [301], unknown noise environments [302], and contaminated Gaussian noise [296]. A fast algorithm [308] based upon Newton's method developed for estimating frequencies of sinusoids may be modified to suit the DOA estimation based upon ML criterion. The ML method gives a superior performance compared to other methods, particularly when the SNR is small, the number of samples are small, or the sources are correlated [298], and thus is of practical interest. For a single source, the .estimates obtained by this method are asymptotically unbiased [301], that is, the expected values of the estimates are equal to their true values. In that sense, it may be used as a standard to compare the performance of other methods. The method normally assumes that the number of sources M are known [298]. When a large number of samples are available, other, computationally more efficient schemes may be used with performance almost equal to this method [299]. A.nalysis of the method to estimate the direction of sources when the array and the source are in motion relative to each other indicates its potential for mobile communications [309], [310].

F. Eigenstructure Methods These methods rely on the following properties of R: 1) The space spanned by its eigenvectors may be partitioned into two subspaces, namely, the signal subspace and the noise subspace, and 2) the steering vectors corresponding to the directional sources are orthogonal to the noise subspace. As the noise subspace is orthogonal to the signal subspace, these steering vectors are contained in the signal subspace. 122

on the resolution capability of these methods is discussed in [327] and [328], whereas the issue of the optimality of these methods is considered in [329]. Now, some of the popular schemes are described in detail. G. MUSIC Algorithm 1) Spectral MUSIC: The MUSIC method [330] is a relatively simple and efficient eigenstructure method of DOA estimation. It has many variations and is perhaps the most studied method in its class. In its standard form, also known as spectral MUSIC, the method estimates the noise subspace from the available samples. This can be done by either eigenvalue decomposition of the estimated array correlation matrix or singular value decomposition of the data matrix, with its N columns being the N snapshots or the array signal vectors. The latter is preferred for numerical reasons [331]. Once the noise subspace has been estimated, a search for 1\11 directions is made by looking for steering vectors that are as orthogonal to the noise subspace as possible. This is normally accomplished by searching for peaks in the MUSIC spectrum given by

( 163) where [F;\, denotes an L by L-Nf dimensional matrix with its L-1\I1 columns being the eigenvectors corresponding to the L-M smallest eigenvalues of the array correlation matrix, and 2f) denotes the steering vector corresponding to direction (j. It should be noted that instead of using the noise subspace and searching for directions with steering vectors orthogonal to this subspace, one may use the signal subspace and search for directions with steering vectors contained in this space [332]. This amounts to searching for peaks in 1fT !if) 12 where [15 denotes an L by !v!-dimensional matrix, with its Ai columns being the eigenvectors corresponding to the 1~;J largest eigenvalues of the array correlation matrix. It is advantageous to use the one with the smaller dimensions. For the case of a single source, the DOA estimate made by the MUSIC method asymptotically approaches the CRLB, that is, when the number of snapshots increases infinitely, the best possible estimate is made. For the multiple sources, the same holds for the large SNR cases, that is, when the SNR approaches infinity [333], [334]. The CRLB gives the theoretically lowest value of the covariance of an unbiased estimator. An application of the MUSIC algorithm to cellular mobile communications is investigated to locate land mobiles and shows that when multipath arrivals are grouped in clusters, the algorithm is able to locate the mean of each cluster arriving at a mobile [335]. This information then may be used to locate the line of sight. Its use for mobile satellite communications has been suggested in [59]. 2) Root-MUSIC: For a UL,A, the search for DOA can be made by finding the roots of a polynomial. In this case, the method is known as root-MUSIC [332]. Thus. root-MUSIC

If

is applicable when a ULA is used. It solves a polynomial rooting problem in contrast to the identification and localization of spectral peaks using spectral MUSIC. RootMUSIC has a better performance than spectral MUSIC [336]. 3) Constrained MUSIC: This incorporates the knowledge of the known source to improve the estimates of the unknown source direction [331]. The situation arises when some of the source directions are already known. This method removes the components of the signal induced by these known sources from the data matrix and then uses the modified data matrix for DOA estimation. It is achieved by projecting the data matrix onto a space-orthogonal complement to a space spanned by the steering vectors associated with known source directions. It is a matrix operation. The process reduces the dimension of the signal subspace by a number equal to the known sources and improves the quality of the estimate, particularly when the known sources are strong or correlated with the unknown sources. 4) Beam-Space MUSIC: The MUSIC algorithms described above process the snapshots received from sensor elements without any preprocessing, such as to form beams, and thus may be thought of as element-space algorithms. This is contrary to a beam-space MUSIC algorithm, where the array data are passed through a beam-forming processor before applying MUSIC or any other DOA estimation algorithm. The output of the beam-forming processor may be thought of as a set of beams, and thus the processing using these data is normally referred to as beam-space processing. A number of DOA estimation schemes are discussed in [337] and [338], where data are obtained by forming multiple beams using an array. DOA estimation in beam space has a number of advantages, such as reduced computation, improved resolution, reduced sensitivity to system errors, reduced resolution threshold, reduced bias in the estimate, and so on [333], [339]-[342]. These advantages arise from the fact that a beam former is used to form a number of beams that are less than the number of elements in the array, and thus one needs to process less data for DOA estimation. One may think of this process in terms of the degrees of freedom of the array. The element-space methods have degrees of freedom equal to the number of elements in the array, whereas the degrees of freedom of beam-space methods equal the number of beams formed by the beamforming filter. Thus, the process reduces the degrees of freedom of the array. Normally, one needs only Ail + 1 degrees of freedom to resolve M sources. The root-MUSIC algorithm discussed for the elementspace case may also be applied to this case, giving rise to beam-space root-MUSIC [341], [342]. It enjoys the computational savings offered by beam-space methods compared to element-space methods in general.

H. Min-Norm Method The min-norm method [314], [343] is applicable for ULA and finds the DOA estimate by searching for the location

123

arrays, one may be able to resolve more than L(L 1)/2 sources using L elements. The other direction-finding methods applicable to an unknown noise field are described in [325] and [353]-[356]. The MAP method presented in [354] and [355] is based on Bayesian analysis, and estimated results are not asystematically consistent, that is, the results may be biased [352]. The method in [356], referred to as CANAL, may be implemented using analog hardware, thus eliminating the need for sampling, data storage, and so on. A DOA estimation method in the presence of correlated arrivals using an array of unrestricted geometry is discussed in [357].

of peaks in the spectrum [344]

P~'1N (8) ==

1

I1QH §.(J 12

(164)

by calculating an array weight 1!l, which is of minimum norm, has its first element equal to unity, and is contained in the noise subspace. The solution of the above problem leads to the following expression for the spectrum [344]-[346] P~'1N(8) ==

1

l§.fUl\TUP ~11

2

(165)

with the vector ~l denoting all zeros except the first element, which is equal to unity. As the method is applicable for ULA, the optimization problem to solve for the array weight may be transformed to a polynomial rooting problem, leading to a root-min-norm method similar to root-MUSIC. A comparison of the performance of the two [347] indicates that the variance in the estimate obtained by root-MUSIC is smaller than or equal to that of the root-minnorm method. Schemes to speed up the DOA estimation algorithm of min-norm and to reduce computations are discussed in [344] and [348].

J. ESPRIT

ESPRIT [358] is a computationally efficient and robust method of DOA estimation. It uses two identical arrays in the sense that array elements need to form matched pairs with an identical displacement vector, that is, the second element of each pair ought to be displaced by the same distance and in the same direction relative to the first element. This, however, does not mean that one has to have two separate arrays. The array geometry should be such that the elements could' be selected to have this property. For example, a ULA of four identical elements with an interelement spacing d may be thought of as two arrays of three matched pairs, one with the first three elements and one with the last three elements such that the first and second elements form one pair, the second and third elements form another pair, and so on. The two arrays are displaced by the distance d. The way that ESPRIT [358] exploits this subarray structure for DOA estimation is now briefly described. Let the signals induced on the fth pair due to a narrowband source in direction () be denoted by Xt(t) and Yt(t). The phase difference between these two signals depends upon the time taken by the plane wave arriving from the source under consideration to travel from one element to the other. As the two elements are separated by the displacement ~o, it follows that

I. CLOSEST Method

This method is useful for locating sources in a selected sector. Contrary to beam-space methods, which work by first forming beams in selected directions, it operates in the element space and in that sense is an alternative to beam-space MUSIC. In a way, it is a generalization of the min-norm method. It searches for array weights in the noise subspace that are close to the steering vectors corresponding to the DOA' s in the sector under consideration; thus the name "CLOSEST" method. Depending upon the definition of the closeness, it leads to various schemes. A method referred to as FINE selects an array weight vector by minimizing the angle between the selected vector and the subspace spanned by the steering vectors corresponding to the DOA' s in the selected sector. In short, the method replaces the vector ~l used in the min-norm method with a suitable vector depending upon the definition of the closeness used. More details about the selection of these vectors and the relative merits of the CLOSEST method are provided in [349]. A number of eigenstructure methods reported in the literature exploit specialized array structures or noise scenarios. Two methods using uniform circular arrays are presented in [350] that extend beam-space MUSIC and ESPRIT algorithms (to be discussed in Section IV-J) for two-dimensional angle estimation, including an analysis of MUSIC to resolve two sources in the presence of gain, phase, and location errors. Properties of the array have also been exploited in [351] to find the azimuth and the elevation of a directional source. Two DOA estimation schemes in an unknown noise field using two separate arrays proposed in [352] appear to offer a superior performance compared to their conventional counterparts. Advantages of minimum redundancy linear arrays are discussed in [341]. It has been shown that by using such

Yt( t) == Xt( t )eJ27r~o

cos

(J

(166)

where ~o is measured in wavelengths. Note that ~o is the magnitude of the displacement vector. This vector sets the reference direction, and all angles are measured with reference to this vector. Let the array signals recei ved by the two arrays be denoted by ~.(t) and 'Y-(t). These are given by (167)

and ( 168)

where A is a K by M matrix, with its columns denoting the M steering vectors corresponding to M directional sources associated with the first subarray; is an M by M diagonal matrix, with its mth diagonal element given by
124

rn rn

==

ej27r~o cos8 rn

(169)

1(t) denotes M source signals induced on a reference

matrix of the full array of L elements. Then select the first K < L rows of U to form [I x and the last of its K rows to form U s:

element and ll x (t ) and ll y (t ), respectively, denote the noise Induced on the elements of the two subarrays. Comparing the equations for £(t) and uJt), it follows that the steering vectors corresponding to M directional sources associated with the second subarray are given by A. Let c, and Uy denote two K by M matrixes with their columns denoting the M eigenvectors corresponding to the largest eigenvalues of the two array correlation matrixes R x x and R y y , respectively. As these two sets of eigenvectors span the same M -dimensional signal space, it follows that these two matrixes U; and U y are related by a unique nonsingular transformation matrix 1/), that is

4) Form a 2M by 2M matrix

[~~J[Ux U

and find its eigenvectors. Let these eigenvectors be the columns of a matrix V. 5) Partition V into 4M by M matrixes as

(176)

(170)

6) Calculate the eigenvalues -\nl' m matrix

Similarly, these matrixes are related to steering vector matrixes _4 and A by another unique nonsingular transformation matrix T, as the same signal subspace is spanned by these steering vectors. Thus

e - -. HI

and

1

m == 1,,,,,1\.1.

{Arg(-\n_t) } ,

21f Lln

ni

==

1, .... lV!. (177)

Use of ESPRIT for DOA estimation using an array at a base station in the reverse link of a mobile communications system has been studied in [368].

which states that the eigenvalues of 1/) are equal to the diagonal elements of and that the columns of Tare eigenvectors of 1/). This is the main relationship in the development of ESPRIT [358]. It requires an estimate of 1/) from the measurement K(t) and y(t). An eigendecomposition of 'V) gives its eigenvalues, and by equating them to leads to the DOA estimates

21f Llo

_,-1

co~

[359], beam-space ESPRIT for uniform rectangular array [364], resolution-enhanced ESPRIT [360], virtual interpolated array ESPRIT [362], multiple invariance ESPRIT [365], higher order ESPRIT [366], and procrustes rotationbased ESPRIT [367].

(173)

== cos- 1 {Arg(Ant) }

-

Other ESPRIT vanations include beam-space ESPRIT

( 172)

Substituting for U'; and Uy and the fact that A is of full rank, one obtains

em

== 1, ... ,M of the

7) Estimate the angle of arrival Orn using

(171 )

Uy == ..4T.

( 175)

y ]

K. WSF Method The WSF method [369], [370] is a unified approach to schemes like MLM, MUSIC, and ESPRIT. It requires knowledge of the number of directional sources. The method finds the DOA such that the weighted version of a matrix whose columns are the steering vectors associated with these directions is close to a data-dependent matrix. The data-dependent matrix could be a Hermitian square root of the array correlation matrix or a matrix whose columns are the eigenvectors associated with the largest eigenvalues of the array correlation matrix. The framework proposed in the method can be used for deriving common numerical algorithms for various eigenstructure methods as well as for their performance studies. Its application for mobile communications employing an array at the base station has been investigated in [58] and [371].

(174)

How one obtains an estimate of 4) from the array signal measurements efficiently has led to many versions of ESPRIT [358]-[363]. One version, refered to as TLS ESPRIT [358], [359], is summarized below. 1) Make measurements from two identical subarrays,

which are displaced by ~o. Estimate the two array correlation matrixes from the measurements and find their eigenvalues and eigenvectors. 2) Find the number of directional sources Musing available methods (some are described later in this section).

L. Other Methods

3) Form the two matrixes with their columns being the M eigenvectors associated with the largest eigenvalues of each correlation matrix. Let these be denoted by [Ix and Uy . For a ULA, this could be done by first forming an L by M matrix U by selecting its columns as the M eigenvectors associated with the largest eigenvalues of the estimated array correlation

A number of methods that do not require eigenvalue decomposition are discussed in [372]-[379]. The method proposed in [372] is applicable for a linear array of L elements. It forms a K by K correlation matrix from one snapshot with K ~ M, and is based on the (JR. orthonormal decomposition [380] on this correlation matrix, with Q being a K by K unitary matrix and R being an 125

upper triangle. The last K-M column of Q defines a set of orthonormal basis for the noise space. Denoting these columns by U l\T, the directions of sources are obtained from the peaks of the spectra

due to multipath propagation. It tends to reduce the rank of the array correlation matrix. A correlation matrix may be tested for source coherency by applying the rank profile test described in [403]. Most preprocessing techniques try to either restore this rank deficiency in the correlation matrix or modify it to be useful for the DOA estimation methods. One scheme, referred to as the spatial smoothing method, has been widely studied in the literature [404]-[416] and is applicable for a linear array. In its basic form, it decorrelates the correlated arrival by subdividing the array into a number of smaller overlapping subarrays and then averaging the array correlation matrix obtained from each subarray. The number of subarrays obtained from an array depends upon the number of elements used in each subarray. For example, using K elements in each subarray, one may form L - K + 1 subarrays from an array of L elements by forming the first subarray using elements 1 to K, the second subarray using elements 2 to K + 1, and so on. The number and size of the subarrays are determined from the number of directional sources under consideration. For M sources, one needs a subarray size of M + 1 and a number of subarrays greater than or equal to M [404]. Thus, to estimate the directions of M sources, one requires an array size of L" == 2M, which could be reduced to 3 M /2 by using improved spatial smoothing methods [405], [407], also known as forward-backward spatial smoothing. This process uses the average of correlation matrix obtained from the forward subarray scheme described above, which subdivides the array starting from one side of the array, and the complex conjugate of the matrix obtained from the backward subarray method, which is starting to subdivide from the other side of the array. The mth subarray matrix Rn! of the backward method is related to the forward method matrix R Tn by

(178) This method is computationally efficient, and its performance is comparable to that of MUSIC [372]. A multiplesource location method based on a matrix decomposition approach is presented in [373]. The method requires knowledge of the noise power estimate and is applicable for coherent as well as noncoherent arrivals. It does not require knowledge of the number of sources. The method discussed in [374] exploits the cyclostationarity [381] of data that may exist in certain situations. This method has significant implementation advantages, and its performance is comparable with the other methods, Another method [375] that combines accuracy with a low computation requirement using polynomial rooting exploits diversity polarization of the arrays. These arrays have the capability of separating signals based on the polarization characteristics and thus have an advantage over uniformly polarized arrays [382], [383]. An adaptive scheme based on Kalman filtering to estimate the noise subspace is presented in [377], which then is combined with root-MUSIC to estimate DOA. The method has good convergence characteristics. The method presented in [376] uses a deconvolution approach to the output of a conventional processor to a localized source, whereas those discussed in [378] and [379] use a neural-network approach to direction finding. The discussion on DOA estimation thus far has concentrated on estimating the directions of stationary narrow-band sources. Though an extension of a narrowband direction-finding scheme to the broad-band case is not trivial, some of the methods discussed here have been extended to estimate the directions of broad-band sources. A discussion of these and other schemes is contained in [313] and [384]-[394]. The methods described in [384]-[386], [389], and [393] are based upon a signal subspace approach, whereas those discussed in [388], l394] and [390], [395] are related to the ESPRIT method and the ML method, respectively. The application of high-resolution direction-finding methods to estimate the directions of moving sources and to track these sources may be found in [396]-[400]. The problem of estimating the mean DOA of spatially distributed sources such as exist in base-mobile communications systems has been examined in [401] and [402].

( 179) where Jo is a reflection matrix, with all its elements along the secondary diagonal equal to unity and elsewhere equal to zero. The process is similar to that used by forwardbackward prediction for bearing estimation [408]. An improved spatial smoothing method [410] uses correlation between all elements of the array rather than correlation between elements of subarrays, as is normally done to improve the performance of the spatial smoothing method. The method described in [409] and [411] removes the effects of sensor noise to make spatial smoothing more effective in low-SNR situations. This spatial filtering method is further refined in [417] to offer DOA estimates of coherent sources with reduced RMS errors. A decorrelation analysis of spatial smoothing [412] shows that there exists an upper bound on the number of subarrays and that the maximum distance between the subarrays depends upon the fractional bandwidth of the signals. A comprehensive analysis [413] of the use of spatial smoothing as a preprocessing technique to weighted ESPRIT and MUSIC methods of DOA estimation shows how their performance could be improved by the proper

M. Preprocessing Techniques

A number of techniques are used to process data before using direction-finding methods for DOA estimation, particularly in situations where directional sources are correlated or coherent. Correlation of directional sources may exist

126

choice of the number of subarrays and weighting matrixes. An application of ESPRIT to estimate source directions and polarization shows the improvement in its performance in the presence of coherent arrivals when it is combined with the spatial smoothing method [418]. The spatial smoothing methods using subarray arrangements reduce the effective aperture of the array as well as the degree of freedom, and thus one needs a higher number of elements to process correlated arrivals than otherwise required. The schemes that do not reduce the effective size of the array include those that restore the structure of the array correlation matrix for a linear array to that when there is no correlation. These are referred to as structured methods [419], [420]. For a linear equispaced array, the correlation matrix in the absence of correlated arrivals has a Toeplitz structure, that is, the elements of the matrix along its diagonals are equal. The correlation between sources destroys this structure. In [419], this is restored by averaging the matrix obtained in the presence of correlated arrivals by simple averaging along the diagonals, while in [420], a weighted average is used. A method using the array correlation matrix structured by averaging along its diagonals of DOA estimation discussed in [421] appears to offer computational advantages over similar methods. Some other preprocessing schemes to decorrelate the correlated sources include random permutation [414], mechanical movement using a circular disk [4221, construction of a preprocessing matrix using approximate knowledge of a DOA estimate [423], signal subspace transformation in the spatial domain [424], unitary transformation method [425], and methods based on aperture interpolations [415], [426], [427].

N. Estimating the Number of Sources Many of the high-resolution direction-finding methods require the number of directional sources, and their performance is dependent on the perfect know ledge of these numbers. Some methods for estimating the number of these sources are discussed here. The method most commonly referred to for detecting the number of sources was first introduced in [428] based on AIC [429] and Rissanen's MOL [430] principle. The method was further analyzed in [431] and [432] and modified in [433] and [434]. A variation of the method that is applicable to coherent sources is discussed in [325], [435], and [436]. Briefly, the method works as follows [428], [432J.

1) Estimate the array correlation matrix from N independent and identically distributed samples. 2) Find the L eigenvalues Ai, 'l == 1, L of the correlation matrix such that Al > A2 > ... > AL.

where (181)

( 182) and the penalty function

- M) f (M , K) -- {M(2L ~M(2L - M) logN,

for AIC for MDL ( 183) with L denoting the number of elements in the array. .3

A modification of the method based on the MDL principle applicable to coherent sources is discussed in [435] and is further refined in [325] and [436] to improve the performance. A parametric method that does not require knowledge of the eigenvalues of the array correlation matrix is discussed in [437]. It has a better performance than some of the other methods discussed and is computationally more complex. All methods that partition the eigenvalues of the array correlation matrix rely on the fact that the M eigenvalues corresponding to M directional sources are larger than the rest of the L-M eigenvalues corresponding to the background noise and select the threshold differently. One of the earliest methods [438] used a hypothesis-testing procedure based upon the confidence interval of noise eigenvalues, and the assignment of the threshold was subjective. A method referred to as an eigenthreshold method [439] uses a one-step prediction of the threshold for differentiating the smallest eigenvalues from the others. The method has a better performance than AIC and MOL. It has a threshold at a lower value of SNR than that of MDL and has a lower error rate than that of AIC at high SNR [439]. An alternate scheme for estimating the number of sources present uses the eigenvectors of the array correlation matrix, unlike other methods, which use the eigenvalues, and is discussed in [440]. The method is referred to as the eigenvector detection technique. It is applicable to a cluster of sources whose approximate directions are known and is able to estimate the number of sources at a lower SNR than that of AIC and MOL. In practice, the number of sources an array may be able to resolve depends not only on the number of elements in the array but also on the array geometry, available number of snapshots, and spatial distribution of sources. Discussion on these and other issues related to the capabilities of an array uniquely to resolve a number of sources may be found in [441 ]-[443] and the references therein. O. Performance Comparison

3) Estimate the number of sources M by solving

min~TizeN(L -

M)log

Performance analysis of direction-finding schemes has been carried out by many researchers [317], [336], [339], [340], [444]-[462]. The performance measures considered for analysis include bias, variance, resolution, CRLB, and probability of resolution. Among the most studied [339],

(~:~~D + f3(M,N) (180) 127

[340], [444]-[454] direction-finding schemes is MUSIC. Most of these studies concentrate on performance and performance comparison with other methods when a finite number of samples are used for direction finding rather than their ensemble average. An asymptotic analysis of MUSIC with forward/backward spatial smoothing in the presence of correlated arrival shows [444] that to estimate two angles of arrival of equal power under identical conditions requires more snapshots for correlated sources than for uncorrelated sources [454]. A rigorous bias analysis of MUSIC shows [447] that estimates are biased. For a linear array in the presence of a single source, the bias increases as the source moves away from broadside. Interestingly, the bias also increases as the number of elements in the array are increased, keeping the aperture fixed. Bias and STD are complicated functions of the array geometry, SNR, and number and directions of sources, and vary in a way inversely proportional to the number of snapshots. A poorer estimate generally results, using a lesser number of snapshots and sources with lower SNR. It is shown in [340] and [447] that the performance of conventional MUSIC is poor in the presence of correlated arrivals and fails to resolve coherent sources. Even though bias and STD both play important roles in direction estimation, the effect of bias near the threshold region is critical. A comparison of the performance of MUSIC with those of min-norm and FINE for a finite sample case [448] shows that in the low-SNR range, the min-norm estimates have the largest STD, and the MUSIC estimates have the largest bias. As these results are dependent on the SNR of the source, the performance of all three approaches the same limit as the SNR is increased. The overall performance of FINE is better than the other two in the absence of correlated arrivals. The estimates obtained by toe MUSIC and ML methods are compared with CRLB in [445] and [446] for a large sample case. The CRLB gives the theoretically lowest value of the covariance of an unbiased estimator. It decreases with the number of samples, number of sensors in the array, and SNR's of the sources [445]. The study concludes that the MUSIC estimates are the large sample realization of the ML estimates in the presence of uncorrelated arrivals. Furthermore, it shows that the variance of the l\1USIC estimate is more than that of the ML estimate, and the two approach each other as the number of elements and the number of snapshots increases. Thus, using an array with a large number of elements and a large number of samples, one is able to make excellent estimates of directions of uncorrelated sources with large SNR using the MUSIC method [445]. It should be noted that the estimates of the ML method are unbiased [460). An unbiased estimate is referred to as a consistent estimate. An improvement in the MUSIC DOA estimation is possible by using beam-space MUSIC [339], [340]. By properly selecting a beam-forming matrix and then using the MUSIC scheme to estimate DOA, one is able to reduce the threshold level of the required SNR to resolve the closely spaced

sources [339]. Though the variance of this estimate is not much different from the element-space case, it has less bias [340]. The resolution threshold of beam-space MUSIC is lower than that of the conventional min-norm method. For two closely spaced sources, however, beam-space MUSIC and beam-space min-norm provide identical performances when suitable beam-forming matrixes are selected [339]. It is shown in [453] that when beam-forming weights have conjugate symmetry (useful only for arrays with particular symmetry), beam-space MUSIC has a decorrelation property similar to backward/forward smoothing. Thus, it is useful for correlated arrival-source estimation and offers performance advantages in terms of lower variance for the estimated angle. The resolution property of MUSIC is further analyzed in [449]-[452] and [454], which show how it depends upon the SNR, number of snapshots, array geometry, and separation angle of the two sources. Analytical expressions of probability of resolution and its variation as a function of various parameters [452] could enable one to predict the behavior of the MUSIC estimate for a given scenario. The two closely spaced sources are said to be resolved when two peaks appear in the spectrum in the vicinity of the source directions. A comparison of the performances of MUSIC and other eigenvector methods, which use the noise eigenvectors divided by the corresponding eigenvalues for DOA estimation, indicates [3] 7] that the performance of the former is more sensitive to the choice of an assumed number of sources compared to the actual number of sources. A performance analysis of many versions of ESPRIT is considered in [336] and [456]-[458] and compared with other methods. Estimates obtained by subspace rotation methods, which include TAM and ESPRIT, have larger variance than those obtained by MUSIC using a large number of samples [456]. Estimates by ESPRIT using a uniform circular array are asymptotically unbiased [458]. LS-ESPRIT and TAM estimates are statistically equivalent. LS-ESPRIT and TLS-ESPRIT have the same MSE [336]. Their performance depends upon how the subarrays are selected [457]. The min-norm method is equivalent to TLS-ESPRIT [463], and root-MUSIC outperforms ESPRIT [464]. TAM is based on the state-space model and finds a DOA estimate from signal subspace. In spirit, its approach is similar to ESPRIT [336]. For Gaussian signals, the WSF method and ML method are efficient, as both attain CRLB asymptotically [455], [459]. A method is said to be efficient when it achieves CRLB. The correlation between the sources affects the capabilities of various DOA estimation algorithms differently [465]. A study [461] of the effect of the correlation between two sources on the accuracy of DOA-finding schemes shows that the phase of the correlation is more significant than the correlation magnitude. Most of the performance analysis discussed so far assumes that the background noise is white. When this is not the case, the DOA schemes perform differently. In the presence of colored background noise, the performance of MUSIC is better than that of ESPRIT

128

Table 1

and the min-norm method over a wide range of SNR. The performance of the min-norm method is worse than those of the other two [466].

Performance Summary of Bartlett Method

Narne of method

Property •

Comments, Comparison and References

Bias

P. Sensitivity Analysis

A sensitivity analysis of MUSIC to various perturbations is presented in [467]-[4721. A compact expression for the error covariance of the MUSIC estimates given in [467] may be used to evaluate the effect of various perturbation parameters, including gain and phase errors, effect of mutual coupling channel errors, and random perturbation in sensor locations. It should be noted that the MUSIC estimate of DOA requires knowledge of the number of sources similar to some other methods, and underestimation of source number may lead to an inaccurate estimate of DOA's [468]. A variance expression for a DOA estimate for this case has been provided in [468]. An analysis of the effect of model errors on the MUSIC resolution threshold [333], [470] and on the waveforms estimated using MUSIC [469] indicates that the probability of resolution decreases [470] with the error variance and that the sensitivity to phase errors depends more upon array aperture than on the number of elements [469] in a linear array. The effect of gain and phase error on the MSE of the MUSIC estimate of a general array is analyzed in [473]. The problem of estimating gain and phase errors of sensors whose locations are known is considered in [4711. An analysis [472] of ESPRIT under random sensor uncertainties suggests that the MUSIC estimates generally give lower MSE than ESPRIT. The former is more sensitive to both sensor gain and phase errors, whereas the latter depends only on phase errors. The study further suggests that for a linear array with a large number of elements, the MSE of the ESPRIT estimate with maximum overlapping subarrays is lower than that with nonoverlapping subarrays. The effect of gain and phase errors on weighted eigenspace methods, including MUSIC, min-norm, FINE, and CLOSEST, is studied in [474] by deriving bias and variance expressions. It indicates that the effect is gradual up to a point, and then the increase in error magnitude causes an abrupt deterioration in the bias and variance of the estimate. The weighted methods differ from the standard ones such that a weighting matrix is used in the estimate, and that matrix could be optimized to improve the quality of the estimate under particular perturbation conditions. The effect of nonlinearity in the system, such as the hard clipping common in digital beam formers, on spectral estimation methods in general is analyzed in [475], which shows that such distortion may be eliminated by additional preprocessing. The effect of various perturbation methods on spectral estimation methods emphasizes the importance of a precise knowledge of various array parameters. There are various techniques to calibrate arrays, some of which are discussed in [476] and [477] and the references therein. There are schemes such as that discussed in [478] to estimate the steering vector and, in turn, the DOA from uncalibrated arrays and in [479] to estimate DOA. Discussions on

•

Biased [465] Bartlett> LP > MLM

Bartlett

•

Resolution

•

Depends upon array aperture

Method

•

Sensitivity

•

Robust to element position errors [280]

•

Array

•

General Array

Table 2

Performance Summary of MVDR Method

Name of method

•

• •

MVDR

Property

Comments, Comparison and References

•

Bias

•

Unbiased

•

Variance

•

Minimum

•

Resolution

•

MVDR> Bartlett [42, 279]

•

Does not have best resolution of any method

Method

[280]

•

Table 3

•

General Array

Performance Summary of MEM

Name of method

•

Array

Property

Comments, Comparison and References

•

Bias

•

Biased [465]

•

Resolution

•

ME > MVDR > Bartlett [279]

•

Can resolve at lower SNR than Bartlett [42]

Maximum

Entropy Method

robustness issues of direction-finding algorithms may be found in [480] and [481]. A summary of the performance and sensitivity comparison of various DOA estimation schemes is provided in Tables 1-12. V.

EFFECT OF ERRORS

The communications system usmg an array of antenna elements considered so far is assumed to be free from errors and perturbations, and the results on various beamforming schemes, adaptive algorithms, and DOA methods are based upon ideal error-free conditions. In real systems, these idealistic situations are hardly met, and the system performance is affected by the amount that the various system parameters deviate from the assumed conditions. Some of these deviations are discussed in this section.

A. Correlated Arrivals The interference-canceling capabilities of the optimal beam formers discussed earlier assume that the signal and

129

Table 4

Performance Summary of Linear Prediction Method

Name of method

Property

Comments, Comparison and References

Table 6 Performance Summary of Element-Space MUSIC Method Narne of method

•

•

•

Linear

Bias

Resolution

•

Biased [465]

•

LP > MVDR [280]

Comments, Comparison and References

•

Bias

•

Biased [447, 358]

•

Variance

•

Less than ESPRIT and TAM for large samples

•

Close to MLM [446], CLOSEST [349], FINE

[456, 457], min norm [349, 448]

> Bartlett

Prediction

> ME [465]

Method

[448] •

•

Table 5

Property

Performance

•

Good in low SNR conditions

•

Applicable for correlated arrivals [288]

Property

•

Bias

Comments, Comparison and References •

Unbiased [460]

•

Less than LP, Bartlett [465], MUSIC [299J

•

Less than MUSIC for small samples [298]

•

Asymptotically efficient for random signals

•

Element Space MUSIC

•

Asympt. efficient for large array [445,446]

•

Resolution

•

Limited by bias [446]

•

Array

•

Applicable for general array,

•

Increasing aperture makes it robust [333]

Performance Summary of MLM

Name of method

Variance of weighted MUSIC is more than unweighted MUSIC [446J

•

Pcrfonnancc

•

Fails to resolve correlated sources

•

Computation

•

Intensive

•

Sensitivity

•

Array calibration is critical [467], sensitivity to phase error depends more on array aperture

than number of elements [469], preprocessing "Can improve resolution [4701 •

Correct estimate of source number is

•

MSE depends upon gain and phase errors and

•

Increase

important [468]

[455,459] •

Variance

•

not efficient for finite samples [299]

•

Less efficient for deterministic signals than

IS

•

MlM

random signals [455]

Method

•

lower than that for ESPRIT [472] 10

gain and phase errors beyond

certain value causes an abrupt dctcrioranon in bias and vanance [474]

Asymp. efficient for deterministic signal using very large array [460]

•

Computation

•

•

Intensive with Jarge samples [299]

Table 7

Same for deterministic and random signals for

Name of method

Performance Summary of Beam-Space MUSIC Method Property

Comments, Companson and References

•

Bias

•

Less than ES MUSIC [340]

•

Variance

•

larger than ES MUSIC [482]

•

RMS Error

•

Less than ESPRIT, min norm [466]

•

Resolution

•

SImilar to BS min norm, CLOSEST [349J,

•

Belter than ES MUSIC, ES rrun norm (339,

•

Threshold SNR decreases as the separation

large arrays [460] •

Performance

•

Applicable for correlated arrivals [298J

•

Works with one sample [298]

•

Beam Space MUSIC

340]

interference are uncorrelated. The correlation between the desired signal and an unwanted interference exists in situations of multipath arrivals and deliberate jamming. It affects the performance of the beam former, as discussed in [52], [419], [420], and [483]-[495], and limits the applicability of various weight estimation schemes. For example, when the weights are estimated by minimizing the mean output power subject to look-direction constraint, the beam former cancels the desired signal while maintaining the constraint. The reason this happens is that while minimizing the mean output power, the beam former adjusts the phase of the correlated interference induced on each antenna such that the power of the sum of the signal and the interference, which is correlated with the signal, is minimized, causing the signal cancellation. This is consistent with the design that the beam former minimizes the output power. The

between the sources Increases [454]

•

Computation

•

Less than ES MUSIC

•

Sensitivity

•

Robust compared to ES MUSIC

design of the optimal weights is based upon the assumption that the signal is not correlated with the interferences, The correlation bxy(f) between two broad-band signals x( t) and y( t) is defined in terms of their power spectrum [496] (184)

130

Table 8

Table 11

Performance Summary of Root-MUSIC Method

Name of method

Property

Comments, Comparison and References

•

•

Variance

Performance Summary of ESPRIT Method

Name of method

Less than Root min norm [347], ESPRIT

Property

Comments, Comparison and References

•

•

TLS ESPRrf unbiased [35~,458J

•

LS ESPRIT biased [358]

Bias

[457] •

Resolution

BS Root MUSIC has better probability of

•

•

resolution than BS MUSIC (341, 342] Root MUSIC

•

•

RMS Error

•

Less than LS ESPRIT (464)

•

RMS Error

•

ESPRIT

•

Less than min norm [466]

•

Tl.S similar to LS [336]

•

Variance

Less than MUSIC for large samples and difference increases with number of clements

Method

•

Array

•

Equispaced linear array

•

Performance

•

Better than spectral MUSIC [341, 464]

•

Similar to TLS ESPRIT at SNR lower than

in

•

Computation

•

Less than MUSIC [362]

•

BS ESPRIT needs less computation than BS Root MUSIC and ES ESPRIT [359]

MUSIC threshold [457] BS Root MUSIC is similar to ES Root

•

array [4561

•

Method

•

LS ESPRIT is similar to TAM (336)

•

Array

•

Needs doublets, No calibration needed

•

Performance

•

MUSIC [341, 342]

Table 9

Performance Summary of MIll-Norm Method

Name of method

•

Minimum Norm

Property

Comments, Comparison and References

•

Bias

•

Less than MUSIC [448,454)

•

Resolution

•

Better than CLOSEST [349], ES MUSIC

•

Method

CLOSEST

Sensitivity

•

Table 12

Variance

•

Similar to ES MUSIC [349]

•

Resolution

•

Similar to BS MUSIC,

•

Better than min norm [349]

•

Sensitivity

better than LS ESPRIT {457]

MSE is lowest for maximum overlapping subarrays under sensor perturbatLon[4721

•

Performance

IS

MSE robust to sensor gam errors [472J

Equivalent to TLS [463]

Comments, Comparison and References

•

better than

Robust than MUSIC and can not handle

•

Property

Method

IS

correlated sources [457]

Performance Summary of CLOSEST Method

Name of method

•

•

TLS ESPRIT

• •

Table 10

•

[349,454]

Method

Optimal weighted ESPRIT

uniformed weighted ESPRIT [456]

Performance Summary of FINE Method

Name of method

•

FINE Method

•

Good in clustered situation [349]

•

An increase in sensor gain and phase errors

Property

Comments, Comparison and References

•

Bias

•

Less than MUSIC [448]

•

Resolution

•

Better than MUSIC and min norm [448]

•

Variance

•

Less than min norm [4481

•

Performance

•

Good at low SNR

beyond certain value causes an abrupt deterioration in bias and variance [474]

Rewriting the correlation matrix for the case of two correlated directional sources as with Gxy(f) denoting the cross-power spectrum. It related to the cross-correlation function

Pxy(T) == E [x(t )Y (t

I:

+ T)]

R == AS..4 H

IS

=

5 == [ Pxy(T)cj21rfT dt.

( 187)

with the source correlation matrix given by

(185)

by the Fourier transform

Gxy(f)

+ (52n I

( 186)

Ps JPsPi 8*

( 188)

shows how the correlation between the two sources affects R. It follows from these expressions that when two sources are uncorrelated-that is, 18\ == 0-5 is a diagonal matrix guaranteeing R to be positive definite (assuming A is of full rank, which requires that steering vectors corresponding to all directional sources are linearly independent [351). The presence of correlation affects the rank of Sand thus of R. In the presence of correlation, the matrix R becomes ill conditioned and may not be invertible, making

Thus, the correlation between the signal and an interference, hereafter denoted as 8, is a complex scalar with S 181 S 1 and lies within the unit circuit. magnitude When the magnitude is equal to one, the two sources are said to be coherent. The correlation between two sources affects the rank of the correlation matrix, causing it to become singular.

a

131

A scheme that does not reduce the degree of freedom of the array is described in [419]. It decorrelates the sources by structuring the correlation matrix to be Toeplitz by averaging along each diagonal and uses the resulting matrix to estimate the weights of the full array. An adaptive algorithm to estimate the weights of an array based upon this principle is presented in [498], and the concept is extended to broad-band beam forming in [499]. A beam-forming scheme [52] based upon master and slave concepts cancels the correlated arrival by the use of two channels. In one channel, the look-direction signal is blocked, and then weights are estimated by solving the constrained beam-forming problem. These weights are then used on the second channel. As the signal is not present at the time of weight estimation, the beam former does not cancel the signal. However, the process only works for one correlated interference. It is extended for a multiple correlated interference case in [486] where an array of 2M - 1 elements is required to cancel M - 1 interferences. The other schemes that require some knowledge of the interference, such as direction or the correlation matrix due to interference only, can be found in [485], [489], [491], [494], and [495]. Many of these schemes improve the array performance in the presence of correlated arrivals by treating the correlated components as interferences and canceling them by forming nulls in their directions using beam-forming techniques. These methods do not utilize the correlated components, as is done in diversity combining (discussed previously), where different components are added in some way to improve the signal level. A receiver known as the RAKE receiver [184], [500]-[502] achieves this increase in signal level for a COMA satellite system by using a number of demodulators operating in parallel to track each component using the user code for that signal. The delay in the signal is identified by sliding the code sequence as required to obtain the maximum correlation with the received component. The signals are added at the baseband after appropriate delay and amplitude scaling. The receiver, however, does not cancel unwanted interferences by shaping the beam pattern.

it difficult to estimate the weights of the optimal beam former, which relies on the existence of the inverse of R. Thus, a beam-forming scheme that is optimal in the absence of a correlated arrival is not able to cancel a correlated interference. Many beam-forming schemes have been devised to cancel an interference source, which is correlated with the signal. In principle, these work by restoring the rank of R. In some earlier work [52], [497], a mechanical movement of the array perpendicular to the look direction was suggested to reduce the signal-cancellation effect by the correlated interference. The scheme generally known as a spatial dither algorithm works on the principle that as the movement is perpendicular to the look direction, the signal induced in the array is not affected, whereas the interference that arrives from a direction different than that of the signal becomes modulated with this motion. This causes a reduction in the interference as noted in [492], where the dither algorithm is further developed such that a mechanical movement is not required. The spatial smoothing scheme [416], as discussed earlier, uses the same idea of spatial averaging by subdividing the array into smaller subarrays and estimates the array correlation by averaging the correlation matrixes estimated from each such subarray. The use of spatial smoothing for beam forming is discussed in [484] and [487] and shows that the use of this method reduces effective correlation between the interference and the desired signal, resulting in a reduction in signal cancellation caused by the optimal beam forming. The spatial smoothing method uses uniform averaging of all the matrixes obtained from different subarrays, that is, each matrix is weighted equally. This results in an estimate of the matrix that is not as good as one could have obtained from given subarray matrixes. Ideally, in the absence of correlation, the array correlation matrix for a uniformly spaced linear array has a Toeplitz structure, that is, elements of the matrix along each diagonal are equal, and the estimated matrix by the spatial smoothing scheme is not the closest to the Toeplitz matrix. This is done by a spatial averaging technique [420], [422], which weighs each subarray matrix differently and then optimizes the weights such that it minimizes the MSE between the weighted matrix and a Toeplitz matrix. The system that results from using this matrix to estimate the weights of the beam former reduces more interference than that given by the use of a uniform weighted matrix estimate. It should be noted that the number of rows and columns in the estimated matrix is equal to the number of elements in the subarray and not equal to the number of elements in the full array. Thus, the weights estimated by this matrix could only be applied to one of the subarrays. Consequently, not all elements of the array are used for beam forming, This reduces the array aperture and its degree of freedom. For an environment consisting of M direction interferences, the desired signal the size of the subarray should be at least .!VI + 2 and the number of subarrays should be at least 1vl(A1 + 1) + 1 [420].

B. Look Direction and Steering Vector Error Knowledge of the look direction is used to constrain the array response in the direction of the signal such that the signal arriving from the look direction is passed through the array processor undistorted. The array weights of the optimal beam former are estimated by minimizing the mean out power subject to the look-direction constraint. The processor maximizes the output SNR by canceling all the interferences. A direction source is treated as an interference if it is not in the look direction. This shows the importance of the accuracy of the knowledge of the look direction. An error occurs when the look direction is not the same as the desired signal direction. For this case, the processor treats the desired signal source as an interference and attenuates it. The amount of attenuation depends upon the power of the signal and the amount of error [16], [42], 132

general discussion of the effect of various errors on the array pattern is provided in [516]. The position of the antenna elements of an array is normally determined by a calibration process requiring auxiliary sources in known locations [517], [518]. A procedure that does not require the location of these sources is described in [508] and [519]. The element failure tends to cause an increase in sidelobe level, and the weights estimated for the full array do not remain optimal [513]. This requires recalculation of the optimal weight with the knowledge of the failed elements taken into account [513], [514].

190J, l503]. A stronger signal is canceled more and a larger error causes more cancellation of the signal. The solution to beam-pointing error is to make the beam broader so that when the signal is not precisely in the direction where it should be (the look direction), its cancellation does not take place. The normal methods of broadening the beams include multiple linear constraints [16], [504] and norm constraints [121]. The latter constraints prohibit the main beam from blowing out, as is the case in the presence of pointing error. In the process of canceling a source close to the point constraint in the look direction, the array response is increased in the direction opposite to the pointing error. A scheme that does not require broadening of the main beam to reduce the effect of pointing error has been reported in [505]. It makes use of direction-finding techniques combined with a reduced dimensional ML formulation to estimate the direction of the desired signal accurately. Effectiveness of the scheme in mobile communications situations has been demonstrated by computer simulations. The study presented in [90] indicates that the beam-space processors in general are more robust to pointing errors than element-space errors. Some other schemes to remedy pointing errors may be found in [506]-[508]. The knowledge of the look direction appears in the weight calculation through the steering vector. The optimal weight calculation for constrained beam forming requires knowledge of the array correlation matrix and the steering vector in the look direction. Thus, the pointing error causes an error to occur in the steering vector, which is used for weight calculation. The steering vector may also be erroneous due to other factors such as imperfection in the knowledge of the position of array elements, errors caused by finite word length arithmetic, and so on. The study of the effect of steering vectors has been reported in [29], [507], and [509]. An analytical study performed by modeling the error as an additive random error [29] indicates that the effect of error is severe in the SPNMI processor, that is, when the array correlation matrix, which is used to estimate the weights, contains the signal. As the signal power increases, the performance of the processor deteriorates further due to errors. The sensitivity of the processor to the steering vector may be decreased by using a combination of the reference signal and steering vector to estimate the weights [510].

D. Weight Errors Array weights are calculated using ideal conditions, stored in memory, and implemented using amplifiers and phase shifters. A theoretical study of the performance of the system assumes the ideal error-free weights, whereas the actual performance of the system is dependent upon the implemented weights. The amplitude as well as the phase of these weights are different from the ideal ones, and these differences arise from many types of errors caused at various points in the system, including: • deviation in assumption that a plane wave arrives at the array; • uncertainty in the positions and characteristics of the elements of the array; • error in the knowledge of the array correlation matrix caused by its estimation from a finite sample and arithmetic; • error in the steering vector or the reference signal used to calculate weights; • computational error caused by finite-precision arithmetic; • quantization error in converting the analog weights into digital form for storage; • implementation error caused by component variation. Studies of weight errors have been conducted by modeling these errors as random fluctuations in weights [29], [520]-[524] or by modeling them as errors in the amplitude and phase [514], [525]-[529]. Performance indexes to measure the effect of errors include the array gain [29], [525], reduction in null depth [520], reduction in interference rejection capability [523], change in side-lobe level [514 L [526], [527], and bias in the angle of arrival estimation [528]. The array gain is the ratio of the output SNR to the input SNR. The effect of random weight fluctuation is to cause a reduction in the array gain: The effect is sensiti ve to the number of elements in the array and the array gain of the error free system [29]. For an array with a large number of elements and with a large error-free gain, a large weight fluctuation could reduce its array gain to unity, implying that output SNR becomes equal to input SNR and no array gain is obtainable.

C. Element Failure and Element Position Error

Uncertainty in the position of an element of an array causes degradation in the array performance in general [511]-[515], and more so when the array beam pattern is determined by constrained beam forming. As discussed previously, the element position uncertainty causes steering vector error, leading to a lower array gain. The effect of the position uncertainty on the beam pattern is to create a background beam pattern similar to that of a single element in addition to the normal pattern of the array [515]. A

133

The phase of the array weight is an important parameter, and an error in the phase may cause an estimate of the source to appear in a wrong direction when an array is used for finding DOA (see, for example, [528]). The phase control of signals is used to steer the main beam of the array in desired positions, as in electronic steering. A device normally used for this purpose is a phase shifter. Those commonly available are ferrite phase shifters and diode phase shifters [20], [530]. One of the specifications with which an array designer is concerned is the RMS phase error. Analysis of the RMS phase error shows that it causes the output SNR of the constrained optimal process to suppress the desired signal, and the suppression is proportional to the product of the signal power and the variance of the random error [531]. Furthermore, the suppression is maximum in the absence of directional interferences. An error that occurs in digital phase shifters is quantization error. In a p-bit digital phase shifter, the minimum value of the phase that can be changed equals 27f/2P . Assuming that the error is distributed uniformly between 7f /2P to 7f /2P , the variance of this error equals 7f2/(3 x 22p ) [531]. The effect of perturbation in the media, which causes the wavefront to deviate from the plane wave propagation assumption, and other related issues may be found in [532]-[534]. The effect of a finite number of samples used in weight estimation is considered in [535]-[537], and how bandwidth affects the performance of a narrowband beam former is discussed in [74] and [538]. The effect of amplitude and phase errors on a mobile satellite communications system using a spherical array employing digital beam forming is studied in [171].

with their convergence characteristics and computational requirements. A detailed treatment of various methods of estimating the DOA's has been provided by including the description, limitation, and capability of each method and their performance comparison as well as their sensitivity to parameter perturbations. This paper provides references to studies where array beam-forming and DOA schemes are considered for mobile communications systems. This aspect of array signal processing was dealt with in Part I of this paper in much more detail by describing how an array could be used for mobile communications and how its use could improve the performance of such systems as well as by discussing the feasibility of an array system in a mobile communications environment.

REFERENCES [1] "Special issue on active and adaptive antennas," IEEE Trans. Antennas Propagat., vol. AP-12, Mar. 1964. [2] "Special issue on adaptive antennas," IEEE Trans. Antennas Propagat., vol. AP-24, Sept. 1976. [3] "Special issue on adaptive processing antenna systems," IEEE Trans. Antennas Propagat., vol. AP-34, Mar. 1986. [4] "Special issue on adaptive systems and applications," IEEE Trans. Circuits Syst., vol. CAS-34, July 1987. [5] "Special issue on bearnforming," IEEE 1. Oceanic Eng., vol. DE-I0, July 1985. [6] "Special issue on underwater acoustic signal processing," IEEE 1. Oceanic Eng., vol. OE-12, Jan. 1987. [7] J. E. Hudson, Adaptive Array Principles. London: Peregrinus, 1981. [8] R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980. [9] S. Haykin, Ed., Array Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1985. [10] B. Widrow and S. D. Stearns, Adaptive Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1985. [11] R. T. Compton, Jr., Adaptive Antennas: Concepts and Performances, Englewood Cliffs. NJ: Prentice-Hall. 1988. [12] M. T. Ma, Theory and Application of Antenna Arrays. New York: Wiley, 1974. [13] J. D. Marr, "A selected bibliography on adaptive antenna arrays," IEEE Trans. Aerosp. Electron. Syst., vol. AES-22, pp. 781-788, 1986. [14] K. Takao, M. Fujita, and T. Nishi, "An adaptive antenna array under directional constraint," IEEE Trans. Antennas Propagac., vol. AP-24, pp. 662-669, 1976. [15] L. Stark, "Microwave theory of phased-array antennas-A review," Proc. IEEE, vol. 62, pp. 1661-1701, 1974. [16] N. L. Owsley, "A recent trend in adaptive spatial processing for sensor arrays: Constrained adaptation," in Signal Processing, J. W. R. Griffiths et al., Eds. New York: Academic, 1973. [17] A. M. Vural, "An overview of adaptive array processing for sonar application," EASCON'75 Rec., pp. 34A-34M. [18] P. M. Schultheiss, "Some lessons from array processing theory," in Aspects of Signal Processing, part 1, G. Tacconi, Ed. Dordrecht-Holland: Reidel, 1977, pp. 309-331. [19] H. A. d' Assumpcao, "Some new signal processors for array of sensors," IEEE Trans. Inform. Theory, vol. IT-26, pp. 441-453, 1980. [20] R. 1. Maillous, "Phased array theory and technology," Proc. IEEE, vol. 70, pp. 246-291, 1982. [21] S. Haykin, 1. P. Reilly, V. Kezys, and E. Vertatschitsch, "Some aspects of array signal processing," Proc. lnst. Elect. Eng., vol. 139, pt. F, pp. 1-19, 1992. [22] H. A. d' Assumpcao and G. E. Mountford, "An overview of signal processing for arrays of receivers," 1. Inst. Eng. AUS1. and fREE Aust., vol. 4, pp. 6-19, 1984. [23] B. D. Van Veen and K. M. Buckley, "Beamforming: A versatile approach to spatial filtering," IEEE Aerosp. Electron. Svst. Mag., vol. 5, pp. 4-24, 1988.

E. Robust Beam Forming The perturbation of many array parameters from their ideal conditions under which the theoretical performance of the system is predicted causes degradation in the system performance by reducing the array gain and altering the beam pattern. Various schemes have been proposed to overcome these problems and to enhance the array system performance under nonideal conditions [90t [121], [539]-[546]. Many of these schemes impose various kinds of constraints on the beam pattern to alleviate the problem caused by parameter perturbation. A survey of robust signal processing techniques in general is conducted in [547]. It contains an excellent reference list and discusses various issues concerning robustness.

VI.

CONCLUSION

This paper has dealt with many facets of array signal processing and beam forming. The emphasis has been on presenting the results in a manner suitable for nonspecialists. This paper has introduced the concepts of beam forming and has provided details of various beam-forming schemes. Many of the available iterative schemes applicable to adaptive beam forming have been described, along 134

[24J S. P. Applebaum, "Adaptive arrays," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 585-598, 1976. [25] O. L. Frost III, "An algorithm for linearly constrained adaptive array processing," Proc. IEEE, vol. 60, pp. 926-935, 1972. [26] W. F. Gabriel, "Adaptive arrays-An introduction," Proc IEEE, vol. 64, pp. 239-272, 1976. [27] B. Widrow, P. E. Mantey, L. 1. Griffiths, and B. B. Goode, "Adaptive antenna systems," Proc. IEEE, vol. 55, pp 2143-2158, 1967. [28J B. Widrow, 1. R. Glover, 1. M. McCool, 1. Kaunitz, C. S. Williams, R. H. Hearn, 1. R. Zeidler, E. Dong, Jr., and R. C. Goodlin, "Adaptive noise canceling: Principles and applications," Proc. IEEE, vol. 63, pp. 1692-1716, 1975. [29] L. C. Godara, "Error analysis of the optimal antenna array processors," IEEE Trans. Aerosp. Electron. Syst., vol. AES-22, pp. 395-409, 1986. [30] H. Krim and M. Viberg, "Two decades of array signal processing: The parametric approach," IEEE Signal Processing Mag., pp. 67-94, July 1996. [31] 1. Munier and G. Y. Delisle, "Spatial analysis in passive listening using adaptive techniques." Proc. IEEE, vol. 75, pp 1458-1471, 1987. [321 S Sivanand. "On adaptive arrays in mobile communication.' in Proc IEEE National Telesvstems ConI, Atlanta, GA, 1993. pp. 55-58. [33] V. C. Anderson and P. Rudnick, "Rejection of a coherent arrival at an array," 1. Acoust. Soc. Amer., vol. 45, pp. 406-410, 1969. [34] V. C. Anderson, "DICANNE, a realizable adaptive process," 1. Acoust. Soc. Amer., vol. 45, pp. 398-405, 1969. [35] L. C. Godara and A. Cantoni, "Uniqueness and linear independence of steering vectors in array space," 1. Acoust. Soc. Amer., vol. 70, pp. 467--475, 1981. [361 Y. Bresler, V. U. Reddy, and T. Kailath, "Optimum bearnformmg for coherent signal and interferences," IEEE Trans. Acoust., Speech. Signal Processing, vol. 36, pp. 833-843, 1988. [37J S. ChOI, T. K. Sarkar, and S S. Lee, "Design of twodimensional Tseng window and its application to antenna array for the detection of AM signal in the presence of strong jammers in mobile communications," Signal Process., vol. 34, pp. 297-310, 1993. [38] I. Chiba, T. Takahashi, and Y. Karasawa, "Transmitting null beam forming with beam space adaptive array antennas," in Proc. IEEE 44th Vehicular Technology Conf., Stockholm, Sweden, 1994, pp. 1498-1502. [39] B. Friedlander and B. Porat, "Performance analysis of a nullsteering algorithm based on direction-of-arrival estimation," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp 461--466, 1989. [401 I. S Reed, J. D. Mallett, and L. E. Brennan, "RapId convergence rate in adaptive arrays," IEEE Trans. Aerosp. Electron. Svst., vol. AES-I0, pp. 853-863, 1974. [41] L. E. Brennan and I. S. Reed, "Theory of adaptive radar:' IEEE Trans. Aerosp. Electron. Syst., vol. AES-9, pp. 237-252, 1973. [42] H. Cox, "Resolving power and sensitivity to mismatch of optimum array processors," I Acoust. Soc. Amer., vol. 54, pp. 771-785, 1973. [43] J. Capon, "High-resolution frequency-wave number spectrum analysis," Proc. IEEE, vol. 57, pp. 1408-1418, 1969. [44J l. J. Gupta and A. A. Ksienski, "Dependence of adaptive array performance on conventional array design," IEEE Trans. Antennas Propagat., vol. AP-30, pp. 549-553, 1982. [451 1. J. Gupta, "Effect of jammer power on the performance of adaptive arrays," IEEE Trans. Antennas Propagat., vol. AP-32, pp. 933-934, 1984. [46] J. H. Winters, "Optimum combining in digital mobile radio with cochannel interference," IEEE 1. Select. Areas Commun., vol. SAC-2, pp. 528-539, 1984. [47] J. H. Winters, "Optimum combining for indoor radio systems WIth multiple users," IEEE Trans. Commun., vol. COM-35, pp. 1222-1230, 1987. [48] B. Suard, A. F. Naguib, G. Xu, and A. Paulraj, "Performance of CDMA mobile communication systems using antenna arrays," IEEE Int. Conf. Acoustics, Speech, and Signal Processing (/CASSP), Minneapolis, MN, 1993, pp. 153-156. [49] A. F. Naguib and A. Paulraj, "Performance of CDMA cellular networks with base-station antenna arrays," in Proc. IEEE Int. Zurich Seminar on Communications, 1994, pp 87-100. [50] C. L. Zahm, "Application of adaptive arrays to suppress strong

[51] [52]

[53] [54] [55]

[56] [571 [58J [59] [60] [61] [62] [63]

[64] [65] [66]

[67] [68] [69] [70] [71] [72] [731

[74] [75]

135

jammers in the presence of weak signals," IEEE Trans. Aerosp. Electron. Syst., vol. AES-9, pp. 260-271, 1973. S. P. Applebaum and D. 1. Chapman, "Adaptive arrays with main beam constraints," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 650-662, 1976. B. Widrow, K. M. Duvall, R. P. Gooch, and W. C. Newman, "Signal cancellation phenomena in adaptive antennas: Causes and cures," IEEE Trans. Antennas Propagat., vol. AP-30, pp. 469-478, 1982. R. T. Compton Jr., "The power-inversion adaptive array: Concepts and performance," IEEE Trans. Aerosp. Electron. Svst.. vol. AES-15, pp. 803-814, 1979. . L. J. Griffiths, "A comparison of multidimensional Weiner and maximum-likelihood filters for antenna arrays," in Proc. IEEE, vol. 55, pp. 2045-2047, 1967. A. Flieller, P. Larzabal, and H. Clergeot, "Applications of high resolution array processing techniques for mobile communication system," in Proc. IEEE Intelligent Vehicles Symp., Paris, France, 1994, pp. 606-611. 1. H. Winters, 1. Salz, and R. D Gitlin, "The impact of antenna diversity on the capacity of wireless communication systems," IEEE Trans. Commun., vol. 42, pp. 1740-1751, 1994. S. Choi and T. K. Sarkar, "Adaptive antenna array utilizing the conjugate gradient method for multipath mobile comrnurucation." Signal Process., vol. 29, pp. 319-333, 1992. S. Anderson, M. Millnert, M. Viberg, and B. Wahlberg, "An adaptive array for mobile communication systems," IEEE Trans. Veh. Techno/. , vol. 40, pp. 230-236, 1991. T. Gebauer and H. G. Gockler, "Channel-individual adaptive beamforming for mobile satellite communications," IEEE 1. Select. Areas Commun., vol. 13, pp. 439-448, 1995. 1. F. Diouris, B. Feuvrie, and J. Saillard, "Adaptive multisensor receiver for mobile communications," Ann. Telecommun., vol. 48, pp. 35-46, 1993. P. W. Howells, "Explorations In fixed and adaptive resolutron at GE and SURC," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 575-584, 1976. L. 1. Griffiths and C. W. Jim, "An alternative approach to linearly constrained adaptive beamforming," IEEE Trans. Antennas Propagat., vol. AP-30, pp. 27-34, 1982. L. 1. Griffiths, "An adaptive beamformer which implements constraints using an auxiliary array processor," in Aspects of Signal Processing, part 2, G. Tacconi, Ed. Dordrecht-Holland: Reidel, 1977, pp. 517-522. C. W. Jim, "A comparison of two LMS constrained optimal array structures," Proc. IEEE, vol. 65, pp. 1730-1731, 1977. A. Cantoni and L. C. Godara, "Fast algorithms for time domain broadband adaptive array processing," IEEE Trans. Aerosp Electron. Syst., vol. AES-18, pp. 682-699, 1982. B. D. Van Veen and R. A. Roberts, "Partially adaptive beamformer design via output power minimization," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 1524-1532, 1987. B. D. Van Veen, "An analysis of several partially adaptive beamformer designs," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 192-203, 1989. _ _ , "Optimization of quiescent response In partially adaptive beamformers," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 471-477, 1990. F. Qian and B. D. Van Veen, "Partially adaptive beamformer design subject to worst case performance constraints." IEEE Trans. Signal Processing, vol. 42, pp. 1218-1221, 1994. _ _ , "Partially adaptive beamforming for correlated interference rejection," IEEE Trans. Signal Processing, vol. 43, pp. 506-515, 1995. D. 1. Chapman, "Partially adaptivity for the large array," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 685-696, 1976. D. R. Morgan, "Partially adaptive array techniques," IEEE Trans. Antennas Propagat., vol. AP-26, pp. 823-833, 1978. A. Cantoni and L. C. Godara, "Performance of a postbeamformer interference canceller in the presence of broadband directional signals," 1. Acoust. Soc. Amer., vol. 76, pp. 128-138, 1984. L. C. Godara and A. Cantoni, "The effect of bandwidth on the performance of post beamformer interference canceller," 1. Acoust. Soc. Amer., vol. 80, pp. 794-803, 1986. L. C. Godara, "Analysis of transient and steady state weight covariance in adaptive postbeamformer interference canceller,"

1. Acoust. Soc. Amer., vol. 85, pp. 194-201, 1989. [761 _ _ , "Postbeamformer interference canceller with improved performance," 1. Acoust. Soc. Amer., vol. 85, pp. 202-213, 1989. [77] _ _ , "Adaptive postbeamformer interference canceller with improved performance in the presence of broadband directional sources," J. Acoust. Soc. Amer., vol. 89, pp. 266-273, 1991. [78] E. Brookner and 1. M. Howell, "Adaptive-adaptive array processing," Proc. IEEE, 1986, vol. 74, pp. 602-604. [79] 1. T. Mayhan, "Adaptive nulling with multiple beam antennas," IEEE Trans. Antennas Propagat., vol. AP-26, pp. 267-273, 1978. (80) R. Klemm, "Suppression of jammers by multiple beam signal processing:' In Proc. IEEE Int. Radar Conf., Sendai, Japan, 1975, pp. 176-180. [81] 1. Gobert, "Adaptive beam weighting," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 744-749, 1976. [82] C. L. Dolf, "A current distribution for broadside arrays which optimizes the relationship between beamwidth and sidelobe levels," Proc. IRE, vol. 34, pp. 335-348, 1946. [83] L. 1. Griffiths and K. M. Buckley, "Quiescent pattern control in linearly constrained adaptive arrays," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 917-926, 1987. [84] C. Y. Tseng and L. 1. Griffiths, "A simple algorithm to achieve desired patterns for arbitrary arrays," IEEE Trans. Signal Processing, vol. 40, pp. 2737-2746, 1992. [85] R. 1. Webster and T. N. Lang, "Prescribed sidelobes for the constant beam width array," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 727-730, 1990. [86] M. H. Er, S. L. Sim, and S. N. Koh, "Application of constrained optimization techniques to array pattern synthesis," Signal Process., vol. 34, pp. 327-334, 1993. [87] M. Simaan, "Optimum array filters for array data signal processing," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 1006-10015, 1983. [88] D. E. N. Davies, "Independent angular steering of each zero of the directional pattern for a linear array," IEEE Trans. Antennas Propagat., vol. AP-15, pp. 296-298, 1967. [891 B. D. Van Veen, "Adaptive convergence of linearly constrained beamfonners based on the sample covariance matrix," IEEE Trans. Signal Processing, vol. 39, pp. 1470-1473, 1991. [90] L. C. Godara, "A robust adaptive array processor," IEEE Trans. Circuits Syst., vol. CAS-34, pp. 721-730, 1987. [91] N. K. Jablon, "Adaptive beamforming with the generalized sidelobe canceller in the presence of array imperfections," IEEE Trans. Antennas Propagat., vol. AP-34, pp. 996-1012, 1986. [92] W. F. Gabriel, "Using spectral estimation techniques in adaptive processing antenna systems," IEEE Trans. Antennas Propagat., vol AP-34, pp. 291-300, 1986. [93] Y. L. Su, T. J. Shan, and B. Widrow, "Parallel spatial processing: A cure for signal cancellation in adaptive arrays," IEEE Trans. Antennas Propagat., vol. AP-34, pp. 347-355, 1986. [94] P. Kawala and U. H. Sheikh, "Adaptive multiple-beam array for wireless communications," in Proc. Inst. Elect. Eng. 8th Int. Con! Antennas and Propagation, Edinburgh, Scotland, 1993, pp. 970-974. [95] W. Chujo and K. Yasukawa, "Design study of digital beam forming antenna applicable to mobile satellite communications," IEEE Antennas and Propagation Symp. Dig., Dallas, TX, pp. 400-403, 1990. [96] 1. Chiba, W. Chujo, and M. Fujise, "Beamspace constant modulus algorithm adaptive array antennas," in Proc. Inst. Elect. Eng. 8th Int. Con! Antennas and Propagation, Edinburgh, Scotland, 1993, pp. 975-978. [97] M. A. Jones and M. A. Wickert, "Direct sequence spread spectrum using directionally constrained adaptive beamfonning to null interference," IEEE 1. Select. Areas Commun., vol. 13, pp. 71-79, 1995. [98] S. Sakagami, S. Aoyama, K. Kuboi, S. Shirota, and A. Akeyama, "Vehicle position estimates by multibeam antennas in multipath environments," IEEE Trans. Veh. Technol., vol. 41, pp. 63-68, 1992. [99] T. Tanaka, R. Miura, I. Chiba, and Y. Karasawa, "An ASIC Implementation scheme to realize a beam space CMA adapuve array antenna," IEICE Trans. Commun., vol. E78-B. pp. 1467-1473, Nov. 1995. [100] W. E. Rodgers and R. T. Compton Jr., "Adaptive array bandwidth with tapped delay line processing," IEEE Trans. Aerosp.

Electron. Syst., vol. AES-15, pp. 21-28, 1979. [101] 1. T. Mayhan, A. 1. Simmons, and W. C. Cummings, "Wideband adaptive nulling using tapped delay lines," IEEE Trans. Antennas Propagat., vol. AP-29, pp. 923-936, 1981. [102] E. W. Vook and R. T. Compton, Jr., "Bandwidth performance of linear arrays with tapped delay line processing," IEEE Trans. Aerosp. Electron. Syst., vol. 28, pp. 901-908, 1992. [103] R. T. Compton, Jr., "The bandwidth performance of a two element adaptive array with tapped delay line processing," IEEE Trans. Antennas Propagat., vol. 36, pp. 5-14, 1988. [104] C. C. Ko, "Jamming rejection capability of broadband Frost power inversion array," Proc. In.H. Elect. Eng., vol. 128, pp. pt. F, 140-151, 1981. [ 105] _ _ , "Tracking performance of a broadband tapped delay line adaptive array using the LMS algorithm." Proc. lnst. Elect. Eng., vol. 134, pt. F, pp. 295-302, 1987. [106] D. Nunn, "Performance assessments of a time domain adaptive processor in a broadband environment," Proc. lnst. Elect. Eng., Pts. F and H, vol. 130, pp. 139-145, 1983. [107] C. C. Yeh, Y. 1. Hong, and D. R. Ucci, "Use of tapped delay line adaptive array to increase the number of degrees of freedom for interference suppression." IEEE Trans. Aerosp. Electron. Syst., vol. AES-23, pp. 809-813, 1987. [108] K. K. Scott, "Transversal filter techniques for adaptive array applications," Proc. lnst. Elect. Eng., Pts. F and H, vol. 130, pp. 29-35, 1983. [109] T. S. Durrani, N. L. M. Murukutla, and K. C. Sharman, "Constrained algorithm for multi-input adaptive latices in array processing," in Proc. ICASSP, Atlanta, GA, 1981, pp. 297-301. [110] D. Alexandrou, "Boundary reverberation rejection via constrained adaptive beamforming," J. Acoust. Soc. Amer., vol. 82, pp. 1274-1290, 1987 . [111] F. Ling, D. Manolakis, and J. G. Proakis, "Numerically robust least-squares lattice ladder algorithms with direct updating of the reflection coefficients," IEEE Trans. Acoust.. Speed. Signal Processing, vol. ASSP-34, pp. 837-845, 1986. [112] Y. Iiguni, H. Sakai, and H. Tokumaru, "Convergence properties of simplified gradient adapuve lattice algorithms." IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 1427-1434, 1985. [113] G. R. L. Sohie and L. H. Sibul, "Stochastic convergence properties of the adaptive gradient lattice," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, pp. 102-107, 1984. [114] M. H. Er and A. Cantoni, "Derivative constraints for broadband element space antenna array processors," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 1378-1393, 1983. [115] M. H. Er and B. P. Ng, "On derivative constrained broadband beamforming," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 551-552, 1990. [116] I. Thng, A. Cantoni, and Y. H. Leung, "Derivative constrained optimum broadband antenna arrays," IEEE Trans. Signal Processing, vol. 41, pp. 2376-2388, 1993. [117] K. Takao and T. Ishizaki, "Constraints of the output power minimization adaptive array for broadband desired signal." Trans. Inst. Electr. Commun. Eng. lpn. B, vol. J68-B, pp. 411-418, 1985. [118] K. M. Buckley, "Spatial/spectral filtering with linearly constrained minimum variance beamformers," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 249-266, 1987. [119] K. M. Ahmed and R. 1. Evans, "Broadband adaptive array processing," Proc. Inst. Elect. Eng., vol. 130, pt. F, pp. 433-440, 1983. [120] M. H. Er, "On the limiting solution of quadratically constrained broadband beamformers," IEEE Trans. Signal Processing; vol. 41, pp. 418-419, 1993. [121] K. M. Ahmed and R. 1. Evans, "An adaptive array processor with robustness and broadband capabilities," IEEE Trans. Antennas Propagat., vol. AP-32, pp. 944-950, 1984. [122] N. Kikuma and K. Takao, "Broadband and robust adaptive antenna under correlation constraints," Proc. InH. Elect. Eng., vol. 136, pt. H, pp. 85-89, 1989. [123] C. L. B. Despins, D. D. Falconer, and S. A. Mahmoud, "Compound strategies of coding, equalization and space diversity for wideband TDMA indoor wireless channels," IEEE Trans. Veh. Technol., vol. 41, pp. 369-379, 1992. [124] N. Ishii and R. Kohno, "Spatial and temporal equalization based on an adaptive tapped-delay-line array antenna," lEICE Trans. Commun., vol. E78-B, pp. 1162-1169, Aug. 1995.

136

[1251 R. Kohno, H. Wang, and H. Imai, "Adaptive array antenna combined with tapped delay line using processing gain for spread spectrum CDMA systems," presented at the IEEE Int. Syrnp. Personal Indoor and Mobile Radio Communications, Boston, MA, 1992. l126J M. H. Er and A. Cantoni, "An unconstrained partitioned realization for derivative constrained broadband antenna array processors," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 1376-1379, 1986. [127] L. P. Winkler and M. Schwartz, "Adaptive nonlinear optimization of the signal-to-noise ratio of an array subject to a constraint," 1. Acoust. Soc. Amer., vol. 52, pp. 39-51, 1972. [128] C. C. Ko, "Fast null steering algorithm for broadband power inversion array," Proc. lnst. Elect. Eng.. vol. 137, pt. F, pp. 377-383, 1990 [129] K. Takao and K. Komiyama, "An adaptive antenna for rejection of wideband Interference," IEEE Trans. Aerosp. Electron. Svst.. vol. AES-16, pp. 452-459, 1980. [130] K. C. Huang, S. H. Chang, and Y. H. Chen, "An alternative structure for adaptive broadband beamforming with imperfect arrays," 1. Acoust. Soc. Amer., vol. 87, pp. 1218-1226, 1990. [131] S. J. Chern and C. Y. Sung, "The hybrid Frost's beamforming algorithm for multiple jammers suppression," Signal Process., vol. 43, pp. 113-132, 1995. [134] S. Nordebo, 1. Claesson, and S. Nordhom, "Weighted Tschebysheff approximation for the design of broadband beamformers using quadratic programming," IEEE Signal Processing Leu., vol. 1, pp. 103-105, 1994. [135] S. Valaee and P. Kaba1, "Wideband array processing using a two-sided correlation transformation," IEEE Trans. Signal Processing, vol. 43, pp. 160-172, 1995. [136) W·. S. Hodgkiss, "Adaptive array processing: Time vs. frequency domain," ICASSP: Int. Con! Acoustic, Speech, Signal Processing, Washington, D.C., 1979, pp. 282-284. [137] L. Armijo, W. Daniel, and W. M. Labuda, "Applications of the FFT to antenna array beamforming," in EASCON, Rec. IEEE Electronics and Aerospace Systems Conv.. Washington, D.C., 1974, pp. 381-383. r 138J M. Dentino, 1. McCool, and B. Widrow, "Adaptive filtering In frequency domain," Proc. IEEE, vol. 66, pp. 1658-1659, 1978. [139] S. S. Narayan and A. M. Peterson, "Frequency domain leastmean-square algorithm," Proc. IEEE, vol. 69, pp. 124--126. 1981. [140] M. E. Weber and R. Heisler, "A frequency-domain beamforrning algorithm for wideband, coherent signal processing," 1. Acoust. Soc. Amer., vol. 76, pp. 1132-1144, 1984. [141] J. J. Shynk and R. P. Gooch, "Frequency-domain adaptive pole-zero filtering," Proc. IEEE, vol. 73, pp. 1526-1528, 1985. [142] S. Florian and N. 1. Bershad, "A weighted normalized frequency domain LMS adaptive algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 1002-1007, 1988. [143] N. J Bershad and P. L. Feintuch, "A normalized frequency domain LMS adaptive algorithm," IEEE Trans. Acoust., Speech. Signal Processing, vol. ASSP-34, pp. 452-461, 1986. [144J F. A. Reed, P. L. Feintuch, and N. J. Bershad, "The application of the frequency domain LMS adaptive filter to split array bearing estimation with a sinusoidal signal," IEEE Trans. Acoust.. Speech, Signal Processing, vol. ASSP-33, pp. 61-69, 1985. [145] D. Mansour and A. H. Gray Jr., "Unconstrained frequencydomain adaptive filter," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, pp. 168-170, 1982. [146] R. Kurnaresan, "On a frequency domain analog of Pronys method," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 168-170, 1990. l147] G. A. Clark, S. R. Parker, and S. K. Mitra, "A unified approach to time-and frequency-domain realization of FIR adaptive digital filters," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 1073-1083, 1983. [148] 1. X. Zhu and H. Wang, "Adaptive beamforming for correlated signal and interference: A frequency domain smoothing approach," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 193-195, 1990. [149] L. C. Godara, "Application of the fast Fourier transform to broadband beamforrning," 1. Acoust. Soc. Amer., vol. 98, pp. 230-240, 1995. [150] M. J. Hinich, "Frequency-wave number array processing," 1. Acoust. Soc. Amer., vol. 69, pp. 732-737, 1981. [151 J V. C. Anderson, "Digital array phasing," 1. Acoust. Soc. Amer.,

vol. 32, pp. 867-870, 1960. [152] R. G. Pridham and R. A. Mucci, "A novel approach to digital beamforming,' 1. Acoust. Soc. Amer., vol. 63, pp. 425-434, 1978. [153] P. Barton, "Digital beamfonning for radar," Proc. In.H. Elect. Eng., vol. 127, pt. F, pp. 266-277, 1980. [154] N. J. Mohamed, "Two-dimensional beamforming WIth nonsinusoidal signals," IEEE Trans. Electromag. Compat., vol. EMC-29, pp. 303-313, 1987. [155] D. E. Dudgeon, "Fundamentals of digital array processing," Proc. IEEE, vol. 65, pp. 898-904, 1977. [156] R. A. Mucci, "A comparison of efficient beamforming algorithms," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, pp. 548-558, 1984. [157] R. G. Pridham and R. A. Mucci, "Digital interpolation beamforming for low-pass and bandpass signals," Proc. IEEE, vol. 67, pp. 904-919, 1979. [158] H. Fan, E. 1. EI-Masry, and W. K. Jenkins. "Resolution enhancement of digital beamforming," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, pp. 1041-1052, 1984. [159] B. Maranda, "Efficient digital beamforming in the frequency domain," 1. Acoust. Soc. Amer., vol. 86, pp. 1813-1819, 1989. [160] P. Rudnick, "Digital beamforming in the frequency domain," 1. Acoust. Soc. Amer., vol. 46, pp. 1089-1090, 1969. [161] R. A. Gabel and R. R. Kurth, "Hybrid time-delay/phase-shift digital beamforming for uniform collinear array," 1. Acoust. Soc. Amer., vol. 75, pp. 1837-1847, 1984. [162] 1. J. Brady, "A serial phase shift beamformer using charge transfer devices," 1. Acoust. Soc. Amer., vol. 68, pp. 504-506, 1980. [163] P. D. Sylva, P. Menard, and D. Roy, "A reconfigurable realtime interpolation beamformer," IEEE 1. Oceanic Eng, vol. DE-II, pp. 123-126, 1986. [164] G. 1. DeMuth, "Frequency domain beamforming techniques," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (lCASSP), 1977, pp. 713-715. [165] A. Papoulis, "A new algorithm in spectral analysis and bandlimited extrapolation," IEEE Trans. Circuits Syst., vol. CAS-22, pp. 735-742, 1975. [166] B. J. Sullivan, "Effect of sampling rate on the conjugate gradient method applied to signal extrapolation," IEEE Trans. Signal Processing, vol. 39, pp. 1235--1238, 1991. [167] 1. A. Cadzow, "An extrapolation procedure for band-limited signals," IEEE Trans. Acoust.. Speech, 'Signal Processing, vol. ASSP-27, pp. 4-12, 1979. [168] A. Sonnenschein and B. W. Dickinson, "On a recent extrapolation procedure for band-limited signals," IEEE Trans. Circuits Syst., vol. CAS-29, pp. 116-117, 1982. [169] A. K. Jain and S. Ranganath, "Extrapolation algorithms for discrete signals with application in spectral estimation," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-29, pp. 830-845, 1981. [170] 1. L. C. Snazand T. H. Huang, "Discrete and continuous bandlimited signal extrapolation," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 1276-1285, 1983. [1711 W. Chujo and K. Kashiki, "Spherical array antenna using digital beamforming techniques for mobile satellite communications," Electron. Commun. Japan, (English trans. of Denshi Tsushin Gakkai Ronbunshi), vol. 75, pp. 76-86, 1992. [172] R. Suzuki, Y. Matsumoto, R. Miura, and N. Hamamoto. "Mobile TDMITDMA system with active array antenna," IEEE Global Telecommunications Con! (GLOBECOM), Phoenix, AZ, 1991, pp. 1569-1573. [173] H. Steyskal, "Digital beamforming antenna, an introduction," Microwave 1., pp. 107-124, Jan. 1987. [174] W. S. Youn and C. K. Un, "Eigenstructure method for robust array processing," Electron. Lett., vol. 26, pp. 678-680, 1990. [175] A. M. Haimovich and Y. Bar-Ness, "An eigenanalysis Interference canceller," IEEE Trans. Signal Processing, vol. 39, pp. 76-84, 1991. [176] B. Friedlander, "A signal subspace method for adaptive interference cancellation," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 1835-1845, 1988. [177] 1. F. Yang and M. Kaveh, "Coherent signal-subspace transformation beam former," Proc. Inst. Elect. Eng., vol. 137, pt. F, pp. 267-275, 1990. [178] B. D. Van Veen, "Eigenstructure based partially adaptive array design," IEEE Trans. Antennas Propagat., vol. 36, pp. 357-362, 1988.

137

[179] N. L. Owsley, "Sonar array processing," in Array Signal Processing, S. Haykin, Ed.. Englewood Cliffs, N.J.: Prentice-Hall, 1985, pp. 115-193. [180] K. Nishimori, N. Kikuma, and N. Inagaki, "The differencial CMA adaptive array antenna using an eigen-beamspace system," IEICE Trans. Commun., vol. E78-B, pp. 1480-1488, Nov. 1995. [181 J L. 1. Horowitz, H. Blatt, W. G. Brodsky, and K. D. Senne, "Controlling adaptive antenna arrays with the sample matrix inversion algorithm," IEEE Trans. Aerosp. Electron. Syst., vol. AES-15, pp. 840-847, 1979. [182] R. Schreiber, "Implementation of adaptive array algorithms," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 1038-1045, 1986. [183] E. Lindskog, "Making SMI-beamforming insensitive to the sampling timing for GSM signals," in Proc. IEEE Int. Symp. Personal, Indoor and Mobile Radio Communications,Toronto, Canada, 1995, pp. 664-668. [184] R. G. Vaughan, "On optimum combining at the mobile," IEEE Trans. Veh. Technol., vol. 37, pp. 181-188, 1988. [185] H. Hashemi, "The indoor radio propagation channels," Proc. IEEE, vol. 81, pp. 943-968, 1993. [186] C. Passerini, M. Missiroli, G. Riva, and M. Frullone, "Adaptive antenna arrays for reducing the delay spread in indoor radio channels," Electron. Lett., vol. 32, pp. 280-281, 1996. [187] L. J. Griffiths, "A simple adaptive algorithm for real-time processing in antenna arrays," Proc. IEEE, vol. 57, pp. 1696-1704, 1969. [188] B. Widrow and 1. M. McCool, "A comparison of adaptive algorithms based on the methods of steepest descent and random search," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 615-637, 1976. [189] R. A. Iltis and L. B. Milstein, "An approximate statistical analysis of the Widrow LMS algorithm with application to narrow-band interference rejection," IEEE Trans. Commun., vol. COM-33, pp. 121-130, 1985. [190] P. M. Clarkson and P. R. White, "Simplified analysis of the LMS adaptive filter using a transfer function approximation," IEEE Trans. Acoust., Speech, Signal Processing, vol. i\SSP-35, pp. 987-993, 1987. [191] W. A. Gardner, "Comments on convergence analysis of LMS filters with uncorrelated data," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 378-379, 1986. [192J 1. B. Foley and F. M. Boland, "A note on the convergence analysis of LMS adaptive filters with Gaussian data," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 1087-1089, 1988. [193] V. Solo, "The limiting behavior of LMS," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1909-1922, 1989. [194] A. Feuer and E. Weinstein, "Convergence analysis of LMS filters with uncorrelated Gaussian data," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 222-229, 1985. [195] S. Jaggi and A. B. Martinez, "Upper and lower bounds of the misadjustment in the LMS algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 164-166, 1990. [196] F B. Boland and J. B. Foley, "Stochastic convergence of the LMS algorithm in adaptive systems," Signal Process., vol. 13, pp. 339-352, 1987. [197] B. Widrow, 1. McCool, M. G. Larimore, and C. R. Johnston Jr., "Stationary and nonstationary learning characteristics of the LMS adaptive filter," in Aspects of Signal Processing, part 1, G. Tacconi, Ed. Boston, MA: Reidel, 1976, pp. 355-393; also in Proc. IEEE, vol. 64, pp. 1151-1162, 1976. [198] L. H. Horowitz and K. D. Senne, "Performance advantage of complex LMS for controlling narrow-band adaptive arrays," IEEE Trans. Circuits Syst., vol. CAS-28, pp. 562-576, 1981. [199] V. Solo, "The error variance of LMS with time-varying weights," IEEE Trans. Signal Processing, vol. 40, pp. 803-813, 1992 [200] M. Nagatsuka, N. Ishii, R. Kohno, and H. Imai, "Adaptive array antenna based on spatial spectral estimation using maximum entropy method," IEICE Trans. Commun., vol. E77-B, pp. 624-633, 1994. [201] J. S. Soo and K. K. Pang, "A multiple size frequency domain adaptive filter," IEEE Trans. Signal Processing, vol. 39, pp. 115-121, 1991. [202] Z. Pritzker and A. Feuer, "Variable length stochastic gradient algorithm," IEEE Trans. Signal Processing, vol. 39, pp.

997-1001, 1991. [203] F. F. Yassa, "Optimality in the choice of the convergence factor for gradient based adaptive algorithms," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 48-59, 1987. [204] 1. B. Evans, P. Xue, and B. Liu, "Analysis and implementation of variable step size adaptive algorithms," IEEE Trans. Signal Processing, vol. 41, pp. 2517-2535, 1993. [205] R. H. Kwong and E. W. Johnston, "A variable step size LMS algorithm," IEEE Trans. Signal Processing, vol. 40, pp. 1633-1642, 1992. [206] C. P. Kwong, "Robust design of the LMS algorithm," IEEE Trans. Signal Processing, vol. 40, pp. 2613-2616, 1992. [207] R. W. Harris, D. M. Chabriew, and F. A. Bishop, "A variable step size (VS) adaptive filter algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 309-316, 1986. [208] R. Y. Chen and C. L. Wang, "On the optimum step size for the adaptive sign and LMS algorithms," IEEE Trans. Circuits Syst., vol. 37, pp. 836-840, 1990. [209] C. C. Ko, "A fast adaptive null-steering algorithm based on output power measurements," IEEE Trans. Aerosp. Electron. Syst., vol. 29, pp. 717-725, 1993. [210] C. C. Ko, G. Balabshaskar, and R. Bachl, "Unbiased source estimation with an adaptive null steering algorithm," Signal Process., vol. 31, pp. 283-300, 1993. [211] J. Benesty and P. Duhamel, "A fast exact least mean square adaptive algorithm," IEEE Trans. Signal Processing, vol. 40, pp. 2904-2920, 1992. [212] A. Feuer and R. Cristi, "On the steady state performance of frequency domain LMS algorithms," IEEE Trans. Signal Processing, vol. 41, pp. 419-423, 1993. [213] V. 1. Mathews and S. H. Cho, "Improved convergence analysis of stochastic gradient adaptive filters using the sign algorithm," IEEE Trans. Acoust~, Speech, Signal Processing, vol. ASSP-35. pp. 450-454, 1987. [214] N. 1. Bershad and L. Z. Qu, "LMS adaptation WIth correlated data-A scalar example," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, pp. 695-700, 1984. [215] N. 1. Bershad and Y. H. Chang, "Time correlation statistics of the LMS adaptive algorithm weights," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 309-312, 1985. [216] E. Eweda, "Analysis and design of a signed regressor LMS algorithm for stationary and non stationary adaptive filtering with correlated Gaussian data," IEEE Trans. Circuits Svst., vol. 37, pp. 1367-1374, 1990. . [217] S. Kaczmarz, "Angenaherte Auftosung von Systemen linearen gleichungen," Bull. Int. Acad. Pol. Sci. Lett., 1937. [218] Y. Z. Tsypkin, Foundation of the Theory of Searching Systems New York: Academic, 1973. [219] 1. I. Nagumo and A. Noda, "A learning method for system identification," IEEE Trans. Automat. Contr., vol. AC-12, pp. 282-287, 1967. [220] R. Nitzberg, "Application of the normalized LMS algorithm to MSLC," IEEE Trans. Aerosp. Electron. Syst., vol. AES-21, pp. 79-91, 1985. [221] _ _ , "Normalized LMS algorithm degradation due to estimation noise," IEEE Trans. Aerosp. Electron. Syst., vol. AES·-22, pp. 740-750, 1986. [222] N. 1. Bershad, "Analysis of the normalized LMS algonthm with Gaussian inputs," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 793-806, 1986. [223] D. T. M. Slock, "On the convergence behavior of the LMS and the normalized LMS algorithms," IEEE Trans. Signal Processing, vol. 41, pp. 2811-2825, 1993. [224] M. Rupp, "The behavior of LMS and NLMS algorithms in the presence of spherically invariant processes," IEEE Trans. Signal Processing, vol. 41, pp. 1149-1160, 1993. [225] M. Barrett and R. Arnott, "Adaptive antennas for mobile communications," Electron. Commun. Eng. 1., vol. 6, pp. 203-214, 1994. [226] A. Cantoni, "Application of orthogonal perturbation sequences to adaptive beamforming," IEEE Trans. Antennas Propagat.. vol. AP-28, pp. 191-202, 1980. [227] L. C. Godara and A. Cantoni, "Analysis of the performance of adaptive beamforming using perturbation sequences," IEEE Trans. Antennas Propagat., vol. AP-31, pp. 268-279, 1983. [228] _ _ , "Analysis of constrained LMS algorithm with application to adaptive beamforming using perturbation sequences," IEEE Trans. Antennas Propagat., vol. AP-34, pp. 368-379, 1986.

138

[2291 J. L. Moschner, "Adaptive filters with clipped input data," Information Systems Laboratory, Stanford University, CA, Tech. Rep. 6796-1, 1970. [230] L. C. Godara, "Constrained beamfonning and adaptive algorithms," in Handbook of Statistics, vol. 10, N. K. Bose and C. R. Rao, Eds. Amsterdam, The Netherlands: Elsevier, 1993. [231] _ _ , "Performance analysis of structured gradient algorithm," IEEE Trans. Antennas Propagat., vol. 38, 1078-1083,1990. [232"] L. C. Godara and D. A. Gray, "A structure gradient algorithm for adaptive bearnforrning." 1. Acoust. Soc. Amer., vol. 86, pp. 1040-1046, 1989. [233 J L. C. Godara, "Improved LMS algorithm for adaptive bearnforrrung." IEEE Trans. Antennas Propagat.. vol. 38, pp. 1631-1635, 1990. 12341 T. Ohgane, N . Matsuzawa, T. Shimura, M. MIzuno, and H. Sasaoka, "BER performance of CMA adaptive array for high-speed GMSK mobile communication-s-A description of measurements in central Tokyo." IEEE Trans. Veh. Technol. vol. 42, pp. 484-490, 1993. [235] R. M. Davis, D. C. Farden, and P. J. S. Sher. "A coherent perturbation algorithm," IEEE Trans. Antennas Propagat., vol AP-34, pp. 380-387, 1986. l236] B. Farhang-Boroujeny and L. F. Turner. "Fast converging stochastic gradient algorithm." Proc. lnst. Elect. Eng., vol. 128, pt. F. pp. 271-274. 1981 [237\ S. T. Alexander, "Transient weight misadjustment properties for the finite precision LMS algorithm.' IEEE Trans. Acoust.. Speech, Signal Processing, vol. ASSP-35, pp. 1250-1258, 1987. [238] Y. H. Chang, C. K. Tzou, and N. 1. Bershad, "Postsmoothing for the LMS algorithm and a fixed point round-off error analysis," IEEE Trans. Signal Processing, vol. 39, pp. 959-962, 1991 [239] H. Leung and S. Haykin, "Error bound method and its application to the LMS algorithm," IEEE Trans. Signal Processing, vol. 39, pp. 354-358, 1991 [240] P. W. Wong, "Quantization and roundoff noises in fixed-point FIR digital filters," IEEE Trans. Signal Processing, vol. 39, pp. 1552--1563. 1991. [2411 R. D. Gitlin. 1. E. Mago, and M. G. Taylor. "On the design of gradient algorithms for digitally implemented adaptive filters." IEEE Trans. Circuit Theory, vol. CT-20, pp. 125-136, 1973. [242J C. Caraiscos and B. Liu, "A round-off error analysis of the LMS adaptive algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, pp. 34-41, 1984. [243] R. D. Girlin, H. C. Meadors, and S. B. Weinstein, "The tap leakage algorithm: An algorithm for stable operations of a digitally implemented, fractionally spaced equalizer," Bell Svst Tech. 1., vol. 61, pp. 1817-1839, 1982. [244] J. M. Cioffi and J. J. Werner, "Effect of biases on digitally Implemented data driven echo canceller," AT&T Tech. 1., vol. 64. pp. 115-138, 1985. l245] A. V. Oppenheim and R. W. Schafer, Digital Signal Processing. Englewood Cliffs, NJ: Prentice- Hall, 1975. [246] B. Widrow, J. McCooL and M. Ball, "The complex LMS algorithm," Proc. IEEE, vol. 63, pp. 719-720, 1975. [247] A. Papoulis, Probability, Random Variables and Stochastic Processes. New York: McGraw-Hill, 1965. (248) E. Eweda and O. Macchi, "Convergence of the RLS and LMS adaptive filters," IEEE Trans. Circuits Syst., vol. CAS-34, pp. 799-803, 1987. [249] P. Fabre and C. Gueguen, "Improvement of the fast recursive least-squares algorithms via normalization: A comparative study," IEEE Trans. Acoust., Speech, Signal Processing. vol. ASSP-34, pp. 296-308, 1986. [250] P. E. Mantey and L. J. Griffiths, "Iterative least-squares algonthm for signal extraction," in 2nd Int. Hawaii Coni. System Science, Honolulu, HI, 1969, pp. 767-770. [251] E. Eleftheriou and D. D. Falconer, "Tracking properties and steady state performance of RLS adaptive filter algorithms," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34. pp. 1097-1110,1986. 1252] M. S. Mueller, "Least squares algorithms for adaptive equalizers." Bell Syst. Tech. 1., pp. ] 905-1925, 1981 [253] J. M. Cioffi and T. Kailath, "Fast recursive-least-square, transversal filters for adaptive filtering," IEEE Trans. Acoust.. Speech, Signal Processing, vol. ASSP-32, pp. 998-1005, 1984. [254] R. A. Wiggins and E. A. Robinson, "Recursive solution to the multichannel filtering problem," 1. Geophvs. Res., vol. 70, pp. 1885-1891, 1965.

[255] T. Murali and B. V. Rao, "A class of recursive maximumlikelihood algorithms," Proc. IEEE, vol. 73, pp. 1336-1339, 1985. [256] G. V. Moustakids and S. Theodoridis, "Fast Newton transversal filters-A new class of adaptive estimation algorithms," IEEE Trans. Signal Processing, vol. 39, pp. 2184-2193, 1991. [257] S. Qiao, "Fast adaptive RLS algorithms: A generalized inverse approach and analysis," IEEE Trans. Signal Processing, vol. 39, pp. 1455-1459, 1991. [258] D. T. M. Slock and T. Kailath, "Numerically stable fast transversal filters for recursive least squares adaptive filtering," IEEE Trans. Signal Processing, vol. 39, pp. 92-114, 1991. [259] W. A. Gardner and W. A. Brown III, "A new algorithm for adaptive arrays," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 1314-1319, 1987. [260] J. Fernandez, I. R. Corden, and M. Barrett, "Adapnve array algorithms for optimal combining in digital mobile communication systems," In Proc. In.H. Elect. Eng. 8th Int. Con! Antennas and Propagation, Edinburgh, Scotland, 1993, pp. 983-986. l261] Y. Wang and J. R. Cruz, "Adaptive antenna arrays for the reverse link of CDMA cellular communication systems," Electron. Lett., vol. 30, pp, 1017-1018, 1994. [262] D. N. Godard, "Self-recovering equalization and carrier tracking in two-dimensional data communication systems," IEEE Trans. Commun., vol. COM-28, pp. 1867-1875, 1980. [263J 1. R. Treichler and B. G. Agee, "A new approach to multipath correction of constant modulus SIgnals." IEEE Trans. Acoust.. Speech, Signal Processing, vol. ASSP-31, pp. 459-472, 1983. [264] 1. J. Shynk and C. K. Chan, "Performance surfaces of the constant modulus algorithm based on a conditional Gaussian model," IEEE Trans. Signal Processing, vol. 41, pp. 1965-1969, 1993. [265] T. Ohgane, "Characteristics of CMA adaptive array for selective fading compensation in digital land mobile radio communications," Electron. Commun. Ipn., vol. 74, pp. 43-53. 1991. [266] T. Ohgane, T. Shimura, N. Matsuzawa, and H. Sasaoka, "An implementation of a CMA adaptive array for high speed GMSK transmission in mobile commumcauons." IEEE Trans Veh Technol.. vol. 42, pp. 282-288, 1993. [2671 1. Parra, G. Xu, and H. Liu, "Least squares projective constant modulus approach," in Proc. IEEE Inr. Symp. Personal, Indoor and Mobile Radio Communications. Toronto, Canada, 1995, pp. 673-676. [268J M. Hestenes and E. Stiefel, "Method of conjugate gradients for solving linear systems," 1. Res. Natl. Bur. Stand.. vol 49, pp 409-436, ] 952. [269] 1. W. Daniel, "The conjugate gradient method for linear and nonlinear operator equations," SIAM 1. Numer. Anal, vol 4. pp. 10-26, 1967. [270) T. Sarkar, K. R. Siarkiewrcz, and R. F. Stratton, "Survey of numerical methods for solutions of large systems of linear equations for electromagnetic field problems," IEEE Trans Antennas Propagat., vol. AP-29, pp. 847-856, 1981. [271] S. Choi and D. H. Kim, "Adaptive antenna array utilizing the conjugate gradient method for compensation of multipath fading in a land mobile communication," in Proc. IEEE 42nd Vehicular Technology Conf., Denver, CO, 1992, pp. 33-36. [272] S. Choi, Application of the Conjugate Gradient Method for Optimum Array Processing, vol. V. Amsterdam. The Netherlands: Elsevier, 1991, ch. 16. [273] B. Widrow and M. A. Lehr, "30 years of adaptive neural networks: Perception, Madalme, and back propagation," Proc. IEEE, vol. 78, pp. 1415-1442, 1990. [274] C. Lau and B. Widrow, Eds., "Special issue on neural networks I," Proc. IEEE, vol. 78. Sept. ] 990. [275] E. Gelenbe and 1. Barhen, Eds., "Special Issue on artificial neural network applications," Proc. IEEE, vol. 84, Oct 1996. [2761 P. R. Chang, W. H. Yang, and K. K. Chan, "A neural network approach to MVDR beamforming problem." IEEE Trans. Antennas Propagat., vol. 40, pp. 313-322, 1992. [277] W. H. Yang and K. K. Chan, "Programmable switchedcapacitor neural network for MVDR beamforming," IEEE 1. Oceanic Eng., vol. 21, pp. 77-84, 1996. [278] L. C. Godara, "Limitations and capabilities of directions-ofarrival estimation techniques using an array of antennas: A mobile communications perspective," presented at the IEEE Int. Symp. Phased Array Systems and Technology, Boston, MA. 1996.

139

[279] R. T. Lacoss, "Data adaptive spectral analysis method," Geophysics, vol. 36, pp. 661-675, 1971. [280] A. H. Nuttall, G. C. Carter, and E. M. Montaron, "Estimation of two-dimensional spectrum of the space-time noise field for a sparse line array," 1. Acoust. Soc. Amer., vol. 55, pp. 1034-1041, 1974. [281] D. H. Johnson, "The application of spectral estimation methods to bearing estimation problems," Proc. IEEE, vol. 70, pp. 1018-1028, 1982. [282] R. A. Wagstaff and 1. L. Berrou, "A fast and simple nonlinear technique for high-resolution beamforming and spectral analysis," 1. Acoust. Soc. Amer., vol. 75, pp. 1133-1141,1984. [283] Q. T. Zhang, "A statistical resolution theory of the beamformerbased spatial spectrum for determining the directions of signals in white noise," IEEE Trans. Signal Processing, vol. 43, pp. I 867 -I 873, 1995. [284J M. S. Bartlett, An Introduction to Stochastic Process. New York: Cambridge Univ. Press, 1956. [285] V. A. N. Barroso. M. J. Rendas, and J. P Gomes, "Impact of array processing techniques on the design of mobile commurucation systems," in Proc. IEEE 7th Mediterranean Electrotechnical, Antalya, Turkey, 1994, pp. 1291-1294. [286] M. I. Miller and D. R. Fuhrmann, "Maximum likelihood narrow-band direction finding and the EM algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 1560-1577, 1990. [287] 1. Makhoul, "Linear prediction: A tutorial review," Proc. IEEE, vol. 63. pp. 561-580, 1975. [288] S. B. Kesler, S. Boodaghians, and 1. Kesler, "Resolving uncorrelated and correlated sources by linear prediction," IEEE Trans. Antennas Propagat., vol. AP-33, pp. 1221-1227, 1985. [289J 1. P. Burg, "Maximum entropy spectral analysis," presented at the 37th Annu. Meeting, Society Exploration Geophysics, Oklahoma City, OK, 1967. [290] 1. H. McClellan and S. W. Lang, "Duality for multidimensional MEM spectral analysis," Proc. lnst. Elect. Eng., vol. 130, pt. F, pp. 230-235, 1983. [291] D. P. Skinner, S. M. Hedlicka, and A. D. Mathews, "Maximum entropy array processing," 1. Acoust. Soc. Amer., vol. 66, pp. 488--493, 1979. [292] T. Thorvaldsen, "Maximum entropy spectral analysis in antenna spaual filtering," IEEE Trans. Antennas Propagat.. vol. AP-28, no. 99, pp. 556-562, 1980. [293] 1. H. McClellan, "Multidimensional spectral estimation," Proc. IEEE, vol. 70, pp. 1029-1039, 1982. [294] S. W. Lang and 1. H. McClellan, "Spectral estimation for sensor arrays," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 349-358, 1983. [295] D. R. Farrier, "Maximum entropy processing of band-limited spectra, Part 1: Noise free case," Proc. Inst. Elect. Eng., vol. 132, pt. F, pp. 491-504, 1985. [296] W. S. Ligget, "Passive sonar: Fitting models to multiple timesencs." in NATO ASI Signal Processing, J. W. R. Griffiths et al., Eds. New York: Academic, 1973. pp. 327-345. l297] F. C. Schweppe, "Sensor array data processing for multiple signal sources," IEEE Trans. Inform. Theory, vol. IT-14, pp. 294-305. 1968. l298] 1. Ziskind and M. Wax, "Maximum likelihood localization of multiple sources by alternating projection," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-36, pp. 1553-1560, 1988. [299] P. Stoica and K. C. Sharman, "Maximum likelihood methods for direction of arrival estimation," IEEE Trans. Acousr., Speech, Signal Processing, vol. ASSP-38, pp. 1132-1143, 1990. [300] S. K. Oh and C. K. Un, "Simple computational methods of the AP algorithm for maximum likelihood Iocahzation of multiple radiating sources," IEEE Trans. Signal Processing, vol. 40, pp. 2848-2854, 1992. [301] H. Lee and R. Stovall, "Maximum likelihood methods for determining the direction of arrival for a single electromagnetic source with unknown polarization," IEEE Trans. Signal Processing, vol. 42, pp. 474-479, 1994. [302] Q. Wu, K. M. Wong, and J. P. Reilly, "Maximum likelihood direction finding in unknown noise environments," IEEE Trans. Signal Processing, vol. 42, pp. 980-983, 1994. [303] 1. Sheinvald, M. Wax, and A. 1. Weiss, "On maximumlikelihood localization of coherent signals," IEEE Trans. Signal Processing, vol. 44, pp. 2475-2482, 1996.

140

[304] S. Haykin, "Radar array processing for angle of arrival estimation," in Array Signal Processing, S. Haykin, Ed. Englewood Cliffs, NJ: Prentice-Hall, 1985. [305] M. Wax and T. Kailath, "Optimum localization of multiple sources by passive arrays," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 1210-11221,1983. [306] A. D. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," 1. Roy Statist. Soc., vol. 13~19, pp. 1-37, 1977. [307] M. 1. Hinich, "Frequency-wave number array processing," 1. Acoust. Soc. Amer., vol. 69, pp. 732-737. 1981. [308] T. 1. Abatzoglou, "A fast maximum likelihood algorithm for frequency estimation of a sinusoid based on Newton's method," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33. pp. 77-89, 1985. [309] T. Wigren and A. Eriksson, "Accuracy aspects of DOA and angular velocity estimation in sensor array processing," IEEE Signal Processing Lett., vol. 2, pp. 60-62, 1995. [310] A. Zeira and B. Friedlander, "On the performance of direction finding with time varying arrays," Signal Process.. vol. 43, pp. 133-147, 1995. [311] D. W. Tufts and C. D. Melissinos, "Simple, effective computation of principal eigenvectors and their eigenvalues and application to high-resolution estimation of frequencies," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34. pp. 1046-1053, 1986. [312] V. F. Pisarenko, "The retrieval of harmonics from a covariance function," Geophys. 1. R. Astron. Soc., vol. 33, pp. 347-366, 1973. [313] M. Wax, T. 1. Shan, and T. Kailath, "Spatio-temporal spectral analysis by eigenstructure methods," IEEE Trans. Acoust.. Speech, Signal Processing, vol. ASSP-32, pp. 817-827. 1984. [314] S. S. Reddi, "Multiple source location-A digital approach." IEEE Trans. Aerosp. Electron. Syst., vol. AES-15. pp. 95-105. 1979. [315] A. Cantoni and L. C. Godara, "Resolving the directions of sources in a correlated field incident on an array," 1. Acoust. Soc. Amer., vol. 67, pp. 1247-1255. 1980. [316] D. J. Bordelon, "Complementarity of the Reddi method of source direction estimation with those of Pisarenko and Cantoni and Godara, I," 1. Acoust. Soc. Amer.. vol. 69, pp. 1355-1359, 1981. [317] D. H. Johnson and S. R. DeGraff, "Improving the resolution of bearing in passive sonar arrays by eigenvalue analysis." IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-29, pp. 401-413, 1982. [318] T. P. Bronez and 1. A. Cadzow, "An algebraic approach to super resolution array processing," IEEE Trans. Aerosp. Electron. Syst., AES-19, pp. 123-133, 1983. [319] V. U. Reddy, B. Egardt, and T. Kailath, "Least-squares type algorithm for adaptive implementation of Pisarenko's harmonic retrieval method," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, pp. 399-405, 1982. [320] J. R. Yang and M. Kaveh, "Adaptive eigensubspace algorithms for direction or frequency estimation and tracking," IEEE Trans Acoust., Speech, Signal Processing. vol. ASSP-36. pp. 241-25 L 1988. [321] M. G. Larimore, "Adaptive convergence of spectral estimation based on Pisarenko harmonic retrieval," IEEE Trans. ACOUSl., Speech, Signal Processing, vol. ASSP-31, pp. 955-962, 1983. [322] H. Ouibrahim, "Prony, Pisarenko and the matrix pencil: A unified presentation," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-37, pp. 133-134, 1989. [323] H. Ouibrahim, D. D. Weiner, and T. K. Sarkar, "A generalized approach to direction finding," IEEE Trans. Acoust., Speech. Signal Processing, vol. 36, pp. 610-613. 1988. [324] A. Paulraj and T. Kailath, "Eigenstructure methods for direction of arrival estimation in the presence of unknown noise field," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34. pp. 13-20, 1986. [325] M. Wax, "Detection and localization of multiple sources in noise with unknown covariance," IEEE Trans. Signal Processing, vol. 40, pp. 245-249, 1992. [326] A. 1. Weiss, A. S. Willsky, and B. C. Levy, "Eigenstructurc approach for array processing with unknown intensity coefficients," IEEE Trans. Acoust.. Speech, Signal Processing, vol. ASSP-36, pp. 1613-1617, 1988. [327] G. Bienvenu, "Influence of the spatial coherence of the back-

[3281 [329] [3301 [331] [332] [333] [334] [335]

[336J

[337] [338] [339]

[340] [3411

[342] [343] [3441 (345)

[346]

[347] [348] [349] [350] [351]

ground noise on high resolution passive methods," in Proc. ICASSP, Washington, D.C., 1979, pp. 306-309. G. Bienvenu and L. Kopp, "Adaptive high resolution spatial discrimination of passive sources," in NATO AS] Copenhagen, L. Bjorno, Ed. Boston, MA: Reidel, 1980, pp. 509-515. _ _ , "Optimality of high resolution array processing using eigensystem approach," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 1235-1248,1983. R. O. Schmidt, "Multiple emitter location and signal parameter estimation.' IEEE Trans. Antennas Propagat., vol. AP-34, pp. 276-280, 1986. R. D. DeGroat, E. M. Dowling, and D. A. Linebarger, "The constrained MUSIC problem," IEEE Trans. Signal Processing, vol. 41, pp. 1445-1449, 1993. A. Barabell, "Improving the resolution of eigenstructured based direction finding algorithms," in Proc. ICASSP, Boston, MA, 1983, pp. 336-339. B. Friedlander, "A sensitivity analysis of MUSIC algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-38, pp. 1740-1751,1990. B. Porat and B. Friedlander, "Analysis of the asymptotic relative efficiency of MUSIC algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-36, pp. 532-544, 1988. R. W. Klukas and M. Fattouche, "Radio signal direction finding in the urban radio environment," in Proc. Nat. Technical Meeting Institute of Navigation, San Francisco, CA, 1993, pp. 151-160; also IEEE Trans. Veh. Tech., submitted for publication. B. D. Rao and K. V. S. Hari, "Performance analysis of ESPRIT and TAM in determining the direction of arrival of plane waves in noise," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-37, pp. 1990-1995, 1989 J T. Mayhan and L. Niro, "Spatial spectral estimation using multiple beam antennas," IEEE Trans. Antennas Propagat., vol. AP-35, pp. 897-906, 1987. I. Karasalo, "A high-high-resolution postbeamforming method based on semidefinite linear optimization," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 16-22, 1990. H. B. Lee and M. S. Wengrovitz, "Resolution threshold of beamspace MUSIC for two closely spaced emitters," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-38, pp. 1545-1559, 1990. X. L. Xu and K. Buckley, "An analysis of beam-space source localization," IEEE Trans. Signal Processing, vol. 41, pp. 501-504, 1993. M. D. Zoltowski, S. D. Silverstein, and C. P. Mathews, "Bearnspace root-MUSIC for minimum redundancy linear arrays," IEEE Trans. Signal Processing, vol. 41, pp. 2502-2507, 1993. M. D. Zoltowski, G. M. Kautz, and S. D. Silverstein, "Beamspace root-MUSIC," IEEE Trans. Signal Processing, vol. 41, pp. 344-364, 1993. R. Kumaresan and D. W. Tufts, "Estimating the angles of arrival of multiple plane waves," IEEE Trans. Aerosp. Elect. Systems, vol. AES-19, pp. 134-139, 1983. V. T. Ermolaev and A. B. Gershman, "Fast algorithm for minimum-norm direction-of-arrival estimation." IEEE Trans. Signal Processing, vol. 42, pp. 2389-2394, 1994. U. Nickel. "Algebraic formulation of Kurnaresan-Tufts superresolution method, showing relation to ME and MUSIC methods," Proc. In.H. Elect. Eng, 1988, pt. F. vol. 135, pp. 7-10. H. Clergeot, S. Tresseus, and A. Ouamri, "Performance of high resolution frequencies estimation methods compared to the Cramer-Rao bounds," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1703-1720, 1989. H. Krim, P. Forster, and 1. G. Proakis, "Operator approach to performance analysis of Root-MUSIC and Root-Min-Norm," IEEE Trans. Signal Processing. vol. 40, pp. 1687-1696, 1992. B. P. Ng, "Constraints for linear predictive and minimum-norm methods in bearing estimation," Proc. lnst. Elect. Eng .. vol. 137, pt. F, pp. 187-191, 1990. K. M. Buckley and X. L. Xu, "Spatial spectrum estimation in a location sector," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-38, pp. 1842-1852, 1990. C. P. Mathews and M. D. Zoltowski, "Eigenstructure techniques for 2-D angle estimation with uniform circular arrays," IEEE Trans. Signal Processing, vol. 42, pp. 2395-2407, 1994. A. L. Swindlehurst and T. Kailath, "Azimuth/Elevation di-

[352]

[353]

[354]

[355]

[356] [357]

[358J [359] [360] [361]

[362] [363] [364J [3651 [366]

[367]

[368] [369] [370] [371] [372] [373] [374]

141

rection finding using regular array geometries," IEEE Trans. Aerosp. Electron. Syst., vol. 29, pp. 145-156, 1993. Q. WU and K. M. Wong, "UN-MUSIC and UN-CLE: An application of generalized correlation to the estimation of the direction of arrival of signals in unknown correlated noise," IEEE Trans. Signal Processing, vol. 42, pp. 2331-2343, 1994. 1. Le Cadre, "Parametric methods for spatial signal processing in the presence of unknown colored noise fields," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-37, pp. 965-983, 1989. K. M. Wong, 1. P. Reilly, Q. Wu, and S. Qiao, "Estimation of the direction of arrival of signals in unknown correlated noise, Part I: The MAP approach and its implementation," IEEE Trans. Signal Processing, vol. 40, pp. 2007-2017, 1992. 1. P. Reilly and K. M. Won, "Estimation of the direction of arrival of signals in unknown correlated noise, Part II: Asymptotic behavior and performance of the MAP," IEEE Trans. Signal Processing, vol. 40, pp. 2018-2028, 1992. M. G. Amin, "Concurrent nulling and locations of multiple interferences in adaptive antenna arrays," IEEE Trans. Signal Processing, vol. 40, pp. 2658-2668, 1992. 1. A. Cadzow, "A high resolution direction-of-arrival algorithm for narrow-band coherent and incoherent sources," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-36, pp. 965-979, 1988. R. Roy and T. Kailath, "ESPRIT-Estimation of signal parameters via rotational invariance techniques," IEEE Trans. Acoust.. Speech, Signal Processing, vol. ASSP-37, pp. 984-995, 1989. G. Xu, S. D. Silverstein, R. H. Roy, and T. Kailath, "Bearnspace ESPRIT," IEEE Trans. Signal Processing, vol. 42, pp. 349-356, 1994. R. Hamza and K. Buckley, "Resolution enhanced ESPRIT," IEEE Trans. Signal Processing, vol. 42, pp. 688-691,' 1994. R. Roy, A. Paulraj, and T. Kailath, "ESPRIT-A subspace rotation approach to estimation of parameters of cisoids in noise," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 1340-1342, 1986. A. Paulraj, R. Roy, and T. Kailath, "A subspace rotation approach to signal parameter estimation," Proc. IEEE, vol. 74, pp. 1044-1045, 1986. A. 1. Weiss and M. Gavish, "Direction finding using ESPRIT with interpolated arrays," IEEE Trans. Signal Processing, vol. 39, pp. 1473-1478, 1991. 1. A. Gansman, M. D. Zoltowski, and 1. V. Krogmeier, "Multidimensional multirate DOA estimation in beamspace." IEEE Trans. Signal Processing, vol. 44, pp. 2780-2792, 1996. A. L. Swindlehurst, B. Ottersten, R. Roy, and T Kailath. "Multiple invariance ESPRIT," IEEE Trans. Signal Processing, vol. 40, pp. 867-881, 1992. N. Yuen and B. Friedlander, "Asymptotic performance analysis of ESPRIT, higher order ESPRIT, and virtual ESPRIT algorithms," IEEE Trans. Signal Processing, vol. 44, pp. 2537-2550, 1996. M. D. Zoltowski and D. Stavrinides, "Sensor array signal processing via a procrustes rotations based eigenanalysis of the ESPRIT data pencil," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 832-861, 1989. Y. Wang and 1. R. Cruz, "Adaptive antenna arrays for cellular COMA cellular communication systems," IEEE ICASSP, Detroit, MI, pp. 1725-1728, 1995. M. Viberg and B. Ottersten, "Sensor array processing based on subspace fitting," IEEE Trans. Signal Processing, vol. 39, pp. 1110-1121, 1991. M. Viberg, B. Ottersten, and T. Kailath, "Detection and estimation in sensor arrays using weighted subspace fitting," IEEE Trans. Signal Processing, vol. 39, pp. 2436-2449, 1991. A. Klouche-Djedid and M. Fujita, "Adaptive array sensor processing applications for mobile telephone communications," IEEE Trans. Veh. Techno!., vol. 45, pp. 405-416, 1996. J P. Reilly, "A real-time high-resolution techrnque for angle of arnval estimation," Proc. IEEE, vol. 75, pp. 1692-1694, 1987. A. 01, "Multiple source location-A matrix decomposition approach," IEEE Trans. Acoust., Speech, Signal Processing, vel. ASSP-33, pp. 1086-1091, 1985. G. Xu and T. Kailath, "Direction-of-arrival estimation VIa exploitation of cyclostationarity-A combination of temporal and spatial processing," IEEE Trans. Signal Processing, vol. 40, pp. 1775-1786, 1992.

[375] A. Weiss and B. Friedlander, "Direction finding for diversely polarized signals using polynomial rooting," IEEE Trans. Signal Processing, vol. 41, pp. 1893-1905, 1993. [376] J. J. Fuchs and H. Chuberre, "A deconvolution approach to source localization," IEEE Trans. Signal Processing, vol. 42, pp. 1462-1470, 1994. [377] Y. H. Chen and C. T. Chiang, "Kalman-based estimators for DOA estimation," IEEE Trans. Signal Processing, vol. 42, pp. 3543-3547, 1994. [378] W. H. Yang, K. K. Chan, and P. R. Chang, "Complexed-valued neutral network for direction of arrival estimation," Electron. Lett., vol. 30, pp. 574-575, Mar. 1994. l379] H. L. Southall, 1. A. Simmers, and T. H. O'Donnell, "Direction finding in phased arrays with neural network beamformer," IEEE Trans. Antennas Propagat., vol. 43, pp. 1369-1374, 1995. [380] G. H. Golub and C. F. Van Loan, Matrix Computation. Baltimore, MD: John Hopkins Univ. Press, 1983. [381] W. A. Gardner, "Exploitation of spectral redundancy in cyclostationary signals," IEEE Signal Processing Mag., vol. 8, pp. 14-37, 1991. [382] E. R. Ferrara and T. M. Parks, "Direction finding with an array of antennas having diverse polarization," IEEE Trans. Antennas Propagat., vol. AP-31, pp. 231-236, 1983. [383] I. Ziskind and M. Wax, "Maximum likelihood localization of diversely polarized source by simulated annealing," IEEE Trans. Antennas Propagat., vol, 38, pp. 1111-1114, 1990. [384] G. Su and M. Morf, "The signal subspace approach for multiple wide-band emitter location," IEEE Trans. Acoust. Speech, Signal Processing, vol. ASSP-31, pp. 1502-1522, 1983. [385] H. Wang and M. Kaveh, "Coherent signal-subspace processing for the detection and estimation of angle of arrival of multiple wide-band sources," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 823-831, 1985. [386] D. N. Swingler and 1. Krolik, "Source location bias in the coherently focussed high-resolution broad-band beamformer," IEEE Trans. Acoust., Speech Signal Processing, vol. 37, pp. 143-145, 1989. [387] J. Krolik and D. Swingler, "Multiple broad-band source location using steered covariance matrices," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1481-1494, 1989. [388] B. Ottersten and T. Kailath, "Direction-of-arrival estimation for wide-band signals using the ESPRIT algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 317-327, 1990. [389] J. A. Cadzow, "Multiple source location-The signal subspace approach," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 1110-1125, 1990. [390] M. A. Doron, A. J. Weiss, and H. Messer, "Maximum-likelihood direction finding of wide-band sources," IEEE Trans. Signal Processing, vol. 41, pp. 411--414, 1993. l391] Y. Grenier, "Wideband source location through frequencydependent modeling," IEEE Trans. Signal Processing, vol. 42, pp. 1087-1096, 1994. [392] D. N. Swingler, "An approximate expression for the Cramer-Rao bound on DOA estimates of closely spaced sources in broadband line-array beamforming," IEEE Trans. Signal Processing, vol. 42, pp. 1540-1543, 1994. [393] K. M. Buckley and L. J. Griffiths, "Broad-band signalsubspace spatial-spectrum (BASS-ALE) estimation," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 953-964, 1988. [394] H. Hung and M. Kaveh, "Coherent wide-band ESPRIT method for direction-of-arrival estimation of multiple wideband sources," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 354-356, 1990. [395] P. M. Schultheiss and H. Messer, "Optimal and suboptimal broad-band source location estimation," IEEE Trans. Signal Processing, vol. 41, pp. 2752-2763, 1993. [396] C. R. Rao, C. R. Sastry, and B. Zhou, "Tracking the direction of arrival of multiple moving targets," IEEE Trans. Signal Processing, vol. 42, pp. 1133-1144, 1994. [397] J. F. Yang and H. 1. Lin, "Adaptive high resolution algorithms for tracking nonstationary sources without the estimation of source number," IEEE Trans. Signal Processing, vol. 42, pp. 563-571, 1994. [398] K. 1. R. Liu, D. P. O'Leary, G. W. Stewart, and Y. 1. 1. Wu, "URY ESPRIT for tracking time-varying signals," IEEE Trans. Signal Processing, vol. 42, pp. 3441-3448, 1994. [399] A. Eriksson, P. Stoica, and T. Soderstrom, "On-line subspace

[400] [401]

[402] [403]

[404]

[405]

[406]

[407]

[408]

[409] [410] [411] [412]

[413] [414] [415] [416]

[417]

[418] [419] [420]

[421]

142

algorithms for tracking movmg sources," IEEE Trans. Signal Processing, vol. 42, pp. 2319-2330, 1994. C. R. Sastry, E. W. Kamen, and M. Simaan, "An efficient algorithm for tracking the angles of arrival of moving targets," IEEE Trans. Signal Processing, vol. 39, pp. 242-246, 1991. Y. Meng, P. Stocia, and K. M. Wong, "Estimation of the directions of arrival of spatially dispersed signals in array processing," Proc. Inst. Elect. Eng. Radar, Sonar Navig., vol. 143, pp. 1-9, Feb. 1996. T. Trump and B. Ottersten, "Estimation of nominal direction of arrival and angular spread using an array of sensors," Signal Process., vol. 50, pp. 57-69, 1996. T. J. Shan, A. Paulraj, and T. Kailath, "On smoothed rank profile tests In eigenstructure methods for directions-of-arrival estimation," IEEE Trans. Acoust., Speech, Signal Processing. vol. ASSP-35, pp. 1377-1385, 1987. T. J. Shan, M. Wax, and T. Kailath, "On spatial smoothing for directional of arrival estimation of coherent signals," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 806-811, 1985. R. T. Williams, S. Prasad, A. K. Mahalanabis, and L. H. Sibul, "An improved spatial smoothing technique for bearing estimation in multipath environment," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36. pp. 425--432, 1988. C. C. Yeh, 1. H. Lee, and Y. M. Chen, "Estimating twodimensional angles of arrival in coherent source environment," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37. pp. 153-155, 1989. S. U. Pillai and B. H. Kwon, "Forward/backward spatial smoothing techniques for coherent signal identification," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 8-15, 1989. W. C. Lee, S. T. Park, I. W. Cha, and D. H. Youn, "Adaptive spatial domain Forward-Backward predictors for bearing estimation," IEEE Trans. Acoust.. Speech, Signal Processing, vol. 38, pp. 1105-1109, 1990. A. Moghaddamjoo and T. C. Chang, "Signal enhancement of the spatial smoothing algorithm," IEEE Trans. Signal Processing, vo1.39, pp. 1907-1911, 1991. W. Du and R. L. Kirlin, "Improved spatial smoothing techniques for DOA estimation of coherent signals," IEEE Trans. Signal Processing, vo1.39, pp. 1208-1210, 1991. A. Moghaddamjoo and T. C. Chang, "Analysis of spatial filtering approach to the decorrelation of coherent sources." IEEE Trans. Signal Processing, vol. 40, pp. 692-694, 1992. 1. F. Yang and C. J. Tsai, "A further analysis of decorrelation performance of spatial smoothing techniques for real multipath sources," IEEE Trans. Signal Processing, vol. 40, pp. 2109-2112, 1992. B. D. Rao and K. Y. S. Hari, "Weighted subspace methods and spatial smoothing: Analysis and comparison," IEEE Trans. Signal Processing, vol. 41, pp. 788-803, 1993. C. Y. Liou and R. M. Liou, "Spatial pseudorandom array processing," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1445-1449, 1989. A. 1. Weiss and B. Friedlander, "Performance analysis of spatial smoothing with interpolated data," IEEE Trans. Signal Processing, vol. 41, pp. 1881-1892, 1993. 1. E. Evans, 1. R. Johnson, and D. F. Sun, "High resolution angular spectrum estimation techniques for terrain scattering analysis and angle of arrival estimation." in Proc. lst. ASSP Workshop Spectral Estimation, Hamilton, Ont., Canada, 1981. pp. 134-139. A. Delis and G. Papadopoulos, "Enhanced forward/backward spatial filtering method for DOA estimation of narrow-band coherent sources," Proc. Inst. Elect. Eng. Radar, Sonar Navig.. vol. 143, pp. 10-16, Feb. 1996. J. Li and R. T. Compton Jr., "Angle and polarizanon estimation in a coherent signal environment," IEEE Trans. Aerosp. Electron. Syst., vol. AES-29, pp. 706-716, 1993. L. C. Godara, "Beamforming in the presence of correlated arrivals using structured correlation matrix," IEEE Trans. Acous{.. Speech, Signal Processing, vol. 38, pp. 1-15, 1990. K. Takao and N. Kikuma, "An adaptive array utilizing an adaptive spatial averaging technique for multipath environments." IEEE Trans. Antennas Propagat., vol. AP-35, pp. 1389-1396, 1987. 1. 1. Fuch, "Rectangular Pisarenko method applied to source

(422] [423] [424] [425] [426]

[427] [428] [429] [430] [431]

[432]

[433] [434]

[435] [436] [437] [438] [439] [440] [441] [442J [443] [444]

[445]

localization," IEEE Trans. Signal Process., vol. 44, pp. 2377-2383, 1996. B. L. Lim, S. K. Hui, and Y. C. Lim, "Bearing estimation of coherent sources by circular spatial modulation averaging (CSMA) technique," Electron. Lett., vol. 26, pp. 343-345, 1990. A. 1. Weiss and B. Friedlander, "Preprocessing for direction finding with minimal variance degradation," IEEE Trans. Signal Processing, vol. 42, pp. 1478-1485, 1994. H. R. Park and Y. S. Kim, "A solution to the narrowband coherency problem in multiple source location," IEEE Trans. Signal Processing, vol. 41, pp. 473-476, 1993. K. C. Huarng and C. C. Yeh, "A unitary transformation method for angle-of-arrival estimation," IEEE Trans. Signal Processing, vol. 39, pp. 975-977, 1991. D. N. Swingler and R. S. Walker, "Line-array beamforming using linear prediction for aperture interpolation and extrapolation," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 16-30, 1989. A. 1 Weiss, B. Friedlander, and P. Stoica, "Direction-of-arrival estimation using MODE with interpolated arrays," IEEE Trans. Signal Processing, vol. 43, pp. 296-300, 1995. M. Wax and T. Kailath, "Detection of signals by information theoretic criteria," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 387-392, 1985. H. Akaike, "A new look at the statistical model identification," IEEE Trans. Automat. Contr., vol. AC-19, pp. 716-723, 1974. J. Rissanen, "Modeling by the shortest data description," Automatica, vol. 14, pp. 465-471, 1978. Q. T. Zhang, K. M. Wong, P. Yip, and 1. P. Reilly, "Statistical analysis of the performance of information theoretical criteria in the detection of the number of signals in an array," IEEE Trans Acoust., Speech, Signal Processing, vol. 37, pp. 1557-1567, 1989. H. Wang and M. Kaveh, "On the performance of signalsubspace processing-Part I: Narrowband systems," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 1201-1209, 1986. Y. Yin and P. Krishnaiah, "On some nonparametric methods for detection of number of signals," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 1533-1538, 1987. K. M. Wong, Q. T. Zhang, 1. P. Reilly, and P. C. Yip, "On information theoretic criteria for determining the number of signals in high resolution array processing," IEEE Trans. ACOUJl., Speech, Signal Processing, vol. 38, pp. 1959-1971, 1990. M. Wax and I. Ziskind, "Detection of the number of coherent signals by the MDL principle," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1190-1196, 1989. M. Wax, "Detection and localization of multiple sources via the stochastic signal model," IEEE Trans. Signal Processing, vol. 39, pp. 2450-2456, 1991. Q. WU and D. R. Fuhrmann, "A parametric method for determining the number of signals in narrow-band direction finding, " IEEE Trans. Signal Processing, vol. 39, pp. 1848-1857, 1991 T. W. Anderson, "Asymptotic theory for pnncipal component analysis," Ann 1. Math Statist.. vol. 34, pp. 122-·148, 1963. W. Chen, K. M. Wong, and J. P. Reilly, "Detection of the number of signals: A predicted eigenthreshold approach," IEEE Trans. Signal Processing, vol. 39, pp. 1088-1098, 1991. H. Lee and F. Li, "An eigenvector technique for detecting the number of emitters in a cluster," IEEE Trans. Signal Processing, vol. 42, pp. 2380-2388, 1994. B. Friedlander and A. J. Weiss, "On the number of signals whose directions can be estimated by an array," IEEE Trans. Signal Processing, vol. 39, pp 1686-1689, 1991 M. Wax and I. Ziskind, "On unique localization of multiple sources by passive sensor arrays," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 996-1000, 1989. Y. Bresler and A. Macovski, "On the number of signals resolvable by a uniform linear array," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 1361-1375, 1986. S. U. Pillai and B. H. Kwon, "Performance analysis of MUSICtype high resolution estimators for direction finding in correlated and coherent scenes," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1176-1189, 1989. P. Stoica and A. Nehorai, "MUSIC, maximum likelihood, and Cramer-Rae bound," IEEE Trans. Acoust.. Speech, Signal Processing, vol. 37, pp. 720-741,1989.

[446] P. Stoica and A. Nehorai, "MUSIC, maximum likelihood, and Cramer-Rao bound: Further results and comparisons," IEEE Trans. Acoust., Speech. Signal Processing, vol. 38, pp. 2140-2150, 1990. [447] X. L. Xu and K. M. Buckley, "Bias analysis of the MUSIC location estimator," IEEE Trans. Signal Processing, vol. 40, pp. 2559-2569, 1992. [448] _ _ , "Bias and variance of direction-of-arrival estimate from MUSIC, MIN-NORM, and FINE," IEEE Trans. Signal Processing, vol. 42, pp. 1812-1816, 1994. [449] H. B. Lee and M. S. Wengrovitz, "Statistical charactenzation of MUSIC null spectrum," IEEE Trans. Signal Processing, vol. 39, pp. 1333-1347, 1991. [450] C. Zhou, F. Haber, and D. L. Jaggard, "A resolution measure for the MUSIC algorithm and its application to plane wave arrivals contaminated by coherent Interference," IEEE Trans. Signal Processing, vol. 39, pp. 454-463, 1991. [451] _ _ , "The resolution threshold of MUSIC with unknown spatially colored noise," IEEE Trans. Signal Processing, vol. 41, pp. 511-516, 1993. [452] Q. T. Zhang, "Probability of resolution of MUSIC algorithm," IEEE Trans. Signal Processing, vol. 43, pp. 978-987, 1995. [453] G. M. Kautz and M. D. Zoltowski, "Performance analysis of MUSIC employing conjugate symmetric beamformers," IEEE Trans. Signal Processing, vol. 43, pp. 737-748, 1995. [454] M. Kaveh and A. J. Barabell, "The statistical performance of MUSIC and mini-norm algorithms in resolving plane wave in noise," IEEE Trans. ACOUSl., Speech. Signal Processing, vol. ASSP-34, pp 331-341, 1986. [455] P. Stoica and A. Nehorai, "Performance study of conditional and unconditional direction-of-arrival estimate," IEEE Trans. Signal Processing, vol. 38, pp. 1783-1795, 1990. [456] _ _ , "Performance comparison of subspace rotation and M USIC methods of direction estimation," IEEE Trans. Signal Processing, vol. 39, pp. 446-453, 1991. [457] B. Ottersten, M. Viberg, and T. Kailath, "Performance analysis of the total least squares ESPRIT algorithm," IEEE Trans. Signal Processing, vol. 39, pp. 1122-1135, 1991. [458] C. P. Mathews and M. D. Zoltowski, "Performance analysis of the UCA-ESPRIT algorithm for circular ring arrays," IEEE Trans. Signal Processing, vol. 42, pp. 2535-2539, 1994. [459] B. Ottersten, M. Viberg, and T. Kailath, "AnalYSIS of subspace fitting and ML techniques for parameter estimation from sensor array data," IEEE Trans. Signal Processing, vol. 40, pp. 590-600, 1992. [460] M. Viberg, B. Ottersten, and A. Nehorai, "Performance analysis of direction finding with large arrays and finite data," IEEE Trans. Signal Processing, vol. 43, pp. 469-477, 1995. [461] A. 1. Weiss and B. Friedlander, "On the Cramer-Rao bound for direction finding of correlated signals," IEEE Trans. Signal Processing, vol. 41, pp. 495-499, 1993. [462] J. Capon, "Probability distribution for estimators of the frequency-wave number spectrum," Proc. IEEE, vol. 58, pp. 1785-1786, 1970. [463] E. M. Dowling and R. D. DeGroat, "The equivalence of the total least squares and minimum norm methods," IEEE Trans. Signal Processing, vol. 39, pp. 1891-1892, 1991 [464] B. D. Rao and K. V. S. Hari, "Performance analysis of RootMUSIC," IEEE Trans. Acoust., Speech, Signal Processing, vol 37, pp. 1939-1949, 1989. [465] S. R. De Graaf and D. H. Johnson, "Capability of array processing algorithms to estimate source bearings," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 1368-1379, 1985. [466] F. Li and R. J. Vaccaro, "Performance degradation of DOA estimators due to unknown noise fields," IEEE Trans. Signal Processing, vol. 40, pp. 686-690, 1992. [467] A. L. Swindlehurst and T. Kailath, "A performance analysis of subspace-based methods in the presence of model errors, Part 1: The MUSIC algorithm," IEEE Trans. Signal Processing, vol. 40, pp. 1758-1774, 1992. [468] B. M. Radich and K. M. Buckley, "The effect of source number under estimation on MUSIC location estimates," IEEE Trans. Signal Processing, vol. 42, pp. 233-236, 1994. [469] B. Friedlander and A. J. Weiss, "Effects of model errors on waveform estimation using the MUSIC algorithm," IEEE Trans. Signal Processing, vol. 42, pp. 147-155, 1994. [470] A. J. Weiss and B. Friedlander, "Effect of modeling errors on

143

[471] [472] [473] [474] [475] [476]

[477J [478] [479] [480]

[481] [482] [483]

r484] [485]

[486]

[487]

[488]

[489] [490]

[491] [492] [493]

Japan, 1986, pp. 1873-1876. [494] D. Williamson, K. L. Teo, and P. C. Musumeci, "Optimum FIR array filters," IEEE Trans. Acoust., Speech, Signal Processing. vol. 36, pp. 1211-1222, 1988. [495] M. T. Hanna, A. Kia, and J. P. Robinson, "Digital filters for attenuating interference arriving from a wide range of angles," IEEE Trans. Signal Processing, vol. 40, pp. 1499-1507, 1992. [496] G. C. Carter, "Coherence and time delay estimation," Proc. IEEE, vol. 75, pp. 236-255, 1987. [497] W. F. Gabriel, "Spectral analysis and adaptive array super resolution techniques," Proc. IEEE, vol. 68, pp. 654-666, 1980. [498] L. C. Godara, "Adaptive beamforming in the presence of correlated arrivals," J. Acoust. Soc. Amer., vol. 89, pp. 1730-1736, 1991. [499] _ _ , "Beamforming in the presence of broadband correlated arrivals," J. Acoust. Soc. Amer., vol. 92, pp. 2702-2708, 1992. [500] G. L. Turin, "Introduction to spread-spectrum antimultipath techniques and their application to urban digital radio," Proe. IEEE, vol. 68, pp. 328-353, 1980. [501] R. Price and P. E. Green, "A communication technique for multipath channels," Proc. IRE, vol. 46, pp. 555-570, 1958. [502] U. Fawer, "A coherent spread spectrum diversity receiver with AFC for multipath fading channels," IEEE Trans. Commun., vol. 42, pp. 1300-1311, 1964. [503] C. L. Zham, "Effects of errors in the direction of incidence on the performance of an antenna array," Proe. IEEE, pp. 1008-1009, 1972. [504] A. K. Steel, "Comparison of directional and derivative constraints for beamformers subject to multiple linear constraints," Proe. lnst. Elect. Eng., vol. 130, pt. H, pp. 41-45, 1983. [505] S. Ponnekanti and S. Sali, "Effective adaptive antenna scheme for mobile communications," Electron. Lett., vol. 32, pp. 417-418, 1996. [506] K. W. Lo, "Reducing the effect of pointing errors on the performance of an adaptive array," Electron. Lett., vol. 26, pp. 1646-1647, 1990. [507] R. A. Mucci and R. G. Pridham, "Impact of beam steering errors on shifted sideband and phase shift beamforming techniques," J. Acoust. Soc. Amer., vol. 69, pp. 1360-1368, 1981. [508] Y. Rockah and P. M. Schultheiss, "Array shape calibration using sources in unknown locations-Part 1: Far field sources," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 286-299, 1987. [509] R. T. Compton, Jr., "The effect of random steering vector errors in the Applebaum adaptive array," IEEE Trans. Aerosp. Electron. Syst., vol. AES-18, pp. 392-400, 1982. [510] Y.1. Hong, D. R. Ucci, and C. C. Yeh, "The performance of an adaptive array combining reference signal and steering vector," IEEE Trans. Antennas Propagat., vol. AP-35. pp. 763-770, 1987. [511] A. Gaffer and G. Langholz, "The performance of adaptive array In a dynamic environment," IEEE Trans. Aerosp. Electron. Syst., vol. AES-23, pp. 485-492, 1987. [512] P. N. Keating, "The effect of errors on frequency domain adaptive interference rejection," 1. Acoust. Soc. Amer., vol. 68, pp. 1690-1695, 1980. [513] M. S. Shernill and R. L. Streit, "In situ optimal reshading of arrays with failed elements," IEEE 1. Oceanic Eng., vol. OE-12, pp. 155-162, 1987. [514] D. 1. Ramsdale and R. A. Howerton, "Effect of element failure and random errors in amplitude and phase on the sidelobe level attainable with a linear array," 1. Acoust. Soc. Amer., vol. 68, pp. 901-906, 1980. [515] E. N. Gilbert and S. P. Morgan, "Optimum design of directive antenna arrays subject to random variations," Bell Syst. Tech. 1., pp. 431-663, 1955. [516] B. D. Steinberg, Principles of Aperture and Array System Design. New York: Wiley, 1976. [517] E. Ashok and P. M. Schultheiss, "The effect of auxiliary source on the performance of the randomly perturbed array," in Proc. ICASSP, vol. 40.1, San Diego, CA, 1984. [518] C. N. Dorny and B. S. Meaghr, "Cohering of an expenmental nonrigid array by self-survey," IEEE Trans. Antennas Propagat., vol. AP-28, pp. 902-904, 1980. [519] Y. Rockah and P. M. Schultheiss, "Array shape calibration using sources in unknown locations-Part 2: Near field sources and estimator implementation," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 724-735, 1987.

the resolution threshold of the MUSIC algorithm," IEEE Trans. Signal Processing, vol. 42, pp. 1519-1526, 1994. A. P. C. Ng, "Direction-of-arrival estimates in the presence of wavelength, gain, and phase errors," IEEE Trans. Signal Processing, vol. 43, pp. 225-232, 1995. V. C. Soon and Y. F. Huans, "An analysis of ESPRIT under random sensor uncertainties," IEEE Trans. Signal Processing, vol. 40, pp. 2353-2358, 1992. H. Snnath and V. U. Reddy, "Analysis of MUSIC algorithm with sensor gain and phase perturbations," Signal Processing, vol. 23, pp. 245-256, 1991. R. Hamza and K. Buckley, "An analysis of weighted eigenspace methods in the presence of sensor errors," IEEE Trans. Signal Processing, vol. 43, pp. 1140-1150, 1995. F. B. Tuteur and 1. A. Presley Jr., "Spectral estimation of spacetime signals with a DIMUS array," J. Acoust. Soc. A,ner., vol. 70, pp. 80-89, 1981. A. 1. Weiss and B. Friedlander, "Array shape calibration using sources in unknown locations-A maximum likelihood approach," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1958-1966, 1989. M. P. Wylie, S. Roy, and H. Messer, "Joint DOA estimation and phase calibration of linear equispaced (LES) arrays," IEEE Trans. Signal Processing, vol. 42, pp. 3449-3459, 1994. C. Y. Tseng, D. D. Feldman, and L. 1. Griffiths, "Steering vector estimation in uncalibrated arrays," IEEE Trans. Signal Processing, vol. 43, pp. 1397-1412, 1995. Y. M. Chen, J. H. Lee, C. C. Yeh, and 1. Mar, "Bearing estimation without calibration for randomly perturbed arrays," IEEE Trans. Signal Processing, vol. 39, pp. 194-197, 1991. A. Flieller, P. Larzabal, and H. Clergeot, "Robust self calibrauon of the maximum likelihood method in array processing," Signal Processing VII: Theories and Applications, M. Holt, C. Cowan, P. Grant, and W. Sandham, Eds. Edinburgh, Scotland: European Association for Signal Processing, pp. 1293-1296, 1994. A. 1. Weiss and B. Friedlander, "Eigenstructure methods for direction finding with sensor gain and phase uncertainties," Cire. Syst. Signal Process., vol. 9, pp. 271-300, 1990. P. Stoica and A. Nehorai, "Comparative performance study of element-space and beam-space MUSIC estimators," Circ. Syst. Signal Process., vol. 10, pp. 285-292, 1991. N. L. Owsley, "Noise cancellation in the presence of correlated signals and noise," New London Laboratory, New London, CT, NUSC Tech. Rep. 4639, 1974. T. J. Shan and T. Kailath, "Adaptive beamfonning for coherent signals and interference," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 527-536, 1985. M. T. Hanna and M. Simaan, "Array filters for attenuating coherent interference in the presence of random noise," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 661-668, 1986. A. K. Luthra, "A solution to the adaptive nulling problem with a look direction constraint in the presence of coherent jammers," IEEE Trans. Antennas Propagat., vol. AP-34, pp. 702-710, 1986. V. U. Reddy, A. Paulraj, and T. Kailath, "Performance analysis of the optimum beamformer in the presence of correlated smoothing," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 927-936, 1987. M. D. Zoltowski, "On the performance analysis of the MVDR beamformer in the presence of correlated interference," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 945-947, 1988. M. T. Hanna, "Array filters for attenuating multiple coherent interference," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-36, pp. 844-853, 1988. M. E. Ali and F. Schreib, "Adaptive single snapshot beamforming: A new concept for the rejection of nonstationary and coherent interferers," IEEE Trans. Signal Processing, vol. 40, pp. 3055-3058, 1992. F. Qian and B. D. Van Veen, "Quadratically constrained adaptive beamforrning for coherent signals and interference," IEEE Trans. Signal Processing, vol. 43, pp. 1890-1900, 1995. K. Cho and N. Ahmed, "On a constrained LMS Dither algorithm," Proc. IEEE, vol. 75, pp. 1338-1340, 1987. K. Takao, N. Kikuma, and Y. Yano, "Toeplitization of correlation matrix in multipath environment," in Proc. ICASSP, Tokyo,

144

1520J R. H. Lang, M. A. B. Din, and R. L. Pickhotz, "Stochastic effects in adaptive null-steenng antenna array performance," IEEE 1. Select. Areas Commun., vol. SAC-3, pp. 767-778,1985. [521] A. J. Berni, "Weight jitter phenomena in adaptive array control loop," IEEE Trans. Aerosp., Electron. Syst., vol. AES-13, pp. 355-361, 1977. [522] 1. E. Hudson, "The effect of signal and weight coefficient quantization in adaptive array processors," in Aspects of Signal Processing, part 2, G. Tacconi, Ed. Dordrecht-Holland: Reidel, pp. 423-428, 1977. [5231 R. Nitzberg, "Effect of errors in adaptive weights," IEEE Trans. Aerosp. Electron. Syst., vol. AES-12, pp. 369-373, 1976. 15241 S Ardalan, "On the sensitivity of transversal RLS algorithms to random perturbations in the filter coefficients," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 1781-1783, 1988. [525] D. R. Farrier, "Gain of an array of sensors subjected to processor perturbation," Proc. Inst. Elect. Eng., 1983, vol. 130, PT. H, pp. 251-254. [526] A. H. Quazi, "Array beam response in the presence of amplitude and phase fluctuations," 1. Acoust. Soc. Amer., vol. 72, pp. 171-180, 1982. [527] L. I. Kleinberg, "Array gain for signals and noise having amplitude and phase fluctuations," 1. Acoust. Soc. Amer., vol. 67, pp. 572-577, 1980. [528J H. Cox, R. M. Zeskind, and M. M. Owen, "Effects of amplitude and phase errors on linear predictive array processors," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 10-19, 1988. [529] D. M. DiCarlo and R. T. Compton Jr., "Reference loop phase shift in adaptive arrays," IEEE Trans. Aerosp. Electron. Syst., vol. AES-14, pp. 599-607, 1978. [530] L. Stark, R. W. Burns, and W. P. Clark, "Phase shifters for arrays," in Radar Hand Book, M. I. Skolnik, Ed. New York: McGraw-Hill, 1970. [531] L. C. Godara, "The effect of phase-shifter errors on the performance of an antenna-array beamformer," IEEE 1. Oceanic Eng., vol. OE-IO, pp. 278-284, July 1985. 1532] A. M. Vural, "Effects of perturbations on the performance of optimum/adaptive arrays," IEEE Trans. Aerosp. Electron. Svst.. vol. AES-15, pp. 76-87, 1979. [533J M. 1. Hinich, "Beamfonning when the sound velocity is not precisely known," 1. Acoust. Soc. Amer., vol. 68, pp. 512-515. 1980. [534] B. D. Steinberg and A. J. Luthra. "Simple theory of the effects of medium turbulence upon scanning with an adaptive phased array," 1. Acoust. Soc. Amer., vol. 71, pp. 630-634, 1982. [535] D. M. Boroson, "Sample size considerations for adaptive arrays." IEEE Trans. Aerosp. Electron. Syst., vol. AES-16, pp. 446-451, 1980. [5361 N. J. Bershad, "On the optimum data nonlinearity in LMS adaptation," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 69-76, 1986. 1537] K. 1. Raghunath and U. V. Reddy, "'Finite data performance analysis of MVDR beamfonner with and without spatial smoothing," IEEE Trans. Acoust., Speech, Signal Processing, vol. 40, pp. 2726-2736, 1992.

[538] J. T. Mayhan, "Some techniques for evaluating the bandwidth characteristics of adaptive nulling systems," IEEE Trans. Antennas Propagat., vol. AP-27, pp. 363-378, 1979. [539] H. Cox, R. M. Zeskind, and M. M. Owen, "Robust adaptive beamforming," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 1365-1376, 1987. [540] R. 1. Evans and K. M. Ahmed, "Robust adaptive array antennas," 1. Acoust. Soc. Amer., vol. 71, pp. 384-394, 1982. [541] 1. W. Kim and C. K. Un, "An adaptive array robust to beam pointing error," IEEE Trans. Signal Processing, vol. 40, pp. 1582-1584, 1992. [542] W. S. Youn and C. K. Un, "A linearly constrained beamforming robust to array imperfections," IEEE Trans. Signal Processing, vol. 41, pp. 1425-1428, 1993. [543] M. H. Er and A. Cantoni, "An alternative formulation for an optimum beamformer with robustness capability," Proc. Inst. Elect. Eng., vol. 132, pt. F, pp. 447-460, 1985. [544] M. H. Er and B. C. Ng, "A robust method for broadband beamforming in the presence of pointing error," Signal Process.. vol. 30, pp. 115-121, 1993. [545] _ _ , "A new approach to robust beamforming in the presence of steering vector errors," IEEE Trans. Signal Processing, vol. 42, pp. 1826-1829, 1994. [546] K. Takao and N. Kikuma, "Tammed adaptive antenna array," IEEE Trans. Antennas Propagat., vol. AP-34, pp. 388-394. 1986. [547] S. A. Kassam and H. V. Poor, "Robust techniques for signal processing: A survey," Proc. IEEE, vol. 73, pp. 433-481, 1985.

145

High-Resolution Frequency-Wavenumber Spectrum Analysis J. CAPON,

MEMBER, IEEE

Abstract-The output of an array of sensors is considered to be a homogeneous random field. In this case there is a spectral representation for this field. similar to that for stationary random processes. which consists of a superposition of traveling waves. The frequencywavenumber power spectral density provides the mean-square value for the amplitudes of these waves and is of considerable importance in the analysis of propagating waves by means of an array of sensors. The conventional method of frequency-wavenumber power spectral density estimation uses.a fixed wavenumber window and its resolution";s-determined essentially by the beam pattern of the array of sensors.. A high-resolution method of estimation is introduced which employs a wavenumber window whose shape changes and is a function of the wavenumber at which an estimate is obtained. It is shown that the wavenumber resolution of this method is considerably better than that of the conventional method. Application of these results is given to seismic data obtained from the large aperture seismic array located in eastern Montana, In addition. the application of the high-resolution method to other areas. such as radar. sonar. and radio astronomy. is indicated,

T

INTRODUCTION

HE USE of an array of sensors for determining the properties of propagating waves is of considerable importance in many areas. As an example. such phased arrays find application in radar. where an array of receiving antennas is used to determine the spatial coordinates of radar targets. In seismic applications" t.he large aperture seismic array (LASA) [1] located in eastern Montana, is used to determine the vector velocity of propagating seismic waves. In addition., LASA provides seismic data for facilitating the discrimination between earthquakes and underground nuclear explosions. The present work will be concerned with the use of an array of sensors to determine the vector velocity of propagating waves. In particular., the heavy emphasis will be on the seismic application based on seismic data obtained from LASA. It is well known that a stationary random process can be characterized by means of a spectral density function [21. Roughly speaking, this function provides the information concerning the power as a function of frequency for the stationary random process. In a similar manner, propagating waves, or a homogeneous random field, can be characterized by a frequency-wavenumber spectral density function. Loosely speaking, this function provides the information concerning the power as a fun-ction of freq uency and the vector velocities of the propagating waves. The definition and properties of the frequency-wavenumber spectrum I

Manuscript received December 13. 1968: revised June ]7. ]969. The author is with M.LT. Lincoln Laboratory, Lexington. Mass. (Operated with support from the U. S. Advanced Research Projects Agency.) .

will be given subsequently. However, the main purpose of the present work is to discuss the measurement.. or estimation, of the frequency-wavenumber spectrum. Previous methods of estimation were based on the use, at a given frequency, of a fixed wavenumber window. These conventional methods were limited to a wavenumber resolution which was determined primarily by the natural beam pattern of the array, as will be shown subsequently. A high-resolution estimation method will be introduced which is based on the use of a wavenumber window which is not fixed. but is variable at each wavenumber considered. As a consequence. it will be shown that the wavenumber resolution achievable by this method is considerably greater than that of the conventional method and is limited primarily by signal-to-noise ratio considerations. The highresolution method will be illustrated by examples obtained using LASA data consisting of long-period noise. longperiod Rayleigh surface wave events. and short-period noise..Applications of the high-resolution method to other ureas. such as radio astronomy. will also be indicated. DEFI~ITIO~ :\~D PROPERTIES OF THE

FREQCE~CY- W AVE~V~IBER SPr:CTRlJ~(

We assume that the output of a sensor located at the vector position x j is a wide-sense stationary discrete-time parameter random process with zero mean. ~ :Vj U1 ~. 111 == O. ± 1, ± 2. ' ... The covariance matrix of the noise is given by (1)

where E denotes expectation. The cross-power spectral density is

.Ij/(i.) ==

L x:

m= -

pj/(/n)~:im;>.

(1)

x;

and

(3) where ;.=l.nj'T is the normalized frequency, .1' is the frequency in hertz and T is the sampling period of the data in:seconds. If the sensor output field is space stationary, then for fixed ).. , .fil()") depends only on the vector-difference xi - x.. In this case the sensor outputs arc said to comprise a homogeneous random field, cf. Yaglom [2, pp. 81-84], and it is convenient to introduce a cross-power spectral density

Reprinted from Proceedings of the IEEE, Vol. 57, No, 8, pp. 1408-1418, August 1969.

146

j'(A~ r)

and cross-covariance pim. r) as j'(;~. r)

=

jjl(;~)'

(4)

pim. r)

=

PJl(nl).

(5)

(2n) - 1 P(A., k)

fJ:,J-'",/;

-,(mdkxJ'Z(di.. dk)

(6)

tervals :

-H E: IZ(L\/.- ~k112 ~ == P(i.. k)(~i. 2rr) dk\~",. where Pi).. k) is the frequency-wavenumber power spectral density function and 2rrk x • '2rrk r arc the\.\' components. respectively. of the vector k in radians per kilometer It should be rioted that Z(~i.. j,kl IS J. random function with uncorrclatcd incrcrncn ts. where the i ucrcrnent-, can be taken either in frequency /. or In vector wavenumber k, The cross-covariance and cross-power spectral density can be wr i uen as .. - .. 1.1

/

-:-:

(i_~ r)

..,

n... k v:

==

I

-

'"

I'~' ...

I

r'

i(III/. -

k

-Ik

'£1/.:,£11".

).0'

k - k o)

(7)

We now assume that K sensors are to be used to estimate the frequency-wavenumber spectrum P(i.. k). Such an estimate is usually based on an estimate for the cross-power spectral density .Ijl(;~)' For simplicity. only the direct segment or block averaging, method of estimation will be considered for estimating ./)/(i.). It has been shown [3] that this method is very desirable from the point of view of computational efficiency. There is also no essential loss of generality in considering a specific estimation method for .I~,(;.). In the direct segment method the number of data points L in each channel is divided into ;\'1 nonoverlapping blocks of ;V data points. L == }fIV. The Fourier transform of the data in the nth segment. jth channel. and normalized frequency I ..

It is possible to write the frequency-wavenumber spectrum as

==

'i

(iy)-l 2 '\' In

=l

UmiYJ,"',\IJ_l)N(;im;.

.J 11

== I.···. K (10) == 1......\1.

The am are weights which are used to control the shape of the frequency window used in estimating .ljt(i.). Again, for simplicity. we assume (1m == L 111 == 1..... ;V. As an estimate for .t~l( i.) we take I ,\I ./~l( i.) == \- ~ S III(i.)S~(iw).

.J

(8)

e.I

IS

SJlI(i.)

I

P(i.. kll:

-

£5(;~

FREQUENCy-WAVENUMBER SPECTRUM

1) E~Z(~;.. ~k)} ==0. for all ~;.. Sk : 2) Z(~i.. ~ 1 k + Ll2k) == Z(L\I.. ~ 1 k) + Z(~;.. ~lk). if L\ 1 k and L\1k arc disjoint intervals. and Z(~ 1 i. + ~~;.. ~k) ==Z(L~l/.. ~k)+Z(tl2i.. L\k) if L\li. and ~2i. arc disjoint intervals: 3) E: Z(L\ 1;.. L\1 k )Z *(.12 I.. ~ 2 k )~ == O. if t! 1k a 11 d :1 2 k arc disjoint intervals. or if n I i. and ~ 2i. are disjoint i 11-

= \.,

==

b(A - Ao) exp [i(k - k o)· rJdr.dr y

CONVENTIONAL METHOD FOR ESTIMATING

where k is the vector wavenumber. We have that Z(~/.. ~k) is a random function of the frequency interval ~/. and the elemental wavenumber area. or interval Sk, with the following properties:

film. ")

f"",

which is a delta function located at the frequency i o == 2nj~ T and wavenumber k.; It should now be apparent how P(i.. k) provides the information concerning the speed and azimuth. or vector velocity, of propagating waves.

whenever xj-xt==r. Following Yaglom, [2. pp. 81-84], any homogeneous random field has a spectral representation

s; =

= too",

It:=

1

j.!

== l. .. ·.K.

(11)

We will assume hereafter that a normalization is performed by dividing .l~/(i.) by Lj~J(;w).f~I(i.)] 1, 2. in order to remove the

effects of improper sensor equalization. We can. without any loss of generality, ignore this step in the ensuing analysis. As an estimate for P(/.• k) we take (12) \V j are weights which are used to control the shape of the wavenumber window used in estimating P(/.. k). We will assume, for simplicity, that w j == 1, j == L .. " K. It has been shown that [_~l(;")} is a nonnegative-definite matrix so that P()., k) will be real and nonnegative [3]. It will now be shown that P is an asymptotically unbiased and consistent estimate for cP where c is some positive constant. Using the results of [3] for EL1Jl(;~)} we get

where the

where J'x. rr are the x. y components. respectively. of the vector r, in kilometers, If the signal consists of a unity amplitude monochromatic plane wave propagating with a velocity. Co km.s, of the form exp [- i(2nj~111T + k o . ,a)]. In == O. ± 1. ± 2, ... , where j~ is the frequency, T is the sampling period. k o == 2nj~r1wo, - ~o is a slowness vector which points in the direction of propagation of the wave. and \cia I== l/r o, then

E[P(A, k o)} =

f~J:",L"oo P(x, k)IB(k I

. WN(x -

and

147

' I"- dx A) 2n dkxdk

y

koW

(13)

IWN (x )12

where

The motivation for this procedure can be given by writing (18) as

is the Bartlett window 2

1 Isin (N /2)XI = N Isin (1/2)x '

2 I WN(x) 1

(14)

P'().. , k)

and IB(kJ/ 2 is the beamforming array response pattern

=

B(k)

~ K

f

(15)

r.ik·"J.

=

K

I

i,l= 1

A j()\" k)Al(A, k)j~l(A) exp [ik . (x j

1t

-1C

f~

-Xl

fXJ IWN(x -Xl

i.)·

where

j=l

Bik -

"dx

ko)l- 2n tlkxtlk,.

= c.

Using the results of [3] we can compute the variance of P as. assuming [N j",} is a multidimensional Gaussian process"

~ k )l VAR 'p"" l (J..." 0 J -

-

1 (E[P . . . (I ~.." k 0) J).2 t J

1\;/

. PVc k)B*(k - ko)B(k + k o} dx 11·1 ~Vv(.\ - i.. )1 2 :;- dkxdkrll . _7r

Thus..

VAR rl p.. (..I.. " k 0) lj

-- 1 «E[P.... ~ k ] ) 2 = -.'v/ (I.. 0) , . I

( 16)

Since the variance of P approaches zero as _\J approaches infinity, it follows that P is a consistent estimate for iP. We follow Blackman and Tukey [4] and assume that P(il." k o) is a multiple of a chi-square variable so that to establish confidence intervals the chi-square distribution can be used with the number of degrees of freedom k given by

k = 2~E[P(i."ko)J}.!/VAR[P(i."ko)J =

l= 1 K

L J.I

Ikol

(21)

B'(/.." k, k o } =

FREQUENCy-WAVENUMBER SPECTRUM

The high-resolution estimate for P(A, k) is defined as -

x,)]

J"

1

K

L

j= 1

A j(i~" k o) exp [i(k - k o} ' XJ)'

(22)

It should be noted that the functional form. or shape, of H' changes as a function of the wavenumber k o. Thus, E{P'(i., k o}} is obtained by means of a frequency-wavenumber window IWN(x-i.)· B'(;.., k, k o)12. Hence, P' will be an asymptotically unbiased estimate for cP if WN(x - I.) B'(;.. , k, k o)12 approaches a 3-dimensional delta function in such a way that

I

HIGH-RESOLUTION METHOD FOR ESTIMATING

= L~ 1 qjl(A.) exp [ik . (x j

(20)

l/jl()'" k)

and {qij(i~. k)} is the inverse of the matrix r exp [ik' (x J - Xl)]j~,(i. )}. Thus, P'(;~, k o) is the power output of an array processor, known as a maximum-likelihood filter. whose design is determined by the sensor data and is different for each wavenumber k o, which passes undistorted any monochromatic plane wave traveling at a velocity corresponding to the wavenumber k o and suppresses in an optimum leastsquares sense the power of those waves traveling at velocities corresponding to wavenurnbers other than k.; cf [3. (122) and (123)]. It should be noted that the amount of computation required to obtain P' is almost the same as that to get P. since only an additional Hermitian matrix inversion is required. We now wish to compute the mean and variance of P'. In order to do this we assume that M, N are large enough so that as an aproximation we may replace It(>'') by !jI(A) in the definition for A J(>.., k) in (20). This then implies that the weights AJ(A, k) are not random and can be replaced by their expected values. This is a simplifying assumption, which is not actually valid, since these weights are designed from the data. However, it does appear to be a reasonable approximation. Using this assumption we have

Ikol = 0,

Ikol

qjl(;~' k)

== 1

( 17)

if M = 36, k = 72, and the 90 percent confidence limits are approximately ± 1.2 dB, and if =1= O. When = 0, these limits are approximately ± 1.6 dB.

P'(A, k)

K

L

where

21\1, Ikol #- 0

= M,

Xl)]

(19)

Thus" EfP().. , k o)} is obtained by means of a frequencywavenumber window IWN(x - i.. )1 2. IB(k-k o)/l . Hence: P will be an asymptotically unbiased estimate for cP if I~,,(x - i.. ).B (k - k o)12 approaches a delta function in such a way that

f

-

(18)

where {qjl(l)} is the inverse of the spectral matrix Lll()\,)}.

148

f:J_XlCX)f-~ooIWN(X-

i.)· B'(}.,

k. koW ~: dkxdk}. = c

where c is some positive number.

(23)

The variance of P' is, assuming {N j m } is a multidimensional Gaussian process,

matrix whose j l th element is exp [ -ik o ' xjJ. Now we have the following matrix inversion formula

q'qR

]'(31)

1 - R + (R/K) + 2(RI K) - PUo, k)

(32)

[(l-R{q,q+l~RI)J-l=*[I-

K+-1- R

so that using (18) we have

, ~ P (Ito, k)

Thus~

-- - 1 I£[P'(' VAR l(p'(~J..., k 0 ))J == l J.., k 0) J12 J ' 1'-' 1 .., --

If k

k.; then (33)

- IE[P' (I... k 0 )J)2 J ~

-- I

.\/

()jl(

R) exp [ - ik o . (x j

L

j

== I

1 - R.

j

:t=

-

x, )J. j. f == 1. .... K (26)

where ()jl(R) ==

1- R

(25)

The confidence limits for pi can be obtained in a manner similar to that for P described previously. It will now be shown that the wavenumber resolution using P' is higher than that obtained by using P. We assume that a single plane wa ve is propagating across the array of sensors and that a noise component is present in each sensor which is incoherent between any pair of sensors. If AJ. LV are large. then the spectral matrix is given by

lj/(I- o ) ==

~

R

= K

In the vicinity of k ~ k o we consider that power contour. or those values of k. for which P(I.. O' k) == I - R. which is still very close to the peak value of 1- R + (R/K), since R is small, between zero and unity, and K is large. usually about 20. For these values of k we get P'(;.. ()~ k)=~(l-R+R.'K), so that pi is already 3 dB down from its peak value of I - R + R/K. Hence, the wavenumber resolution using P' will be much higher than that obtained with P. We now assume that there are two independent and random plane waves propagating across the array of sensors. plus incoherent noise, so that the spectral mat rix is

(34) (27)

I

R is the ratio or the incoherent noise power to the total power of the sensor ou t rut. k.; == 2rr to 7.. lo IS the Ircq ucncy. i- o == 2rr I;) T. T is the sampling period. - 7. is the slowness vector which points in the direction of propagation and has magnitude 1:(1 == I r. r is the phase velocity of the propagating wa vc, Hence. using (12) we have

where (/1- 'l : are 1 x K row matrices whose 1 )th elements rc)spcctively. b, cxp [ik 1 . X J] and h2 cxp [ik 2 . X j]~ and hi + 17"2 + R == 1. We now have

a~c.

where ~kj==k -k j . In order to find an expression for P'(/.. k) we note the following matrix inversion formula [( RI

where I1k o

=

+

q , (j

d + q ~ (j 2 ]

+ k - k o.

(RI

- 1

+

(29)

I

== (R I + q'l q 1 ) -

lJ'lqd-lq~q2(RI

+

(j 2 ( RI

+

((1 (j 1

+

r

1

q'ICJd1(j

1

(36)

~

U sing this formula, as well as that given in (31) we obtain

Ifwe denote the matrix given in (26) by F. then

(37)

(30)

where ~k il = k, - k, and

where I is the K x K identity matrix, (j is a 1 x K row matrix whose I jth element is exp [ik o ' xjJ and q' is a K x 1 column 149

Ppo.k)

?

R

= _

K

J

b~)

R

hj

+-

2R

~

K

+ -K - P-(A ) 0' k)

j == 1, 2, (38)

j = 1,2.

negative-definite matrix when the block averaging method of spectral estimation is used [3]. However, this is not good enough to insure that the inverse of the spectral matrix exists. That is. it must be shown that the spectral matrix is positive-definite in order that its inverse exist. In fact, if the number of blocks M is less than the number of sensors K, then the spectral matrix is of order K, but only of rank M at most and is thus singular. This can be seen by writing the spectral matrix F as

(39)

Thus P', is the high-resolution frequency-wavenumber spectrum obtained when only the jth propagating wave, plus incoherent noise, is present. In the vicinity of k = k 1 we would like P' ~ P'i, Hence, the second term in (37) represents an undesired error term in this region which we would like to be as small as possible. It can easily be shown that this error term will- be small compared to liP; if b~[R

+ Kbi(l

F=

- IB(~k12)12)]

' .,

» R IB(~k12) ,-b 1 - K(R + Kbi)· R

.,

(40)

1

R

IB(~kl_.,)12 _:....------=--K l-IB(~k12)12

-

tv

"'"' L

m=l

P'(;~o, k) ~

L

F'=(l-R)F+RI. We now show that F' is positive definite. and thus 110nsingular. Consider the quadratic form Q associated with the matrix F'

(42)

Q=

K

~

j= 1

K

L

1= 1

(43)

j= 1

In other words, the high-resolution spectrum of the sum of two propagating waves, plus a small amount of incoherent noise, is the sum of the high-resolution spectra for the individual waves, as indicated in (43). This result may be extended to the case of M propagating waves, but the precise conditions for the linearity to hold become cumbersome to derive. It has been found experimentally that linearity will hold if there is sufficient wavenumber separation between the propagating waves and if the beam pattern IB(k)12 is reasonably good. The high-resolution estimate is based on the inverse of the estimated spectral matrix, cf. (18). Therefore, the pro blem of whether this inverse exists is extremely important. As mentioned previously, it has been shown that {~l(;")} :is a non-

+

R

a J(li.f~l(;.)

~ I L~

1- R L

-- L

Pj(;~o, k).

Nj.m+(tI-l)Nt.iurA .

singular. providing there is reasonable data in each block. However, in some cases it is not possible to obtain K or more data blocks. This situation arises. for example. when analysis of transient signals is desired whose time duration. unlike that of the noise. is very short. In order to make the spectral matrix nonsingular a small amount of incoherent noise is added. This is accomplished by modifying the matrix F given in (44) into the matrix F' given by

Thus, we have shown that a certain type of linearity holds, i.e., 2

(44)

l/:I(jn

Thus, F is the sum of M matrices each having rank unity. and the rank of F cannot exceed the sum of the ranks, namely M, Hence, F has rank A1 at most and. if AtJ < K. F must be singular. Therefore. a necessary. but not sufficient. condition for F to be nonsingular is JtJ ~ K. As a practical matter. it is found that whenever tv! ~ K~ F will be non-

(41)

This inequality will be satisfied if either R/K is small, IB(~k 12)\2 is small or both of these quantities are small. It should be noted that IB(~k 12)1 2 will be small if the wavenumber k , corresponding to one of the propagating waves is sufficiently different from the wavenumber k.2 of the other propagating wave. In this case the two propagating 'Naves can be resolved in wavenumber by the natural beam pattern of the array of sensors'! IB(klI2 . However. if IB(dk 12)1 2 is not too small, so that the natural beam pattern can not resolve the two waves, it is still possible for the high-resolution method to resolve the two waves if R/ K is small and IB(~k 12)\2 < 1. Thus. we see the advantage of the highresolution method over the conventional method of frequency-wavenumber spectrum analysis. In a similar manner we may show that in the vicinity of k = k2~ P' ~ P2 if ., R IB(L\k 21)\2 h- » 2 K 1 - IB(~k21)12

n= 1

where lJn is a 1 x K row matrix whose 1 jth element is

In most cases RIK will be very small so that we may write (40) as b2 »

1\1

L

n=1

K

I

j= 1

j=l

~ aj N /.m+(n-l)N(' t m ):I2 L r»

(46)

m=l

Iajl2 ~

-

n ::; ;. ~ tt.

Now, ifQ=O. we must have K

L lail = 0, 2

i> 1

(47)

since the first term in (46) is always nonnegative. However, (47) implies that Qj=O, j= 1. .. " K~ which proves that F' is positive definite. ApPLICATIONS TO SEISMIC DATA

We now wish to describe the application of the conventional and high-resolution frequency-wavenumber spectrum estimates to seismic data obtained from LASA. The

150

-- - --N

/'

/

/

/

"....

/@

@

§

I I

@

~

\

\

\

F1

@

@)

81 AO 83 82

@

8

@ @ \

.............

'-@ -,

<,

<,

--

@

@ ~------

--

I t 4 - - - - - - 200 km - - - - - - - . .

.\

TABLE I PARAMETERS USED IN MEASUREMENT OF FREQUENCy-WAVENUMBER SPECTRUM

DATA

Sampling Rate (Hz)

Array

Aperture (krn)

Number of Nominal Number of Samples per Sensors = K Btock=N

Frequency Resolution (Hz)

Confidence Limits

± 1.2

----_._------------------~-----_.-

LPZ noise LPZ Rayleigh surface-wave event (entire wave) LPZ Rayleigh surface-wave event (200 seconds at a time) SP noise

1 10

200

21

100

0.01

36

:00

21

100

0.01

36

:00

21 25

100 100

0.01 0.10

2 36

36

L/\SA consists of 21 subarrays of 25 short-period (SP) vertical seismometers as indicated in Fig. 1. A: the center of each subarray there is a three-component set of long-period (LP) seismometers oriented in the vertical (Z). north -south (NS). and cast-west (EW) directions. r\~ mentioned previously, a direct segment. or block averaging, method of spectral estimation was employed. The weigh ts U j == 1. j == 1... '. j\J" were used, cf. (10) so that a Bartlett frequency wi ndow was used in the spectral est: mation [4]. The seismic data considered was LPZ noise. LPZ Rayleigh surface-wave events. and SP noise. The parameters used in the measurement are given in Table I. The results of the conventional frequency-wavenumber spectrum measurcment program arc displayed, at a fixed frequency, as contours of - 10 log [P(j., kL Pm a x ] vs k\, k r • where Pmax is

90l~~

Number of Blocks = M

(dB)

Added Amount Incoherent Noise= R 0 0

± 1.2

0.05 0

the maximum value of ? The wavenumber coordinates arc in cycles per kilometer. The wavenumber grid on which P is com puted consists of 61 x 61 points. The level of the contours varies from 0 to 12 dB in steps of 1 dB. The display of the high-resolution results is similar to that of the conventional results with the only exception that the contour levels are incremented by 2 dB. It should be noted that if a wave is propagating from the north with a velocity corresponding to the wavenumber k.; then the wavenumber spectrum results will show a peak at the point k; == 0, k y:::::: \k ol/2n, i.e., the peak will appear above the origin of the wavenumber axes. The transfer function of the LP system is shown in Fig. 2. The results of both the conventional and high-resolution frequency-wavenumber spectrum measurements for LPZ

151

4000

c:

o

o

100

E

.. o

c

~

u

E

~

10

o:>

E

~

0

~ o4

~

1!---J'--.J....--I--1-.L.....LJ,...J...1....._--l-~~~"'-J-.J...J.-_..&..--'--~~~

0001

0.1

O.Oi

i.O

FREQUENCY (Hz)

Fig. 2. Long-period system transfer function.

noise are shown in Figs. 3 and 4 for two different noise samples taken on 7 April 1967 and 26 January 1967, respectively. These figures show that the conventional and highresolution results are in agreement as both methods tend to show strong peaks occurring at the same wavenumber in each program. However, the high-resolution method delineates the frequency-wavenumber spectrum much more clearly than the conventional method, especially in the suppression of the sidelobe level. This is demonstrated quite well in Fig. 4 which shows a 360" azimuthal spread for the wavenumber structure with a variable power density along this circle. This is, of course, exactly what would be expected since the dispersion curve of the LPZ propagating seismic noise has been measured and found to correspond to that of a fundamental mode Rayleigh wave [5]. This implies that at a given period the phase velocity of the propagating noise at LASA must be constant, independent of the location of the sources of the noise, and thus its frequency-wavenumber spectrum must consist of an arc, or arcs, whose extent corresponds to the range of the azimuths of the noise sources. The results of Fig. 3 show the noise consists essentially of a single wave propagating from the north. In this case the conventional result should appear essentially the same as the beam pattern of LASA, with the peak of the beam pattern occurring at the wavenumber corresponding to the vector velocity of the propagating wave. That this is indeed the case can be seen by comparing Fig. 3 with Fig. 5 which

shows the beam pattern of LASA. It should be noted that frequency-wavenumber spectra were computed for a theoretical model of the LPZ noise and showed excellent agreement with the measurements obtained using the actual LPZ seismic noise data. We also mention that the computer running time to produce a pair of plots.. such as is shown in Fig. 3~ is approximately 10 minutes using the IBM 360 67. Another application of interest is to LPZ Rayleigh surface-wave events. In this case the propagating waves are transients.. in time. and the field of sensor outputs cannot be considered as a homogeneous random field, as is the case with propagating seismic noise waves. Therefore, the frequency-wavenumber spectrum must be redefined in this case. Towards this end consider the time correlation function

The spectral densities .Ijt(i.. ), j"(i.. , r) are defined in the same manner as previously" cf. (2), (4)" respectively, and the frequency-wavenumber spectral density Pi): k) is also defined as previously in (9). The measurement of P().. ~ k) is still done by the direct segment method as indicated in (12) and (18). This represents an approximation which produces reasonable results. The frequency-wavenumber spectrum was measured for the 21 November 1966 Kurile Islands event whose param-

152

CONVENTIONAL

HIGH RESOLUTION

N

W

+------I E

I

·- --C·~. 5

°S

- 0.04 9

S

FREOUENC Y = 0.03 Mz

WAVENUMBER (cyclesll<m)

kmISEC

7 APR 67 NOISE SAMPLE 23 30 00 TO 00 30 00

0.04 9

WAVENUMBER (cyclesll<m)

Fig. 3. Conventional and high-resolution frequency-wavenumber spectra for 7 April 1967 long-period noise sample .

CONVENTIONAL

HIGH RESOLUTION

N

II I

W

E

t

I I I

i

I

0 .04 9

S

S

FREOUENCY • 0 .05 Mz

WAVENUMBER (cyclesll<m)

WAVENUMBER (cyclesll<m)

26 JAN 6 7 NOISE SAMPLE 00 ,000TOOll 0oo

Fig. 4. Conventional and high-resolution freque ncy-wavenumber spectra for 26 January 1967 long-period noise samp le.

~ 0.02

-

~

~

u

0 " ~

- 0 02

'"

'J

~~ >...

.. ~ ~.~

1r

,

.

~

..: , ~,":

- 0.0 4

. l~

~

..., ~.

:'

~

' j ,. ........ ;~ ~ 4-( '/ ~ j~ O'~~ L ,",.". ~

-

E

~

""" ~

"

P'

" -0.02

~

/,

.~

.~

i~

'; ~

J!1 ~ _. <..-...~ .... \\-. . ~. 1:="' ;,:;,' ~; (

...},'l .

- 0.04

..

i· '-

,, ~

~

~ W

N

.

.-

;s: 004

~

.!! ~.

0 S

.....:

.' '-'

0P

'C ~

.c \.;; 0.02

:~

~

004

WAVENUMBER (cycl~s/km)

Fig. 5. The beam pattern for the large aperture seismic array .

153

TABLE II PARAMETERS FOR

21

N OVEMBER

1966

HIGH RESOLUTION

CONVENTIONAL

K URILE ISLA NDS E VENT

21 November 1966

Date : Region : Origin time : Latitude: Longitude: Distance: Azimuth : Depth : Body-Wave Magnitude :

Kurile Island s

12: 19: 27 46 .7 N

'. '"

~

w

152.5 E 64.3'

...._

J ',

. • : .. 3S.M/ SEC

:1 12

40 km 6.0

- - - -- - - - - -- - - - -- - - - - - --._--- --

o

S

,

"'=-

00 4 !l

---{-S WAVE NUMBER {cycles/ km l

WAVENUMBER (c yc lts / km)

FREQUENCY " 0 .03 Hz (0)

F4 - , _ -

. _•

.

) 4 _ , A, '_

::::::.·::.,·:;.v_I:..

B 3 · ~-

. - - -~- ...'.

C4 ..- : :·-

,.

--

-

;..:

~_ .

. . . ,. .. '~::::::::.,.:::::.~_':."'

o · 12 45

"

~ , .'.

--

-~

._ .._. -

CONVENT IONAL

02

· ..·.. s.,:,. ·-~· ·-

"

, . .',

::: z· · .-_'" 6 2 ..-· '_

. ..... "'.::: ......

._

V..-.'" '; , . :....v..

.... . ..;:::....:'.;

21 NOV 66 KURILE ISLAND S EVENT 12 4000 TO.13 40 00

-

D

. _. _ _

F,- · ·

-

.-....:

.:', ':,,:,' : "

_

'2

.

"' ,

HIGH RESOLUTION

N

:.-

. ..- -. ,

~ 2 50

21 NOV 66 KURILE ISL A NDS EVENT "N AVE NU M BE R (c yc 1e s / km )

Fig. 6. The long-period waveforms for 21 November 1966 Kurile Islands event.

WAv EN UM BER

(c yc les / lo. m )

FREQUENCY " 0. 0 4 Hz

lbl

21 NOV 66 I SL ANDS EVENT 12'4000 TO 13 40 00

KURILE

eters are given in Table II. The LPZ Rayleigh surface waves of this event are shown in Fig . 6. The results obtained by measuring the frequency-wavenumber spectrum over the entire LPZ Rayleigh surface-wave train. as indicated in Table I. are given in Fig . 7 for frequencies of 0.03. 0.04. 0.05 Hz . It is known that the beating. or modulation. of the .envelope of these surface waves . as shown in Fig. 6. is caused by multiple path propagation. especially at shorter periods. cf. [6]-[8] . This multipath propagation effect is shown quite clearly at 0.04 Hz where two peaks are resolvable. One peak is at an azimuth corresponding to the initial wave arriving along the great circle path between LASA and the Kurile Islands, while the other peak shows the later multipath arrival propagating from the northwest. In order to determine the time delay between the multipath arrivals at LASA . for the 25-second period group. the frequency-wavenumber spectrum was measured over successive 200-second-long blocks of time, as indicated in Table I. The results are given in Fig. 8, which, for simplicity. shows only the high-resolution results. Fig. 8(a) shows that the initial 25-second period group arrives from approximately the azimuth of the event, while Figs. 8(b)-(d) show the later arrivals coming from a more northerly direction . The time delay between the multipath arrivals appears to be about 200 seconds, since the emergence of a secondary peak to the north is visible in Fig . 8(b) . The group velocity for these waves at the 25-second period is about 3.3 krn/s so that a path length difference of about 660 km or 6 degrees

CONVENTIONAL

HIGH RESOLUT ION

w

-

. " '. -~ (0 ,

"..

.

.

3.5KM /SEC

0 04 ~

Fig. 7. Conventional and high-resolution frequency-wavenumber spectra for 21 November 1966 Kurile Islands event: 12:40:00 to 13:40:00. (a) Frequency=O.03 Hz. (b) Frequency = 0.04 Hz. (c) Frequency =0.05 Hz.

exists between the two multi path arrivals. Similar results have been obtained by Evernden by measuring phase velocities with a tripartite array [7], [8]. Imaddition, Evernden gives a theory to explain the causes of the multi path propagation of Rayleigh surface waves . We now discuss the application of our results to SP noise. The transfer function of the SP system is shown in Fig . 9.

154

CONVENTIONA L

5

5

WAVENUMBER ( cycles/ ~ml

12 : 51'0 0 TO 12 . 54 : 20 (0)

oo..,.r --

-

-

-

-

-

-

WAV ENUMBER

o

-

-

0 15

5

(cyciu / kml

(b)

o

-0 15

015

5

WAVENUMBER (cyclu/ km)

12 54 20 TO 12 57 4 0

FREQUENCY = 0.04 Hz 21 NOV 66 KURILE ISLANDS EVENT

HIGH RESOLUTION

FREQUENCY

(0)

= 0.2 Hz

WAVENUMBER (cyeln / llm)

1 FEB 67 NOISE SAMPLE 0 8 03 : 00 TO 0 8 : 0 9 : 0 0

---,

CONVENTIONAL

I

HIGH RESOLUTION

N

N

I

i I I

i~

.;:.;

\

N

"

KM /5EC

J

I

I'

II

. ~)

.•

I~

.' E

~5 K"/5EC

;,J

Q

5 KM/ 5EC

I

-NAvE NUMBER (cyc ,.,! km/

NA VE N UM8ER cyCle5J .m J '.3 CI .X) -0 13 ) 4 20

I e)

id l

12 5 7 4 0 TO 13

or

:)0

FREOUENCY ' C Q4 Hz 21 NOV 6 6 ~ UR I LE :SL ANDS EVENT

- I J',

o 5

-0'5

WAVEN UMBER ( cyclts! km l

WAVE N UMBER ( cycles/ km l

FREOUENCY, 0 .6 Hz I b)

1 FEB 67 NOISE SAMPLE

0 8 0 3 0 0 TO 0 8 09 00

Fig. 8. Conventional and high-resolution frequency-wavenumber spectra for successive 200 second intervals of 21 November 1966 Kurile Islands event. Frequency = 0.04 Hz. Wavenumber (cycles km) (a) 12 : 51: 00 to 12:54:20. (b) 12:54:20 to 12:57 :40 (c) 12:57 :40 to 13:01 :00 (d) 13:01:00 to 13:04:20.

HIGH RESOL UTI ON

CONVENTIONAL

24

iii'

B 0

w liJ

12

o 0

...J

0

'" w

D

Ul

Eo = RESPONS E AT 1 Hz

0

Eo = 0.019VOLTS/mlJ.

z

Il.

Ul

w a:

J

5

5

015

WAVENUMBER (cyc les /11m 1

WAvENUMBER (cycl. s/ am )

FREQUENCY'LO Hz

- 12

IcI

I FEB 67 NOIS E SAMPLE 08 :03 :00 TO 08 :09 :00 -24

0.5

2

5

Fig. 10. Conventional and high-resolution frequency-wavenumber spectra for I February 1967 short-period noise sample: 08:03:00 to 08:09:00. (a) Frequency = 0.2 Hz. (b) Frequency = 0.6 Hz. (c) Fre quen cy = 1.0 Hz.

10

FREQUENCY (Hz)

Fig. 9. Short-period system transfer function.

The results of both the con ventional and high -resolution frequency-wavenumber spectrum measurements lor SP noise a rc shown in Fig. 10 for a noi se sample taken on I February 1967. The array o f S P seismometers used in this measurement is sho wn in Fi g. II and the beam pattern for thi s array is shown in F ig. 12. The results of Fig . 10 show that at 0 .2 Hz the SP noi se con sists of two components. a high-velocity body wav e whose horizontal phase velocity 155

is about 13.5 krn/s and a low velocity surface wave whose phase velocity is about 3.5 krn /s . At frequencies of 0.6 Hz and 1.0 Hz the SP noise consists primarily of body waves . CONCL USIONS

The est imation of the frequency-wavenumber power spect ra l density is of con siderable importance in the analysis of propagating waves by an array of sensors. The con-

t

oCI 81

o 84H2

08 1H6

this change in wavenumber window shape is performed in an optimum manner, as pointed out previously. As a consequence, it has been shown that the wavenumber resolution of this method is determined primarily by the amount of incoherent noise which is present in the array of sensors, and, to a lesser extent, by the natural beam pattern of the array. The experimental results show a considerable improvement of wavenumber resolution of the high-resolution method relative to the conventional method. In the case of LPZ seismic noise there was an improvement of about a factor of four, cf. Fig. 3. Thus, the high-resolutiorumethod is extremely useful for the estimation of the frequencywavenumber spectrum when the incoherent noise power is relatively small compared to the power of the propagating waves. The high-resolution method would. of course. be useful in applications other than seismic arrays . We now mention briefly the application of the method to radio astronomy. It is now possible to synchronize the outputs recorded at several radio astronomy telescopes [9]. Thus. these telescopes can be considered as sensors in an array. (cf. [9. Fig. I]) . If the incoherent noise power in each telescope is sufficiently small. i.e.. the radio signals from distant stars recorded by the telescopes should be coherent and there should be relatively little incoherent background noise power. then the high-resolution method is directly appl ica b le for the purpose of using this array of telescopes to map the sources of rad io energy.

N

08 4 H6

oC 4 81

0 8 4 H4 0 8 1H4

O AOHl

w-

0 8 1H2

AOG60 AOO I 0 AOG2 0 8 3H6 o AOD3 0 AOH3 o AOH5 0 83H2 o AOG4 0 8 2H2 0 82H6

-

E

o C28 1

0 83H4 o82H4 oC3 H2 0 0 81

I.

ı__ - -- - - - Fig . 11.

~

5 36 KM

-

-

-

I

~I - - - -0-1

A subarray of short-period sensors.

N 0 15

'.

~

005f--- -+- - -+- - -t-- - t---;---j----l W

o f----+---+-~-:\tt+-"

\\\':':+-

-

-+----i E

REFERE~CES

[I I P. E. Green . J r.. R. A. Frosch and C. F. Romney, "P rrucipics \'1' .m experimental lar ge a pe rt ure seism ic ar ray I L\SA I." Proc, lEU:'.

-0101---.,;q.~--+---t---+----j----1

. ·2 . ~

-0 05

~

0

",.~

S

010

015

WAVENUMB ER (cycle s/km )

Fig . 12. The beam pattern for the subarray of shortperiod sensors.

ventional method of estimation employs a fixed wavenumber window, and, as a consequence, the wavenumber resolution is determined essent ially by the natural beam pattern of the array of sensors. The high-resolution method of estimation employs a wavenumber window whose shape. and thus sidelobe structure, changes and is a function of the wavenumber at which an estimate is obtained . In addition,

vo l. 53. pp. 1821-1833. December 1965. [2 1 .-\ . :vi. Yaglorn. All Introduction 10 rite Theorv of Stuuonar v Random Functions, Englewood Cliffs. N . J. : Prentice Hall. 1902. [3] J. Capon. R. J. Greenfield. a nd R. J. Ko lke r. " Muludirucnsio nal maximum-likelihood processing of a la rge aperture -e isnuc ar ra y," Proc. IEEE. vo l. 55. pp . 192-211. February 196i . [4 J R. B. Blackman and J. W. Tukcy. The .H ('aSlI rcmell / of Power S{'('('/I'
156

An Algorithm for Linearly Constrained. Adaptive Array Processing OTIS LAMONT FROST, III,

AbstTact-A constrained least mean-squares algorithm has been derived which is capable of adjusting an array of sensors in real time to respond to a signal coming from a desired direction while discriminating against noises coming from other directions. Analysis and computer simulations confirm that the algorithm is able to iteratively adapt variable weights on the taps of the sensor array to minimize noise power in the array output. A set of linear equality constraints on the weights maintains a chosen frequency characteristic for the array in the direction of interest. The array problem would be a classical constrained least-meansquares problem except that the signal and noise statistics are assumed unknown a priori. A geometrical presentation shows that the algorithm is able to maintain the constraints and prevent the accumulation of quantization errors in a digital implementation.

MEMBER, IEEE

TIME DELAYS OF

TOTAL

~'\Ttt~ ,

SIGNAL))

)

I

I

I

I ~.ltt'T I I Z

I

::

I

I

I

I

•• I

I

. :L !D·n~ I~.~ I

NOISES

EQUIVALENT L()()t(-OIRECTION

TAPPEDDELAY LINE:

~

~

t

SIGNAL..-.....r--

Fig. 1.

••• --6--Ho\---4

•••

t

I

t

I

J

~I .~~ IT I T

..

'. I

I. INTRODUCTION

H IS PAPER describes a simple algorithm for adjusting an array of sensors in real time to respond to a desired signal while discriminating against noises. A "signal" is here defined as a waveform of interest which arrives in plane waves from a chosen direction (called the "took direction"). The algorithm iteratively adapts the weights of a broad-band sensor array (Fig. 1) to minimize noise power at at the array output while maintaining a chosen frequency response in the look direction. The algorithm, called the "Constrained Least MeanSquares" or "Constrained L:YlS" algorithm, is a simple stochastic gradient-descent algorithm which requires only that the direction of arrival and a frequency band of interest be specified a priori. I n the adaptive process, the algori th m progressively learns statistics of noise arriving from directions other than the look direction. Noise arriving from the look direction may be filtered out by a suitable choice of the frequency response characteristic in that direction, or by external means. Subsequent processing of the array output may be done for detection or classification. A major advantage of the constrained LlVlS algorithm is that it has a self-correcting feature permitting it to operate for arbitrarily long periods of time in a digital computer implementation without deviating fro m its constraints because of cumulative roundoff or truncation errors. "The algorithm is applicable to array processing problems in geoscience, sonar, and electromagnetic antenna arrays in which a simple method is required for adjusting an array in real time to discriminate against noises impinging on the array sidelobes.

\

t

I

"

T

s.-..~ . .

I

I

ADJUSTABLE WEIGHTS

'I

'J

.....

EQUVALENT SIGNAL OUTPUT

Broad-band antenna array and equivalent processor for signals coming from the look direction.

Previous Work Previous work on iterative least squares array processing was done by Griffiths .[1]; his method uses an unconstrained minimum-mean-square-error optimization criterion which requires a priori knowledge of second-order signal statistics. Widrow, Mantey. Griffiths, and Goode [2] proposed a variable-criterion [3] optimization procedure involving the use of a known training signal; this was an application and extension of the original work on adaptive filters done by Widrow and Hoff (4]. Gri ffi ths also proposed a constrai ned least mean-squares processor not requiring a priori knowledge of the signal statistics [5]; a new deri vation of this processor, given in [6], shows that it may be considered as putting "soft" constraints on the processor via the quadratic penaltyfunction method. "Hard" (i.e., exactly)-constrained iterative optimization was studied by Rosen [7] for the deterministic case. Lacoss [8], Booker et al. [9), and Kobayashi [10] studied "hard"constrained optimization in the array processing context for filtering short lengths of data..."\.11 four authors used gradientprojection techniques [11]; Rosen and Booker correctly indicated that gradient-projection methods are susceptible to cumulati ve roundoff errors and are not suitable for long runs wi thou t an additional error-correction procedure. The constrained LMS algorithm developed in the present wor k is designed to avoid error accumulation while maintaining a "hard" constraint; as a result, it is able to provide continual filtering for arbitrarily large numbers of iterations.

Manuscript received December 23, 1971; revised May 4, 1972. This research is based on a Ph.D. dissertation in the Department of Electrical Engineering, Stanford University, Stanford. Calif. The author is with ARGOSystelns, Inc., Palo Alto, Calif. 9-1-303.

Basic Principle of the Constraints The algorithm is able to maintain a chosen frequency response in the look direction while minimizing output noise

Reprinted from Proceedings of the IEEE, Vol. 60, No.8, pp. 926-935, August 1972.

157

power because of a simple relation between the look direction frequency response and the weights in the array of Fig. 1. Assume that the look direction is chosen as the direction perpendicular to the line of sensors. Then identical signal components arriving on a plane wavefront parallel to the line of sensors appear at the first taps simultaneously and parade in parallel down the tapped delay lines following each sensor; however, noise waveforms arriving from other than the look direction will not, in general, produce equal voltage components on any given vertical column of taps. The voltages (signal plus noise) at each tap are multiplied by the tap weights and added to form the array output. Thus as far as the signal is concerned, the array processor is equivalent to a single tapped delay line in which each weight is equal to the sum of the weights in the corresponding vertical column of the processor, as indicated in Fig. 1. These summation weights in the equivalent tapped delay line must be selected so as to give the desired frequency response characteristic in the look direction. If the look direction is chosen to be other than that perpendicular to the line of sensors, then the array can be steered either mechanically or electrically by the addition of steering time delays (not shown) placed immediately after each sensor. A processor having K sensors and J taps per sensor has KJ weights and requires J constraints to determine its lookdirection frequency response. The remaining KJ - J degrees of freedom in choosing the weights may be used to minimize the total power in the array ou tpu t. Si nee the look-direction frequency response is fixed by the J constraints, minimization of the total output power is equivalent to minimizing the nonlook-direction noise power, so long as the set of signal voltages at the taps is uncorrelated with the set of noise voltages at these taps. The latter assumption has commonly been made in previous work on iterative array processing [1], [5], [8][10]. The effect of signal-correlated noise in the array may be to cancel ou t all or part of the desired signal corn ponen t in the array output. Sources of signal-correlated noise may be multiple signal-propagation paths, and coherent radar or sonar "clutter.~' I t is permissible, and in fact desirable for proper noise cancellation that the voltages produced by the noises on the taps of the array be correlated among themselves, although uncorrelated with the signal voltages. Examples of such noises include waveforms from point sources in other than the look direction (e.g., lightning, "jammers," noise from nearby vehicles), spatially localized incoherent clutter, and self-noise from the structure carrying the array. Noise voltages which are uncorrelated between taps (e.g., amplifier thermal noise) may be partially rejected by the adaptive array in t\VO ways, As in a conventional nonadaptive array, such noises are eliminated to the extent that signal voltages on the taps are added coherently at the array output, while uncorrelated noise voltages are added incoherently. Second, an adaptive array can reduce the weighting' on any tap that may have a disportionately large uncorrelated noise power.

II.

OPTIMUM-CONSTRAINED

LMS

Notation Notation will be as follows (see Fig. 2): Every ~ seconds, where ~ may be a multiple of the delay T between taps, the voltages at the array taps are sampled. The vector of tap voltages at the kth sample is written X(k), where

The superscript T denotes transpose. The tap voltages are the sums of voltages due to look-direction waveforms land non-look-direction noises n, so that

X(k) = L(k)

+ J.V(k)

(1)

where the KJ-dimensional vector of look-direction waveforms at the kth sample is

I

l(kd)

I

K taps

l(ka) l(k~

L(k) 6

- r)

)

l(ka - r) l(kA - (J - 1)r) l(k~

- (J - l)T)

K taps

)

t

K taps

I

and the vector of non-look-direction noises is

The vector of weights at each tap is W, where

l,VT 6 [WI,

W2, . • . , WKJ].

I t is assumed for this derivation that the signals and noises are adequately modeled as zero-mean random processes with (unknown) second-order statistics:

E[X(k).lyT(k)] 6. R x x

(2a)

E(lV(k)~VT(k)] ~ R N N

(2b)

E[L(k)LT(k)] ~RLL.

(2c)

As previously stated, it is assumed that the vector of lookdirection waveforms is uncorrelated with the vector of non .. look-direction noises, i.e.,

E[N(k)LT(k)]

The first step in developing the constrained LlVIS algorithm is to find the optimum weight vector.

158

(3)

I t is assumed that the noise environment is distributed so that R x x and R N N are positive definite [12]. The output of the array (signal estimate) at the time of kth sample is

y(k)

=

ltVTX(k)

=

XT(k)W.

(4)

Using (4) the expected output power of the array is

E[y2(k)]

\VEIGHT VECTOR

= O.

=

E[WTX(k)XT(k)W] = WTRxxW.

(5)

The constraint that the weights on thejth vertical column of taps sum to a chosen number Ii (see Fig. 1) is expressed by

ADJUSTABLE WEIGHTS

LOOK DIRECTION

SIGNALS AND NOISES

Jle)

NOISE A

/' /

NOISE B NON-LOOK DIRECTION NOISES

Fig. 2.

I...

J TAPS PER SENSOR

Signals and noises on the array. Because the array is steered toward the look direction, all beam signal components on any given column of filter taps are identical.

the requirement

look-direction-equivalent tapped delay line shown in Fig. 1:

J = 1, 2, ... , ] where the KJ-dimensional vector

Cj

(6)

has the form

(9)

o

o

K By inspection the constraint vectors c, are linearly independent, hence, C has full rank equal to J. The constraints (6) are now wri tten

0

5".

K

Optimum

0

jth group of K elements.

Vector

Since the look-direction-frequency response is fixed by the J constraints, minimization of the non-look-direction noise power is the same as minimization of the total output power. The cost criterion used in this paper will be minimization of total array output power WTRx x W. The problem of finding the optimum set of filter weights W o p t is summarized by (5) and (10) as

1 Cj

~Veight

(10)

(7)

1 0 K

minimize

0

w

subject to 0

K 0 Constraining the weight vector to satisfy the J equations of (6) restricts W to a tK J - J) -di mensional plane. Define the constraint matrix C as ~---J---~

WTRxxW

(11a)

5.

(l1b)

CT~V

This is the constrained LMS problem. W o p t is found by the method of Lagrange multipliers, which is discussed in [13]. Including a factor of ! to simplify later arithmetic, the constraint function is adjoined to the cost function by a J-dimensional vector of undetermined Lagrange multipliers X:

H(W)

=

!WTRxxW

+ XT(CTW

- ~).

(12)

Taking the gradient of (12) with respect to W

VwII(W) RxxlV

(8) and define 5= as the J-dimensional vector of weights of the

+ CA.

(13)

The first term in (13) is a vector proportional to the gradient of the cost function (Ll a), and the second term is a vector

159

normal to the (KJ - J)-dimensional constraint plane defined by C1'W -ff = 0 [14]. For optimality these vectors must be antiparallel [13], which is achieved by setting the sum of the vectors (13) eq ual to zero

VwH(W) = RxxW

+ CX

=

implement and, for a given computational cost, is applicable to arrays in which the number of weights is on the order of the square of the number that could be handled by the iterative matrix inversion method and the cube of the number that could be handled by the direct su bstitu tion method.

o.

Derivation

In terms of the Lagrange multipliers, the optimal weight vector is then

lVo p t = - RXX-lCX

For motivation of the algorithm derivation temporarily suppose that the correlation matrix R x x is known. In constrained gradient-descent optimization, the weight vector is initialized at a vector satisfying the constraint (Ll b), say W (0) = C( C)-lit, and at each iteration the weigh t vector is moved in the negative direction of the constrained gradient (13). The length of the step is proportional to the magnitude of the constrained gradient and is scaled by a constant u. After the kth iteration the next weight vector is

(14)

where RXX-l exists because Rxx was assumed positive definite. Since W op t must satisfy the constraint (lib)

cr

CTlVo p t = ;;: = - CTRxx-lCX and the Lagrange multipliers are found to be

X

= -

[CTRxx-lC]-lg:

(15)

W(k

where the existence of [CT Rxx- 1 C]-l follows from the facts that Rxx is positive definite and C has full rank [6]. From (14) and (15) the optimum-constrained Ll\IS weight vector solving (11) is

lVo p t = RXX-lC[CTRxx-lC]-lg:.

+ 1) = W(k)

= W(k) -

JL[RxxW(k)

+ CA(k)]

(18)

where the second step is from (13). The Lagrange multipliers are chosen by requiring W(k+ 1) to satisfy the constraint (llb):

(16)

n:

Using the set of weights l-Vo p t in the array processor of Fig. 2 forms the optimum constrained L:\IS processor, which is a filter in space and frequency. Sulstituting IV.: p t in (-1), the constrained least squares estimate of the look-direction waveform is

=

CTlV(k

+ 1) =

CTW(k) - ,uCTRxxW(k) - J.LCTCX(k).

Solving for the Lagrange multipliers X(k) and substituting into the weight-iteration equation (18) we have

lV(k

+ 1) =

IV(k) - ,urI - C(CTC)-lCT]RxxW(k)

+ C(CTC)-l[g:

(1 i) Discussion

The constrained LMS filter is sometimes known by other names. If the f req uency characteristic in the look-direction is chosen to be all-pass and linear phase (distortionless), the output of the constrained Ll\IS filter is the maximum likelihood estimate of a stationary process in Gaussian noise if the angle of arrival is known [15]. The distortionless form of the constrained L:\tIS filter is called by some authors the "Minimum Variance Distortionless Look" estimator, "Maximum Likelihood Distortionless Estimator," and "Least Squares Unbiased Esti mater." By suitable choice of g: a variety of other optimal processors can be obtained [16].

I I I.

- ,uVwH[W(k)]

- CTW(k)].

(19)

The deterministic algorithm (19) is shown in this form to emphasize that the last factor ~ - crW(k) is not assumed to be zero, as it would be if the weight vector precisely satisfied the constraint at the kth iteration. I t will be shown in Section Vl that this term permi ts the algori th m to correct any small deviations from the constraint due to arithmetic inaccuracy and prevents their eventual accumulation and growth. Defining the KJ-dimensional vector

(20a) and the KJXKJ matrix

(20b) the algorithm may be rewritten as

THE ADAPTIVE ..;\LGORITHM

In this paper it is assumed that the input correlation matrix Rxx is unknown a priori and must be learned by an adaptive technique. I n stationary environments during learning, and in time-varying environments, an estimate of the optimum filter weights must be recomputed periodically. Direct substitution of a correlation matrix estimate into the optimal-weight equation (16) requires a number of multiplications at each iteration proportional to the cube of the number of weights. The complexity is primarily caused by the required inversion of the input correlation matrix. Recently Saradis et al. [17] and Mantey and Griffiths [18] have shown how to iteratively update matrix inversions, requiring only a number of multiplications and storage locations proportional to the square of the number of weights. The gradient-descent constrained L1VIS algorithm presented here requires only a number of multiplications and storage locations directly proportional to the number of weights. It is therefore simple to

W(k

+

1) = P[W(k) - ,uRxxW(k)]

+ F.

(21)

Equation (21) is a deterministic constrained gradient descent algorithm requiring knowledge of the input correlation matrix Rxx, which, however, in the array problem is unavailable a priori. An available and simple approximation for Rxx at the kth iteration is the outer product of the tap voltage vector with itself: X(k)XT(k). Substitution of this estimate into (21) gives the stochastic constrained LMS algorithm

160

W(O)

W(k

+

1)

= F

P[W(k) - J.Ly(k)X(k)]

+F

(22)

where y(k) is the array output (signal estimate) defined by (4).

(22), using (4), (2a), and the independence assumption yields an iterative equation in the mean value of the constrained LMS weight vector

Discussion The constrained LMS algorithm (22) satisfies the constraint erW(k+ 1) =~ at each iteration, as can be verified by premultiplying (22) by and using (20). At each iteration the algorithm requires only the tap voltages X(k) and the array output y(k); no a priori knowledge of the input correlation matrix is needed. F is a constant vector that can be precomputed. One of the two most complex operations required by (22) is the multiplication of each of the KJ components of the vector X(k) by the scalar p.y(k); the other significa tt operation is indicated by the matrix P = I - C( (7C)-lC". Because of the simple form of C [refer to (7)], multiplication of a vector by P as indicated in (22) amounts to little more than a few additions. Expressed in summation notation the iterative equations for the weight vector components are

cr

E[W(k

= P{E[W(k)] - p.RxxE[W(k)]}

+

+ F.

(23)

Define the vector V(k 1) to be the difference between the mean adaptive weight vector at iteration k + 1 and the optimal weight vector (16)

V(k

+

1) 6 E[W(k

+

1)] - W opt.

Using (23) and the relations F= (1 - P) W opt and PRxxWopt=O, which may be verified by direct substitution

of (16) and (20b), an equation for the difference process may be constructed ~r(k

1

WICk

+

1)

wICk)

WK(k

+

1)

wK(k) -

WK+l(k

+

1) = WK+l(k) - J.Ly(k)XK+l(k) - K

--

J.l.y(k)Xl(k)

1 J.Ly(k)xK(k) - -

K

K

1

w'!.K(k

+ 1)

w?K(k) -

Ii-Y( k) X ~K ( k)

1 K

wJK(k

+

wJK(k) -

JJ.y(k)XJK(k)

--

1)

+ 1)]

1

K

+

1) = PV(k) - J.LPRxxV(k).

[w}(k) - p.y(k)xlk) 1+

L:

j=l

K

L: ?K

L: i=K+l

K

+ -11

[wiCk) - JJ.y(k)xj(k)]

+ -.::.-.

K

f~

K

j<) [wj(k) - Jly(k)x}(k) 1+ ~

2K

L: j=K+l L:

II

[wiCk) - J.Ly(k)Xi(k)]

j=l

(24)

JK

K j=(J-l)K+l

[wiCk) - p.y(k)xj(k)]

h

+- . K

The idempotence of P (i.e., P" = P), which can be verified by carrying out the multiplication using (20b) and premultiplication of equation (24) by P shows that P V(k) = F(k) for all k, so (24) can be written

These equations can readily be implemented on a digital computer.

IV. PERFOR~lANCE Convergence to the Optimum The weight vector W(k) obtained by the use of the stochastic algorithm (22) is a random vector. Convergence of the mean weight vector to the optimum is demonstrated by showing that the length of the difference vector between the mean weight vector and the optimum (16) asymptotically approaches zero. Proof of convergence of the mean is greatly simplified by the assumption (used in [2)) that successive samples of the input vector taken A seconds apart are statistically independent. This condition can usually be approximated in practice by sampling the input vector at intervals large compared to the correlation time of the input process plus the length of time it takes an input waveform to propagate dow n the array. The assumption is more restrictive than necessary, since Daniell [19] has shown that the much weaker assumption of asymptotic independence of the input vectors is sufficient to demonstrate convergence in the related unconstrained least squares problem. Taking the expected value of both sides of the algorithm

V(k

+ 1)

[I - p.PRxxP]V(k)

[I -

p.PRXXp]k+lV(O)o

The matrix P R x x P determines both the rate of convergence of the mean weight vector to the optimum and the steady-state variance of the weight vector about the optimum. It is shown in [6] that PRxxP has precisely] zero eigenvalues, corresponding to the column vectors of the constraint matrix C; this is a result of the fact that during ada ptio n no movement is permitted away from (KJ - J)-ditnensional constraint plane. I t is also shown in [6, appendix C] that P RxxP has !{J - J nonzero eigenvalues 0' i, i = 1, 2, ° • " KJ - J, with values bounded' between the smallest and largest eigenvalues of R x x

161

Amin::; O'min

S; a ,

:s;

(Tmax ::; A m ax,

i = 1,2, ... , KJ - J

where Amin and A ma x are the smallest and largest eigenvalues of R x x and (1' mill and (1' max are the smallest and largest nonzero eigenvalues of PI~x.y'P.

Examination of V(O) = F- Wopt shows that it can be expressed as a linear combination of the eigenvectors of P RxsP corresponding to nonzero eigenvalues. If V(O) is equal to an eigenvector of P RxxP, say e; with eigenvalue a; ~ 0 then

V(k

+ 1)

[I -

J-LPRxxp]k+l ei

[1 -

J-LUijk+l ei .

2

o<

J-L

<

2

«1.

1 - - [tr (PRxxP)

..~ model of the effect of a nonstationary noise environment proposed by Brown [23] is that the steady-state rms change of the optimal weight vector Wopt(k) between iterations has magnitude 0, i.e.,

with time constants given by (25).

lim sup EIIWopt(k t-lIO

Steady-State Perjormance-r-Stationary Environment The algorithm is designed to continually adapt for coping with nonstationary noise environments. I n stationary environments this adaptation causes the weight vector to have a variance about the optimum and produces an additional component of noise (above the optimum) to appear at the output of the adaptive processor. The output power of the optimum processor with a fixed weight vector (17) is

+ 1) -

WoptTRxxl-Vopt

(lim sup EllfV(k) - Wopt(k)1I2/'2

=

ffT(CTRxx-lC)-lff.

<

k-ao

A measure of the fraction of additional noise caused by the adaptive algorithm operating in steady state in a stationary environment is termed" misadjustment" .1v[(J.I.) by Widrow

By assuming that successive observation vectors [vectors X(k) of tap voltages] are independent and have components xl(k), ... , XKL(k) that are jointly Gaussian distributed, Moschner [20] calculated very tight bounds on the misadjustrnent, using a method due to Senne [21], [22]. For a convergence constant J..L satisfying

the steady-state misadjustment may be bounded by

(26)

W opt (k)112 = 02 •

Brown's general results may be applied to the constrained LMS algorithm by restricting the optimal weight vector to have magnitude less than some number !! ~V maxll and again assu mi ng the successi ve in pu t vectors are i ndependen t with Gaussian-distributed components. For p. small it can be shown [23, p. 47] that the steady-state rms distance of the weight vector from the optimum is bounded by

=

(1/2) tr (PRxxP)

(28)

Steady-State Performance-Nonstationary Environment

lim IIE[lV(k)] - lVoptl1 = 0

+

2um a x ]

then it is guaranteed to satisfy (26). Griffiths [1] shows that the upper bound in (28) can be calculated directly and easily from observations since tr (R xx ) = E [XT(k)X(k)], the sum of the powers of the tap vol tages.

k-CIO

1

+

(27)

O<J-L<---3 tr (Rxx)

and so if the initial difference is finite the mean weight vector converges to the optimum, i.e.,

U m ax

jJ.

2

1/um ax

~ (1 - ,uUmin)k+lll V(O)II

o < J..L <

tr (PRxxP)

where tr denotes trace. N[(J-L) can be made arbitrarily close to zero by suitably small choice of u; this means that the steady-state performance of the constrained LMS algorithm can be made arbitrarily close to the optimum. From (25) it seen that such performance is obtained at the expense of increased convergence time. If Jot is chosen to satisfy

II V(k + 1)11

E[YoPt 2 (k) ]

2Umin]

2

then the length (norm) of any difference vector is bounded between two ever-decreasing geometric progressions

(1 - JotCTmax)k+lll V(O)II ~

+

J.I.

(25) jJ.O'i

J.I.

1 - - [tr (P RxxP) 2

s M(p.) ~ -

The convergence of the mean weight vector to the optimum weight vector along any eigenvector of PRxxP is therefore geometric with geometric ratio (1- jJ.Ui). The time required for the euclidean length of the difference vector to decrease to e- 1 of its initial value (time constant) is

where the approximation is valid for If jJ. is chosen so that

tr (PRxxP)

J-L

~

+ JL2

1

/

2[X* tr* (PRxxP)

- 1 - {1 - 2J-Lu*

+ Jot

2

[3u*2

+ u*2]IIW l\ + 2X* tr* (PRxxP)]} 1/2 m ax

(29)

where any starred quantities q* or q* are taken to bound the corresponding time-varying quantity q(k), i.e., q* 5:q(k) ~q* for all k. In general, the optimum convergence constant J.L that minimizes the upper bound (29) for a nonstationary environment is nonzero. This contrasts with the stationary case, in which the best steady-state performance is obtained by making p. as small as possible.

V.

GEOMETRICAL INTERPRETATION

The constrained LMS algorithm (22) has a simple geometrical interpretation that is useful for visualizing the errorcorrecting property which keeps the weight vector from deviating from its constraints. In an error-free implementation of the algorithm, the KJdimensional weight vectors satisfy the constraint equation (11 b) and therefore terminate on a constraint plane A defined

162

CONTOURS OF CONSTANT OUTPUT POWER

A. {WICTW-1} CONSTRAINT PLANE

I- {w ~CTW .O} Fig. 3.

WTRxXW

CONSTRAINT SUBSPACE

The (KJ - J) -plane A and subspace ~ defined by the constraint.

w

Fig. 5.

Example showing contours of constant output power and the constrained weight vector that minimizes output power. WoPt=RXX-lC(CTRxx-lC)-l'J.

Fig. 4.

P projects vectors onto the constraint subspace.

A

by Fig. 6.

W(k+l) =P[W(k)

This (KJ - I)-dimensional constraint plane IS indicated diagrammatically in Fig. 3. It is well known [14] that vectors pointing in a direction normal to the constraint plane (but not necessarily normal to the vectors that terminate on that plane) are linear combinations of the constraint matrix column vectors. These vectors have the form CA, where A is a I-dimensional vector determining the linear combination. Thus the vector F= C(C7'C)-lff', appearing in the algorithm (22) and used as the initial weight vector, points in a direction normal to the constraint plane. F also terminates on the constraint plane since c:r F = 5". Thus F is the shortest vector terminating on the constraint plane (see Fig. 3). The homogeneous form of the constraint equation

CTW = 0

Operation of the constrained

L~ts

algorithm:

-~y(k)X(k)]+F.

general, this change moves the resulting vector off the constraint plane. The resulting vector is projected onto the constraint subspace and then returned to the constraint plane by adding F. The new weight vector W(k + 1) satisfies the constraint to within the accuracy of the arithmetic used in implementing the algorithm.

VI.

(30)

defines a second (KJ - J)-dimensional plane, which includes the zero vector and thus passes through the origin. Such a plane is called a subspace [11] (see Fig. 3). The matrix P in the algorithm (22) is a projection operator [24]. Premultiplication of any vector by P will annihilate any components perpendicular to ~, projecting the vector into the constraint subspace (see Fig. 4). The vector y(k)X(k) in the algorithm is an estimate of the unconstrained gradient. Referring to (12) the unconstrained cost function is ! WT R x x W. The unconstrained gradient [refer to (13)] is R x x W. The estimate of R x x W at the kth iteration, used in deriving (22), is y(k)X(k). Contours of constant output power (cost) and the optimum constrained weight vector Wop t that minimizes the output power are shown in Fig. 5. The operation of the constrained LMS algorithm is shown in Fig. 6. In this example, the unconstrained negative gradient estimate - y(k)X(k) is scaled by JJ. and added to the current weight vector W(k). This is an. attempt to change the weight vector in a direction that minimizes output power. In

ERROR-CORRECTING FEATURE

In a digital-computer implementation of any algorithm, it is likely that small computational errors will occur at each iteration because of truncation, roundoff. or quantization errors. A difficulty in applying the well-known gradientprojection algorithm to the real time array-processing problem is that computational errors causing deviations of the weight vector from the constraint are not corrected [7], [9]. Without additional error-correcting procedures, application of the gradient-projection algorithm is limited to problems requiring few enough iterations that significant deviations from the constraint do not occur. The constrained LMS algorithm, on the other hand, was specifically designed to continuously correct for such errors and prevent them from accumulating. The reason for this characteristic is shown by a geometrical comparison of the two algorithms. The gradient-projection algorithm may be derived by following the derivation of the constrained LMS algorithm to (19) and dropping the last factor, 5"- CTW(k). This factor would be equal to zero in a perfect implementation in which the weight vector satisfied the constraint CT vV(k) =ff at each iteration. The algorithm that results when the term is dropped is

W(O)

w-(k

+ 1)

C(CTC)-l~

W(k) - p.Py(k)X(k).

(31)

This is a gradient-projection algorithm [11]. It is so named

163

.:...

."....-'-.

••*

.....

--•

A '

Fig. 7.

..

-

....

Operation of the gradient-projection algorithm (31).

o

CONSTRAINED LMS ALGORITHM

FREQUENCY

W(k+1) .P[W(kl-p,(lllX(kJ] +F .

Fig. 9.

Frequency response of the processor in the look direction.

LOOKDIRECTION SIGNAL

NOISE A

"<:':..

P[wCk) -".,(k»)(1l

A

o

(0)

Fig. 10.

GRADe ENT- PROJECTION ALGORITHM W(k+l)aW(k)-

~Py(k)X(k)

0.1

0.2

0.3

FREQUENCY

NOISE 8

.L. 0.4

0.5

Power spectral density of incoming signals. See Fig. 2 and Table I for spatial position of noises.

W(k+l)

TABLE I

W(Il)

SIGNALS AND NOISES IN THE SIMULATION (SEE FIG. 2)

Source Look-direction signal Noise A Noise B White noise (per tap)

A

Fig. 8. Error propagation. The constrained L~lS algorithm (a) corrects deviations from the constraints while the gradient-projection algorithm (b) allows them to accumulate.

because the unconstrained gradient estimate y(k)X(k) is projected onto the constraint subspace and then added to the current weight vector. I ts operation is shown in Fig. 7 (compare with Fig. 6). A comparison between the effect of computational errors on the gradient-projection algorithm and on the constrained LMS algorithm is shown in Fig. 8. The weight vector is assumed to be off the constraint at the kth iteration because of a quantization error occurring in the previous iteration. It is shown in Fig. 8(a) that the constrained LMS algorithm makes the unconstrained step, projects onto the subspace, and then adds F. producing a new weight vector W(k+ 1) that satisfies the constraint. The gradient-projection algorithm (Fig. 8(b)], however, projects the gradient estimate onto the subspace and adds the projected vector to the past weight vector, moving parallel to the constraint plane but continuing the error. Note the implicit (incorrect) assumption that W(k) satisfied the constraint, corresponding to the same

Power

Direction (0° is normal to array)

0.1 1.0 1.0

Center Frequency (1.0 is 1 IT)

0°

45° 60°

0.3 0.2 0.4

Bandwidth 0.1

0.05

0.07

o.i

assumption made in the derivation of the gradient-projection algorithm. Accumulating errors in the gradient-projection algorithm can be expected to cause the weight vector to do a random walk away from the constraint plane with variance (expected squared distance from the plane) increasing linearly with the number of iterations. By contrast, the expected deviation of the constrained LMS algorithm from the constraint does not grow, remaining at its original value. VII.

SIMULATION

A computer simulation of the processor was made using 6-digit floating point arithmetic on a small computer (the HP-2116). The processor had four sensors on a line spaced at r-second intervals and had four taps per sensor (thus KJ = 16). The environment had three point-noise sources, and white noise added to each sensor. Power of the look direction signal was quite small in comparison to the power of interfering noises (see Table I). The tap spacing defined a frequency of 1.0 (i.e., f= 1.0 is a frequency of liT Hz). In the look-direction, foldover frequency for the processor response was !T, or 0.5. All signals were generated by a pseudo-Gaussian gen-

164

7 6

;"

"

~ ••[CONSTRAINED-LMS ALGORITHM

OPTIMUM PLUS MISADJUSTMEelT (UPPER AND LOWER BOUNDS)

0::

...~

4

~

.

.

-

...:

:

.

:

....

"

.:

.

.....

oPTIMUM OUTPUT POWER---l

30

40

~O

60

70

BO

90

• J"

I !IO

100

• •. j

- •• • • J

1

100

50

200

150

200

NUMBER OF ADAPTATIONS

Fig. 11.

The output power of the constrained LMS filter (upper graph) decreases as it adapts to discriminate against unwanted noise. Lower curve shows small deviations from the constraint due to quantization. 7

....rGRADIENT -PROJECTION ALGORITHM

............

...........

....

'

.

.-....

0°

'0 '

...............

,

150

........j ~O

Fig. 12.

.

..

j

100 NUMBER OF ADAPTATIONS

.. ....

_ •••::-.•• I!IO

Output power of the gradient-proj ection alg orithm (upper graph ) o pe ra ted on the sa m e data as the constrained L:>IS algor it hm (c.f., Fig. II) . Lower c u rve sho ws th at deviations fro m the co ns t ra in t tend to increase with time . Note scale.

erator and filtered to give them the proper spa tia l and temporal correlations. All temporal correlations wer e a rranged to be identically zero for time differen ces greater than 25T. The time between adaptations t. was as sumed greater than 58T, so successive samples of X (k) were un correlated . The look direction filt er was spe cified by the vector 5'T=[l , -2,1.5,2] which resulted in the frequ en cy chara cteristic shown in Fig. 9. The sig na l and noise spec t ra are shown in Fig. 10 and their spatial position in Fig . 2. I n this problem , t he eigenvalues of R x x ranged from 0.111 to 8,355 , The upper permissibl e bound o n th e co nve rge nce constant JJ. calculated by (26 ) was 0.074 ; a value of JJ.=0 .01 was selected, which , by (27) , would lead to a misadj ustrnen t of between 15.2 and 17.0 percent. The processor was init ia lized with W (O) = F= [ ( [T C) - I;)" and Fig. 11 shows performance as a fun ct ion of time. The upper graph has three hori zontal line s. The lower lin e is th e output power of the optimum wei ght vec to r. The closely

spa ced up per tw o lines are upper and lower hounds for th e adaptive pr ocessor output pow er , which is the optimum ou tput power plus misadjustment. The mean steady-state value of the processor' s o u t put power falls som ewhere between th e upper and lower bounds (but may, at any instant fall above or below these bound s). The difference between the initi al and st eady-st ate pow er levels is the amount of undesirable noi se power the processor has been able to remove from the output. .\ si mula tion of the gradient projection algorithm (3 1) 0 11 the array problem was made usin g exa ctly the same data as used by the constrained LMS algorithm . The results ar e sho wn in Fig . 12. The lower part of Fig , 12 shows how th e gradi ent- projection algorithm walks away from the constraint. N ote the cha nge in scal e. If the errors of the co nst rained Ll\I S a lgorit h m (F ig. 11) were plotted on the same scale the y would not be disc ernibl e. The errors of the gradient-proj ection method are expected to continue to grow.

165

The fact that the output power of the gradient-projection processor (upper curve, Fig. 12) is virtually identical to the output power of the constrained LMS processor is a result of the fact that the errors have not yet accumulated to the point of moving the constraint a significant radial distance from the origin.

VIII.

LIMITATIONS AND EXTENSION

Application of the constrained LMS algorithm in some array processing problems is limited by the requirement that the non-look-direction noise voltages on the taps be uncorrelated with the look-direction signal voltages. This restriction is a result of the fact that if the noise vol tages are correiated wi th the signal then the processor may cancel out portions of the signal with them in spite of the constraints. If the source of correlated noise is known, its effect may be reduced by placing additional constraints to minimize the array response in its direction. Implementation errors, i.e., deviations from the assumed electrical and spatial properties of the array (such as incorrect amplifier gains, incorrect sensor placements, or unpredicted mutual coupling between sensors) may also limit the effectiveness of the processor by permitting it to discriminate against look-direction signals while still satisfying the letter of the constraints. I njection of known test signals into the array may provide information about the signal paths that can be used to compensate, in part, for the errors. The algorithm may be extended to a more general stochastic constrained least squares problem

min E{ [d(k) - W T ..Y(k)]2} subject to CTVV = 5"

(32)

where d(k) is a scalar variable related to the observation vector X(k) and C is a general constraint matrix. The scalar d(k) may be a random variable correlated with ..Y(k) or it may be a known test signal used to compensate for array errors. This would be a classical least squares problem except that the statistics of X(k) and d(k) are assumed unknown a priori. The general constrained LlVIS algorithm solving (32) may be derived similarly to (22) and is

W(O)

W(k

+ 1) = p{ W(k)

= C(C T C)- l5" - J.'[y(k) - d(k) ]X(k)}

+ F.

The general algorithm is applicable to constrained modeling, prediction, estimation, and con trol. I t is discussed in [6].

IX.

CONCLUSION

Analysis and computer simulations have confirmed the ability of the constrained LMS algorithm to adjust an array of sensors in real ti me to respond to a desired signal while discriminating against noise. Because of a system of constraints on weights in the array, the algorithm is shown to require no prior knowledge ~,i ~~e signal or noise statistics. A geometrical presentaticv has shown why the constrained LMS algorithm has an ability to maintain the constraints and prevent the accumulation of quantization errors in a digital implementation. The simulation tests have confirmed the effectiveness of this error-correcting feature, in Contrast with the usual uncorrected gradient-projection algorithm. The error-correcting feature and the simplicity of the algorithm make it appropriate for continuous real-time

signal estimation and discriminating against noises in a possibly time-varying environment where little a priori; information is available about the signals or noises. Time constants, steady-state performance, and a proof of convergence are derived for operation of the algorithm in a stationary environment; convergence and steady-state performance in a nonstationary environment are also shown. A simple extension of the algorithm may be used to solve a general constrained LMS problem, which is to minimize the expected squared difference between a multidimensional filter ou tpu t and a known desired signal under a set of linear equality constraints. REFERENCES

[1] L. J. Griffiths, "A simple adaptive algorithm for real-time processing in antenna arrays," Proc. IEEE, vol. 57, pp, 1696-1704, Oct. 1969. [2] B. Widrow, P. E. Mantey, L. J. Griffiths, and B. B. Goode, "Adaptive antenna systems," Proc. IEEE, vol. 55, pp. 2143-2158, Dec. 1967. [3] L. J. Griffiths, "Comments on IA simple adaptive algorithm for realtime processing in antenna arrays' (Author's reply)," Proc, IEEE (Lett.), vol. 58, p. 798, May 1970. [4] B. Widrow and M. E. Hoff, Jr., "Adaptive switching circuits," IRE WESCON Con». Rec., pt. 4, pp. 96-104, 1960. [5] L. J. Griffiths, "Signal extraction using real-time adaptation of a linear multichannel filter," Stanford Electron. Lab., Stanford, Calif., Doc. SEL-60-017, Tech. Rep. TR 67881-1, Feb. 1968. [6] O. L. Frost, III, "Adaptive least SQuares optimization subject to linear equality constraints," Stanford Electron. Lab., Stanford, Calif., Doc. SEL-70-055, Tech. Rep. TR 6796-2, Aug. 1970. [7] J. B. Rosen, "The gradient projection method for nonlinear programming, pt. 1: Linear constraints," J. Soc. Indust. Appl. J.\1ath., vol. 8, p. 181, Mar. 1960. [8] R. T. Lacoss, "Adaptive combining of wideband array data for optimal reception," IEEE Trans. Geosci. Electron., vol. GE-6, pp, 78-86, May 1968. [9] A. H. Booker, C. Y. Ong, j. P. Burg, and G. D. Hair, "Multipleconstraint. adaptive filtering," Texas Instruments, Sci. Services Div., Dallas, Tex., Apr. 1969. [10] H. Kobayashi, "Iterative synthesis methods for a seismic array processor," IEEE Trans. Geosci, Electron; vol. GE-8, pp. 169-178, july 1970. [11] D. G. Luenberger, Optimization by Vector Space Methods. New York: Wiley, 1969. [12] 1.j. Good and K. Koog, "A paradox concerning rate of information," Informat: Contr., vol. 1, pp, 113-116, May 1958. [13] A. E. Bryson, Jr., and Y. C. Ho, Applied Optimal Control. Waltham, Mass.: Blaisdell, 1969. [14] W. H. Fleming, Functions of Several Variables. Reading, Mass.: Addison-Wesley, 1965. [15] E. J. Kelly, Jr., and M. j. Levin, "Signal parameter estimation for seismometer arrays," Mass, Inst. Technol. Lincoln Lab. Tech. Rept. 339, Jan. 1964. [16] A. H. Nuttall and D. W. Hyde, "A unified approach to optimum and suboptimum processing for arrays," U. S. Navy Underwater Sound Lab., New London, Conn., USL Rep. 992, Apr. 1969. [17] G. N. Saradis, Z. j. Nikolic, and K. S. Fu, "Stochastic approximation algorithms for system identification, estimation, and decomposition of mixtures," IEEE Trans. Sys, Sci. Cvbern., Vol. SSC-5, pp. 8-15, jan. 1969. [18] P. E. Mantey and L. J. Griffiths, "Iterative least-squares algorithms for signal extraction," in Proc, 2nd Hawaii I nt, Conf. on Syst. Sci., pp. 767-770. [19] T. P. Daniell, "Adaptive estimation with mutually correlated training samples," Stanford Electron. Labs., Stanford, Calif., Doc. SEL68-083, Tech. Rep. TR 6778-4, Aug. 1968. [20] J. L. Moschner, "Adaptive filtering with clipped input data," Stanford Electron. Labs., Stanford, Calif., Doc. SEL-70-053, Tech. Rep. TR 6796-1, June 1970. [21] K. D. Senne, "Adaptive linear discrete-time estimation," Stanford Electron. Labs., Stanford, Calif., Doc. SEL-68-090, Tech. Rep. TR 6778-5, June 1968. [22] - - , "New results in adaptive estimation theory," Frank J. Seiler Res. Lab., USAF Academy, Colo., Tech. Rep. SRL-TR-70-0013, Apr. 1970. [23] J. E. Brown, II~, "Adaptive estimation in nonstationary, environments," Stanford Electron. Labs., Stanford, Calif., Doc. SEL-70-056, Tech. Rep. TR 6795-1, Aug. 1970. [24] D. T. Finkbeiner, II, Introduction to Matrices and Linear Transformations. San Francisco, Calif.: Freeman, 1966.

166

The Application of Spectral Estimation Methods to Bearing Estimation Problems DON H. JOHNSON ,

MEMBER , IEEE

In vited Paper

Abstroct-« The equivalence between the problem o f determining the bearing of a radiating source with an array of sensors and the problem of estima ting the spectrum of a signal is demonstrated. Modem spectral estim ation algorithms are derived within the context of array processing using an algebraic approach. Emphasis is placed on the problem of determining the bearing of a sound source with an array. Special issues encountered in applying these estimates are discussed.

T

I.

I NTRODU CTION

HE CLASSICAL PROBLEM in arr ay sign al proc essing is i to determine the lo cation of a sou rce which is rad iat ing energy . A single arr ay is used to esti ma te the direction of the source relat ive to t he locat ion of the array , The out puts of several such arrays, which are separa ted irom each other, are then used to determine location . The direct ion estimation problem is of in terest here ; it is shown later that this problem is ma thematically equivalent to estimating the spatial Fourier transform of the radiation field . The recurrent example used in this pap er will be the determination of the bearing of an aco ustic source with an array; this pro blem is referred to as th e passive sonar problem. In this problem , the signal-to-noise rat io encountered in practice of a single sensor's output is usually small (on the ord er of 0 d B ), Cons equently , spec tral esti mati on algorithms capa ble of determining the spectrum of a signal measured in the presence of significant no ise are o f special interest. It is sho wn he re that man y of the so-called " modem" algorithms can be derived in a uni fied way as the solution of a constrained optimization problem . The int ent o f this paper is to present these algorithms in this framework and to inte rpret the results in th e con tex t of array processing problems. While equivalent to est imating a spectrum , spec ial issues are introduced when the se pr ocedures are applied to the bear;n g esti matio n pr o blem. Most of these algori thms have been derivet! "'l ~~·.':!I<:Ie; however , som e new results are presente d. We assum e that a plan e-wave signal set, x ) is propagating in a medium at speed c in the direction - k o ( Fig. 1). An array of M sensors is present in the medium. Each sensor is assum ed to reco rd the aco usti c field at its spa tial position with perfect fideli ty. The waveform measured at the spa tial posit ion Z m of the mth sensor is denoted by xm(t) and is given by zm ' k o ) xm(t )=s ( t+-+ Nm (t ) c-

Fig. I . Definit ion of var iables involved in array processing. An array o f sensors is located in a medium. The origin for the coordinate system <J I the array can be chosen arbitrarilv , but is usually chosen to be th e cen tro id o f the array . The vector zm defines the location of the mth sensor relative to [his origin. A plane wave is shown propagating toward the arr ay in the direction - k, . Co nseq uen tly , the bearing of the so urce relative to [he array is denoted by the un it vector k o '

(1)

where Nm(t ) is additive noise and Zm . k o denotes the dot Manuscript received May 5, 1982. This wor k was supported by th e Office of Naval Research under Contract NOOO14-81 -K·0565. The author is with the Department of Electrical Engineering, Rice University , Houston , TX 77001.

product of the vector =m an d k o . This no ise may be due to disturbances propagating in the medium or to no ise generate d in ternally in the sensor or in the assoc iated electronics. One o f the oldest ideas in array processing for determinin g the bearing of an acoustic source (i.e., k o ) is beamforming . Here , the outputs of the sensors are summed with weights and delays to form a beam y(t) . y(t)

=L

m

amxm(t -

(2)

).

The ide a behind beam forming is to align the propagation delays of a signal presumed to be propagating in a direction -k o so as to reinforce it. Signals propagating from other directions and the noise are not reinforced. For example , if the sensor del ay T m is ideally adjusted to compensate for the signal del ay (zm . k o )! c, the signal would be completely reinforced . The delayed output of each sensor would be xm{t -

Tm

) = s(t) +Nm{t -

Tm

)

and the resulting beam output would be, assuming unity weights (am = I ) y( t) = M s(t ) + LNm {t- T m ). m

Thus the signal power in y(t) is M 2 tim es that measured at each sensor while the no ise power is increased by only a factor

Reprinted from Proceedings of the IEE E, Vol. 70. No. 9, pp. 1018-1028, September 1982.

167

Tm

-HO

-50 -40 -30 -20 -10

0

10

20

30

40 50

beam is proportional to the spatial Fourier transform of the quantity am exp {+j(2rr!/c)zm • ko}, a sequence consisting of signal and bearnforrning parameters evaluated at the spatial positions Z m- The spatial frequency variable is therefore k(f/c). As c = fA, this variable can be written as k(A. For example, consider a linear array of equally spaced sensors. Then Z m = mdix , where i x is the unit vector in the x-direction, and d is the spacing between sensors. Then Z m . k o = md sin (J 0 where (J 0 is the angle of the direction of propagation from the array broadside. Equation (4) becomes

90

0'~..L.....L..I"'--I.-L.....-L-......&-..~~..&.-~~~""""""L-..L--J.,...-1..-JL......L-~....L..L..&....U.4-

-10

(a)

~

1

-203 :1

en "0

Yi], k) = S(f)

uJ

0

::>

I-

:J

(b)

-10""'1--

Lam exp {-j2rr(md/A) (sin (J - sin (Jo)}.

(5)

m

_

<{

In a circular array with M sensors equally spaced on a circle of radius r, the sensor locations are zm = r cos (21rmIM)i x + r sin (21Tm/M)i y . Then

ouJ

Y(f, k)

a..

~

<{

~

a:: ~

0-

C/)

~ i

-30~ 1111' I , I , I , I , I , I , I , I ' I -9~ --50 -40 -30 -20 -18 e !0 20

i

t

I I I I I ii' I'M

30

40 50

90

BEARING-ceg

Fig. 2. Results of the application of various spectral estimation algorithms. A linear array of ten equally spaced sensors is receiving a single signal coming from an angle of zero degrees. The sensor spacing is 'A/2. The sensor signal-to-noise ratio (a} la~) is 0 dB. The spatial correlation matrix is given by (11). (a) Bartlett estimate. (b) Maximum-likelihood estimate. (c) Linear-predictive estimate (mo = 0). The reference amplitude in each part is maximum value of the spectrum in that part.

of M (assuming the noise signals measured by the sensors can be described as mutually uncorrelated processes). More generally, the energy in the beam y(t) is computed for many directions-of-look k by manipulating the delays T m- Maxima of this energy as a function of k are assumed to correspond to acoustic sources, and the source bearings correspond to the locations of these maxima (Fig. 2). Further insight into beamforming is gained by considering these expressions in the frequency domain [l], [24). Defining X m (f) to be the Fourier transform of X m (r), the Fourier transform of a beam is given by Y(f,k)= Lam exp {-j21r(flc)zm -k}Xm(f)

(3)

m

where the beamforming delay T m is given by (zm . k)/c. If, for example, each sensor output consists of a single propagating signal Xm(f) = 8([) exp {+j2rr(flc)zm . k o } the quantity Y(f, k) is given by Yt], k) = S(f) Lam exp {-j2rr(flc)zm · (k - ko)}.

(4)

m

Consequently, the temporal Fourier transform Y(f) of the

=S(f) Lam m

Thus the computation of this quantity depends greatly on array geometry. This frequency-domain analysis also demonstrates several issues important in the evaluation of the performance of a beamforming algorithm. As (4) describes the Fourier transform of a sequence, the spatial spectrum Yi], k) is periodic with period one. For the linear, equally spaced array, this period is d I(A/2). If the wavelength of the acoustic signal is such that this quantity is greater than one, aliasing occurs: the actual direction-of-propagation can be confused with other values of k: On the other hand, if the wavelength is such that this quantity is less than one, computations which usually assume a period of one (such as the fast Fourier transform) will evaluate spatial spectra at frequencies that do not correspond to any physical arrival. The energy in the beam as a function of k is evaluated by computing II Y(f) 12 df and

determining the location of

maxima. For simplicity, assume that the signal is narrow-band with all of its energy concentrated at the frequency fo. Thus beam energy P(k) is given by P(k) = I Y(fo.

k)1 = (J~ I~am exp {-i ~ zm .(k - ko)}r 2

(7)

where Ao = clIo. These results can be formulated in matrix form [5), [9), [29). Define the column vector X to contain the temporal Fourier transforms of the array outputs and the elements of column vector A to be Am =

a~exP{+j2:zm.k}.

Then Yi], k) = A'X where A' denotes the conjugate transpose of A. The vector X is given by X = 0sS + onN, where S represents a plane-wave signal (8 m = exp {+j(21T/"A)Zm · k o}) and N represents noise. The noise is assumed to be statistically independen t of the signal. These vectors are normalized so that and o~ denote the power levels of the signal and noise, respectively, at each sensor. The energy in the beam when

168

a;

- 50 - 40 - 30 - 20

- I0

0

I0

20

30

40 50-

The Bartlett estimate is, therefore, given by

90

PBART(k)

t-

~

~

r

-2~

(13)

t ~

1

-30

111 1

~

I

w

1

C)

:::J

::J

(b)

!r

::L

2

<{ -J

<{

CI: ~

~

u

f

w o,

if)

J1J.

0

Ii 1

r I

lilt

I! 11111.

Ll-L

L. ~

~

~

t

j

t

-1

-le~

j

-

j

°11

3

fI i

-~0

I ( I 1 I I I i I ( : f ---r-T I -S0 -413 -30 -20 -10 Ia

I

I

I

20

\

30

I

I

I ii' I'" 90

1

40 50

Fig. 3. Results of the applications of various spectral estimation algorithms. The array configuration is identical to that described in the caption to Fig. 2. Two signals are present in the field, one coming from an angle of SO , the other from an angle of -So. The spatial correlation matrix is given by (30). The sensor signal-to-noise ratio is o dB for each signal. (a) Bartlett estimate. (b) Maximum-likelihood estimate. (c) Linear predictive estimate (mo = 0).

steered in direction k is given by P(k)

=E[ I Yet, k) 12 ] = E[ IA'X 12 ] =E[A'XX'A 1 = A'~A

(8)

where ~ = E[XX'] is the spatial correlation matrix of the sensor outputs. If no noise IS present (an = 0), each element of ~ is given by ;Rm m = 1

1

a; exp {-i Ao217' (zm

1

- zm ). k O } 2

.

(9)

More generally, when noise is present, the spatial correlation matrix is given by

a; SS'

:R = a~ ~+

(10)

where ~ is the spatial correlation matrix of the noise. Furthermore, if the noise is spatially white (uncorrelated from sensor to sensor), ~ = ~ so that

9l = a~ ~ +

a; SS'.

( 11)

When the weights am are each set to unity, the so-called Bartlett spectral estimate results (Figs. 2(a) and 3(a)). The steering vector A is set equal to E, where E represents an ideal plane wave that is propagating in the direction-of-look k Em = exp {- j21T

~ Zm • k}.

(12)

Assuming that the noise is spatially white, the power in the beam when the array is steered toward the source (k = k o ) is found to be

-10

(a)

= E''lE.

Thus the quantity PBART(k) yields an estimate of the signal power propagating in the direction - k, The previous section has demonstrated that the measurement of the bearing of a radiating source by an array is mathematically equivalent to computing a spatial spectrum and determining the location of local maxima. A typical spectrum resulting from conventional beamforming is shown in Fig. 2(a). This result illustrates the classic problem of array processing. Because of the finite aperture of the array (aperture is defined roughly to be the spatial extent of the array measured in wavelengths), the detail that can be obtained from the spatial frequency response is limited. A source having a well-defined bearing appears to be coming from a dominant but diffused direction as well as from false directions corresponding to sidelobes. The sidelobes are due to the equal weighting assumed for each sensor output (am = 1). The comparable result in time-series analysis occurs when a rectangular window is applied to a sinusoidal signal and the spectrum computed. If the weights are tapered, the sidelobes can be reduced but at the expense of a wider mainlobe [4]. Using this classical approach, increased bearing estimation accuracy can only be obtained by increasing the aperture of the array. This solu tion is of limited utility as it means increasing the physical size of the array. For this and other technical reasons, modem highresolution spectral analysis algorithms are being considered instead [23], [29), [39]. These methods are usually referred to as adaptive methods as the choices of weights am and delays T m vary with direction-of-look and the characteristics of the sound field measured by the sensors. In evaluating spectral estimation methods, three criteria are usually used. The first is resolu tion: the ability of an estimate to reveal the presence of two equal-energy sources which have nearly equal bearings [17]. When two sources are resolved, two distinct peaks are present in the spectrum; if not resolved, only one peak is found. nus definition of resolution may not seem to correspond to the intuitive notion of resolution. Rather, a better resolved bearing would seemingly correspond to a narrower spectral peak. A spectral estimate yielding the sharpest peak usually implies that the bearing has been best resolved. However, the sharpness of a peak can always be increased by raising the spectrum to a power greater than one. Such a computation does not increase the accuracy to which source bearings can be distinguished. The more operational definition of resolution is thus preferred: how well can a spectral estimate allow the presence of two sources to be determined. Spectral estimates are usually normalized in judging resolution so that the value at k = k o equals signal power. For example, the estimates considered here, such as the Bartlett estimate, would be normalized to be proportional to Note that resolved spectral peaks do not necessarily imply that the peaks are located at the proper bearings. The second criterion is therefore the bias of the estimate. When one source is present, the bias (the error in the location of the spectral peak) is usually zero (the estimate is unbiased). However, when two sources are present, the bias is usually nonzero. These two criteria of the "goodness" of a spectrum may con-

169

a;.

tlict: good resolution is often obtained at the expense of a biased estimate. The third criterion is variability: the range of bearings over which the location of a spectral peak can be expected to vary. Analytic evaluation of the variability for a given spectral estimate is usually difficult. The Cramer-Rao lower bound on the variance is usually used as a benchmark to evaluate the measured performance of a given method [2].

likelihood beamformer is steered toward the source (E = S), the maximum-likelihood estimate equals 2

PML(k o ) = as

Perhaps the most well-known high-resolution array processing algorithm is the so-called maximum-likelihood method first reported by Capon [1 2], [131, [19]. The derivation of this method does not correspond to the standard approach used in maximum-likelihood estimates. Rather, this estimate is derived by finding the steering vector A which yields the minimum beam energy A '!itA subject to the constraint that A 'E = 1, where E represents an ideal plane wave corresponding to the direction-of-look, The purpose of the constraint is to fix the processing gain for each direction-of-look to be unity. Minimizing the resulting beam energy reduces the contributions to this energy from sources and/or noise not propagating in the direction-of-look, The solution of this constrained optimization problem occurs often in the derivation of adaptive array processing algorithms. The solution technique is to use a Lagrange multiplier. We minimize

~-tE

(14)

III. LINEAR-PREDICTION METHOD The linear-predictive spectral estimate commonly used in time series problems can also be used in array processing problems [30], [34], [38]. As before, let X mo be the Fourier transform of the output of the rn o th sensor evaluated at the frequency fo. We assume that this value is estimated by a weighted linear combination of the outputs of the other sensors

Define the column vector A to be lao,

(16)

~= §+(t~e'

then its inverse is given by (17)

~-l =..!..-I§0; Ss'l. o~ Mo; +o~

(18)

Thus when the noise is spatially white and the maximum-

at, · · · ,amo-l' a mo' amo+l' • • · ,aM-l ]T.

We seek to minimize the quantity E[ IA'XI 2 ] = A'~A subject to the constraint that am o = 1. This constraint can be written as A'Um o = 1 where u m 0 is a column vector having the moth element equal to one and the other elements equal to zero. The linear-predictive coefficients are thus seen to be the solution to a constrained optimization problem having the same form as that encountered in the maximum-likelihood method. The solution is, therefore

where ~ = [ ~ + ~e'(f ]-l~. For example, when 9l is given by (11), (i= = asS and i = ~ which results in

e

(21)

(15)

Theoretical studies of the properties of this estimate rely on a closed-form expression for !R-l. If a matrix is of the form

~-l=t-(i~e'

amX m·

am

The power in the beam when steered in the direction-of-Iook determined by E is the quadratic form A '!RA which becomes in this instance (see Figs. 2(b) and 3(b))

= (E'l-l E)-I.

L

m-:l=mo

The linear-predictive method is based on finding the weights which minimize the mean-squared prediction error

The quantity ex is determined by the constraint equation A'E = 1; the final solution is then

PML(k)

(20)

Comparing with (15), this vector differs only in the presence of the noise correlation matrix instead of the signal-plus-noise correlation matrix. The spatial correlation matrix ~ of the noise involved in this expression is not usually known in practice. Furthermore, the presence of the signal in the acoustic field can prevent measurement of it. Capon has termed the estimate PML(k) a "high-resolution" estimate [12]. Cox [17] has shown that this method has better resolution properties than the Bartlett estimate (Fig. 3, for example). However, it is not true that this method has the best resolution properties of any method.

X mo = -

~-lE

= E'~-lE·

AML

The usual approach of setting the gradient of F with respect to A to zero must be used with caution. The vector A is complex, and the gradient of the conjugate of A cannot be defined. One could evaluate the gradients with respect to the real and imaginary parts, set each result to zero, and solve for A. A simpler approach is to consider A and its conjugate as independent variables rather than the real and imaginary parts [16], (38]. When gradients with respect to A and A* of F are evaluated, they are conjugates of each other. Setting one of these to zero results in the solution

A = E'l-l E·

(19)

and thus has units power as does the Bartlett estimate (see (13)). The true maximum-likelihood solution for A has a form similar to (15). In this approach, a plane wave is presumed to be propagating in the direction -k in the presence of statistically independent Gaussian noise. The steering vector corresponding to maximum-likelihood estimate of the signal power is easily shown to be

II. MAXIMUM-LIKELIHOOD METHOD

F=A'~A +a(A'E- 1).

a~

+M

A

=

m-l J\

I

umo

-1.

u m o 9l

u mo

(22)

Using ideas found in time series [34], the linear-predictive method corresponds to an all-pole (Le., autoregressive) model for the signal. In this case, the power spectrum is given by the mean-squared prediction error divided by the magnitude

170

-50-40 -30 -20 -10

squared of the spectrum of the predictor coefficients. Consequently, the linear-predictive spectral estimate in the array case is of the form

0

10

20

30

40 50

90

(23) (a)

which is given by PLP(k) =

co

, (i)-I UmoJ\ U m o

IU~o~-l £1 2

"0

UJ

o

(24)

•

:::>

I-

:::::i

Figs. 2(c) and 3(c) illustrate examples of this estimate. Using (17), the value of this spectral estimate for k = k o and Mo] /a~ 1 is easily shown to be

»

PLP(k

(L

~

a::

o)=M(ilf- 1) ai.

I-

o

W

0...

This spectral estimate differs from the conventional and the maximum-likelihood estimates in an important way: the value of a peak is proportional to the square of the signal power. Burg and others [7], [29] argue that the power of a signal is determined from PL P (k) by the area under the peak rather than its amplitude. However, if the spectral estimate is modified to be

(f)

(b)

r t

(25)

I

I

1

r

I

I

1

I

10

the power estimate when k ......

= k o becomes _

2

With this modification, the resolution properties of the linearpredictive spectral estimate can be compared wi th other methods more easily. Linear-predictive spectral estimates for m., = 0 or J.f - 1 in a linear, equally spaced array have better resolution properties than maximum-likelihood estimates when the noise is spatially white [18]. This algorithm has the additional flexibility of the choice of rno , the sensor whose output is being predicted. In time series, this flexibility is also present although it has not been studied extensively. There the selection of m., corresponds to the choice of sample to be predicted by the other signal samples. In this case, the usual choice is m ; = -:\1 - 1 so that the past samples are used to predict the present one. Thus a causal signal model is obtained and the remaining choices for rna correspond to noncausal models. Causal models are usually preferred in time series problems. In contrast, causality is not a particularly important issue for a spatially distributed array, and the additional flexibility of choosing which sensor output is to be predicted could yield better spectral estimates, With this additional flexibility, what is the best choice for rna? One choice is to pursue the fundamental idea behind the linear-predictive method: choose m.; LO minimize the prediction error A':RA. Using (22), the mean-square prediction error is given by (U~o:R-l Umo)-l , which equals ((~-l )m l n 0 1- 1 . o

The minimum error choice for m., can be found analy tically from the coefficients of the causal model when ~ is Hermitian and Toeplitz. In this case, the diagonal e.erncnts of ~-I are given t>y (see the Appendix)

a?

L (la?1 m

z::

;=0

2

-

;a~_;;2)

(26)

where the quantities are elements of the solution vector ~-1 Uo normalized so that = 1. The quantity a~ is defined to be zero for the purposes of this expression.

ag

20

30

t

I 40

I

t

II jill.,

sa

90

BEARING-deg

2

PLP(k o ) - iYfa s + an'

(~-l)mm

I , !

Fig. 4. Effect of the choice of prediction element rn o on the linearpredictive spectral estimate. The array configuration is described in the caption to Fig. 2 and the signal and noise characteristics in Fig. 3. (a) Sensor signal-to-noise ratio is 0 d B. (b) Sensor signal-to-noise ratio is -10 dB.

To minimize the prediction error, one would search for the largest element on the diagonal of :R- 1 . The prediction error can be smaller for values of rno =1= Jot - 1 in many cases. However, the spectrum corresponding to this minimum predictionerror choice is not necessarily "better" than that obtained when m., = M - 1 or O. Fig. 4 is an example of this phenomenon. It can be shown that for a linear array of equally spaced sensors, predicting the output of a center element (rn o = lW /2) has better resolution properties than those obtained when predicting the output of an end element (rn o = M> 1 or 0) when the source bearings are closely spaced (separated by less than one-half a beamwidth) [18]. However, the opposite result holds when the source bearings are not closely spaced (Fig. 4). Fig. 4 also illustrates the nonlinear nature of this spectral estimate. The resolution capability of an algorithm can be affected by the signal-to-noise ratio ~ some algorithms are affected more (rn o = 4 in Fig. 4, for example) than others. Furthermore, the increased resolution indicated in Fig. 4(a) is obtained at the expense of increased bias. A criterion for the "proper" choice of the predictive element rna has not been found to date. This is but one linear-prediction algorithm. While all of them start with the model given in I~ 21), the error criterion to be minimized differs. For example, in time series, the sum of the so-called forward and backward squared errors is sometimes used (7], [30], (36]. For 3. linear array, this choice 01 criterion corresponds to the sum of the squared errors for rn o = 0 and 111- 1. The approach presented here and these other algorithms can also be modified to allow the order of the prediction model to be different from M - 1. The tradeoffs involved in reducing the model order have not been studied extensively. These various algorithms do have different

171

resolution and bias properties [30), but these properties are not fully understood at this time.

IV.

01

COMPARISON OF THE MAXIMUM-LIKELIHOOD AND LINEAR-PREDICTIVE METHODS

These two spectral estimation methods provide spectra having better resolution properties than conventional beamformmg. Comparison between these two estimates are often drawn. The maximum-likelihood method is an adaptive beamforming algorithm while linear prediction does not yield weights for beamforming. The linear-predictive method has better resolution properties. However, this increased resolution is accompanied by a ripple in the power estimate PLP(k) when the direction-of-look is not equal to the actual signal bearing (Fig. 3). These spectra have been related to each other analytically by Burg [8] in the case of equally spaced linear array by 1

:fl

M-l

PML(k)

=

001

o 001

where

'cos(umo,E;~-1)12

I 10

I

I

I

i

20

39

t

I , I

I

I i Ph

49 50

99

= I I [M(Mo;

v;r,.

+o~)]

and when the direction-of-look is significantly different from

(27)

k o so that E is orthogonal to S, this cosine squared is approximately equal to I/Ma~. The precise value of the cosine will

oscillate about this quantity depending on the projection of E onto S. Considering Fig. 5, the amount of this projection diminishes as the direction-of-look departs more from the signal direction. The characteristic of the linear-predictive estimate that results in both better resolution and increased ripple when compared to the maximum-likelihood estimate is this cosine-squared term.

11011 =a'!a. Note that when a and fJ are complex, this expression is complex-valued. Despite this property, the magnitude of this cosine is bounded between o 'and 1 because of the Sch warz inequality. Consider the ratio of the high-resolution spectral estimates

v.

(28)

Comparing (27) and (28), we have 2 = lcos (u rn o, E", ~ -1 )1.

J

0

while vectors orthogonal to S are reduced in length by Therefore, when E =S.

2

--&.:.:.::..-

1 T\

VMo; +o~

II all ~ denotes the norm of a as generated by !

PML(k) PLP(k)

i

remaining eigenvalues equal l/o~. If the cosine were computed with respect to the identity matrix, 'cos (II rn o, E) 12 = 11M for all direction-of-look vectors E. When computing the cosine with respect to gt-l, the result depends on the relationship between E and S. The space induced by ~-1 reduces the lengths of vectors parallel to S by a factor of

P<{;)(k)

s. ~) ='11 all ~ 111111 ~

I

Fig. s. The ratio of maximum-likelihood and linear-predictive spectral estimates. The ratio given by (29) is computed for the spectra given in Fig. 2(b) and (c).

where Pl';)(k) is the spectrum obtained with an mth-order model. This result suggests that PML(k) is a smoothed version of Ptr;:>(k) as the former "averages-in" lower order linearpredictive models. Another result can be obtained by noting the expression for the generalized cosine of two vectors a and /J.

cos (a,

ii'

-50 -40 -30 -20 -10

BEARING-deg

1

o'/J

11'1 I I tIt I -90

(29)

The ratio of these spectral estimates is thus equal to the cosine of the angle between u m o and E with respect to the vector space generated by ~ -1. One consequence of this expression is that PML(k) ~PLP(k).

The linear-predictive spectrum will be much greater than the maximum-likelihood estimate when this cosine is small. The natural orthogonal basis for the vector space induced by 1 g{-l is comprised of the eigenvectors of this matrix. When the correlation matrix is given by (11), one eigenvector equals S while the remaining eigenvectors are the (M - 1) orthogonal vectors spanning the subspace orthogonal to S. The eigenvalue of ~-l corresponding to S is equal to 1/(Ma; +a~) while the lit is easily shown that these eigenvectors are mutually orthogonal because i-I is a Hermitian matrix. Therefore, the eigenvectors constitu te an orthonormal basis.

EIGENVECTOR METHODS

A class of spectral estimation procedures based on an eigenvector-eigenvalue decomposition of the spatial correlation matrix has been developed recently [3], [26], (431. These procedures are intimately related to the maximumlikelihood and linear-prediction methods just described. The motivation for this approach is to emphasize those choices for E which correspond to signal directions. As the expressions for the maximum-likelihood (16) and linear-prediction (24) estimates have E appearing only in the denominator, the rationale is to reduce the lengths of those E's corresponding to signals and increase those not corresponding to plane-wave signals. The problem is that one does not know, in general, which direction to emphasize; it is these directions that we are trying to determine from the spatial spectra. On the other hand, these directions determine the structure of the spatial correlation matrix, in particular the eigenstructure of matrix. By examining this structure, one can obtain algorithms which enhance the spatial spectra in an objective way so that peaks corresponding to propagating signals are made more prominent. The eigenvalues Ai and eigenvectors Yi of ~ are defined by the relationship i= 1,···,M

where Al ~ A2 <; ... ~ AM- As mentioned earlier, when 9t is

172

ci.

given by (11) the eigenvector corresponding to the largest eigenvalue (termed the "largest eigenvector") equals S'and the remaining vectors span the (M - I )-dimensional subspace orthogonal to S. More generally, if the acoustic field contains K distinct incoherent propagating signals in a spatially white noise background, the spatial correlation matrix is given by jl

= a~

~+

K

L af SiS;.

~-1

e'~-l

(30)

The K largest eigenvectors of this matrix are the K orthogonal vectors which span the subspace containing the signal vectors Sf, i = 1, ... ,K. As before, the remaining (1\1- K) eigenvectors span the subspace orthogonal to this signal subspace. The characteristics of the direction-of-look vector E can be changed in an objective way by using the eigenstructure of ~ just described [26]. Define a matrix to be

e

J.lt1-K

I

CiViVi

(31 )

i=1

where the choice of the constants ci depend on the particular algorithm. For example, let Ci = 1. Then the matrix is a projection matrix, implying that contains only those components in E orthogonal to the signal direction vector and that the lengths of those components are not changed. The resulting modified maximum-likelihood and linear-predictive spectral estimates are

es

e

(32) (33 ) Both of these estimates provide an exact es tirnate of the signal bearing when the ideal spatial correlation matrix f \ 11) is used. These estimates are infinite when k = kf) and are finite elsewhere. The matrix need not be computed .n evaluating these estimates. Because:R is a correlation matrix, it can be expressed in terms of its eigenvectors and eigenvalues as

e

jl =

.\1

L

AiVi V ;.

VI. \34 )

i=1

~-l has the same eigenvectors as :X, but its eigenvalues are the reciprocals of those of :R. Therefore, .\-1

The products ~-1 are both given by

e and

I

- Vi vi. i=l Xi e'~-l e appearing

'R-1 = L

e = ~-le =

L

~W-K

e'l-l e

Vi V;.

and (37)

i=l

The effect of setting the small eigenvalues to unity is to "whiten" those portions of the spectrum that do not correspond to propagating signals. Its resolution capabilities are similar to those of the eigenvector methods. A method different in style from the eigenvector methods but related to them in its analytic details is also of current interest. This method, due to Pisarenko [41], assumes that the noise is spatially white. The power in the white noise is estimated by finding the largest quantity that can be subtracted from the main diagonal of the spatial correlation matrix while retaining a nonnegative definite matrix. If ~ were given by (11), this quantity would indeed be a~. The Fourier transform of the smallest eigenvector of a Toeplitz, Hermitian matrix has M - 1 zeros [10], [35]; all of the spatial freq uencies corresponding to signal direction vectors are represen ted by these zeros as well as false, non-signal-related frequencies. The power of each spectral line is evaluated by a separate procedure involving the solution of a set of linear equations [41]. Note that the smallest eigenvector of this modified matrix is iden tical to that of the original correlation matrix; consequently, the noise power need not be estimated explicitly in order to compute the smallest eigenvector. If the multiplicity of the smallest eigenvalue is greater than one, the spatial frequencies of the signals are those direction-of-look vectors lying in the su bspace orthogonal to the su bspace spanned by the smallest eigenvectors. When the spatial correlation matrix :R is given by (11), ill of these eigenvector methods provide J perfect indication of the bearing(s) of the sound sourceis). Consequently, there is no reason at this point to prefer one of these eigenvector methods over another. The characteristics 1)1 these estimates tend to differ when actual data are used: these differences are discussed in a succeeding section.

i=1

e= L

Instead of (36), the expression for both

eis 2

(35) in (32) and (33)

MAXIMUM-ENTROPY METHOD

Burg used the principle of maximum entropy to define J class of spectral estimates (71. In this approach, the 2J1 + I correlation values ~('i) where 'i is the ith inter-sensor separation vector Z m - Z ,'1, m , !1 = 1, ... ,Jl and i = m - n are assumed to infinitely accurate. The power spectral estimate is constrained to have a Fourier transform equaling the measured correlation values. Consequently, a set of linear constraints on the spectral estimate P(k) is obtained

fP(k) exp (+j21Tk . 'i) dk = 3t(ri), (36)

i = -Af, ... ,1lf. (38)

The entropy of the power spectrum is defined by

Consequently, the eigenvector expansion of ~ -1 is truncated to include only those terms not corresponding to signal propagation directions. These variations of the maximumlikelihood and linear-prediction methods have better resolution properties than the standard approaches : 26]. However, they can be more sensitive to assumptions made in the course of analysis (e.g., how many signals are present). A related eigenvector method termed MUSIC (MUltiple SIgnal Classification) has been developed by Schmidt [43]. This method differs from that just presented in the choice of

173

H

=

Iln K

P(k) dk

(39)

where K denotes the range of spatial frequencies where the entropy is defined. This region could be restricted to those values which correspond to physically possible arrivals. It has been shown that maximizing the entropy is roughly equivalent to choosing the "most likely" spectrum that satisfies the

2Note that the matrix method.

e is not

the same for e'~-Ie and ~-le in this

polynomials is also a positive polynomial). Considering H to be a function of these coordinates, we seek the location B o in this space where the hyperplane is tangent to the surface defined by H(B). The equivalent nonlinear constrained optimization problem is to minimize H(B) (40) and (41) subject to the constraint that B'R = l , This problem has no known closed-form solution. However, since the entropy is a convex function defined over a convex set, a unique minimum does exist [33], and it can be found using well-known numerical optimization techniques. The lack of a closed-form solution has hindered analytical inquiry into the characteristics of the resulting spectral estimates. In particular, the intuitive notion that the maximum-entropy spectral estimate should be as good if not better than the linear-predictive estimate has not been confirmed.

correlation constraints [25], [40]. The solution to this problem has been shown to be of the form [20] I PME(k)=p(k)

(40)

where p(k) is a so-called positive polynomial p(k) =

J.W

L

b, exp (- j2trk · 'i)

(41 )

i=-M

where b, is the ith coefficient of the polynomial. Positive polynomials are defined to have a value greater than or equal to zero for all values of their arguments. p(k) ~ 0,

for all kEK.

(42)

In the time series case or the case of an equally spaced linear array, the maximum entropy solution corresponds to the linear prediction solution. This correspondence is due to the fundamental theorem of alge bra: every polynomial of a single scalar variable has exactly n roots where n is the order of the polynomial. The polynomial p(k) is such a polynomial in these special cases and can thus be factored to be of the form p(k) = IA'E\2.

VIII.

(43)

When the array geometry is multidimensional, positive polynomials cannot be written in the form of (43). The fundamental theorem of algebra does not hold in general for multivariate polynomials; consequently, they may not be factorable. The unear prediction estimate of the spatial spectrum can be found as has been previously described. However, these spectra constitute a subclass of potential maximum-entropy spectra. One might, therefore, expect the maximum-entropy method to yield better estimates than the linear-prediction method. The evaluation of the polynomial coefficients in the maximum-entropy spectral estimate is a subject of current research. One algorithm of finding the maximurn-entropy spectrum has been proposed recently by Lang and McClellan [31], [37]. By considering the coefficien ts bias elements of the vector Band j{(rj) as elements of the vector R, the inner product B'R is given by

B'R=Lb1:R(1i).

(44)

Using Parseval's theorem, this sum is written as

s'« =

r

JK

p(k) dk

PoCk)

1 K

9t= (45)

where PoCk) is the denominator polynomial of the optimal maximum-entropy solution. Thus when the optimum (maximum entropy) coefficients B o are used in this expression

B~R =

f K

PoCk) dk PoCk)

= 1.

EMPIRICAL ISSUES

Many aspects of the application of modem spectral estimation procedures to bearing estimation problems have not been fully explored. Two issues dominate. The first is array geometry. The location of each sensor Z m is well-hidden in the formulation of the spectral estimate. Consequently, the effect of geometry on the "quality" of the various spectral estimates is unknown. It could well be that each procedure excels for specific class of array geometries. On the other hand, these procedures can also be sensitive to differences between the actual sensor location and those presumed by the spectral estimates. The adaptive methods tend to be much more sensitive to such errors than the Bartlett estimate. In fact, a sensor can produce no output and not greatly affect the spectrum computed by the Bartlett method. Another dominating issue is the computation of the correlation matrix. In practice, the spatial correlation matrix is never of the form of (10). Consequently, theoretical results based on this assumption may be misleading. In array processing, the spatial correlation matrix is computed by a variation of the Bartlett procedure.' used in time series. Let the vector of sensor outputs by x(t). The time series recorded from each sensor is sectioned into time segments of duration T and the Fourier transform of each section is evaluated. The vector of Fourier transforms for the ith section is denoted by Xi(f). The spatial correlation matrix evaluated at the temporal frequency f is estimated by averaging the outer product of this vector with itself

(46)

The quantity B'R = 1 defines a constraint on the maximumentropy solution. As the correlation values are presumed to be known, this quantity can be viewed as a hyperplane in the 2M + I-dimensional vector space having the b i as its coordinates. Note that these coefficients are restricted to be in the subspace corresponding to coefficients of positive polynomials. TIlls subspace is convex (the weighted sum of two positive

K

,

L x.x; i=l

(47)

The number of terms in the average is often referred to as the time-bandwidth product of the computation. The origin of this term is easily understood. If the length of each section is T, the resulting temporal spectral resolution is proportioned to 1IT. The total time duration of the record is KT. Consequently, (KT· 1IT) =K can be thought as a measure of how accurately the temporal frequency resolution and the amount of averaging required for a good estimate trade against each other. Assuming Xi =asS + anN! (the signal does not change from section to section), this estimate of the spatial correlation matrix is

174

j This procedure should not be confused with the Bartlett spectral estimate defined earlier.

-50--10 -30 -20

-HI

0

10

20

30

-50 -40 -30 -20 -10

90

40 50

0

10

20

30

40 50

90

-10

-10

(a)

(a)

m "'0

W

0

~

~

:::i

CL

2

co "0

~ o ...u

W

0 :J ~

o,

-10

(f)

~

(b)

j

-10l

~

0~

(b)

-20

II

-20~

~

0

ill

0-

(f)

-3~ ~ -'-II??'' ' ' '""--1...-,.. .--,"T"j""j--r-"j "'T"j~j""j~1--Y-I-'j~T '

-30

-90

0

-50 - 40 -30 -20 -10

a

I

I

10

I j I 20

30

I

I

I

I

40 50

I

t'

I ' Jln

90

BEARING-deg

Fig. 7. Result of applying eigenvector methods. The spatial correlation matrix used in this computation is the same as that used in Fig. 6. The eigenvector expansion (36) was truncated at 8 terms. (a) Maximum-likelihood spectral estimate. (b) Linear-predictive estimate. Note that for this figure, (25) was used in the computation of the linear-predictive estimate rather than that described by (24).

-10

~

(c)

-291~ ~

-3e~f'---"""'".,. . .1. . ,......., "",-Ir-TI~I-rj---r,---'-1rr-'I I "T"'"i

-90

-50 -40 -30 -20 -10

0

- ,

10

, , ii' I 20

30

i

I

40 50

I

orthogonal to the smallest eigenvector. One is then faced with determining the set of "best" signal vectors [6], [ 11].

I j 11ft 90

BEARING-deg

VIII.

Fig. 6. Effect of finite averaging on various spectral estimates. The array configuration and signal characteristics are as described in the caption to Fig. 3. The matrix i is given by (47); time-bandwid th product of the computation is 50. (a) Bartlett estimate. (b) M-aximum-likelihood estimate. (c) Linear-predictive estimate. The same correlation matrix was used in each spectral estimate.

where 1

N=K

L K

i=1

N i and

is an estimate of the noise spatial correlation matrix ~. The matrix ~ is Hermitian but is usually not Toeplitz, As the /\ ...., time-bandwidth product increases, ~ -4- ~ and N ~ O. If the noise is spatially white, then ~ = g and (11) results. However, in most applications of interest, K is not large enough to justify such a simple formula. The cross terms between signal and noise and the presence of ~ instead of ~ imply that spectral errors can occur (Fig. 6). It has been shown that the maximum-likelihood and linearpredictive estimates are sensitive to the cross terms [ 18]. Furthermore, the increased resolution capability of linear prediction is mitigated to some extent by its sensitivity to K. Roughly speaking, the time-bandwidth product for linear prediction must be M times that for maximum likelihood to result in the same statistical variability of the spectral estimate. When !l = g, the eigenvector methods are more sensitive to the cross terms than to the statistical variation present in~. A finite time-bandwidth product limits the resolution of the eigenvector methods [26]. Fig. 7 illustrates the spectra obtained when these eigenvector methods are used. The Pisarenko method is more sensitive to a finite value of K. As the matrix 3t is no longer Toeplitz, no signal vector may be

CONCLUSIONS

A cohesive methodology of deriving high-resolution spectral estimates for array processing problems has been presented. Each has been shown to be the solution of a constrained optimization problem. This approach is quite general and can be used to derive procedures applicable to time series (i.e., onedimensional data) and to multidimensional data. While arrays are usually multidimensional, the spectral estimation problem equivalent to the bearing estimation problem has very specific properties. Procedures designed to compute the spectrum of a signal sampled on a regular grid (such as rectangular and hexagonal ones) do not usually apply to array processing pro blems. The impact of array geometry on spectral estimation procedures is largely unknown. Many designers of arrays use unequally spaced sensors for a variety of reasons. Hence the generality of the theory presented here. Some work is emerging on the geometry question (14], [32). The underlying model used for the signal in these derivations can also be questioned. The wavefront of sound propagating from a point source is curved. Significant curvature of the wavefron t across the array aperture can significantly affect the quality of estimates which assume a plane wave. If the curvature were known, it could easily be taken into account; unfortunately, it rarely is known. A more serious problem is coherence between signals impinging on the array from different directions. In this case, a nodal pattern of peaks and valleys of signal power is established across the array. This effect results in a location-dependen t amplitude and phase variation beyond that assumed in the usual plane-wave model. Current research is directed towards methods which can cope with coherent signals [21], [22]. The linear-predictive estimate is more sensitive to its signal model than most of the other procedures described here. In

175

addition to the usual plane-wave assumption, the signal recorded at each sensor is also assumed to be modeled by a linear difference equation. While the plane-wave signal may obey this relationship, the noise usually does not. In practical problems, the signal-to-noise ratio at each sensor is small, usually being around 0 dB. Consequently, this method is sensitive to noise and to finite time-bandwidth products. This problem has been recognized in the time-series literature; in that context, so-called ARMA models emerge [15], [27]. These are pole-zero models where the poles describe the signal, and the zeros are due to the presence of noise. Procedures are being developed to measure parameters of such models [28], [42], [45], but the applicability of them to array processing problems is limited because of problems similar to those encountered in the multidimensional maximum-entropy spectral estimate. While further work is needed to find spectral estimation procedures for time-series problems and to quantify their behavior, the bearing estimation problem offers a different set of issues, which are apparently more challenging.

Let

be the M X M matrix

eM = [1~~1 ~] and ~M to be the matrix in the equivalent quadratic form in the second term of (AS)

1'i

XM~MXM =

1

1=0

aiXM_i!2

The elements of this matrix are j<:i<:M- 1

i =M j =M.

Using (AS), the inverse of ~M can be written as

~A}= eM + ~M·

A closed-form expression for the inverse of a Toeplitz, Hermitian matrix 3{ can be found in the coefficients of the linear-prediction problem. This derivation is adapted from Siddiqui [44], although this result does not appear directly in his report. Let W m be a white, Gaussian stochastic sequence having zero mean and unity variance. Define the X m to be the output of a linear, shift-invariant system governed by the difference equation

(3{A} )ij = (~A} )M-j,M-i.

eM

W m is the input. Note that the coefficients ao, . · ., can be complex. Define X m to be the column vector [x t , •• • , XM] T. The matrix :R.t\l is defined to be the correlation matrix of the random vector XJ.VJ. We seek an expression for ~:J.. The joint distribution of the random variables

where

(g{:~ )ij j-l

E

k=O

T

is given by

[akQk+i-j - ak+M-iQk+M-j] +aM-iQM-jJ

i =M

Qo Q'M-i,

j =

"2 (X W'- t J\ M ,

1

0-1

1X

2

M - l +WM)

}

.

(A2)

Substituting the difference equation (AI) for WM, one obtains the joint distribution for Xl, • • • , XM to be

l~[.

Note that the first column of iA} is the vector [lao 12 , Q6 Q t , · · · ,a6 QM - l ] T . The inverse of 3lM can thus be found from the solution i-l"o of the causal linear prediction problem.

(A3)

This expression can also be written straightforwardly as (A4)

Comparing these expressions, we see that XM.AMXM

1

(A7)

M 1

, a)-I

i~M-

a~QM-i'

p(Z) = (21r)M{2 I :R _ 11{2

·exp {

3tiJ

As is zero in the last column, the last column of equals that of !M. This column is equal to the rust row of ~ A} because of this matrix's centrosymmetric property. The first row of lA}-l can then be deduced because the first row of jM is also known. Knowledge of its first row implies knowledge of its last column, which is in the next-to-last column of ~:~. This method proceeds iteratively to yield the following closed-form expression for CiA} )ijJ i ;;, j

QM-t

I

(A6)

Because g{M is Hermitian, so is ~A}; however, 9{M being Toeplitz does no t imply that the inverse is necessarily Toeplitz, The inverse is, however, centrosyrnmetric (symmetric about the cross-diagonal) [10]

ApPENDIX

Z= [Xt,X2,··· ,xM-l' wM1

e. M

X

'M-l

= XM-l;J\ M-l M-l + I ' C D -1

!~ ~ aixM-i

2 1

(AS)

1=0

176

REFERENCES [1 ] A. B. Baggeroer, "Sonar signal processing," in Applications of Digital Signal Processing, A. V. Oppenheim, Ed. Englewood Cliffs, NJ: Prentice-Hall, 1978, pp. 331-437. [2] W. J. Bangs and P. M. Schultheiss, "Space-time processing for optimal parameter estimation," in Signal Processing. New York: Academic Press, 1973, pp. 577-590. [3] G. Bienvenu and L. Kopp, "Source power estimation method associated with high resolution bearing estimation," in Proc. IEEE ICASSP, pp. 153-156 (Atlanta, GA, 1981). {4] R. B. Blackman and J. W. Tukey, The Measurement of Power Spectra. New York: Dover, 1958. [5] T. P. Bronez and J. A. Cadzow, "An algebraic approach to dataadaptive array processing," in Proc. ASSP Workshop on Spectral Estimation, pp, 5.2.1-5.2.8 (McMaster Univ., Hamilton, Ont., Canada, 1981). [6] H. P. Bucker J "High-resolution cross-sensor beamforming for a uniform line array," J. Acoust, Soc..A mer., vol. 63, pp. 420-424, 1978. [7] J. P. Burg, "Maximum entropy spectral analysis," Ph.D. dissertation, Dep, Geophys., Stanford Univ., Stanford, CA, 1967.

[ 8) - , "The relationship between maximum entropy spectra and maximum likelihood spectra," Geophysics, vol. 37, pp. 375-376, 1972. [9] J. A. Cadzow and T. P. Bronez, "An algebraic approach to superresolution adaptive array processing," in Proc. IEEE ICASSP, pp. 302-305 (Atlanta, GA, 1981). [ 10] A. Can toni and P. Butler, "Properties of the eigenvectors of persymmetric matrices with applications to communication theory," IEEE Trans. Commun., vol. COM-24, no. 8, pp. 804-809,1976. [ 11] A. Cantoni and L. C. Godara, "Resolving the directions of sources in a correlated field incident on an array J" J. A coust, Soc. Amer., vol. 67, pp. 1247-1255, 1980. (12] J. Capon, "High-resolution frequency-wavenumber spectrum analysis," Proc. IEEE, vol, 57, no. 8, pp. 1408-1418,1969. [13] - , "Maximum-likelihood spectral estimation," in Nonlinear Methods of Spectral Analysis, S. Haykin, Ed. New York: Springer, 1979, pp. 155-179. [ 14] G. C. Carter, "Variance bounds for passively locating an acoustic source with a symmetric line array," J. Acoust. Soc. Amer., vol. 62,pp.922-926,1977. [ 15) Y. T. Chan, J .M.M. Lavoie, and J. B. Plant, "A parameter estimation approach to estimation of frequencies of sinusoids," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-29, pp. 214-219,1981. [ 16) J. F. Claerbout, Fundamentals of Geophysical Data Processing. New York: McGraw-Hill, 1976. (17) H. Cox, "Resolving power and sensitivity to mismatch of optimum array processors," J. Acoust. Soc. Amer., vol. 54, no. 3, pp. 771-785, 1973. [18) S. DeGraaf, "The effect of coherent signals on the capability of array processing algorithms to resolve source bearings," M. S. thesis, Dep. Elec. Eng., Rice University, Houston, TX, 1982. [19] D. 1. Edelblute, 1. M. Fisk, and G. L. Kinnison, "Criteria for optimum-signal-detection theory for arrays," J. A coust. Soc. Amer., vol. 41, no. 1, pp. 199-205, 1967. (20) J. A. Edward and M. M. Fitelson, "Notes on maximum-entropy processing," IEEE Trans. Inform. Theory, vol. IT-19, pp. 232234, 1973. [ 21] J. E. Evans, J. R. Johnson, and D. F. Sun, "High resolution angular spectrum estimation techniq ues for terrain scattering analysis and angle of arrival estimation," in Proc. ASSP Workshop on Spectral Estimation, pp. 5.3.1-5.3.10 (McMaster Univ., Hamilton, Ont., Canada, 1981). (22) W. F. Gabriel, "Adaptive superresolution of coherent RF spatial sources," in hoc. ASSP Workshop on· Spectral Estimation, pp. 5.1.1-5.1.7 (McMaster Univ., Hamilton, onr., Canada, 1981). (23) S. Haykin, Nonlinear Methods of Spectral Analysis. New York: Springer, 1979. [24] M. J. Hinich, "Frequency-wavenumber array processing," J. Acoust. Soc. Amer., vol. 69, no. 3, pp. 732-737,1981. [25] E. T. Jaynes, "What is the problem"," in Proc. ASSP Workshop on Spectral Estimation, pp. 1.1.1-1.1.10 (McMaster Univ., Hamilton, Ont., Canada, 1981). [26] D. H. Johnson and S. DeGraaf, "Improving the resolution of bearing in passive sonar arrays by eigenvalue analysis," submitted to IEEE Trans. Acoust., Speech, Signal Processing.

[27] M. Kaveh, "High resolution spectral estimation for noisy signals 't IEEE Trans. Acoust., Speech. Signal Processing, vol. ASSP-27 pp.286-287,1979. ' (28] M. Kaveh and S. P. Bruzzone, "A comparative overview of ARMA spectral estimation," in Proc. ASSP Workshop on Spectral Esti. mation, pp. 2.4.1-2.4.8 (McMaster Univ., Hamilton, Ont., Canada 1981). • (29) R. T. Lacoss, "Data adaptive spectral analysis methods," Gee, physics, vol. 36, no. 4, pp. 661-675, 1971. [30] S. W. Lang and J. H. McClellan, "Frequency estimation with maximum entropy spectral estimators," IEEE Trans. A COUst. Speech, Signal Processing, vol. ASSP-28, pp. 716-724, 1980. ' (31) - , "Spectral estimation for sensor arrays," in Proc. ASSP Work_ shop on Spectral Estimation, pp. 3.2.1-3.2.7 (McMaster Univ Hamilton, Ont., Canada, 1981). ., (32] S. W. Lang, G. L. Duckworth, and 1. H. McClellan, "Array design for MEM and MLM array processing," presented at the Int. Conf, on Acoustics, Speech, and Signal Processing, Atlanta, GA. Mar. 1981. [33] D. G. Luenberger, Optimization by Vector Space Methods. New York: Wiley, 1969. (34) J. Makhoul, "Linear prediction: A tutorial review," Proc. IEEE vol. 63, pp. 561-580, 1975. ' (35) - , "On the eigenvectors of symmetric Toeplitz matrices," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-29 pp. 868-872,1981. ' [36) L. Marple, "A new autoregressive spectrum analysis algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28 pp. 441-454, 1980. ' [37] 1. H. McClellan, "Multi-dimensional spectral estimation," in Proc. ASSP Workshop on Spectral Estimation, pp. 3.1.1-3.1.10 (McMaster Univ., Hamilton, Ont., Canada, 1981). (38) R. N. McDonough, "Application of the maximum-likelihood method and the maximum entropy method in array processing," in Nonlinear Methods of Spectral Analysis, S. Haykin, Ed. New York: Springer, 1979, pp. 181-243. (39) R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley-lnterscience, 1980. (40) A. Papoulis, "Entropy: From iust principles to spectral estimation," in Proe. ASSP WorkshOp on Spectral Estimation, pp. 1.2.1-1.2.7 (McMaster Univ., Hamilton, Ont., Canada, 1981). [41 ) V. F. Pisarenko, "The retrieval of harmonics from a covariance function," Geopnys. J. R. Astr. Soc., vol. 33, pp. 347-366, 1973. (42) E. A. Robinson, "Iterative least-squares procedure for ARMA spectral estimation," in Nonlinear Methods of Spectral Analysis, S. Haykin, Ed. New York: Springer, 1979, pp. 127-153. [43] R. O. Schmidt, "A signal subspace approach to multiple emitter location and spectral estimation," Ph.D. dissertation, Stanford Univ., Stanford, CA, 1981. [44] M. M. Siddiqui, HOn the inversion of the sample covariance matrix in a stationary autoregressive process," Ann. Math. Stat., vol. 58, pp. 585-588, 1958. [45] T. J. Ulrych and M. Ooe, "Autoregressive and mixed autoregressive-moving average models and spectra," in Nonlinear Methods of Spectral Analysis, S. Haykin, Ed. New York: Springer, 1979, pp. 73-125.

177

On Spatial Smoothing for Direction-of-Arrival Estimation of Coherent Signals TIE-JUN SHAN, MATI WAX,

AND THOMAS KAILATH, FELLOW, IEEE

Abstract-We present an analysis of a "spatial smoothing" preprocessing scheme, recently suggested by Evans et al., to circumvent problems encountered in direction-or-arrival estimation of fully correlated signals. Simulation results that illustrate the performance of this scheme in conjunction with the eigenstructure technique are described.

I.

present simulation results that illustrate its performance in conjunction with the eigenstructure technique. II. PROBLEM STATEMENT

INTRODUCTION

TN recent years, there has been a growing interest in high Lesolution eigenstructure techniques for direction-of-arrival estimation. These methods, developed by Pisarenko [12], Ligget [9], Owsley [11], Schmidt [14], Reddi [13], Bienvenu and Kopp [1], Johnson and Degraff [8], and Wax et ale [18], are known to yield high resolution and asymptotically unbiased estimates, even in the case that the sources are partially correlated. Theoretically, these methods encounter difficulties only when the signals are perfectly correlated. In practice, however, significant difficulties arise even when the signals are highly correlated, as happens, for example, in multipath propagation or in military scenarios involving smart jammers. The perfect correlation case, referred to as the coherent case, serves as a good model for the highly correlated case. In spite of its practical importance, the coherent case did not receive considerable attention until recently. Although a rather general solution was proposed by Schmidt [14], the high computational complexity involved makes it unattractive. Widrow et ale [19] and Gabriel [6], [7] described two similar approaches, both aimed at "decorrelating" the coherent signals. The scheme in Widrow et al., called "spatial dither," is based on mechanical "dithering" of the array, while Gabriel's scheme is based on "Doppler smoothing." Recently, Evans et ale [4], [5], in an extensive study of direction-of-arrival estimation techniques, presented an attractive solution to the problem for the case of a uniform linear array. Their solution is based on a preprocessing scheme referred to as spatial smoothing that essentially "decorrelates" the signals and thus eliminates the special difficulties encountered with coherent signals. I~ this paper, we present a more complete analysis of the spatial smoothing preprocessing scheme. We also Manuscript received June 29, 1983; revised February 27, 1985. This work was supported in part by the Air Force Office of Scientific Research, Air Force Systems Command under Contract AF49-620-79-C-0058, and by the Joint Services Program at Stanford University under Contract DAAG29-81K-0057. The authors are with the Information Systems Laboratory, Stanford University, Stanford, CA 94305.

Consider a uniform linear array composed of p identical sensors. Let q (q < p) narrow-band planewaves, centered at frequency wo, impinge on the array from directions {(J 1, . · . , 8q } . Using complex (analytic) signal representation, the received signal at the ith sensor can be expressed as (/

ri(t) =

L:

a"Sk(t)e-jwo(i-I)sinO~dlc

k=l

+

ni(t)

(1)

where, in fairly common notation, Sk (.) is the signal of the kth wavefront, ak is the complex response of the sensor to the kth wavefront, d is the spacing between the sensors, c is the propagation speed of the wavefronts . and n, (.) is the additive noise at the ith sensor. We assume that the signals and noises are stationary and ergodic complex-valued random processes with zero mean. In addition, the noises are assumed to be uncorrelated with the signals and uncorrelated between themselves, and to have identical variance ,;. Rewriting (1) in vector notation, assuming for simplicity that the sensors are omnidirectional, i.e .. at I, we obtain

=

'I

= L:

r(t)

1=

I

a(Oi)

s/(t) + n(r)

(2a)

x 1 vector

where r(t) is the p

=

lit)

(2b)

[rl(t), ... , rp(t)]T

and a(e i ) is the "steering vector" of the array in the direction ()i: a( ();)

T;

=

[1 e - JWOT" d

=-

c

• • • ,

e - JWO(P -

I )T, ] T,

sin OJ.

(2c)

To further simplify the notation, we rewrite (2) as r(t) = As(t)

+

net)

(3a)

where set) is the q x 1 vector S(t)

=

[Stet), · · · , S(/t)]T

(3b)

and A is the p X q matrix A -- [a(Llv)I , · · ·, a (llu q)] .

Reprinted from IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-33, No.4, pp. 806-811, August 1985.

178

(3c)

It follows from our assumptions that Er(t) rt(t): = R

where

t

= ASA t +

ell, S:

= Es(t)st

plicity of the smallest eigenvalue is p - (q - 1); 2) the eigenvectors corresponding to the minimal eigenvalue are orthogonal to the columns of the matrix A. Because of their Vandermonde structure, note that the first column of A in (5c) is no longer a legitimate steering vector since no linear combination of two "direction vectors" can yield another steering vector. The results of a straightforward application of the eigenstructure technique to R can now be easily understood. First, because the multiplicity of the smallest eigenvalue of R is now q - 1, the detection step will give q - 1 as the number of signals. Second, since only the "direction vectors" corresponding to {83 , • • • , 8q} are included in the "signal" subspace, only these directionsof-arrival will be resolved in the estimation step. In general, if m out of the q wavefronts are coherent, the application of the conventional eigenstructure technique will result in an inconsistency: while the number of signals detected will be q - m + 1, only q - m directions-of-arrival, corresponding to the the incoherent wavefronts, will be resolved. Thus, if only one group of coherent signals exists, the difference between the number of signals detected and the number of signals resolved will be indicative of the size of the coherent group. Realizing this, Schmidt [15] proposed the following procedure: if a coherent group of size m is detected, search for the linear combination of m "direction vectors" that is included in the "signal" subspace or, equivalently, that is orthogonal to the "noise" subspace. Unfortunately, because of the high dimensionality of this search involved, this solution is computationally unattractive; in the next section, we present a different solution.

(4)

denotes the conjugate transpose. Notice that the

S is diagonal when the signals are uncorrelated, nondi-

agonal and nonsingular when the signals are partially correlated, and nondiagonal but singular when some signals are fully correlated (or coherent). . Assuming that the spacing between the sensors is less than half a wavelength of the impinging wavefronts (d < 1/2, AO where AO = 2-rrc/wo), it follows that the columns of the matrix A are all different, and hence, because of their Vandermonde structure, linearly independent. Thus, if S is nonsingular, then the rank of ASA t is q. If

{AI

~

A2 · ..

~

Ap } and

{Vb V2' . . . , Vp }

are the eigenvalues and the corresponding eigenvectors of R, then the above rank properties imply that 1) the minimal eigenvalue of R is equal to cl- with mul-

tiplicity p - -q:

Aq +}

= Aq

+'J

-

= ... =

Ap =

cl

2) the eigenvectors corresponding to the minimal eigenvalue are orthogonal to the columns of the matrix A, namely, to the "direction vectors" of the signals { v q + h"

• ,

v p } .1 {a( 8 1) ,

• • • ,

a( 0q) } .

We shall refer to the subspace spanned by the eigenvectors corresponding to the smallest eigenvalue as the "noise" subspace, and to its orthogonal complement, spanned by the "direction vectors" of the signals, as the "signal' subspace. The high resolution eigenstructure techniques are based on the exploitation of properties 1) and 2) above. Unfortunately, these properties hold only when the covariance matrix of the sources S is nonsingular. Different relations hold when S is singular. Assume, for simplicity, that the rank of S is q - 1. This implies that two signals, say the first two, are coherent, i.e., S2(t) = as}(t), with ex denoting a complex scalar describing the gain and phase relationship between the two coherent signals. In this case, we can rewrite (2) as r(t) = as(t) + net) where s( t) is the (q - 1) x 1 vector

set) = [(1 + ex) SI(t), S3(t), ... , Sq(t)]

T

III. THE SPATIAL SMOOTHING PREPROCESSING SCHEME

As we have pointed out in the previous section, the nonsingularity of the covariance matrix of the signals is the key to a successful application of the eigenstructure technique. In this section, we present a preprocessing scheme, introduced by Evans et al. [5], that guarantees this property even when the signals are coherent. Let a uniform linear array with L identical sensors {I, · · · , L} be divided into overlapping subarrays of size p, with sensors {I, · . . , p} forming the first subarray, sen(5a) sors {2, · . · , p} forming the second subarray, etc. (see Fig. 1). Let r, ( .) denote the vector of received signals at (5b) the kth subarray. Following the notation of (3), we can write

and A is the (q - 1) x m matrix A = [a(O})

+ aa«(}2),

a«(J3), · .. , a«(Jq)]'

where

From (5), it follows that the covariance matrix of r(t) also can be written as t

R = ASA +

ifI.

(6)

D(k)

= AD(k-l) s(t) +

nk(t)

(7a)

denotes the kth power of the q

x q diagonal

rk(t)

matrix

D

= diag

{e -jWOTl,

• • • ,

e-jwoTq } .

(7b)

The covariance matrix of the kth subarray is therefore

Now S = E[s(t) s(t) t], the covariance matrix of the mod- given by ified signals, is a (q - 1) x (q - 1) nonsingular matrix R k = AD(k-l) SDt(k-l) At + dlI and A is of full column rank. Therefore, in complete analogy to properties 1) and 2) above, we have 1) the multi- where S is the covariance matrix of the sources.

179

(8)

with C denoting the Hermitian square root of (11M) S: 1 = M S. (13e)

cc

L-l

~

Clearly, the rank of S is equal to the rank of G. Thus, our task is to prove that G has rank q Of, equivalently, using the rank operator p, to prove that p{G} = q. Recalling that the rank of a matrix is unchanged by a permutation of its columns, it can be easily verified that

r(I)--..J

r----

l'

(2)_ _....

r--r(3)

----1 p{G}

=p

(14a) Cqtbq

~I

Fig. 1. Subarray spatial smoothing.

b, = [1 e - jWiT;,

The spatially smoothed covariance matrix is defined as the sample means of the subarray covariances:

1 M

(9)

(..!.. ~ tr:» SDt
(10)

or more compactly as

R = ASA t + ill

(lla)

where 8 is the modified covariance matrix of the signals, given by M

S = .l ~ tr:»

u c:

strv :»,

(lIb)

We shall now prove that when M ~ q, the number of signal sources S will be nonsingular regardless of the coherence of the signals. Theorem: If the number of subarrays is greater than or equal to the number of signals, i.e., if M ~ q, then the modified covariance matrix of the signals S is nonsingular. Proof: First, note that we can rewrite S as

S=

..!..s [ID · · ·

D(M-I)]

M

..

(12)

..!..S M

D-(M-l)

which can be further simplified to

S = GGt

(13a)

where G is the q x Mq block matrix

G-= [C DC · · · df-IC]

• • • ,

e - j ;(M - 1)T1

i = 1, · · ., q.

where M = L - p + 1 is the number of subarrays. Using (8), we can rewrite (9) as

R=A

cqqbq

where Cij is the ijth element of the matrix C and b, (i = 1, · · · , q) is the 1 x M row vector

~rL-nt+-l_-..I'"

R=M~R/c k=l

Cq2bq

(13b)

(14b)

To show that the matrix G is of rank q, namely, is full row rank, it suffices to show that each row of the matrix . C has at least one nonzero element and that the vectors {bl , · • · , bq } are linearly independent. The first fact follows by contradiction. Assume that a row of C, say the kth, is composed of all zeros. This implies, by (13c), that the kth signal has zero energy, in contradiction to the definition of S as the covariance matrix of the nonvanishing signals. The linear independence of the vectors b-, · · · , bq follows by observing that for M -s q, these vectors can be embedded in a Vandermonde matrix, which is known to be nonsingular. The above result is stated in Evans et ale [5, pp. 2-24]. Their proof, however, is incomplete; they show, correctly, ~ l) that the matrix 8(t) = 11M £Ji=l [D('- s(t)]· i [U -I) s(t)]t is nonsingular. Notice that ES = 1/M E · ~~l [Oi-I) s(t)] [Ui-I) s(t)]t = S, that is, the expected value of S is equal to S, the modified covariance matrix. Unfortunately, the nonsingularity of a random matrix does not imply the nonsingularity of its expected value. Thus, the nonsingularity of S, the crucial element upon which the eigenstructure method hinges, does not follow from the nonsingularity of S. It can be shown that in the special case that the covariance matrix of the sources is block diagonal, i.e., when there are several groups of coherent signals that are uncorrelated with each other, the number of subarrays can be reduced to the size of the largest group of coherent signals. Since the smoothed covariance matrix R has exactly the same form as the covariance matrix for the noncoherent case, one can successfully apply the eigenstructure methods to this smoothed covariance matrix regardless of the coherence of the signals. However, this robustness comes at the expense of a reduced. effective aperture. To see this more quantitatively, consider the number of sensors needed to cope with q coherent wavefronts. Recalling that the number of subar-

180

-

""M

.

10.0 0

5 . 00

. 00

-5.00 III

-0

- 10 . 00

-15.00

-20 .0 0

- 25 . 00

.00

50 . 00

70

65 100 . 00

130

150 .0 0

200 . 00

Fig. 2. Conventional beamform ing method (with Hamming window ) (six sensors; SNR = 3 dB; 500 " snapshots"; two coherent narr ow-b and sources from 85 ° , 130° , one incoherent so urce fro m 70 °) . 30. 00

25.00

20.00

15 .00

III

-0

10 .0 0

5 . 00

. 00

- 5.00

-10 .00

. 00

50 . 00

70

65 100.00

130

150.00

200 .00

Direct ion -of -Ar r lval

Fig. 3. Co nventional MUSIC method (six sensors; SNR = 3 dB ; 500 " snapshots " ; two coherent narrow-band sources from 85 ° , 130 · , one incohe rent source fro m 70 °)

The example we considered had three (q = 3) plana r wavefronts at directions-of-arrival 85 0 , 130° , and 70 ° . The first two signals were coherent , while the third signal was not correlated with the others . The array was uniform and linear, with six elements a third wavelength apart , The signal-to-noise ratio was 3 dB , and the number of samples (" snapshots" ) taken from the array was 500 . Applying the IV. SIMULATION RESULTS conventional beamforming method and the eigenstructure In this section , we present simulation results that illus - method of Schmidt [14), we obtained the results shown in trate the performance of the spatial smoothing scheme in Figs . 2 and 3, respectively. Only one dom inant peak corresponding to the direction-of-arrival of the third signal is conjunction with the eigenstructure technique .

rays , given by M = P -, m + 1, must be greater than or equal to q , and that the size of each subarray m must be at least q + 1, it follows that the minimum number of sensors needed is p = Lq . Comparing this to p = q + 1 for the conventional case , it is clear that we trade off half the effective aperture .

181

50.00

"0.00

30.00

fg

20.00

10.00

.00

-10.00 .00

50.00

70

85

100.00

130

150.00

200.00

Fig. 4. New method (six sensors; subarray size = 4; SNR = 3 dB; 500 "snapshots"; two coherent narrow-band source from 85 0 , 1300 ; one incoherent source from 70°).

seen in both cases; the directions-of-arrival of the two coherent signals were not resolved. However, first applying the spatial smoothing preprocessing scheme with three (M = 3) subarrays of four (p = 4) sensors each, and then applying the eigenstructure method of Schmidt [14] to the spatially smoothed covariance matrix yielded the results shown in Fig. 4. In this case, the three peaks corresponding to the directions-of-arrival of all the three signals are clearly seen.

v.

ACKNOWLEDGMENT

The authors wish to thank the referees for their helpful comments. REFERENCES

[1] G. Bienvenu and L. Kopp, "Adaptivity to background noise spatial coherence for high resolution passive methods, " in Proc. IEEE ICASSP '80, Denver, CO, pp. 307-310. [2] 1. P. Burg, "Maximum entropy spectral analysis." Ph.D. dissertation, Stanford Univ., Stanford, CA, 1975. [3] J. Capon. "High resolution frequency-wavenumber spectral analysis, Proc. IEEE, vol. 57, pp. 1408-1418, Aug. 1969. [4] J. E. Evans, 1. R. Johnson, and D. F. Sun, "High resolution angular spectrum estimation techniques for terrain scattering analysis and angle of arrival estimation," in Proc. Ist ASSP Workshop Spectral Estimation, Hamilton, Ont., Canada, 1981, pp. 134-139. [5] - , "Application of advanced signal processing techniques to angle of arrival estimation in ATC navigation and surveillance system," M.LT. Lincoln Lab., Lexington, MA, Rep. 582, 1982. [6] W. F. Gabriel, "Spectral analysis and adaptive array superesolution techniques," Proc. IEEE, vol. 68, pp. 654-666, 1980. [7] - , "Adaptive superesolution of coherent RF spatial sources," in Proc. lst ASSP Workshop Spectral Estimation, Hamilton, Ont., Canada, 1981, pp. 134-139. [8] D. H. Johnson, and S. R. Degraff, "Improving the resolution of bearing in passive sonar arrays by eigenvalue analysis," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, pp. 638-647, 1982. [9] W. S. Ligget, "Passive sonar: Fitting models to multiple time series, '':' in Signal Processing, 1. W. Griffith et al., Eds. New York: Academic, 1973, pp. 327-345. [10] A. H. Nuttall, "Spectral analysis of a univariate process with bad data points, via maximum entropy and linear prediction techniques," Na.. 'val Underwater Syst. Cen., New London, CT, Tech. Rep. 5303, 1976. [11] N. L. Owsley, "Spectral signal set extraction," in Aspects of Signal Processing, Part II, G. Tacconi, Ed. Dordecht, The Netherlands: Reidel, 1977, pp. 469-475. [12] V. F. Pisarenko, "The retrieval of harmonics from a covariance function," Geophys. J. Roy. Astron. Soc., vol. 33, pp. 247-266, 1973. [13] S. S. Reddi, "Multiple source location-A digital approach," IEEE Trans. Aerosp. Electron. Sysr., vol. AES-15, pp. 95-105, 1979. [14] R. O. Schmidt, "Multiple emitter location and signal parameter estimation," in Proc. RADC Spectral Est. Workshop, 1979, pp. 243258. It

CONCLUDING REMARKS

A spatial smoothing scheme, introduced by Evans et ale [5] to circumvent problems encountered in the estimation of the directions-of-arrival of coherent signals, was more completely analyzed. Our emphasis was on the use of the spatial smoothing scheme in conjunction with the eigenstructure technique. However, as pointed out by Evans et al., this scheme can also be applied in conjunction with other processing techniques such as the minimum variance technique of Capon [3]. It is also interesting to note, as again pointed out by Evans et al., that the linear prediction technique of Clayton and Nuttall [10], when used with a low-order predictor in the spatial domain, essentially performs the spatial smoothing implicitly. In fact, it is the improved performance observed for this method that apparently motivated Evans et ale to investigate the spatial smoothing scheme. The extension of the spatial smoothing scheme to more difficult scenarios arising in array processing, e.g., to narrow-band signals with unknown center frequency and to wide-band signals, follows straightforwardly frorn Wax et al. [18]. A modification of the idea for adaptive beamforming in communication applications is described by Shan and Kalath [16].

182

[15) - , "A signal subspace approach to multiple source location and spectral estimation," Ph.D. dissertation, Stanford Univ., Stanford, CA, 1981. [16] T. J. Shan, and T. Kailath, "Adaptive beamforming for coherent signals and interference," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, June 1985. [17] T. J. Ulrych, and R. W. Clayton, "Times series and maximum entropy," Phys. Earth Plan. Ins.; vol. 12, pp. 188-200, 1976. [18] M. Wax, T-J. Shan, and T. Kailath, "Spatio-ternporal spectral analysis by eigenstructure methods," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, pp. 817-827, 1984. [19] B. Widrow, K. M. Duvall, R. P. Gooch, and W. C. Newman, "Signal cancellation phenomena in adaptive antennas: Causes and cures," IEEE Trans. Antennas Propagat., vol. AP-30, pp. 469-478, 1982.

183

Detection of Signals by Information Theoretic Criteria MATI WAX

AND

THOMAS KAlLATH, FELLOW,

Abstract-A new approach is presented to the problem of detecting the number of signals in a multichannel time-series, based on the application of the information theoretic criteria for model selection introduced by Akaike (AIC) and by Schwartz and Rissanen (MDL). Unlike the conventional hypothesis testing based approach, the new approach does not require any subjective threshold settings; the number of signals is obtained merely by minimizing the AIC or the MDL criteria. Simulation results that Dlustrate the performance of the new method for the detection of the number of signals received by a sensor array are presented.

I

1.

IEEE

are discussed in Sections IV and V, respectively. Simulation results that illustrate the performance of the new method for sensor array processing are described in Section VI. Frequency domain extensions and some concluding remarks are presented in Sections VII and VIII, respectively. II.

FORMULATION OF THE PROBLEM

The observation vector in certain important problems in signal processing such as sensor array processing [15] , [20] , [5] , [6], [10], [12], [22], [25], harmonic retrieval [15], [12], pole retrieval from the natural response [11], [24], and retrieval of overlapping echos from radar backscatter [7], denoted by the p X 1 vector x(t), is successfully described by the following model

INTRODUCTION

N many problems in signal processin~: the vecto.r ?f observations can be modeled as a superposmon of a finite number of signals embedded in an additive noise. This is the case, for example, in sensor array processing, in harmonic retrieval, in retrieving the poles of a system from the natural response, and q in retrieving overlapping echoes from radar backscatter. A key x(t) = A (<1»;) Si(t) + net) (1) i= 1 issue in these problems is the detection of the number of signals. where One approach to this problem is based on the observation s;( . ) = scalar complex waveform referred to as the ith signal that the number of signals can be determined from the eigenA(

L

A = [A ( 1 )

Manuscript received May 19, 1983; revised June 1, 1984. This work

was supported in part by the Air Force Office of Scientific Research, Air Force Systems Command, under Contract Al? 49-620-79-C-0058,

the U.S. Army Research Office, under Contract DAAG29-79-C-0215, and by the Joint Services Program at Stanford University under Contract DAAG29-81-K-0057. The au thors are with the Information Systems Laboratory, Stanford University, Stanford, CA 94305.

•••

A(
(2b)

and set) is the q X 1 vector sT (t) =

[SI (r) ... Sq(t)] .

(2c)

Because the noise is zero mean and independent of the signals, it follows that the covariance matrix of x( . ) is given by

Reprinted from IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-33, No.2, pp. 387-392, April 1985.

184

used to encode the observed data, Rissanen proposed to select the model that yields the minimum code length. It turns out that in the large-sample limit, both Schwartz's and Rissanen'e approaches yield the same criterion, given by

(3a)

where 'I!=ASAt

(3b)

e

MDL=-log!(Xf ) + !klogN. (7) with t denoting the conjugate transpose, and S denoting the covariance matrix of the signals, i.e., S =E[s(') s( . )t] . Note that apart from a factor of 2, the first term is identical to Assuming that the matrix A is of full column rank, i.e., the the corresponding one in the Ale, while the second term has vectors A(i) (i = 1, ... , q) are linearly independent, and that an extra factor of log N. the covariance matrix of the signalsSis nonsingular, it follows that the rank of 'It is q, or equivalently, the p - q smallest IV. ESTIMATING THE NUMBER OF SIGNALS eigenvalues of '11 are equal to zero. Denoting the eigenvalues To apply the information theoretic criteria to detect the of R by Al ~ A] ... ~ Ap it follows, therefore, that the smallnumber of signals, or equivalently, to determine the rank of 2 est p - q eigenvalues of R are all equal to a , i.e., the matrix '11, we must first describe the family of competing 2 Aq+l=A q+2=···=Ap=a • (4) models) or density functions that we are considering. Regarding the observations x(t 1), ... ,X(tN) as identical and statistiThe number of signals q can hence be determined from the cally independent complex Gaussian random vectors of zero multiplicity of the smallest eigenvalue of R. The problem is mean, the family of models is necessarily described by the cothat the covariance matrix R is unknown in practice. When variance matrix of x( .). Since our model's covariance matrix estimated from a finite sample size, the resulting eigenvalues is given by (3), it seems natural to consider the following famare all different with probability one, thus making it difficult ily of covariance matrices to determine the number of signals merely by "observing" the (8) eigenvalues. A more sophisticated approach to the problem, developed by Bartlett [4] and Lawley [14], is based on a se- where 'I'(k) denotes a semipositive matrix of rank k, and a dequence of hypothesis tests. The problems associated with this notes an unknown scalar. Note that k E {a, 1,' .. , p - I} approach is the subjective judgment needed in the selection of ranges over the set of all possible number of signals. the threshold levels for the different tests. Using the well-known spectral representation theorem from In this paper we take a different approach. We pose the de- linear algebra, we can express R(k) as tection problem as a model selection problem and then apply k the information theoretic criteria for model selection intro2 2 R(k) = L (A; - a ) Y;Jlit + a J (9) duced by Akaike (AIC) and by Schwartz and Rissanen (MDL).

!

i= 1

III. INFORMATION THEORETIC CRITERIA

where t"l,···' Ak and Y 1 , " ' , Vk are the eigenvalues and The information theoretic criteria for model selection, intro- eigenvectors, respectively, of R(k). Denoting by e(k) the duced by Akaike [1], [2], Schwartz [21] , and Rissanen [17] parameter vector of the model, it follows that address the following general problem. Given a set of N obsere(k) T = (AI ~ " . ,Ak' a 2 , V 1T , . . . , V[). (10) vations X = {x(l), ... ,x(N)} and a family of models, that is, a parameterized family of probability densities [(Xle), select With this parameterization we now proceed to the derivation the model that best fits the data. of the information theoretic criteria for the detection probAkaike's proposal was to select the model which gives the lem. Since the observations are regarded as statistically indeminimum Ale, defined by pendent complex Gaussian random vectors with zero mean, their joint probability density is given by Ale = -210gfCXfe) + 2k (6)

e

where is the maximum likelihood estimate of the parameter vector 8, and k: is the number of free adjusted parameters in e. The first term is the well-known log-likelihood of the maximum likelihood estimator of the parameters of the model. The second term is a bias correction term, inserted so as to make the Ale an unbiased estimate of the mean KulbackLiebler distance betwe~n the modeled density [(X,e) and the estimated density!(Xle). Inspired by Akaike's pioneering work, Schwartz and Rissanen approached the problem from quite different points of view. Schwartz's approach is based on Bayesian arguments. He assumed that each competing model can be assigned a prior probability, and proposed to select the model that yields the maximum posterior probability. Rissanen's approach is based on information theoretic arguments. Since each model can be

[(x{t 1), ... ,X{tN )Ie(k)) ==

n N

i= 1 11'

P

1 (k) exp - x ( t; )t [R (k)] -1 det R

()

X t; .

(11 )

Taking the logarithm and omitting terms that do not depend on the parameter vector e(k), we find that the log-likelihood function L(e(k») is given by L(e(k») ::: -N log det R(k) - tr [R(k)]-l

R

(12a)

where Ris the sample-covariance matrix defined by

185

A

1

R ::: N

N

L

;=1

x(t;) X(ti)t .

(12b)

The maximum-likelihood estimate is the value of e(k) that maximizes (12). Following Anderson [3], these estimates are given by

~i = Ii

i = 1, . · . , k

~

1

A2

a = --

Vi =C;

i

(13a) (13b)

Ii

L..J

p-ki=k+l

= 1, ... , k

(13c)

where II > 12 .•• > I p and C1 , ••• ,Cp are the eigenvalues ~nd eigenvectors, respectively, of the sample covariance matrix R. Substituting the maximum likelihood estimates (13) in the log-likelihood (12), with some straightforward manipulations, we obtain (p - k)N

p

L(e) = log

n

I,,/(P-k) I

i= k + 1

(14)

p

P- k

The number of signals J is determined as the value of k E {a, 1, ... , p - I} for which either the AIC or the MDL is minimized.

L

Ii

i= k + 1

V. CONSISTENCY OF THE CRITERIA

We have described two different criteria for estimating number of signals. The natural question is which one should be preferred. What can be said about the goodness of the estimates obtained by these criteria. One possible benchmark test is the behavior as the sample size increases. One would prefer an estimator that yields the true number of signals with probability one as the sample size increases to infinity. An estimator with this property is said to be consistent. By generalizing a method of proof given in Rissanen [18] and Hannan and Quinn [9], we shall show that the MDL yields a consistent estimate, while the Ale yields an inconsistent estimate that tends, asymptotically, to overestimate the number of signals. The consistency of the MDL is proved by showing that in the large-sample limit, MDL(k) is minimized for k =q. Taking first k < q, it follows from (17) that

-!- [MDL(q) - MDL(k)J

Note that the term in the bracket is the ratio of the geometric mean to the arithmetic mean of the smallest p -.k eigenvalues. The number of free parameters in e(k) is obtained by counting the number of degrees of freedom of the space spanned by e(k). Recalling that the eigenvalues of a complex covariance matrix are real but that the eigenvectors are complex, it follows that e(k) has k + 1 + 2pk parameters. However, not all of the parameters are independently adjusted: the eigenvectors are constrained to have unit norm and to be mutually orthogonal. This amounts to reduction of '2 k degrees of freedom due to the normalization and :2 !k(k - 1) degrees of freedom due to the mutual orthogonalization. Thus, we obtain

N

q

= log (-

+ log

I

+ 2k(2p - k)

p

p-q

p )p -q(1 L i, -

q-k

i=q+l

q L

i=k+l

)q -

I;

k

P )P-k 1 -p-k L t, i=k +l

(18)

L

i=k +l

1;

Since the eigenvalues of the sample-covariance matrix I; (i = 1, ... , q) are consistent estimates of the eigenvalues of the true covariance matrix Ai, it follows that in the large-sample limit the eigenvalues I; (i =k + 1, ... , q) are not all equal with probability one. Therefore, by the arithmetic-mean geometricmean inequality it follows that in the large-sample limit

while the MDL criterion is given by

q

n

q - k

p

p

L t,

i= k + I

1 + - k(2p - k) logN. 2

L: t,»

i=k+l

n q

I//(q -

k).

(19)

i=k+l

This implies that the first term in (18) is negative with probability one in the large-sample limit. Similarly, by the generalized arithmetic-mean geometric-mean inequality

i= k + 1

P- k

1

I;

logN + (q - k)(2p - q - k) 2N .

(16)

MDL(k) = -log

(-

)q -

k

i= k + 1

(15)

I!/(P-k)

i= k + 1

p-k

q

L

(p - k)N

p

AIC(k) = -2 log

Ii

(

The form of Ale for this problem is therefore given by

n

1

q- k

(number of free adjusted parameters) =;k+l +2pk- [tk(k-l)] =;k(2p-k)+1.

n

i= k+ 1

(20) (17) 186

it follows that in the large sample limit

p

P- k

L li>(~ f P k i=k+l

i=k+l

.( -

1

q
q )(q L I;

Ii)

zero. Hence, the Ale tends to overestimate the number ot signals q in the large-sample limit. We should note that from the analysis above it follows that any criteria of the form

(p - q)/(p - k)

k)/(p - k)

This implies that the second term in (18) is also negative with probability one "in the large-sample limit. Now, since the last term in (18) goes to zero as the sample size increases, it follows that the difference [MDL(q) - MDL(k)] is negative with probability one in the large-sample limit for k < q. Taking now k > q, it follows from (17) that 2 [MDL(k) - MDL(q)] = (k - q)(2p - k - q) log N p

+ -2 log

n

/.l/(p - k) I

i= q + 1

1

p

L P - q i= q

+1

p

+ 2 log

n

i= k + 1

(23)

where cx.(N) ~ and a(N)jN ~ 0 yields a consistent estimate of the number of signals. 00

VI.

SIMULATION RESULTS

In this section we present simulation results that illustrate the performance of our method when applied to sensor array processing. By the well-known duality between spatial frequency and temporal frequency, these examples can also be interpreted in the context of harmonic retrieval. The examples refer to a uniform linear array of p sensors with q incoherent sinusoidal plane waves impinging from directions {cPl , ... , cPq}. Assuming that the spacing between the sensors is equal to half the wavelength of the impinging wavefronts, the vector of the received signal at the array is then given by

Ii

x(t) =

q

L

A(¢k) e- iT1 ( t ) + n(t)

k =1

/.l/(P - k) I

P

P- k

-Iog [(Xl 8) + a(N) k

(21)

i=k+l

L

;= k: + 1

(22)

where A(cPk) is the p X 1 "direction vector" of the kth wavefront.

A(k)T = [Ie -iTk

t,

Note that the terms in the curly bracket are twice the loglikelihoods of the maximum likelihood estimator under the hypotheses that the rank of 'Ir is q and k, respectively. Thus, their difference is the likelihood-ratio for deciding between these two hypotheses. From the general theory of likelihood ratios (see, e.g., Cox and Hinkley [8]) it follows that the asymptotic distribution of this statistic is X2 with number of degrees of freedom equal to the difference of the dimensions of the parameter spaces under the two hypothesis, i.e.,

(24a)

...

e-j(q

-l)Tk]

(24b)

with Tk

= 1T sin cf>k

(24c)

and

11( . ) = random phase uniformly distributed on (0, 21T) n( . ) = vector of white noise with mean zero and covariance 2 0 /. Note that this model is a special case of the general model presented in (1). In the first example, we considered an array with seven sensors (p :; 7) and two sources (q = 2) with directions-of-arrivals 20° and 25°. The signal-to-noise ratio, defined as 10 log I /2a2 , [k(2p - k) + 1] - [q(2p - q) + 1] was 10 dB. Using N = 100 samples, the resulted eigenvalues of the sample-covariance matrix were 21.2359, 2.1717, 1.4279, =(k - q )(2p - k - q). 1.0979, 1.0544, 0.9432, and 0.7324. Observing the gradual Thus, as the sample size increase, the probability that the term decrease of the eigenvalues it is clear that the separation of the in the curly bracket in (20) exceeds the first term in (20) is 3 "smallest" eigenvalues from the 2 "large" ones is a difficult given by the area in the tail from (k - q )(2p - k - q) log N of task in which a naive approach is likely to fail. However, apthe mentioned X2 distribution with (k - q )(2p - k - q) de- plying the new approach we have presented above yielded the grees of freedom. Since the area in this tail approaches zero as following values for the AIC and MOL. the sample size increases, it follows that in the large-sample 6 4 5 2 3 o limit the difference [MDL(k) - MDL(q)] is positive with probAle 1180.8 100.5 71.4 75.5 86.8 93.2 96.0 ability one for k > q. Combining this with the previous result MOL 590.4 67.2 66.9 80.7 95.5 105.2 110.5. for k < q, it follows that MDL(k) has a minimum at k =q. Repeating the above arguments for the AIC, it follows The minimum of both the AIC and the MDL is obtained, as that in the large-sample limit and for k < q, the difference expected, for the q = 2. [AIC(q) - AIC(k)] is negative with probability one. However, In the second example, we added another source at 10° to for k > q, the difference [AIC(k) - AIC(q)] has nonzero prob- the scenario described in the first example. The eigenvalues of ability to be negative even in the large-sample limit, since the the sample-covariance in this case were 15.5891, 8.1892, tail from (k - q)(2p - k - q + 1) of the X2 distribution with 1.4715, 1.1602, 1.0172, 0.9210, and 0.6528. The resulting (k - q)(2p - k - q + 1) degrees of freedom is definitely not values of the AIC and the MOL were as follows. 187

o Ale MDL

1011.4 505.7

562.1 298.0

2

3

4

5

6

82.9 72.7

83.2 84.6

90.4 97.3

95.9 106.5

96.0 110.5.

In this case, the minimum value of both the AIC and the MDL is obtained, incorrectly, for q = 2. The failure of the AIC and MOL in this case is because of the inherent limitations of the problem. Indeed, repeating the example with signal-to-noise ratio of 10 dB yielded the following eigenvalues 14.5595, 6.7786,0.3786,0.1372,0.1109,0.0946, and 0.0687, and consequentially the following values of the AIC and the MOL.

o Ale MDL

2731.6 1365.8

1960.6 997.2

2

3

4

5

6

241.4 152.0

90.7 88.4

91.6 97.9

95.1 106.2

96.0 110.5

where 11(Wn)~··· ~lp(wn) are the eigenvalues of the periodogram estimate of the spectral-density matrix K( w n ) . When the signals are wide band, namely, occupy several frequency bins, say, + M, the information on the number of signals is contained in all the M frequency bins. Assuming that the observation time is much larger than the correlation times of the signals, it follows that the Fourier coefficients corresponding to different frequencies are statistically independent (see, e.g., Whalen [26, p. 81]). The AIC for detecting the number of wide-band signals that occupy the frequency band WI, ... , Wl+ M, is given by the sum of (27) over the frequency range of interest

W" ... ,W,

p

Ale(k) =-2

Now the minimum of both the AIC and the MDL is obtained, correctly, for q =3. VII.

L

IZ

=/

log

n

i= k + 1

li(W n ) l / ( p -

k)

p

L

p-k i = k

EXTENSION TO THE FREQUENCY DOMAIN

The starting point of our approach was the time domain relation (1). However, in certain cases, especially in array processing, the frequency domain is more natural. As we shall show, our approach is easily extended to handle frequency-domain observations. Consider the frequency domain analog of the time domain model (1); the observed vector is a Fourier coefficients vector expressed by

I+M

(p - k)L

+l

l;(w n )

+ 2M[k(2p - k)] .

(28)

The corresponding expression for the MDL criterion can be similarly derived.

VIII.

CONCLUDING REMARKS

A new approach to the' detection of the number of signals in a multichannel time series has been presented. The approach is based on the application of the AIC and MDL information theoretic criteria for model selection. Unlike the conventional hypothesis test based approach, the new approach does not req quire any subjective threshold settings; the number of signals is x(\V n ) :: A(m n , i) Si(lV n ) + n(\v n ) (25) i= 1 determined merely by minimizing either the Ale or the MDL criterion. where It has been shown that the MDL criterion yields a consistent s;(l-V n ) = the Fourier coefficient of the ith signal at fre- estimate of the number of signals, while the Ale yields an inquency W n , consistent estimate that tends, asymptotically, to overestimate A( W n , ; associated with the ith signal The detection problem addressed in this paper is usually a n( w n ) = a p X 1 complex vector of the Fourier coefficients part of a more complex combined detection-estimation probof the additive noise. lem where one wants to estimate the number as well as the The problem, as in the time domain, is to estimate the num- parameters <1>1 , ••• , q of the signals in (l). Because of the ber of signals q from L samples of the Fourier-coefficients vec- computational complexity involved, the problem is usually tor Xl (W n ) , ... ,XL(W n ) . solved in two steps: first the number of signals is detected, and By the well-known analogy between multivariate analysis then with an estimate of the number of signals q at hand, the and time-series analysis (see, e.g., Wahba [23]), the time- parameters of the signals are estimated. It should be pointed domain approach carries over to the frequency domain with out, however, that in principle the Ale and MDL criteria can the role of the sample-covariance rnatrix played by the peri- be applied to the combined detection-estimation problem odogram estimate of the spectral density matrix, given by [26]. The evaluation of the Ale and MOL in this case involves the computation of the maximum likelihood estimates A_I L t K(w n ) - Xi(W n)X; (w n ) · (26) $1 ,... ~ ~ k » for every possible number of signals k E {a, 1, L i=1 ... ,p - I}. Solving this highly nonlinear problem for all values of k is computationally very expensive. Nevertheless, if Thus, the frequency-domain version of the AIC is given by one is ready to pay the price, the gain in term of performance, (p - k)L P especially in difficult situations such as low signal-to-noise li(Wn)l/(p - k) ratio, small sample size, or closely "spaced" signals, may be i= k + 1 AIC(wn , k) = -2 log significant. p

2:

2:

n

2:

p - k i= k

+ 2 [k(2p - k)]

+1

l;(w n )

REFERENCES

(27) 188

[1) H. Akaike, "Information theory and an extension of the maximum likelihood principle," in Proc. 2nd Int. Symp. Inform.

Theory, suppl. Problems of Control and Inform. Theory, 1973, [25] M. Wax, T. J. Shan, and T. Kailath, "Spatio-temporal spectral analysis by eigenstructure methods," IEEE Trans. Acoust pp.267-281. [2] - , ~'A new look at the statistical model identification," IEEE Speech, Signal Processing, vol. ASSP-32, pp. 817-827, 1984. ., Trans. Automat. Contr., vol. AC-19, pp. 716-723, 1974. [26] M. Wax, "Detection and estimation of superimposed signals" [3] T. W. Anderson, "Asymptotic theory for principal component Ph.D. dissertation, Stanford Univ., Stanford, CA, Mar. 1985. ' analysis," Ann. J. Math. Stat., vol. 34., pp. 122-148,1963. [27] A. D. Whalen, Detection of Signals in Noise. New York: Aca[4] M.·S. Bartlett, HA note on the multiplying factors for various x2 demic, 1971. approximations," J. Roy. Stat. Soc., ser. B, vol. 16, pp. 296-298, 1954. (5] G. Bienvenu and L. Kopp, "Adaptivity to background noise spatial coherence for high resolution passive methods," in Proc. ICASSP'80, Denver, CO, 1980, pp. 307-310. [6) T. P. Bronez and J. A. Cadzow, "An algebraic approach to super resolution array processing," IEEE Trans. Aerosp. Electron. Syst., vol. AES-19, pp. 123-133, 1983. [7) A. Bruckstein, T. J. Shan, and T. Kailath, "The resolution of overlapping echoes," IEEE Trans. Acoust., Speech, Signal Processing, submitted for publication. [8] D. R. Cox and D. V. Hinkley, Theoretical Statistics. London, England: Chapman and Hall, 1974. [9] E. J. Hannan and B. G. Quinn, "The determination of the order of an autoregression," J. Roy. Stat. Soc., ser. B, vol. 41, pp. 190195,1979. [10) D. H. Johnson and S. R. Degraff, "Improving the resolution of bearing in passive sonar arrays by eigenvalue analysis," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, pp. 638647,1982. [11] R. Kumaresan and D. W. Tufts, "Estimating the parameters of exponentially damped sinusoids and pole-zero modeling in noise." IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, pp. 833-840, 1983. [12) - , "Estimation of arrival of multiple plane waves," IEEE Trans. Aerosp. Electron. Syst., vol. AES-19, pp. 123-133,1983. (13) S. Y. Kung and Y. H. Hu, "Improved Pisarenkos sinusoidal spectrum estimate via the SVD subspace approximation method," in Proc. 21 CDC, Orlando, FL, 1982, pp. 1312-1314. [14] D. N. Lawley, "Tests of significance of the latent roots of the covariance and correlation matrices," Biometrica , vol. 43. pp. 128-136,1956. [15] N. L. Owsley, "Spectral signal set extraction," Aspects of Signal Processing: Part II, G. Tacconi, Ed., Dordecht, Holland: Reidel, pp. 469-47"5,1977. [16] V. F. Pisarenko, "The retrieval of harmonics from a covariance function," Geophys. J. Roy. Astron. Soc., vol, 33, pp. 247-266, 1973. [17} J. Rissanen, "Modeling by shortest data description," Automatica, vol. 14, pp. 465-471, 1978. [ 18] - , "Consistent order estimation of autoregressive processes by shortest description of data," Analysis and Optimization of Stochastic Systems, Jacobs et ale Eds. New York: Academic, 1980. [19] - , "A universal prior for the integers and estimation by minimum description length," Ann. Stat., vol. 11, pp. 417-431, 1983. [20) R. O. Schmidt, "Multiple emitter location and signal parameter . estimation," in hoc. RADC Spectral Estimation Workshop, Rome, NY, 1979, PP. 243-258. [21] G. Schwartz, "Estimating the dimension of a model," Ann. Stat., vol. 6, pp. 461-464,1978. [22] G. Su and M. Morf, "The signal subspace approach for multiple emitter location," in Proc. 16th Asilomar Conf Circuits Syst. Comput., Pacific Grove, CA, 1982, pp. 336-340. [23] G. Wahba, "On the distribution of some statistics useful in the analysis of jointly stationary time-series," Ann. Math. Stat., vol. 39,pp. 1849-1862, 1968. (24) M. Wax, R. O. Schmidt, and T.. Kailath, "Eigenstructure method for retrieving the poles from the natural response," in Proc. 22nd CDC, San Antonio, TX, 1983, pp. 1343-1344.

189

Multiple Emitter Location and Signal Parameter Estimation RALPH O. SCHMIDT,

MEMBER, IEEE

colocated antennas;

Abstract-Processing the signals received on an array of sensors for the location of the emitter is of great enough interest to have been treated under many special case assumptions. The general problem considers sensors with arbitrary locations and arbitrary directional characteristics (gain/phase/polarization) in a noise/interference environment of arbitrary covariance matrix. This report is concerned first with the multiple emitter aspect of this problem and second with the generality of solution. A description is given of the multiple signal classification (MUSIC) algorithm, which provides asymptotically unbiased estimates of 1) number of incident wavefronts present; 2) directions of arrival (DOA) (or emitter locations); 3) strengths and cross correlations among the incident waveforms; 4) noise/interference strength. Examples and comparisons with methods based on maximum likelihood (ML) and maximum entropy (ME), as well as conventional beamforming are included. An example of its use as a multiple frequency estimator operating on time series is included.

3) multiple frequency estimation. THE DATA MODEL

The waveforms received at the M array elements are linear combinations of the D incident wavefronts and noise. Thus, the multiple signal classification approach begins with the following model for characterizing the received M vector X as in

XI

X2

a(OI) a«(J2) ... a«(JD)

x;

FI F2

FD

WI

+

W2 W,W

or

INTRODUCTION

T

HE TERM MULTIPLE signal classification (MUSIC) is used to describe experimental and theoretical techniques involved in determining the parameters of multiple wavefronts arriving at an antenna array from measurements made on the signals received at the array elements. The general problem considers antennas with arbitrary locations and arbitrary directional characteristics (gain/phase/ polarization) in a noise/interference environment of"arbitrary covariance matrix. The multiple signal classification approach is described; it can be implemented as an algorithm to provide asymptotically unbiased estimates of 1) number of signals;

2) directions of arrival (DOA);

3) strengths and cross correlations among the directional waveforms: 4) polarizations; 5) strength of noise/interference.

These techniques are very general and of wide application. Special cases of MUSIC are 1) conventional interferometry; 2) monopulse direction finding (DF), i.e., using rnultiple Manuscript received August 9, 1985. The author is with Saxpy Computer Corporation, (formerly GuilTech Research Co.), 255 San Geronimo Way, Sunnyvale, CA 94086. Editor's Note-This paper was first published in the Proceedings of the RADC Spectrum Estimation Workshop, held in October 1979 at Griffiss Air Force Base, NY. The document was limited in circulation, but the author's uMUSIC" algorithm is a principal candidate in the field of spectral estimation and rather widely referenced in the literature. Therefore. it seemed appropriate to reprint it in this special issue. IEEE Log Number 8407014.

X=AF+ W.

(1)

The incident signals are represented in amplitude and phase at some arbitrary reference point (for instance the origin of the coordinate system) by the complex quantities F 1, F:, "', FD. The noise, whether "sensed" along with the signals or generated internal to the instrumentation. appears as the complex vector W. The elements of X and ,,4 are also complex in general. The at} are known functions of the signal arrival angles and the array element locations. That is, ail depends on the ith array element, its position relative to the origin of the coordinate system, and its response to a signal incident from the direction

of thejth signal. Thejth column of A is a "mode" vector a(Oj) of responses to the direction of arrival OJ of the jth signal. Knowing the mode vector a(O,) is tantamount to knowing 0) (unless a(O)) == a(02) with 0, =;:. O2 , an unresolvable situation, a

type I ambiguity). In geometrical language, the measured X vector can be visualized as a vector in M dimensional space. The directional mode vectors a(Oj) == ail for i = 1, 2, ' .. , M, i.e., the columns of A, can also be so visualized. Equation (1) states that X is a particular linear combination of the mode vectors; the elements of F are the coefficients of the combination. Note that the X vector is confined to the range space of A. That is, if A has two columns, the range space is no more than a twodimensional subspace within the M space and X necessarily lies in the subspace. Also note that a«(J), the continuum of all possible mode vectors, lies within the M space but is quite nonlinear. For help in visualizing this, see Fig. 1. For example, in an azimuth-only direction finding system, 8 will consist of a single parameter. In an azimuth/elevation/

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. AP-34, No.2, pp. 276-280, March 1986.

190

rank less than M. Therefore IA PA *I = IS - ASo l =0. THE S IGNA L SUBSPACE ( DE T E R M IN E D B Y THE DATA)

(3)

This equation is only satisfied with A equal to one of the eigenvalues of S in the metric of So. But, for A full rank and P positive definite, APA * must be nonnegative definite. Therefore A can only be the minimum eigenvalue Amm. Therefore. any measured S = X X * matrix can be written S = APA * + AminSO.

(4)

where Amin is the smallest solution to IS - A50 1 O. Note the special case wherein the elements of the noise vector Ware mean zero , variance a!. in which case , AminSo = a~ l. C AL CULA TIN G .-\ SOLUTI ON

°MIN

The rank of APA * is D and can be determined directly from the eigenvalues of S in the metric of So. That is, in the complete set of eigenv alues of 5 in the metric of 50, Anun will not always be simple. In fact. it occurs repeated N = M - D times . This is true because the eigenvalues of S and those of 5 - AmmSO = A PA * differ by Amm in all cases . Since the minimum eigenvalue of APA * is zero (being singular), AlOin must occur repeated N times . Therefore, the number of incident signals estimator is

NOISE S PACE EIGEN VECTOR

THE 8(H) CONTIN UUM OF DIRECT ION-OF· A RRIVAL VECTORS (DETE RMINED BY ARRAY GEOMETRY AN D C H A R AC T E R I S TI CS) 1 , 0Z' 0MIN ARE THE EIGEN VECTORS OF S CORRESPONDING 1 TO EIGENVAI. UES "'1 > "'2 > "'M IN > 0 1 0 2 S PAN THE SI G N AL SUBSPAC E 1, 8(",), _ ( 8

Fig. I.

2)

ARE T HE IN C I DENT S IGNAL MODE V E C T O R S

Gcomct n c portrayal for three-antenna case .

D=M- N

range system, 0 will be replaced by 8.
5

where N the multiplicity of AI1l,n(5, 50) and Amm(S, So) is read .. Anun of S in the metric of 50'" (In practice . one can expect (hat the multiple AllIIn will occur in a cluster rather than all precisely equa l. The " spread " on this cluster decrea ses as more data is processed .) T HE S IGNA L-\ :-iD N OISE S LBSPACES

The M eigenvectors of 5 in the metric of So must satisfy Se, = A,Soe;, i = 1,2, .. '. A1. Since S = APA * + AminSO. we have APA *e, = (Ai - Amm) 50e;. Clearly . for each of the A, that is equal to Amin-there are N-we must have APA "e, = 0 or A <e, = O. That is, the eigenvectors associated with Amln(S, So) are orthogonal to the space spanned by the column s of A; the incident signal mode vectors ! Thu s we may justifiably refer to the N dimensional subspace spanned by the N noise eigenvectors as the noise subspace and the D dimensional subspace spanned by the incident signal mode vectors as the signal subspace: they are disjoint.

M ATRI X

The A1 x A1 covariance matrix of the X vector is

5 ==

XX * =AFF ~ ..--I * +

ww*

Or

5 =APA * + A5 0

(5)

TH E ALGORITHM

(2)

under the basic assumption that the incident signals and the noise are uncorrelated . Note that the incident waveform s represented by the clements of F may be uncorrelated (the D x D matrix P ~ FF* is diagonal) or may contain completely correlated pairs (P is singular). In general, P will be " merely" positive definite which reflects the arbitrary degrees of pair-wi se correlations occurr ing between the incident waveforms. When the number of incident wavefronts D is less than the number of array elements M, then A PA * is singular; it has a

We now have the means to solve for the incident signal mode vectors. If EN is defined to be the M X N matr ix whose columns are the N noise eigenvectors . and the ordinary Euclidean distance (squared) from a vector Y to the signal subspace is d 2 = Y*ENE.t Y, we can plot 11d2 for points along the a(O) continuum as a function of O. That is,

191

1 P MU ( 0) = -a*-(-O)-E- 1.-a-(O-) • N-E-

(6)

(However, the a(O) continuum may intersect the D dimen-

sional signal subspace more than D times; anouther unresolvable situation occurring only for the case of multiple incident signals-a type II ambiguity.) It is clear from the expression that MUSIC is asymptotically unbiased even for multiple incident wavefronts because S is asymptotically perfectly measured so that EN is also. a(O) does not depend on the data. Once the directions of arrival of the D incident signals have been found, the A matrix becomes available and may be used to compute the parameters of the incident signals. The solution for the P matrix is direct I and can be expressed in terms of (S - AminSO) and A. That-is, since APA* = S - AminSO,

RCVRSlMIXERSIETC.

AND

S MATRIX COMPUTE

s=

S

S12

S;:

S22

SIM]

, sil IS CROSS

CORRELAnON

BETWEEN IIh AND IIh ANTENNA

[ SM1

SMM

(7) INCLUDING POLARIZATION

.OF

Consider a signal arriving from a specific direction 00 . Assume that the array is not diverse in polarization; i.e., all elements are identically polarized, say, vertically. Certainly the DF system will be most sensitive to vertically polarized energy, completely insensitive to horizontal and partially sensitive to arbitrarily polarized energy. The array is only sensitive to the vertically polarized component of the arriving energy. For a general or polarizationally diverse array, the mode vector corresponding to the direction 00 depends on the signal polarization. A vertically polarized signal will induce one mode vector and horizontal another, and right-hand circular (RHC) still another. Recall that signal polarization can be completely characterized by a single complex number q. We can "observe' how the mode vector changes as the polarization parameter q for the eminer changes at the specific direction 00 • It can be proven that as q changes through all possible polarizations, the mode vector sweeps out a two-dimensional ·"polarization subspace. " Thus, only two independent mode vectors spanning the polarization subspace for the direction 00 are needed to represent any emitter polarization q at direction 00 - The practical embodiment of this is that only the mode vectors of two emitter polarizations need be calculated or kept in store for direction 00 in order to solve for emitter polarizations where only one was needed to solve for DOA in a system with an array that was not polarizationally diverse. These arguments lead to an equation similar to (6) for P(O) but including the effects of polarization diversity among the array elements.

SIGNALS

MUSIC

0=3

q= Ox1 O·YECTOA OF DtRECTlON-of·ARRIVAL (AZIMUTH ONLY CASE)

Fig. 2.

[

; ]

.!..:!:.l v'2

OxO Ox 0 MATRIX OF CROSS AND AUTO POWERS

Ox1 D-YECTOR OF POLARIZATION PARAMETERS

Block diagram for multiple Signal classification.

front polarizations. The eigenvector corresponding to A!mn in provides the polarization parameter q since it is of the form [1 qJ T.

(8)

THE ALGORITH\-1

In summary. the steps of the algorithm (see Fig. 2) are: Step 0: collect data. form S; Step 1: calculate eigenstructure of S in metric of So; Step 2: decide number of signals D; (5): Step 3: evaluate P\fl/(O) versus 0: (6) or (8): Step 4: pick D peaks of P.\fU(O): Step 5: calculate remaining parameters: (7). The above steps have been implemented in several forms to verify and evaluate the principles and basic performance. Field tests have been conducted using actual receivers. arrays, and multiple transmitters. The results of these tests have demonstrated the potential of this approach for handling multiple signals in practical situations. Performance results are being prepared for presentation in another paper.

(8)

where ax(O) and ay(O) are the two continua corresponding to, for example, separately taken x and y linear incident waveI (added in reprint) Equation (7) is true if So, the noise covariance matrix, is the identity matrix. In general, although there are many estimators of P, the least squares estimate based on X == AF + W with Ww* == AmmSO requires whitening which leads to

COMPARISON WITH OTHER METHODS

In comparing MUSIC with ordinary beamforming (BF), maximum likelihood (ML), and maximum entropy (ME), the following expressions were used. See Figs. 3 and 4. PBF(O) = a*(O)Sa«(J)

P

1

(0)---- a*(O)S- la(O)

ML

1 a*«(J)cc*a(O)

192

multiple incident wavefronts , P M L «(} ) becomes

CONVENTIONAL BEAMFQRMING

ACTUAL ADA' dB :

.J O \

~ ~ ~3"O.O~ .

. >: _' _~

'-"

SNR- Z4dB . , SNR .. lOdB

.,, '--- - - - »- - ·Wl

.~

3r1';

. ..

-.

SNA . 24 dB

x

SNR " lOdS

f~~

A

x

TRIANGLE ARRAY

min

~ I

,

,

'00

I~

10

,

J' "

SECOND PREDOMINANT PEA KI$ff'RONG

(A N AMB IGUITY )

1'00

" l-::=-.,-----'-::=-c"..---,,~1'00 - 100

~

XI

. 1\ 0

100

· '>(1

II

AOA

TO 57 dB

SO

100

I~

MUSIC

d B G L AMBIGUITY .10 '0

,

Fig. 3.

NO BIAS ERROR

OR

CONFUSION

_

"

Exampl e o f azimuth-only DF perfor mance .

MAX LIKELIHOOD

de

AOA

·1

8

which implies a D dimensional search (and plot !). PMEC(}) is based on selecting one of the M array elements as a " reference" and attempting to find weights to be applied to the remaining M - I received signals to permit their sum with a MMSE fit to the reference . Since there are M possible references , there are M generally different P M E( (}) obtained from the M possible column selections from S- I. In the comparison plots, a particular reference was consistently selected . An example of the completely general MUSIC algorithm applied to a problem of steering a multiple feed parabolic dish ante nna is shown in Fig . 5 . sin x /x pencil beamshapes skewed slightly off boresight are assumed for the element patterns. Since the six antennas are essentially colocated, the OF capacity arise s out of the antenna beam pattern diversity . The computer was used to simulate the " noisy" S matrix that would arise in pract ice for the conditions desired and then to subject it to the MUSIC algo rithm. Fig. 5 show s how three direct ional signals are distinguished and their polarizations estimated even though two of the arr iving signals are highly similar (90 percent correlated) . The application of MUSIC to the estimation of the frequen cies of mult iple sinusoids (arbitra ry amplitudes and phases ) for a ve ry limited duration data sample is shown in Fig . 6 . The figur e suggests that. even though there was no actual noise included . the rounding of the data samples to six decimal digits has already destro yed a significant port ion of the information present in the data needed to resolve the several frequencie s.

ADA BI AS ERR OR

I

8

~

dB: ~MAXENTRO~~l~~E~~I~~:S

100 (TIME BANDWIDTH I

>

'$0

MAX LIKELIHOOD

o 100 AOA

EQU ILATERAL

sr

dB "

>00

0

AUA

I P ML(8) = -;\-.-(A-*-S--'-A-)

NO ABILITY TO RESOLVE TWO SIGNALS

MAX ENTROPY

S UMMAR Y AND C ONCL USION de

LARGE BIAS ERROR : PEAK IS RELATIVELY SH ARP

AOA

MUSIC

de

NO BIAS ERROR ; PEAK IS VERY SHARP

AOA

Fig. 4 .

Exa mple of azimuth-onl y DF perfor mance (scale expanded abo ut weak er sig nal at 30° ) .

where c is a column of S - '. The beamformer expre ssion calculates for plotting the power one would measure at the output of a beamformer (summing the array element signals after inserting delays appropriate to steer or look in a specific direction) as a function of the direction. P ML ( (} ) calculates the log likelihood function under the assumptions that X is a mean zero , multivariate Gaussian and that there is only a single incident wavefront present. For

As this paper was being prepared. the works of Gething [I J and Davies [2] were discovered , offering a part of the solution discussed here but in term s of simultaneous equations and special linear relation ships without recourse to eigenstructure. However, the geometric sign ificance of a vector space setting and the interpretation of the S matri x eigen structure was missed. More recent work by Redd i [3J is also along the lines of the work presented here though limited to uniform , collinear arrays of omnidirectional elements and also without clear utilization of the entire noise subspace. Ziegenbein [4] applied the same basic co ncept to time series spectral analysis referring to it as a Karhunen-Loeve transform though treat ing aspec ts of it as " ad hoc. " El-Behery and MacPhie [5] and Capon [6] treat the uniform co llinea r arra y of omnid irect ional eleme nts using the maxim um likel ihood method . Pisarenko [7] also treats time series and addresses only the case of a full complement of sinuso ids; i.e. , a one-dimensional noise subs pace. The approach presented here for multiple signal classification is very general and of wide application. The method is interpretable in terms of the geometry of complex M spaces where in the eigenstructure of the measured S matrix plays the central role . MUSIC provides asymptotically unbiased esti-

193

mates of a general set of signal parameters approaching the Cramer-Rao accuracy bound. MUSIC models the data as the sum of point source emissions and noise rather than the convolution of an all pole transfer function driven by a white noise (i.e., autoregressive modeling, maximum entropy) Or maximizing a probability under the assumption that the X vector ' is zero mean, Gaussian (maximum likelihood for Gaussian data). In geometric terms MUSIC minimizes the distance from the 0(8) continuum to the signal subspace whereas maximum likelihood minimizes a weighted combination all component distances. No assumptions have been made about array geometry. The array elements may be arranged in a regular or irregular pattern and may differ or be identical in directional characteristics (amplitude/phase) provided their polarization characteristics are all identical. The extension to include general polarizationaIIy diverse antenna arrays will be more completely described in a separate paper.

METRIC CONTOURS OF 3 d B B E A M W IDTH S O F T H E SIX A N T EN NA FE E D EL EM E NT S L IN E A R L Y POLAR IZED AS P ER A R ROWS

EMITTER PAT H N U M BE R 2

IN T E R F E R ER

AZIMUTH , B E-AMy/IDTH S

0.50 EMITTE R PATH N UMB E R 1

SIGNAL PO LARIZATIONS:

o

RELAT IVE POWE RS ;

t dB

EMITTE R PATH N UMBE R 2

CJ o dB

IN T E R F E R E R

/ 3 dB

REFERENCES

[I) (2)

COR R E LAT I ON C OEF F IC IE NT S

',00 .91 .00] [ .91

1 .00

.00

.00

1 .00

.00

(3) (4)

Fig. 5.

MUSIC applied to a multiple feed. parabolic dish antenna system . [5) D A TA : 1 6 C O M PLE X TI M E S A M P L E S OF 6 CI S O I OS F RE Q (Hz)

R ELAMP (dB)

78 .1 134.1 138.6 142.9 152.9 16 6. 3

-62.33 - 1 1. 10

0.0 -11 .1 0 -40. 0 -2 6. 02

(6) [7J

P. 1. D. Gething, " Analysis of multicomponent wavefields ." Proc. Inst. Elec. Eng.• vol. 118. no . 10. Oct. 1971. D. E. N. Davies . "Independent angular steering of each zero of the directional pattern for a linear array ." IEEE Trans. Antennas Propagat.• vol. AP-15, Mar. 1967. S. S. Reddi, " Multiple source location-A digital approach . " IEEE Trans. Aerosp. Electron. Syst. , vol. AES-15 . no. I. Ian . 1979. 1. Ziegenbein, "Spectral analysis using the Karhunen-Loeve transform, " presented at 1979 IEEE Int. Conf. Acoust. Speech Signal Processing , Washington . DC . Apr . 2-4, 1979. pp. 182-185 . I. N. El-Behery and R. H. MacPhie . " Maximum likelihood estimation of source parameters from time -sampled outputs of a linear array." J. Acoust. Soc. Am., vol. 62 . no. I. July 1977. 1. Capon . " High resolution frequency-wavenumber spectrum analysis ," Proc. IEEE, vol. 57. no. 8. Aug. 1969. V. F. Pisarenko , " The retrieval of harmon ics from a covariance function." Geophys. J. Royal Astron. Soc., no. 33. pp. 374-366 . 1973.

Using Spectral Estimation Techniques in Adaptive Processing Antenna Systems WILLIAM

F.

GABRIEL, FELLOW, IEEE

Abstract-Improved spectral estimation techniques hold promise for becoming a valuable asset in adaptive processing array antenna systems. Their value lies in the considerable amount of additional useful information which they can provide about the interference environment, utilizing a relatively small number of degrees of freedom (DOF). The "superresolution" capabilities, estimation of coherence, and relative power level determination serve to complement and refine the data from faster conventional estimation techniques. Two conceptual application area examples for using such techniques are discussed; partially adaptive lowsidelobe arrays, and fully adaptive tracking arrays. For the partially adaptive area the information is utilized for efficient assignment of a limited number of nOF in a beamspace constrained adaptive system in order to obtain a stable main beam, retention of low sidelobes, considerably faster response, and reduction in overall cost. These benefits are demonstrated via simulation examples computed for a 16..e lement linear array. For the fully adaptive tracking array area the information is utilized in an all-digital processing system concept to permit stable nulling of coherent interference sources in the main beam region, efficient assignment/control of the available nOF. and greater Ilexibility in timedomain adaptive filtering strategy.

I

I. INTRODUCTION

M PRO VE D SPECTRAL estimation techniques are an emerging technology which derives largely from modern spectral estimation theory of the past decade and adaptive array processing techniques [2], [3]. [4]. Coupled with the phenomenal advances in digital processing [5]. these techniques are becoming a valuable asset for adaptive array antenna systems. Their value lies in the considerable amount of additional useful information which they can provide about the environment. utilizing only a relatively small number of degrees of freedom (DO F). For example. current spectral estimation algorithms can provide asymptotically unbiased estimates of the number of interference sources. source directions, source strengths, and any cross correlations (coherence) between sources [6], [7]. Such information can then be used to track and "catalogue" interference sources, hence assign adaptive DOF. These newer techniques are not viewed as a" superresolution' replacement for more conventional estimation methods such as main beam search, analog beamformers, or spatial discrete Fourier transforms (DFT). Rather. the new technology is considered complementary to the other methods and best used in tandem. For example, "superresolution" techManuscript received May 3. 1985~ revised September 16. 1985. This work Was supported by NAVAIRSYSCOM and the Office of Naval Research. This paper is a condensation of NRL Rep. 8920 [1). The author is with the Radar Division, Naval Research Laboratory, Washington. DC 20375. IEEE Log Number 8407018.

niques cannot compete with the speed of a DFT. Some comparisons of the various methods may be found described in the literature [4], [7], [8]. The purpose of this paper is to present two conceptual application areas for using spectral estimation techniques; partially adaptive low-sidelobe antennas, and fully adaptive tracking arrays. A partially adaptive array is one in which only a part of the OOF (array elements or beams) are individually controlled adaptively [9], [10], [11]. Obviously, the fully adaptive configuration is preferred since it offers the most control over the response of the antenna system. But when the number of elements or beams becomes moderately large (hundreds), the fully adaptive processor implementation can become prohibitive in cost, size, and weight. The paper is divided into three principal parts. Section II discusses partially adaptive. low-sidelobe antennas with the focus upon a constrained beamspace system; Section III considers source estimation and beam assignment from .. superresolution " techniques: and Section IV discusses an alldigital, fully adaptive tracking array concept.

II .

PARTIALLY ADAPTIVE LOW-SIDELOBE ANTENNAS

The antenna system addressed in this section is assumed to be a moderately large aperture array of low-sidelobe design wherein the investment is already considerable and one simply could not afford to make it fully adaptive. The assumption of low sidelobes (30 dB or better) is intended to give us good initial protection against modest interference sources and to reduce the problems from strong sources, i.e., in regard to the number of adaptive DOF required and the adaptive dynamic range of the processor. Thus. retention of the low sidelobes is considered a major goal in our adaptive system. In the discussion to follow, it is shown that using improved spectral estimation techniques in such a system can result in the following benefits over a fully adaptive array system. 1) Considerably faster adaptive response, reduction in computation burden, and reduction in overall cost because relatively few adaptive DOF are implemented. 2) Minimal degradation of both the main beam and sidelobe levels because simple adaptive weight constraints are made possible. 3) Compatible with a larger number of adaptive algorithms, including even analog versions. 4) Greater flexibility in achieving a "tailored" response due to greater information available. On the negative side, a partially adaptive system can never be guaranteed a cancellation performance equal to that of a

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. AP-34, No.3, pp. 291-300, March 1986.

195

fully adaptive array and, in addition, will deteriorate abruptly in performance when the interference situation exceeds its adaptive DOF. These risks are an inherent part of the package and must be carefully weighed for any specific system application.

A . A Low-Sidelobe E igenvector Constraint

I

I

"

o

-28

(

C I B (

t

S

-J.

9. SPATIAL AI'lGLE IN DECREES

(a)

(1)

where W is the adaptive weights vector , R is the sample covariance matrix, S* is the quiescent main beam weights vector , the asterisk denotes the conjugate of a complex vector or matrix, and JL is a constant. Furthermore, compute the sample covariance matrix via the simple " block" average taken over N snapshots,

R=- ~ [E(n)E(n)*t], N n= 1

(

R N

We begin this section by reviewing that unconstrained adaptive arrays can experience very "noisy" sidelobe fluctuations and main beam perturbations when the data observation/ integration time is not long enough, even though the quiescent mainbeam weights are chosen for low sidelobes . Consider a linear array of K elements, with each element adaptively weighted , and let us compute the complex adaptive element weights W k from the well-known sample matrix inverse (SMI) algorithm [11], [12]. Expressed in convenient matrix notation ,

W=JLR-'S*

p

o -,

u

p

o

U

-18

E R

(2)

where E(n) is the element signal data vector received at the nth time sampling. (See Appendix I for description of snapshot signal model.) The data observation/integration time (b) in (2) is the parameter N. If R is estimated over a lengthy observation time. like thousands of snapshots , . then the Fig. I . Fully adaptive 16-e1emem linear array. SMI algorithm wrth R esurnated from :!56 snapshots per update . three 30 dB noncohereru sources sidelobe fluctuations from W updates will be relatively small . located al 14·. 18· and :!2·. (al Quiescent main beam pattern . 30 dB Taylor However, practical system usage often demands short obserweighting . (b) Typ ical adapted patterns . rune update tn a ls plotted . vation times on the order of hundreds of snapshots or even less. Fig. I illustrates typical adapted pattern behavior for t denotes the transpose of a vector or matrix . The {37 and ei independent estimates of R using N = 256 snapshots per are the eigenvalues and eigenvectors, respect ively. of the update for the case of three 30 dB noncoherent sources sample covariance matrix . and (3~ is equal to receiver channel located at 14 0 , 18 0 , and 22·. The antenna aperture chosen for noise power level. Equation (3) shows that W consists of two this example is a lb-element linear array with half-wavelength parts: the first part is the quiescent main beam weight S*: the element spacing and a 30 dB Taylor illumination incorporated second part. which is subtracted from S*. is a summation of in S*. Note that the adaptive algorithm maintains the main weighted , orthogonal eigenvectors . This is a clear expression beam region and successfully nulls out the interference of the fundamental principle of pattern subtraction which sources , but that it also raises the sidelobe levels elsewhere . . applies in adaptive array analysis. The reader is referred to The adaptive patterns are in continual fluctuation in the [13] for a more extensive discussion . sidelobe regions and may exceed the quiescent sidelobe level We introduce the term principal eigenvectors (PE) to mean by a considerable margin. Also, the main beam suffers a those eigenvectors which correspond to unique eigenvalues significant modulation which would degrade tracking perform- generated by the spatial source distribution ; and the teon ance . These effects worsen as the value of N decreases. "noise eigenvectors" to mean those eigenvectors which To understand the reason for this undulating pattern correspond to the small noise eigenvalues generated by the behavior, it is helpful to analyze the optimum weights in terms receiver channel noise contained in the finite R estimates. The of eigenvalue/eigenvector decomposition. A derivation of PE are generally rather robust and tend to remain relatively such a decomposition for (I) can be found in [I], and we stable from one data trial to the next, whereas the noise reproduce here (B18). eigenvectors tend to fluctuate considerably because of the K {3-{3o inherent random behavior of noise. This difference in behavior (3) is illustrated in Fig. 2 for the three source case described W=JL' S*- ~ - ' - 2 - a.e, 1= I {3, above, wherein there are three PE and 13 noise eigenvectors where (Xi = ejlS* and JL' = JLI {3~. associated with each R estimate. Fig . 2(a) shows the stability

[

( 2 2) ]

196

"

p

( 3,

.,

o -

u E

•

A

;

I

"o

~

•

- I

E C

f\, f\ f'I "

,

r:

E L

S -2

-J

p

o _,

U

E R

f\

f\ '"'

iJoI

•

-

.

1

,

;

1/)1

-G .

:

•

-J.

SP ATIA L AHClE I I'l

.. J. lDECREES

' ~ GO

,

i

?,'

SP ATI AL AMGlE

Fig. 3.

(a)

1M OEGREES

Typical adapled patterns resulting from the constraint of utilizing only the PE. three-source case of Fig. I.

B. Low-Sidelobe Constraints for a General Beamformer Consider next a more intere sting configuration shown by the schematic diagram of Fig . 4, where we repre sent an adaptive array system operatin g in beamspace so as to have available some preadaption spatial filter ing. Applebaum and Chapman [10], (14) first described beamspace systems of this type. utilizing a Butler matrix (15) beamformer whe rein the vector of beamformer outputs E may be expressed.

(4)

-G.

-J. SPAT I AL AHCL £

... ll'l

3.

JEe REE S

••

Fig. 2. PIOIS of principal eigenvecto rs and noise eigenvectors computed from the R estimates assoc iated with the rhree-vource case of Fig. I. nine update trials. Cal Princip al eigenvecto rs I. 2. and 3. (b) Typical noise eigenvectors ~. 10. and 16.

of the three PE for nine trial s. and Fig. 2(bl shows the random behavior of typica l noise eigenvectors for the exact same tria ls. Th us. we would expect that the sidelobe undulations in Fig . I(b) are associated primarily with the noise eigenvectors . This thesis is verified in Fig . 3. which illustrates the adapted patterns resulting from (3) whe n only the PE are subtracted . The above adaptive array pattern behavior leads to the following obse rvations for source distr ibutions which do not encroach upon the main beam and involve a small number of the available degrees of freedom . I) It is possible to reta in low sidelobes in the adapted patterns , even with short obse rvation times. by constraining our algorithm (3) to utilize only the PE . The weight solution is unique and therefore stable. 2) Utiliz ing only the PE is tantamount to operating our adaptive system in beamspace (as opposed to element space ) with a set of weighted orthogonal canceller beams. 3) The fully adaptive array automatically forms and "assigns" its PE canceller beams to cove r the interference source distr ibution, with one beam per each OaF needed. Therefore, we have set forth a low-sidelobe eigenvector constraint algorithm for this type of restricted interference situation.

where 8 is a K x K matrix contammg the beamformer element weights. Other descr iptions of beamspac e systems are also available in the literature [II] , (16) , (17), [18), of which Adams et al. [17] is particularly germane to our discussion . Chapman (10) pointed out that when utilized in a partially adapti ve confi guration . such beamspace systems are susceptible to aperture element errors and cannot arbitrarily compensate the random error compone nt of their sidelobe structure. Thi s makes it necessary to control element errors in accord ance with the quiescent main beam sidelobe level desired. and fits into our initial assumption of low-sidelobe design mentioned earlier. A separate weighted main beam summing is indicated which may be obtained either by coupling into the beamformer outputs as shown. or by coupling off from the ele ments and providing su itable phase shifters for steering plus a co rpo rate feed network . Our purpose here is to examine the sidelobe performance of such a partially adaptive beamspace sys tem in which element erro rs are kept low and beamforrner beam s are subjected to simple constraints. Spat ial estimation data on the interference source distribu tion shall determine which beamformer beams are to be adapti vely controlled. Such beams are defined here in as " assigned" beams, and the idea is to assign only enough beam s to accommodate the OaF required by the source distr ibution . Whenever the two are equal, the adaptive weight solution is unique and we avoid adding any extra " noisy " weight perturbations . The reader will recognize that we are attempting to replace the PE beams of Section II-A with assigned beams from our ge neral beam former. Thus, we are defining a partially adaptive array which will utilize only a relatively small number of its available OaF. In addition to

197

where the particular value of k must be selected for the jth assigned beam. Our J dimension adaptive weight solution thus becomes

W=R -'A.

(8)

Equation (8) gives us the J assigned beam weights required in (5). The proposed constraint I W,. I ~ 'Y can be applied directly to the solution from (8) , but recognize that this is a "hard" constraint and the results will not be optimal when the limit is exceeded. A softer. more flexible constraint for our purposes is one suggested by Brennan I based upon Ow sley [19], where one selects weights which simultaneously minimize both the output and the sum of the weight amplitudes squared , i.e . . Fig. 4 . Beamspace partially adapt ive array with a separatel y weighted main beam and canceller beam s assigned by a source estimation processor.

minimize {I V - wry 1 2 + aWrW *} .

this assigned beam constraint. we seek to limit the adaptive weights of assigned beams to a maximum level 'Y chosen to exceed the mainbeam sidelobe level by only a few decibels . This prevents an excessive rise in adaptive sidelobe level. including the condition where the number of assigned beams exceeds the OaF required . 'Y actually represents the product of assigned beam gain and adaptive weight magnitude. such that we have the option of working with beam former beams which are considerably decoupled/attenuated . An equation formulation may be expressed in terms of the same pattern subtraction principle as utilized in (3) for K beams. Wo=

s--

2: K

Wkb~

(5)

k=1

where I W k I ~ 'Y for J assigned beams and W k = 0 for all other beams . b k is the kth Butler matrix beam element-weight vector. When Wk == 0, that beam port is essentially disconnected from the output summation and it is much to our advantage to reduce the OaF of the adaptive weight processor accordingly , i.e. , this processor reduction relates directly to the computational burden . response time. sidelobe degradation, and overall cost mentioned earlier. For example. utilizing the SMI technique described in ( 1) and (2) . we would now have the advantage that our sample covariance matrix of signal inputs R involves only the J assigned beams and its dimensions reduce from K x K down to J x J, thereby greatly easing the computation burden involved in obtaining its inverse (II]. The equivalent "steering vector " A per Applebaum [9] is also reduced to dimension J and consists of the cross correlation between the main beam signal V and the J ass igned beam outputs Y

I

A=N

2: N

V(n)Y*(n).

(6)

" =1

The jth assigned beam output for the nth snapshot signal sample is simply

k set by j

where the overbar denotes averaging over N snaps. The solution is a simple moditication to (8) wherein

W==[R+aIJ -1A

(9)

where a = (,,(2/1) trace [R] . Note that (9 ) adds a small percentage o f the average assigned beam power to the diagonal terms of R. i.e .. it is a " pseudono ise " add ition technique . Recall that 'Y was selected to be close to the main beam sidelobe level. Although 0: is a small percentage of the trace [R). it is generally much larger than the receiver noise level P~. and this domination over recei ver noise by a constant will tend to se verely dampen weight fluctuat ions due to no ise . Of course. (9 ) deviates from the optimum Weiner weights and will result in a slightly larger output residue . but the cost is neglig ible compared to the remarkabl y stable results achieved from this rather simple con straint . It essent ially permits the number of assigned beam s to exceed the DOF required . and yet reta in low sidelobe level s . Equations (5 )- (9 ) were util ized in computing the adaptive pattern examples which follow . The reader should recognize that the J dimension adaptive weight so lution may be arrived at via any of the current adaptive processing algorithms such as Howells-Applebaum [9] . Gram-Schmidt [II] . sample matri x inverse update [:!Ol. etc. Applying the se constraints to our three-source case of Fig. 1. we would assign beamformer beams 10. II. and 12 to co ver the sources . as illustrated in Fig . 5 . These assigned beams are then given a maximum ga in level about 5 dB above the - 30 dB main beam sidelobes, or I W k I ~ 0 .055 . All other W k are set to zero . Typical resultant adapted patterns are almost identical to Fig . 3 . The pattern stability is near-perfect for a unique solution like this, and note that the three sources have been nulled with very little perturbation of the rnainbearn sidelobes except in the immediate vicinity of the sources . Since we are inverting a matrix of only 3 x 3 dimension in (8) for this case. it follows that the number of snapshots processed per trial could be reduced by an order of magnitude [12] and still have excellent results. The adaptive weights will tend to become "noisy" if we include even one extra OaF beyond the

(7)

198

I

Private comm unication. L. E. Brennan. Adaptive Sensors. Inc.

JO 11 12

P

O -Ie ~

£ R I

N

o

-20

£

C

f\ /\

fI

~

A

-J'

U4

••

J0

SPATI AL ANCLE Il'l DECREES

Fig. 5.

Beamformer beams 10. II , and 12 assigned case of Fig . I.

10

90

cover the three-source

unique solution , However, if we use the " soft" constraint of (9) in solving for the weights , stable performance is again restored despite the extra OOF. Although not shown here. another example of interest was the case of using a two-beam cluster ( I I and 12 in Fig. 5) to cancel a single 40 dB broad-band source located at 22· . It was found that the source could be adequatel y cancelled at bandwidths up to IS percent. Many other combinations of source distributions and assigned beams were tested to further verify the technique. and the partiall y adaptive performance was satisfactory provided

that the assigned beams were sufficient to cover the DOF demanded by the source distribution .

C. Interference Sources in (he Main Beam Region Extension of the foregoing partiall y adaptive array technique for main beam interference is straightforward. provided we relax the constraint upon the value of y in (5). Obviously, the low-sidelobe strategem becomes seco ndary to the greater menace of an interference source coming in thru our high-gain main beam . Low sidelobes could still be retained. if necessary , by implementing a beam former which is capable of producing a family of low-sidelobe assigned beams [17]. III.

SOURCE ES TlM ATl ON AND B E A~l A SSIGNME NT

Modern spectral estimation technique s are considered complementary to the conventional methods for tracking and cataloging interference sources. They do not interfere with any function s of the main beam. and they are capable of providing superior source resolution from fewer elements. The latter advantage is gained in part because we assumed low sidelobes for the main beam. i.e. . the only sources that require estimat ion are those few which are of sufficiently high SNR to get thru the mainbeam sidelobes . Resolution performance is always directly related to signal-to-noise ratio (SNR). of course [3], [7], [8] . The principle of achieving source estimation from a small fraction of the aperture OOF has been demon strated via many techniques, both conventional and optimal [2], [4], [21]. It is not within the scope of this paper to attempt a comprehensive

comparison of such techniques, but the point is important to our concept so that an example of a half-aperture linear array estimator is given in this section. The type of application envisioned is illustrated in Fig. 6, where we represent a K x K element aperture system in which the adaptive beam OOF are to be assigned on the basis of estimates derived from two orthogonal linear arrays of KI2 elements each. An extension of the two-dimensional beamspace adaptive array system of Fig. 4 to the three-dimensional system suggested by Fig. 6 permits several beamformer options , including I) two orthogonal two-dimensional beam formers of which one is coupled into a row and the other coupled into a column of elements; and 2) a complete three-dimensional beamformer [22] coupled into the aperture elements , perhaps on a thinned basis . The separate main beam must be summed from all K 2 elements in order to attain the desired low sidelobes. Although they involve relatively few elements from the aperture. the linear array estimators represent a significant increase in system expense because they are all-digital processing subsystems. Processing of the digital signals to estim ate the sources may be carried out in accordance with a number of spectral estimation algorithms available in the literature [I J-[8J . Several algorithms that were utilized in the simulations conducted for this paper are discussed in [I] . Once the source estimation information is available, then we can assign beam former beams via a computer logic program. A Fortran IV computer code named "BEAMASSIGN" was developed which accepts source information updates. compare s the new data against a source directory kept in memory. computes track updates for sources alread y in memory . determines priority ranking , and assign s beams to cover the sources of highest prior ity. An important point to note is that beam assignment does not require great accuracy . i.e .. a halfbeamwidth is usually close enough. Also, clusters of two or three adjacent beams may be assigned for doubtful cases. A demonstration of beam assignment was conducted with a moving source simulation involving the 16-element linear array of Fig . I . Four sources of unequal strength were set up in the far field . traveling in crisscrossing patterns . Two of the sources are of 30 dB strength with start angles of 3.0· and 39 .0 · , and two are of 43 dB strength with start angles of 5.0 · and 70 .0·. The estimation of the scanned main beam for this example is shown in Fig . 7(a). Each time-unit plot cut is computed from R averaged over 160 snapshots , (10)

where S* is the main beam steering vector used to generate the display plot. As expected, this simple Fourier output is dominated by the two stronger sources . In contrast, Fig . 7(b) shows the source estimation derived from eigenanalysis processing using only half of the aperture (eight elements). Note that the "superresolution" characteristics of this type of optimal estimation produces excellent source tracking , even in the vicinity of crossover of three of the sources. The results of using the source information data contained in Fig. 7(b) to continuously update beam assignments is illustrated in the adapted pattern cuts shown in Fig. 8(a). Note that the main beam remains steady and the sidelobes seldom exceed

199

xxxxxxxxx~xxxxxx

ELEMENT ROW x x x x x x x x x x x x x x x

AP ERTURE ELEMENTS

69

C

59

1 B E L S

C A

L

E

9 - 29

90 B0

- 10

EL EMENT COLUMN/ Fig. 6.

D

E

Sp

0

~~ 10

(K x K) element aperture within which row/column linear arrays

couple into source estimation processors.

60

20

"N... "'\.

40

30

30

050 E Dr:..

"Ee:s

20 60 70

a

T

1

50

S

c

70 LE A

HE

10

(a)

D

E

C I

B E L

D 31313

(b)

E

C 2513 I

Fig. 8. Adaptive patterns for 16 element linear array. SMI algorithm. four moving sources case of Fig. 7 . (a) Partially adaptive. constrained. assigned beams. 32 snapshots processed per plot cut. (b) Fully adaptive array. no constraints . 300 snapshots processed per plot cut.

B 21313

E

L 1513

S

C 11313 A L

E

13 -213 -113

Sp

their quiescent 30 dB peak level, despite the drastic shifting of the nulls as the moving sources crisscross in the sidelobe region. In contrast, Fig. 8(b) illustrates the adapted pattern cuts obtained when we utilize the SMI algorithm weights with the array fully adaptive . Although the source cancellation is excellent, the main beam suffers significant modulation and the peak sidelobe levels rise considerably .

913 813 713

13

Ac:r~ Ie

613 213

"N... "'t

E

313

413

413

D~ 513

Ie

" ~ 613

713 (b)

513

S

cA

LE

313 ME 213 T I

IV.

13

Fig. 7. Estimation of four moving sources via main beam scan and halfaperture eigenanalysis algorithm, 160 snapshots averaged per plot cut. (a) Conventional main beam scanning, 16 elements, 30 dB Taylor illumination . (b) Half-aperture eigenanalysis source estimation .

AN ADAPTIVE ARRAY TRACKING ApPLICAnON

A second area where spectral estimation techniques can provide valuable assistance is that of adaptive array tracking systems. Here we are dealing with the problem of attempting to track targets under the condition of having interference sources present in the main beam region. Some early proposed solutions in this area evolved from the growing adaptive array

200

technology of the 1970's. For example , a paper by White [23] discusses the radar problem of tracking targets in the lowangle regime where conventional tracking radars encounter much difficulty because of the presence of a strong surfacereflected ray . The first extension of fully adaptive arrays to angle estimation in external noise fields is the contribution of Davis et al. [24], who developed an algorithm based on the outputs of adaptivcly distorted sum and difference beams. The adaptive beams filter (null) the external noise sources, and distortion correction is then .applied in the resultant monopulse output angle estimate. Their work is particularly appropriate as a starting point for this section, where we discuss the advantages of using spectral estimation techniques in an alldigital, fully adaptive, array tracking system; [17] is also pertinent .

o 180 DEC.

p

o

u

DEG. PHASE

PHASE~,---

-10

I

/r\

•

E

tt

il

i I II

i I

!,

D -28

E C

I B

~

I

I

\ f

S - 38 \

I

I I

\I!

- '8.~ ·11~..Ll"~:---4~~-4f4'-:!-~~..llL~~lLJ.iL,J!:L'.l.u..LLi.L'1 -45

Fig. 9. Typical adaptive patterns for coherent interference in the mainbeam region; 16 element linear array. SMI algorithm . two 13 dB coherent sources located at - 7.6· and - 4.0·.

A . Coherent Spatial Interference Sources The existence of significant coherence between spatial sources as , for example, in multipath situations involving a specular reflection, continues to represent a serious problem area even for a fully adaptive tracking array. Reasons include

II

1) coherent signals are not stationary in space [3]. [7]. [25];

I V90

2) adaptive systems may perform cancellation via weight

I

phasing rather than null steering [7]. [25]-[28J ; 3) adaptive tracking beam distortion is highly sensitive to coherent signal phasing; 4) signal fading under antiphase conditions .

I i i i i

T R A

c

K

A N C

To demonstrate these reasons. adaptive characteristics were computed for a Io-elemeru linear array for an interference case in which there are two 13 dB coherent sources in the main beam region. There is also a third source. noncoherent, in the nearby sidelobe region to act as a stable null comparison point. In Fig . 9. we illustrate the severe changes in our main beam caused by variation of the phase shift between the two coherent Sources located at - 7.6· and - 4.0·. The quiescent mainbeam has the same Taylor weighting as in Fig . I(a) . Note that for phasing of O· and 180·, the adaptive weights are not achieving cancellation by steering nulls onto the coherent Sources but, rather. by the weight phasing itself. The array Output was driven down to receiver noise level for all three phases. The plots for 90· phase are very similar to what one would obtain if all three sources were noncoherent, i.e . . cancellation is achieved by adaptive null steering in this instance . Such severe sensitivity to coherent source phasing in the mainbeam region produces different distortions in tracking estimates from adaptive L (sum) and .:l (difference) patterns, as shown in Fig . 10. The equation development for this type of plot is in [I], but the main point here is to show the considerable changes in track angle estimates just due to phase variation. Once again, if all three sources were noncoherent, the distortion plot would be stable and very similar to the one shown for 90· phase.

L

J\.Q

E E S T I M

DEC. PHASE

! Ii Ii , i _----/-"'--it /

!

/

/

/

o DEC . PHASE

/

'//~180

DEC . PHASE

- 1

A

T E

-.?a

•• -I

e

2

TRUE TARCET ANCLE IN BEAMUIDTHS

Fig. 10. Track estimate distortion resulting from adaptive ~ and Ll tracking beams for coherent interference in the main beam region. same case as Fig. 9 .

8. All-Digital Tracking System Concept The separate estimation of interference source data (total number, power levels, location angles, coherence) and its

utilization to improve the output SNR of desired signal detections is a mode of system operation that has been addressed in the literature a number of times for various applications [7], [8], [17], [18]. In this section, we briefly review such a system wherein the estimated data is used to drive a fully adaptive tracking processor [1]. The concept is illustrated in Fig. 11. Starting on the left side, the system continuously computes/updates a sample covariance matrix R. Of particular significance is the fact that R may be dimensioned either equal to or less than the total number of array elements, i.e ., the model order of the estimate is selectable per subaperture averaging option choice. Off-line processing on Ris then conducted at periodic intervals to estimate the locations and relative power levels of interference sources via the most appropriate spectral estimation algorithms. The central processor unit (CPU) then applies these data to the computation of optimized adaptive spatial filter weights for the right side of

201

P

o

U

1

E R

UPDATED SAMPLE COVARIANCE MATRIX

FAST -MEMORY STORAGE

e

E C

I

SPECTRAL EST. ALGORITHMS

B

E l

S -I CONTROL

DATA

SPATIAL S~OOTHIHC PLUS EICOIAHALVS IS ALL THREE PHASES

BEAMFORMER

CPU.

f------l--- ----- - --- -- --f------l SEARCH-TRACK

SPATIAL ANGLE I N DECREES

ALGORITHMS

Fig. 12. Comparison of main beam scan versus spatial smoothing processing for coherent source case of Fig. 9 . PEGS eigenanalysis . 256 snapshots per trial.

Fig. II . Separation of source estimation from adaptive filter weight computation can be done accurately only in an alldigital processing system, but it permits the following benefits :

this source estimation data, we can construct an equivalent covariance matrix dimensioned for the full aperture per the procedure given in Appendix I. and compute its inverse for obtaining the adaptive filtering . If we define the constructed covariance matrix as M , then its inverse may be viewed as a matrix set of adaptive' 'bearnforrner' filter weights to give us the filtered output nth snapshot vector Ef(n),

I) estimation of coherent interference source locations for

(II)

Fig . II.

All-digital adaptive array tracking system concept.

deliberate adaptive null filter placement; 2) remembering slowly changing or time-gated sources; 3) anticipating sources from a priori data inputs; 4) flexibility in time-domain control of the filtering to counter interference time strategies; 5) tracking/cataloging/ranking sources; 6) efficient assignment of available OaF; 7)compatible with fast-response adaptive algorithms, i.e ., parallel algorithm processing. The right side of Fig. II indicates a fast-memory storage capability which is intended to permit selected time delays of the snapshots for feeding into the filter weights . The idea is to synchronize selected snapshots with their filter weight updates if possible . Finally, the filtered signal output residue is fed into a beamformer which is weighted to produce the desired search and mono pulse track beams for target detection and tracking . The algorithms of Davis et al. [24] may be applied for estimating the target signal angle of arrival, based upon the outputs of adaptively distorted sum and difference beams. Reference [1] discusses the equivalence of such beams to the Fig . 11 concept. As an example, let us apply this concept to the coherent sources case utilized for Figs . 9 and 10 wherein we would utilize a 16-element linear array feeding into our all-digital processor. An appropriate estimation algorithm is that of forward-backward subaperture spatial smoothing [7] , [28], [29], [30] combined with eigenanalysis . The rudiments of this algorithm are described in [1], and the results are plotted in Fig . 12 in comparison with a scanned main beam output. From

Conventional beam weighting S* can then be applied to the filtered output residue to obtain the final output for the nth snapshot, Yo(n)

=Ej(n)S* = E'(n)M -IS*

(12)

or

where W 0 is the familiar optimum Wiener filter weight. Note that the constructed covariance matrix M permits options such as adding synthetic sources or changing power levels. Furthermore, since it is always Toeplitz, solutions may be simplified somewhat. For the current example. the computed adaptive characteristics would be very similar to those plotted in Figs . 9 and 10 for the 90° phase angle.

202

V.

CONCLUSION

Two conceptual application areas have been presented for using spectral estimation techniques; partially adaptive lowsidelobe arrays, and fully adaptive tracking arrays. In both cases, improved spectral estimation techniques are used separately to acquire information about the interference environment which is beyond that ordinarily available in a conventional adaptive array . Examples discussed included "superresolution" effects, relative power level determination, estimation of coherent sources, and the tracking/cataloging l ranking of sources. For the partially adaptive area, the information was utilized for efficient assignment of a limited number of DaF in a beamspace constrained adaptive system in

order to obtain the following benefits (as compared to a fully adaptive array): retention of low sidelobes plus a stable mainbeam; considerably faster adaptive response; reduction in overall cost; and greater flexibility. On the negative side of the coin, we incur the risk of possible inferior cancellation performance if the interference source situation is not adequately covered by the assigned DOF. For the fully adaptive tracking array area the information is utilized in an all-digital processing system to obtain the benefits of stable nulling of coherent interference sources in the main beam region, efficient assignment of the available DOF, and a far greater flexibility in the time-domain control of adaptive filtering strategy. ApPENDIX

assumed to be a random process with respect to both the time index n and the element index k. ) Equation (15) permits us to construct a convenient column vector of observed data in the form, E(n) = Vp(n) + ,,(n)

where V is a K x I matrix containing a column vector each of the I source directions; i.e.,

Consider a simple linear array of K elements. The received signal samples are correlated in both space and time, giving rise to a two-dimensional data problem, but we convert this to spatial domain only by assuming that narrow-band filtering precedes our spatial domain processing. Bandwidth can be handled when necessary via a spectral line approach [13] or tapped delay lines at each element [20). but we did not consider such extra complication essential to the basic purposes of this analysis. Thus. the postulated signal environment on any given observation consists of I narrow-band plane waves arriving from distinct directions AI' The RF phase at the kth antenna clement as a result of the ith source would be the product WIX~" where .¥k is the location of the element phase center with respect to the midpoint IJf the array in wavelengths. and WI is defined as ( 13)

This notation is deliberately chosen to have the spatial domain dual of sampling in the time domain. so that the reader may readily relate to the more farnil iar spectral analysis variables. Sin (J, is the dual of a sinusoid frequency /,, and the X k locations are the dual of time sampling instants t k • Note that if our elements are equally spaced hy a distance d, then X k may be written. ( 14)

where 'A is the common RF wavelength. The ratio d/»: becomes the dual of the sampling time T with the cut-off frequency equal to the reciprocal. The complex amplitude of the ith source at the array midpoint phase center is P" such that we can now express the nth time-sampled signal at the kth element as I

(}WiXk)

for

Note that (16) separates out the basic variables of source direction in the direction matrix V, source baseband signal in the column vector p (n), and element receiver channel noise in the column vector." (n). The vector E(n) is defined as the nth snapshot, i.e., a simultaneous signal sampling across all Karray elements at the nth time instant. These snapshots would nominally occur at the Nyquist sampling rate corresponding to our receiver bandwidth, so that a radar-oriented person may view them as range bin time samplings. However. for source estimation purposes. they need not necessarily be chosen from contiguous range bins and. in fact. for most applications it would be highly desirable to selectively time gate the snapshots used for source estimation. For this simple analysis . let us postulate that the snapshots are gated at more or tess arbitrary instants of time. Over typical processing intervals. the directions of arrival will not change significantly. so that V is a slowly changing matrix. In contrast. the signals Pi(n) will generally vary rapidly with time. often unpredictably. such that we must work with their statistical descriptions. It is assumed that the signals are uncorrelared with receiver noise. Proceeding then from (16). we can obtain the covariance matrix R via application of the expected value operator 8, or ensemble average.

I

2: Pi(n)gk(8 i) exp

Vi

(17)

SNAPSHOT SIGNAL MODEL

Ek(n) = T/k(n) +

(16)

R == t. [ .=(n ) E :+: I ( n ) )

( 18)

R== VPV*I+ N

( 19)

where N == 8['1(n).,,(n)*r], P = 8[p(n)p(n)*/], the asterisk is the complex conjugate, and t is the transpose. N is a simple diagonal matrix consisting of the receiver channel noise power levels. The diagonal elements of P represent the ensemble average power levels of the various signal sources, and off-diagonal elements can be nonzero if any correlation exists between the sources. Note that correlated far-field signals can easily arise if significant specular reflection or diffraction multipath is present.

(15)

i==1

Where gk «()i) is the element pattern response in the direction ()i, and 11k (n) is the nth sample from the kth element independent Gaussian receiver noise. (The receiver noise component is 203

REFERENCES

W. F. Gabriel. "Using spectral estimation techniques in adaptive processing antenna systems." Naval Res. Lab. Rep. 8920, Oct. 1985. [2] D. G. Childers, Ed.. Modern Spectrum Analysis. New York: IEEE Press. 1978. (31 W. F. Gabriel. "Spectral analysis and adaptive array superresolution techniques," Proc. IEEE, vol. 68. pp. 654-666, June 1980. [4] Special Issue on Spectral Estimation, Proc. IEEE, vol. 70, Sept. 1982. [5] S. Y. Kung, H. J. Whitehouse. and T. Kailath. Eds., VLSI and Modern Signal Processing. Englewood Cliffs, NJ: Prentice-Hall,

(1]

[6]

1985.

R. Schmidt. "Multiple ernitter location and signal parameter estima-

[7]

[8] [9] [10] [11] [12] [13] [14] [15] [16] [17]

[181

r19] [20]

tion," in Proc. RADC Spectrum Estimation Workshop, RADC-TR79-63, Rome Air Development Center, Rome, NY, Oct. 1979, p. 243. J. E. Evans, J. R. Johnson, and D. F. Sun, "Application of advanced signal processing techniques to angle of arrival estimation in ATC navigation and surveillance systems," MIT Lincoln Lab. Tech. Rep. 582, (FAA-RD-82-42), June 1982. A. J. Barabell et al., "Performance comparison of superresolution array processing algorithms," MIT Lincoln Lab. Rep. TST-72, May 1984. S. P. Applebaum, "Adaptive arrays," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 585-598, Sept. 1976. D. J. Chapman, "Partial adaptivity for the large array," IEEE Trans. Antennas Propagat., AP-24, pp. 685-696, Sept. 1976. R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980. I. S. Reed, J. D. Mallett, and L. E. Brennan, "Rapid convergence rate in adaptive arrays," IEEE Trans. Aerosp. Electron. Syst., vol. AES10, pp. 853-863, Nov. 1974. W. F. Gabriel, "Adaptive arrays-An introduction," Proc. IEEE, vol. 64. pp. 239-272. Feb. 1976. S. P. Applebaum and D. J. Chapman. "Adaptive arrays with mainbeam constraints," IEEE Trans. Antennas Propagai., vol. AP24, pp. 650-662, Sept. 1976. J. Butler, "Multiple beam antennas," Sanders Assoc. Internal Memo RF 3849, Jan. 1960. J. T. Mayhan. "Adaptive nulling with multiple-Beam antennas," IEEE Trans. Antennas Propagat., vol. AP-26, pp. 267-273, Mar. 1978. R. N. Adams. L. L. Horowitz. and K. D. Senne. "Adaptive mainbeam nulling for narrow-beam antenna arrays, ., IEEE Trans. Aerosp. Electron. Syst., vol. AES-16. pp. 509-516, Jul. 1980. E. C. DuFort. "An adaptive low-angle tracking system:' IEEE Trans. Antennas Propagat., vol. AP-29, pp. 766-772, Sept. 1981. N. L. Owsley. "Constrained adaption." in Array Processing Applications to Radar. New York: Academic, 1980. E. Brennan. J. D. Mallett. and I. S. Reed .:: Adaptive arrays in airborne

[21] [22] [23] [24] [25] [26] [27] [28] [29]

[30]

204

MTI radar," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 607-615, Sept. 1976. B. M. Leiner, "An analysis and comparison of energy direction finding systems," IEEE Trans. Aerosp. Electron. Syst., vol. AES-15, pp. 861-873, Nov. 1979. J. P. Shelton, "Focusing characteristics of symmetrically configured bootlace lenses," IEEE Trans. Antennas Propagat., vol. AP-26, pp. 513-518, July 1978. W. D. White, "Low-angle radar tracking in the presence of multipath," IEEE Trans. Aerosp. Electron. Syst., vol. AES-10, pp. 835853, Nov. 1974. R. C. Davis, L. E. Brennan, and L. S. Reed, "Angle estimation with adaptive arrays in external noise fields," IEEE Trans. Aerosp. Electron. Syst., vol. AES-12, pp. 179-186, Mar. 1976. W. D. White, "Angular spectra in radar applications," IEEE Trans. Aerosp. Electron. Syst., vol. AES-15, pp. 895-899, Nov. 1979. A. Cantoni and L. Godara, "Resolving the directions of sources in a correlated field incident on an array," J. Acoust. Soc. A,n., vol. 64, pp. 1247-1255, 1980. B. Widrow et al., "Signal cancellation phenomena in adaptive antennas: Causes and cures," IEEE Trans. Antennas Propagat., vol. AP-30, pp. 469-478, May 1982. T. J. Shan and T. Kailath, "Adaptive beamforming for coherent signals and interference," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 527-536, Jun. 1985. A. H. Nuttall, "Spectral analysis of a univariate process with bad data points, via maximum entropy and linear predictive techniques," Naval Underwater Syst. Center, New London, Cl', NUSC-TR-5303, Mar. 1976. L. Marple, "A new autoregressive spectrum analysis algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28, pp. 441-454, Aug. 1980.

Implementation of Adaptive Array Algorithms ROBERT SCHREIBER

A. Notation

Abstract-Some new, efficient, and numerically stable algorithms for the recursive solution of matrix problems arising in optimal beamforming and direction finding are described and analyzed. The matrix problems considered are systems of linear equations and spectral decomposition. While recursive solution procedures based on the matrix inversion lemma may be unstable, ours are stable. Furthermore, these algorithms are extremely fast.

I

I. INTRODUCTION

N this paper we consider the computational procedures to be used in implementing some standard and some more recently proposed adaptive methods for direction finding and beamforming by sensor arrays. We discuss the computation of a minimum variance distortionless response (MVDR) beamformer and of several high-resolution methods (recently advocated by Bienvenue and Mermoz [1], Owsley [9], and Schm idt [10]) that are based on the spectral decomposition of the signal covariance matrix. We are especially concerned with recursive implementation of these procedures. Whenever the signal is sampled, an estimate for the covariance matrix is updated and the computed solution (a weight vector) changes in response to this new information. We shall propose and analyze some new, efficient. numerically stable algorithms. The computational procedures we advocate take advantage of this on-line character. We find methods for updating the solutions that are much less expensive than procedures that do not make use of the previously computed solution. For the MVDR method, some previous work has been done [8]. An update method based on the Sherman-Merrison-Woodbury formula (which is also known as the matrix inversion lemma) has been advocated. We show that this procedure can, in one common circumstance, be numerically unstable. We propose three new, stable methods here. For the high-resolution methods, we illustrate the use of some efficient procedures for updating eigenvalue and singular value decompositions. We show how to take advantage of the existence of multiple eigenvalues of the signal covariance matrix to further reduce the work. We also show that complex arithmetic can largely be avoided. Manuscript received June 26, 1984; revised December 18. 1985. This work was supported by the U.S. Office of Naval Research under Contract N00014-82-K-0703 and by the U.S. Army Research Office under Contract DAAG 29-83-K-OI24. The author is with the Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180. He is also a Consultant to Saxpy Computer Corporation, Sunnyvale, CA 94086. IEEE Log Number 8609623.

Let ((~n and ((;;m X n denote the spaces of complex n vectors and m X n matrices. We use upper case italic letters for matrices, lower case italic letters for vectors. The In X n matrix A has, by convention, the columns ra), Q2, . .. ,a,,], and the elements [ai.i]; the vector x has elements (~b ... , ~n)T; for A E (em x ", A T denotes the transpose and A H the conjugate transpose of A. If A E I,c;n x n is diagonal (a,.) == 0 for i j), we denote A by diag(al, I . . . an.,,)' We denote the r x r identity matrix by l.. For A E ":" X ": the Frobenius norm of A is given by

'*

In giving operation counts for algorithms. we use the terrn operation to mean one complex multiplication and one complex addition. One operation costs about as much as four real multiplications and four real additions. Note that computing .r + a)' with real a costs one-half an operation.

II.

AN ON-LINE ALGORITH~1 FOR ,-\DAPTIVE

BEA~1FOR"llNG

Let .r E "be a narrow-band signal received by an array of n elements. Let its covariance matrix be denoted R.

( 1) where E{ } denotes expected value. R is Hermitian and. if any noise is present, positive definite. Thus. R has a Cholesky factorization

(2) where L is lower triangular and has positive, real diagonal elements. The factorization can be computed in n 1/6 operations [12]. With the help of the Cholesky factorization, we can compute R- 1d with fl2 operations by solving two triangular systems: Lu == d. and

Thus, w == LFf-lu == (LLH)-Id == R-1d. Consider the adaptive control of an n-element array. The minimum variance distortionless response beamformer determines the output of the array by g(d) = ~vHX

Reprinted from IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-34, No.5, pp. 1038-1045, October 1986.

205

(3)

where g is an estimate of the signal arriving from some given bearing, d is a steering vector for the given array and bearing, x is the signal vector, and w = R-1dp(d),

(4)

where p(d) is an estimate of the average power arriving from the given bearing, p(d)

==

(dflR-1d)-I.

(5)

In practice, we have several bearing angles and corresponding steering vectors d., i = 1, 2, .... , m. Let D be the n x m matrix of these vectors D

== [d" ..... , d",].

The principal computational problem is, then, to find the n x m solution matrix

(6) In the on-line beamfonning problem, D remains fixed, but R is often changed to incorporate a new sample x of the signal:

(7) where J.l. E (O~ I). We refer to such a change as a rank-one update to R (since the rank of xx H is one) although . if J.L =f.:. 1, the change to R is, in general, of full rank. One must find the corresponding updated solution *W

= *R-1D.

(8)

The obvious method [81 is based on the Shennan-Morrison-Woodbury formula for *R - I:

*R- '

=

J.L -IR-

1

+ 13:::. H

c) for every column w of Wand corresponding col umn d of D, i) compute 0 : = B: Hd; ii) compute w : = J.L -lw + The cost of step 2a) is n 2 operations; of step 2b) i~ (~)n 2 operations; of step 2ei) is nm operations; of stej 2cii) is about (~)nm operations. The alternative algorithm, in which the updated Cholesky factorization of f is used to solve for R-1D, costs n 2m operations for solv ing triangular systems and (~)n ~ for updating the factorization. Unfortunately, the method is unstable. If 0 < p.< 1. then p.- I > 1. Any error in W is amplified by the factor 1 J.L- every time the update (12) is performed. These errors eventually render the computed solutions W uselessly inaccurate. Thus, correct solutions must occasionally be calculated directly from D and the Cholesky factorization of R according to (8). Fortunately, in some applications one may take J.l. ~ 1 (so that the estimate of R computed using (7) is a better approximation to the true covariance matrix.) Thus, #J. -I is only slightly larger than 1, and the "unstable" update (12) can be used for quite some time. In fact, the choice J.l. ~ 1 is appropriate for relatively stationary signal environments. And in this case, it may also be allowable to avoid updating the weights with every new sample. But, in a rapidly changing environment, one would take J.L substantially smaller than 1, to allow R to change fast enough. In that case" the update (12) would be useless" and a stable method would be essential. Are there equally efficient, stable methods? By (11) and

oz.

(6)

(9)

= XH\v.

where (10)

Thus, (12) is equivalent to the formula *~V

and

(11) Thus, if d is a typical column of D. and \v is the corresponding column of W, then by (6L (8). and (9) . __ IIl.J ( 12) *\V -- Jl -I ~v + fJ/~ ........ To make use of the method (12) one requires that R-1x be computed. This can be done with the aid of the Cholesky factorization of R. Moreover, Gill, Golub, Murray, and Saunders have suggested a method for updating the Cholesky factorization after the rank-one change (7) that uses about (~)n 2 + O(n) operations 15]. Fast computation of this algorithm by parallel processor arrays was considered by Schreiber and Tang [Ill· This suggests the following algorithm. 1) (Initialize.) Let R = I, L = l, and W = D. Thus, R = LL H , and W = R-1D. Compute (3 from (10). 2) Every time an update (7) is made to R, a) solve for z = R-1x by solving the two triangular systems Ly = x and L HZ = y~ b) update the Cholesky factor L of R; and 206

=

JL -I~,

+ I3zx H w .

( 13)

Notice that w now appears twice. Perhaps the second use of w has stabilized the method? It has. Theorem: The residual does not change when formula (13) is used: even when w only approximately satisfies Rw = d, the identity d - Rw = d - *R*~v holds. Proof: It suffices to show that *R*w = Rw. By Jirect computation

*R*,v = (JlR

+

(1 - Jl)XXH)(Jl-I~V

= Rw + xx

Hw[{3(J!

+

+

(3zxH~v)

(1 - p,)xHz)

+

(1 - Jl)j-t-l]

= Rw sinee (3 (j-t +

(1 - Jl)x HZ) = (Jl - 1) JA. - I . II This shows that the error is not amplified by (13). A similar analysis can be done for the method (12). It shows that the residual can increase by a scalar multiple of x, whose length is proportional to (xHw - zHd). An even more stable procedure can be devised. From (12) it follows that

*w -

W E

span {w, z}.

(14)

z tI '

Jl -

/I'

IV

1,=

/I '

I tl'

,=

/I '

+n

,1 I

+ -, z

Fig . 1. Computat ion of

a ,.

All update procedures seek in some way to find the linear combination of wand Z that, when added to w, gives *w . Among the many possible methods are a group of conjugate direction procedures that are especially desirable in that they make almost no assumptions other than (14) and use the available data to choose the coefficients of wand z in the linear combination. In these methods. one chooses an orthogonal basis for the span of wand ::. then takes a step from w in the direction of one of the basis vectors. go ing to the precise point in that direction closest to * w. Then another step, in the direction of the other basis vector. produces *w. The various possible algorithms differ in the choice of basis and the innerproduct used . The following procedure is computationally convenient. Define z == *R - Ix. Note that ; is a sca lar multiple of R- 'x . Choose 'Y so that z and z , == IV - 'YZ are *Rorthogonal. (T wo vectors u and u are *R-orthogonal if u H " Ru = 0 .) Starting fro m IV, take a step a l;: 1 so that w, == W + al Zl is as close as possible to *IV with respect to the *R-norm. (The square of the *R-norm of a vector u is given by u H *Ru.) Then take a step a 2 Z so that W2 == WI + a2 Z is as close as possible to *w . Now *w = W2 is the updated solution. It is well known [6] that if Rw = d exactly then, except for rounding errors, W2 exactly equals *w. On the othe r hand. suppose that w = R- 1d + e . where e is the current error in w. We can write e = el + e2, where el E span {w, z } and e2 is *R-o rtho go nal to both w and z. Then after application of the two conjugate direction steps as de scribed above , we will obtain *w = *R- Id + ez. In other words , the component of the error that lies in the span of wand z will have been annihilated . Computation of the step lengths in a conjugate direction method usually involves computing dot products. In this case , however, computation o f al can be greatly simplified by some geometric insight-see F ig . I. By (12 ) , *w - /l -IW is a sca la r multiple of z . Moreover, w, = (1 + al)w + a l'YZ. Thus, *w - WI is a scalar multiple of z if *w + al)w is. Thus , we should take al = /l- I -

I.

(I

Of course, the computed solution w does not exactly satisfy (12) . So we should really compute al by requiring

that *w - w, be *R-orthogonal to ZI ' But we would need to· compute a matrix-vector product to be able to do this without error. That would raise the cost of the method from O(n) to O(n 2) . In fact, our procedure is equivalent .to replacing the product Rw by the vector d in the innerproduct Z ~ Rw. Let r = Rw - d. The error we make is therefore Z ~ r. Now ZI is orthogonal to x. So the error will be rather small if r is close to the span of x. In view of the fact that r is a residual and R is gi ven by (I), it is likely that this is so. The method is as follows. Algorithm (Conjugate Direction): Given the Cholesky factor L of R, a new signal sample .r , the current computed solution W = R-1D , and /l, I) update the Cholesky factor; now *R = *L *L H; 2) so lve (*L*L H)Z = x; 3) compute Z HX (which is real); and 4) for every column w of Wand corresponding column d of D . a) compute xHw ; b) compute ;: Hd; c) compute v : = -x Hwl;:Hx : d) compute 2, : = w + 'Y;: ; e) cornpute ce . t > /l - I - I ; f) compute w l : = w + a ,;: ,: g) compute a2 : = (zHd - xHw)/;:HX: h) compute W2 : = WI + a2 ::: i) stop : W2 is the computed solution to " R" IV = d .

•

Re call that computing x + ay with complex .r , v. ami real a costs one-half an operation . Thus . this conjugate direction procedure costs (~)n 2 + (~ )11 11I operations . For m » n it is slightly less than twice as costly as the meth ods (\2 ) and (13 ) . In view of the experimental results given below. it does appear to be more accurate than ( 13) in some cases . But its superiority is not uniform : it de pends on a 2 and u : A. Experimental Tests

We have verified our claims by an experiment. Vectors x = (1 ,2 , 3,4, 5 , 6)T + s were generated. where shad random, independent, normally distributed components of mean zero and variance a 2 . We took d = ( I, I. I. I , I. I )T and initially R = I and w = d. We then used the three methods discussed above for 100 updates. Let W(I ) denote the solution vector given by the method (12), w(2 ) the solution given by the stable for mula (13), and w (3) the solution given by the conjugate direction formulas . At each step we updated the Cholesky factorization of R by computing Q

[(I -

Il) 1/2X T]

/l 1/2L T

=

[*L 0

r ]

where Q is the product of n plane rotations . We give four error statistics below . The first is a measure of the accuracy of the updated Cholesky factor L,

207

ER ==

IIR -

LLTIIFIIIRIIF'

RELATIVE

p. (}2

=

ER E1 E2

£3 (12

=

ER E, £2 E)

(}2

= 0.8

100 = 0.197( -6) = 0.408( +3) = O.448( - 5) = O.840( -6) 10- 2 = O.224( -6) = O.284( +4) = O.467( - 5) = O.850( -5)

= IO-h

ER

TABLE I ERRORS AFTER 100

= 0.206( -6)

£1 = 0.990( +5) £2 = O.159( -2) E, = O.402( -2)

p.

Lv

UPDATES

= 0.9

p.

= 0.501(-6) = O.172( -1) E2 = 0.184( -5) ER

£1

E) = O.252( -6)

g(d)

= 0.219(-5) = 0.144( -4) £2 = 0.761(-5) £) = 0.412(-6) ER

= 0.818(-6)

EJ

= O.240( -

£2

EJ

= O.720( -5) = O.702{-6)

5)

= O.677( -6) = 0.709(+0) £2 = 0.344(-3) E.\ = 0.872( -4) £R

£1

(16

= a.2IO( -5) = 0.113(-4) E'). = O.256( -4) E., = O.244( -4) ER

A theoretical analysis of this procedure is given in Section II-B below. Also, for j = 1, 2, 3, we give a measure of the error in the updated solution l-v( j),

EJ == Ihv(j) - (UT)-'dll/\I(LLT)-'dll.

wHx

= (v Hy) p(d)

( 17)

·where y is the solution to the triangular linear system

Ly

E, = O.129( -4)

£1

=

= dHR.-1xp(d)

ER

£1

= O.307( -6) = O.365( -I)

£2 = O.955( -5)

d.

And from (2), (4), and (3) we have that

= 0.99

E,

ER

=

= x.

(18)

If we are willing to solve the system (18) at a cost of

n 2 operations for every new signal x (and if there are many bearings d, this is reasonable), then we may use (17) to compute g(d). Thus, we no longer need the weight vector

w, but rather the vector u and the power estimate p. We now give a stable method for updating v and p after the change (7). This new algorithm is especially convenient in that it can be incorporated into the process of updating the Cholesky factor L of R. The systolic array devised by Schreiber and Tang [11] can be used to perform the necessary additional computations. By (7), we seek the Cholesky factor *L of

We took J.L = 0.8,0.9, and 0.99 and 0 2 = 102 , 10- 2 , and 10-6. The results were essentially unchanged for 0 '2 greater than 102 . In Table I we show the errors in the format

h were y == (1 - Jl) i,"-x. Let Q be an orthogonal matrix such that TQ == [0

E~ E~

for each pair (IJ.. 0 2). The notation O.123( -4) means 0.123 x 10- 4 • All computations were done in single pre-

x

*L]

*R = TQQflT

=

Il

+ 1 (19)

H

L LH

* *

so *L is the Cholesky factor of *R. This method is very stable. If there is some error in L, for example, if

B. Another Stable Method

We now discuss a third stable updating method that differs in two ways from those already considered. It avoids explicitly fanning w; and it can be viewed as an extension of the process for updating the Cholesky factor L-a process that we shall describe more fully in this section. From (2) and (5) we have that (p(d))-l

+ 1

where *L is lower triangular with positive real diagonal. It is easy to see that Q can be obtained as the product of n plane rotations. Now. clearly ~

cision on a VAX.

Note that the conjugate direction method (method 3) is distinctly more accurate in those cases where high accuracy is useful: low signal-to-noise ratio, which tends to make R well conditioned. and Jl ::::: 1. so that R is accurately estimated. Three such cases occur in the upperright-hand corner of Table I. In these cases, the conjugate direction method is ten times more accurate than the stable update method that uses (13).

11

= dHR-ld (15)

where v is the solution to the triangular linear system 208

LL H == R + E where E is an error matrix, then by (7) and (19), *L*L H = TQQHTH

= TT H = v:" + JlLL H = *R + J.LE. Since Jl < 1, the error is reduced. In this sense, this method of updating the Cholesky factor is self-correcting. It was used in the experiments of the previous section, which show that it is very accurate. It can, therefore, be strongly recommended. Now let d be a given steering vector, let p = p(d) be the corresponding power estimate, and let u = v(d) =

L -1 d be the corresponding solution to (16). Apply the rotations used in finding *L to obtain [0,

J.L -1/2VH]Q ==

[8,

*v H].

(20)

Now it follows, by (19) and (20), that d =

lY

=

[0

1r

l:-I12J

Jll/2 L

= TQQH

0

l.Jl

_1/2

and *p

Since

jl-I

-I

H

= *v *v

+

JL

-I

E.

> 1, this approach is unstable, and

*p should

H

be computed as *v *v. This procedure requires (~)n 2 operations for the Cholesky update, (~)nm operations for updating *v using (20), and (! )nm operations for recomputing *p.

1

•

III.

METHODS FOR UPDATING THE SPECTRAL DECOMPOSITION

*L1l:J

A number of modern, high-resolution methods make use of the spectral decomposition of R [1], [9], [10]: R == MAM

== *L*v, so that *v is the updated solution to (16). To show that (20) is both correct and stable. we assume that Lv is not exactly equal to d . but that

H

(21)

where A == diag ( AI ~ ... , A,,) is the matrix of eigenvalues of R, ordered so that

AI

Lv == d + ,.

~ A~ ~

...

~

Aft"

Here M == [/nl ~ . . . , Ill,,] where In; is a normalized eigenvector corresponding to Ai' Note that /vt is unitary (M H where r is a residual vector. But by (19) and (20), == MWe shall be concerned, therefore, with updating d + r == Lv the decomposition (21) after a rank-one change (7) to R. In practice, one uses an estimate of R that is the product XHX, where X is a matrix whose rows are (weighted) sampies of x. Therefore. the eigenvalues and eigenvectors of the estimate are the squared singular values and right singular vectors of X. Periodically. a new observation of .r is made and is appended to "Y as a new row. Thus. the problem of updating the spectral decomposition (21) is == *L*v. mathematically equivalent to that of updating the singular so that *v satisfies (16) as well as did u. This is therefore values and right singular vectors of X when a row is apa stable update method. pended. We now consider an efficient update formula for the Bunch and Nielsen recommend that to update the sinpower estimate p. From (20) it follows that gular value decomposition of X when a row 1':) added, one should update the corresponding spectral decomposition II - 1P - I == [0 - 1/2V H] I 0 I of R after a rank-one change (2). This can obscure the r: J.1. -1/2 I l f.1 v small singular values of X: a singular value of size
1

»:»

209

) .

a stable algorithm for finding the eigenvalues and eigenvectors of such a matrix in 0(s2) operations. All the observations and methods proposed here have analogs for the SVD. In particular, Businger [4] has given a method that takes quadratic time for updating the SVD when a row is appended to the matrix. But the methods advocated in this section, including the eigenvalue updating algorithm, are all amenable to computation by systolic arrays, and the time for an update can be reduced in this way to O(n). Part of Businger's method (the QR iteration for a bidiagonal matrix) is not. Observe that, by (7) and (21), *R = M[JLA

where M;

= x.

+

(1 - Jl)~H]MH,

+

(I - Jl )z:. H

have the spectral decomposition

=

S

T*i\T

(22)

,

== NIT:

*R == *M *.\*A1}f

is the desired spectral decomposition. Thus, the spectral decomposition (22) of the sum of a diagonal matrix and a matrix of rank one is needed. We now show that S can be made real. Let \\7 E n be given by \V = (WI_ • • • , w,,). where Wj = I j I. Then

r

=

D}{:.

where D = diag (SI/WI' . . . . r",'w,,). Note that DD" = I", i.e.. D is unitary. Now

DHD

w

= D Ii : = DIIMllx = (MD)/lx.

The columns of MD are normalized eigenvectors of R. If we use them in place of the columns of 1\1. we have that

S =

J.Ll\

+

(1 -

Jl)H:~vT

which is real and symmetric. So we will assume in the following that M has been replaced by MD. When R has repeated eigenvalues, there is more that can be done (3]. And if x is formed from s directional signals and spatially homogeneous, additive noise of power (]2, then the eigenvalues of R satisfy

As > A,\O + I

=

A.\" ~ 2

[~I

= . . . = An ==

:AJ

where SI = J.LA I + (1 - J,t)zlz7 and ZI = (rt, r.~ + I)T. Now, given the spectral decomposition

S.

=

TI*A 1 Tf,

we have that

T

S= [ I o

Si=,nix,

a 2•

l~i:5s.

We must first compute these values. Now we let

(23)

then

H'

s=

H

H

and let *M

is done, then

is the spectral decomposition of S. This has been observed in previous work [7]. Let us be specific about the choice of ms + .' . . . , m., Since z = MHx we have

Let S == JLA

r, = O. If this

In.\,+

I

== (x

-± 1=

I

r,m i

)/lI x -

1

±rim;ll. = I

Thus, m, ... I is the normalized orthogonal projection of x on N. Now add additional vectors Inj' j = s + 2~ .. " n, until an orthonormal basis for N is obtained. This makes t, = 0 for j == s + 2~ ... , n . In fact. it is not necessary to construct In" + 2' • . • , m.; (If it were, they could be taken as the columns of a certain Householder matrix [3].) Note that the number of eigenvalues (of the estimated covariance matrix) greater than o 2 can increase by 1 every time we update. On the other hand. the number of eigenvalues of R greater than a:' is determined by the number of linearly independent signals hitting the array. The updating (7) moves one of the eigenvalues a 1. to the right, but the remainder move to the left. reduced by the factor u, In the equilibrium state of this process there are s eigenvalues greater than a 2. and a cluster of n - s near a 1.. This is not completely satisfactory. Karasalo, Goetherstrorn, and Westerlin suggest an attractive method that can be used if we have an a priori upperbound s on the number of signals [7]. They replace the new matrix *R by its closest approximation by a matrix of the form A + a:'l, where rank(A) == s. They show that

(24)

(The space N == span {ms + I, . · . ~ ,n,,} is called the noise subspace. While R determines N~ any orthonormal basis for N can serve as the last Il - S columns of M.) To take advantage of the repeated eigenvalue, let AI = diag (AI' . · • , A~ + I) and A:! = diag (As +:!' . · . , An)' It is possible to choose the last n - s columns of M so as to make t\, + 2

210

and that the new noise level is *a 2 == [(12 -

S -

1)(12

+ *As + tl/(n - s).

To compute the eigenvectors of *R, note that

(25)

where M} = [m}, . · . , m, + I]. Because T is real, this multiplication costs n(s + 1)2/2 operations. How much more costly is it to recompute (21) every step? Direct computation of a Hermitian spectral decomposition requires approximately 5n 3 + O(n 2) operations. The work required by the method developed here is 1) form z1 at cost n(s + 1) operations; 2) compute the spectral decomposition of 51 at cost O(s 2) operations; and 3) compute *M = M 1 Tat cost (~)(s + 1)2n operations. The net cost is therefore (~)n(s + 1)2 + Oins + s 2) operations. The relative expense of the new scheme drops very rapidly as n/ s increases from 1, and is never more than 23 percent of the cost of a full spectral decornposition. Additional computational savings are possible. Let us consider the methods proposed by Owsley [9], which are typical. These methods all seek to estimate the power of the signal hitting the array at a given angle by g(d) == ~vHd

TABLE II OPERATION COUNTS AND STABILITY OF THE BEAMFORMING ALGORITHMS

Section II Nonrecursive solution of (4) Matrix inverse lemma:

J=1

mj/J'( A;)m JHd

c

==

MHd.

*c

==

*MHd.

311m

+

m nm

n.a.

(27)

and 5+1

*g

==2: ./=1

(3(*Aj) \ *~'r'j

2 1

To compute *g, we need only update But by (25) and (28) *c

n.a.

o

.\ + I

*MHd

THMHd

.

C1.

(29)

then use (29).

n.a.

+

Key

Denote the elements of these vectors by c == ('Y 1 ' 'Y n)T and *c = (* 'Y 1. • . . , *'Y n) T. Let C I == ('Y 1, v, + 1)T and *c\ == (* 'Y1' . . . . *v, + I)T. Then

(3/2)n

(3/2)n 2

p:

(28)

+

+

o

Recomputing p

and

nm

(5/2)nrn + m] (l/2)n{5n + 9m]

Section II-B Updating the Cholesky factor L using orthogonal transformations: (19) Updating the solution v using the same orthogonal transformations (20): Using the orthogonal transformations to update

where f3( A) is real valued. Various choices for (3 can be made that succeed in suppressing the effect of noise and enhancing the resolution of the method. In computing the output power g(d), we can do better than to use the definition (26) directly. We can instead use the relations (25), (26), and (27) to reduce the cost from n'2 operations to (5 + 1)212 operations for each vector d. To be specific, let

+

(10)-(11)-(13)

+

== .2:

2m

(5/2)nfn

Conjugate direction:

(26)

n

Stability

(10)-(11)-(12)

Stabilized matrix inversion:

where d is a steering vector for the given array and angle and w

Cost

Method

= o" v:

m]

+

stability is not an issue for nonrecursive methods: the method is unstable: errors are amplified by every step; the method is stable: errors are neither amplified nor damped; the method is stable: errors are damped by every step:

so that (30)

Thus. from (29) and (30). we can compute g with only (s + 1)212 operations for each vector do IV.

CONCLUSIONS

We have given three numerically stable and cornputationally efficient procedures for adaptive bearnforming that improve, either in speed or accuracy. on known procedures. These procedures make methods based on the inverse of the signal covariance matrix much more practical for real-time use. This is especially true for large sensor arrays, since the dominant cost of these procedures grows only linearly with the number of array elements (in this respect they are like the LMS method). Straightforward use of the matrix inverse or a triangular factorization incurs quadratic cost. For methods based on a spectral decomposition of the signal covariance matrix, we have obtained a similar economy. The resulting rather dramatic reduction in cost makes these methods, too, more practical for real-time use. To summarize the algorithms recommended, we give their operation counts and stability properties. In Table II, we give the results for the bearnforrning algorithms discussed in Section II. In Table III, we give the results for updating the spectral decomposition discussed in Section III. We have not made an issue of stability of the spectral decomposition methods. Because no factor of 1-L -I occurs in the methods , there is no reason to suppose that insta-

211

(4] P. A. Businger, "Updating a singular value decomposition," BIT, vol. 10, pp. 376-385, 1970. [5] P. E. Gill, G. H. Golub, W. Murray, and M. A. Saunders, "Methods for modifying matrix factorizations," Numerische Mathematik. vol. 28, pp. 505-535, 1974. [6] G. H. Golub and C. F. Van Loan, Matrix Computations. Baltimore, MD: Johns Hopkins University Press, 1983. [7] I. Karasalo, L. Goetherstrom, and V. Westerlin, "Nagra signalbehandlingsfonner basered pa kovariansestirnat av signal och bros." (English title: ••Some signal processing methods based on an estimate of the covariance of signal and noise") Swedish Defense Res. Inst. (FOA), Huvudavdelning 3, S581 11 Linkoeping, Sweden, Tech. Rep. FOA Rep. C30343-El, ISSN 0347-3708, Oct. 1983. [8] R. A. Monzingo and T. W. Miller. Introduction to Adaptive Arrays. New York: Wiley, 1980. [9] N. L. Owsley, "High resolution spectrum analysis by dominant mode enhancement, " in VLSI and Modern Signal Processing, S. Y. Kung, H. J. Whitehouse, and T. Kailath, Eds. Englewood Cliffs. NJ: Prentice- Hall, 1984. [10] R. O. Schmidt. "A signal subspace approach to multiple emitter location and spectral estimation," Ph.D. dissertation. Stanford Univ., Stanford, CA. 1981. [11] R. Schreiber and W. P. Tang, . 'On systolic arrays for updating the Cholesky factorization," Swedish Roy. lost. Technol.. Dep. Numeric. Anal. Comput. Sci., Stockholm. Sweden, Tech. Rep. TRITANA-8313, 1983. also submitted to BIT. [12] G. W. Stewart, Introduction to Matrix Computations. New York: Academic. 1973. [13] I. Karasalo ... Estimating the covariance matrix by signal subspace averaging." IEEE Trans. Acoust., Speech, Signal Processing. vol. ASSP-34. pp. 8-12, Feb. 1986.

TABLE III OPERATION COUNTS OF ALGORITHMS FOR THE SPECTRAL DECOMPOSITION

Cost

Method Section III Full recomputation of the decomposition (21) Use of the Bunch-NeilsenSorensen method (21 )-(22) Exploiting repeated eigenvalues (24) when there are s < n

signals

Recomputing g(d) using the definition (26), when there are m different direction vectors d Use of the recursive method (29)(30) to compute g(d)

mns (1/2)ms

bility of the type encountered in the first beamfonning method of Section II will occur. Moreover, Bunch, Neilsen, and Sorensen [3] and Karasalo, Goetherstrom, and Westerlin [7], [13] give substantial experimental evidence for the stability of some fast update methods for the spectral decomposition. ACKNOWLEDGMENT

I thank P. Kuekes, N. Owsley, and G. Golub for many interesting discussions on this subject. I would also like t ) thank the referees, who made a number of very useful suggestions. REFERENCES

[I] G. Bienvenue and H. F. Mermoz ... New principle of array processing in underwater passive listening." 10 VLSJ and Modern Signal Processing. S. Y. Kung, H. J. Whitehouse. and T. Kailath. Eds. Englewood Cliffs. NJ: Prentice-Hall. 1984. [2] J. R. Bunch and C. P. Nielsen. "Updating the singular value decomposition:' Numerische Mathematik. vol. 31. pp. 111-130. 1978. . ~31 J. R. Bunch. C. P. Nielsen. and D. C. Sorensen. "Rank-one modification of the symmetric eigenproblern.' Numerische Mathematik. vol. 3 1. pp. 3 1-48. 1978.

212

Steady State Analysis of the Generalized Sidelobe Canceller by Adaptive Noise Cancelling Techniques NEIL K. JABLON,

(due to weight jitter) [6] will be different. However, as long as the GSC signal blocking matrix has dimension one less than the number of antenna elements and its columns are linearly independent, then the Frost and GSC implementations lead to the same steady state output signal-to-interference-plus-noise ratio (SINRo) in a stationary environment, based on a comparison of Wiener solutions (i.e., infinitely slow adaptation) [5]. In the first part of this paper, using adaptive noise cancelling techniques [6], exact expressions are derived for the GSC Wiener solution, SINR o, and performance improvement due to adaptation (PIA), defined as the ratio of SINRo after adaptation compared to SINRo before adaptation. The results are derived assuming

Abstract-Narrow-band adaptive noise cancelling techniques are used to study the generalized sidelobe canceller (GSC), a general form of linearly constrained adaptive beamforming structure. In an environment which consists of a look-direction signal, one jammer, and additive receiver noise, exact expressions are derived for the Wiener solution, the steady state output signal-to-interference-plus-noise ratio (SINRo), and performance improvement due to adaptation (PIA), defined as the ratio of SINRo after adaptation to SINRo before adaptation. These expressions are in terms of the signal directions and power levels, an arbitrary array geometry, and a general signal blocking matrix. Next, easily evaluated scalar equations for PIA are given for two classes of constraints. The first is constant gain in the look direction, and the second is constant gain plus a main beam zero first derivative in the look direction. Under the further assumption of an equally spaced line array, even simpler equations for PIA result, and are used to show that for jammers arriving outside the beamwidth between first nulls (BWFN) region of the unadapted beampattern, the introduction of the additional main beam zero first derivative constraint leads to negligible degradation in PIA.

T

I.

MEMBER, IEEE

INTRODUCTION

HE GENERALIZED sidelobe canceller (GSC) is an important adaptive antenna structure, for both theoretical and practical reasons. As Griffiths and Jim explained in [1], the GSC can be viewed as an alternate implementation and extension of Frost's [2], [3] algorithm, using a basic model due to Applebaum and Chapman (4]. Like the Frost beamformer, the GSC adapts to minimize mean square error (MSE) while implementing a look direction constant gain constraint, but in addition is easily generalizable to deal with main beam zero derivative constraints of any order [5]. "New methods of adaptive beamforrning are suggested by the GSC structure," [1] for example, combined temporal/spatial constraints. Since the GSC uses an unconstrained rather than a constrained algorithm to adapt the weights, it may be possible to adapt much faster. The GSC will also be less sensitive to coefficient quantization effects, since the dynamic range of the signals in the adaptive portion of the beamformer is compressed. In general, the Frost beamformer and GSC have different autocovariance matrices, so that algorithm performance measures formulated in terms of autocovariance matrix eigenvalues, such as transient response time and misadjustment Manuscript received March 5, 1985; revised July 8, 1985. This work was supported by the Naval Air Systems Command under Contracts NOOO 19-83C-0287 and NOOO19-85-C-0018, and by the Fannie and John Hertz Foundation Graduate Fellowship Program. The author is with the Information Systems Laboratory, Electrical Engineering Department, Stanford University, Stanford, CA 94305. IEEE Log Number 8406814.

• all antenna elements are omnidirectional with identical amplitude and phase: • the beamforrner is narrow band, so that instead of using tapped delay lines to form the adaptive weights. there is only a single column of complex weights: • the wanted signal is in the far field. and its angle of arrival is assumed to be known. Each antenna element contains a phase shifter so that the array can be steered to this look-direction. The wanted signal will hereafter be referred to as just the signal: • additive receiver noise of equal power is present at each antenna element and can be modeled as a Gaussian process, uncorrelated from element to element and from time sample to time sample; • the signal can be modeled as narrow band, and is assumed to have a planar wavefront. The narrow-band assumption means that the reciprocal of the signal bandwidth is large compared to the transit time of the wavefront across the array; • a single far-field jammer is present, which can also be modeled as narrow band and planar: • the signal, jammer, and receiver noise are zero-mean, wide-sense stationary, and statistically independent of one another: • the propagation medium is linear, homogeneous, and isotropic. The derived expressions thus will not take into account misadjustment [6], non-Wiener signal cancellation [6], multipath [6], [7], element amplitude/phase errors [I], [3j, [6][10], mutual coupling [11], and sky noise, often modeled as

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. AP-34, No.3, pp. 330-337, March 1986.

213

spherically isotropic noise [12]. These effects all alter the Wiener solution, and thus degrade performance. The derived expressions are also exact, given the above assumptions. No further assumptions were made concerning relative power levels of the signal, jammer, and receiver noise. For a given array geometry at a fixed frequency, it will be demonstrated that one only has to evaluate a single equation involving matrix quantities, which is just a function of look direction, jammer angle, and the signal blocking matrix used. Since all signal blocking matrices meeting certain criteria lead to the same Wiener solution [5], [13], one can simply pick the most convenient one when evaluating these expressions. Although only a single jammer is considered, the results presented here could be extended to the case of multiple jammers, either by analytically inverting the more complicated expression for a multiple jammer autocovariance matrix, or by simply solving for the autocovariance matrix inverse numericully. The inverse of the autocovariance matrix is then multiplied by the crosscovariance vector to obtain the Wiener solution.. after which it is straightforward to calculate SINR o. The second part of this paper involves evaluating PIA for two classes of constraints. The first is that of constant gain in the look direction (zero-order constraint). One possible realization for the signal blocking matrix is then adjacent element differencing . an attractive choice for practical implementation. Due to the results in [13]. PIA calculated by this method will be the same as for a Frost beamformer. assuming stationary signal statistics [5). This expression for PIA is exact, and involves no matrix operations (i.e.; matrix rnultiplication and inversion). As will become clear later in the paper. knowing PIA for any signal blocking matrix provides all the information needed to compute SINR,), as long as the signal power level is known. The second class is the double constraint of constant gain and a main beam zero first derivative in the look direction tfirst-order constraint). which is useful for reducing the bearnformer sensitivity to perturbations such as amplitude/ phase errors [I}. A simple realization for the signal blocking matrix is then two columns of differencing in series. From [13) it also follows that PIA calculated for this particular signal blocking matrix applies to all signal blocking matrices which implement the first-order constraint. An equation for PIA with this second type of signal blocking matrix is presented. As with the equation for PIA using the zero-order constraint. no matrix operations are involved in this equation, either. Taking the example of a broadside equally spaced line array, both equations for PIA are plotted as a function of jammer angle for various numbers of antenna elements and input interference-to-noise ratios (INRI). It is easily seen from the graphs that the use of the additional main beam zero first derivative constraint in the look direction only degrades performance within the beam width between first nulls (BWFN) region of the unadapted beampattern , and outside that region the loss in PIA due to the use of the additional constraint is insignificant. Although Vural [9] and marc recently Er and Cantoni [14] also investigated the degradation in SINRo, or equivalently PIA, due to the introduction of an additional main beam zero

first derivative constraint in the look direction, their results were based on computer simulation and did not include explicit scalar equations for SINRo loss due to the additional constraint that a system designer could quickly evaluate. Er and Cantoni's work was focused on the wide-band case. Takao and Komiyama [15] investigated the use of an additional constraint to the Frost beamformer which consisted of a beam pattern zero first derivative constraint in the jammer's direction. As with Er and Cantoni, Takao and Komiyama focused on the wide-band case, and their results with respect to SINRo also consisted of computer simulations. Hudson [12] presented approximate results for beamformers implementing a firstorder constraint, based on a polynomial expansion to characterize the antenna main lobe response. Therefore, this is the first work which exactly quantifies the loss in SINRo due to the introduction of a main beam zero first derivative constraint in the look direction. The outline of this paper is as follows. Section II derives the Wiener solution of the GSC, Section III the steady state SINR o, and Section IV PIA. Section V presents equations for PIA in the two cases of zero-order and first-order constraints. In Section VI an equally spaced line array is assumed, and the two equations for PIA are plotted and compared. Several approximations are presented for PIA in certain situations. Section VII contains the conclusions. II.

WIENER SOLUTION

In this section. the Wiener weight vector will be derived for the GSC. It is well known (6) that this weight vector minimizes MSE under the assumption of stationary signal statistics. The GSC is shown schematically in Fig. 1. consisting of K omnidirectional elements having identical amplitude and phase. The array is electronically presteered to a known look direction. which in the narrow-band case can be done with a phase shifter at the output of each antenna element. The presteering delays - T s are given by I•

-

T i.s

_(dt-do )

==

-c-"-

.

SIn

On

;=1, ... , K

(1)

where d, represents the location of element i, do represents the location of the array phase center, c represents wave speed in the propagation medium.. and Os the look direction, measured as indicated in Fig. 1. A jammer coming from an angle OJ would in the absence of presteering, undergo a time delay at each element of

i= 1, ... , K.

(2)

Thus, in the presence of presteering, the signal can be treated as coming from the array broadside, and the jammer undergoes a total time delay at element ; of T;

~ T i .) - Ti,s'

i= 1, ... , K.

(3)

Equations (I) and (2) assumed a straight line array. This assumption was made only to illustrate the functional dependence of the presteering delays, and does not limit any of the results in this paper to that type of array.

214

CONVEN T IONA L DEL AY-A ND-SUM DEAMFOnMEn

ADAPTIVE NOISE CANCELLE n

r - ----- -- - - --- - - -,

r- ---- - -- ---- --,

I I

I I

1

I Des ired resp on se d k

I

I

L

_

I Outpu t Ok

~~'_.

+

: Prim a ry I I I

I I

I

I

I

I

I

Sidclobo

_ ___ _ _ _ _ ...J

1

ca nce lling

I

I

I I

I

I I

I

sig nal

I I

rrmr nOCEsson

I

I

I I

Uk

I

" I .k

I

I

I I

!l efer en ce I

B

"K.k Referen ce

I I I

I I I

I

I

I

I

I I I

I

I I

:

L

Fig . \ .

Co m ple x a d ap t ive

a l~o rl t 11 m

:

...J

Block d iagra m o f narrow-band GSC . The addit ive receiver noise s following the stee ring dela ys are not shown .

Following passage through the presteering delays , the signal received at each antenna element is corrupted by additive zeromean white Gaussian noise (WGN). as in Fig. 2. Other aspects of narrow-band adaptive antennas are discussed in [161. The main pan of the beamformer consists of two branches . The top one is termed the desired response branch, and its purpose is to form the signal db which is the primary input to the adaptive noise canceller. The manner in which the desired response branch is configured here constrains the look direction gain to be unity . In general, the desired response branch of the GSC will be the same as a conventional delayand-sum bearnformer, which has no adaptive weights . The bottom branch of the beam former is termed the sidelobe cancelling branch, and its purpose is to form the sidelobe cancelling signal Yb by providing K reference inputs to the adaptive noise canceller. Note the use of complex conj ugate weights Wi.k (i = I, . . . , K) in computing Yk. The se weight s can be updated by several different methods , for example the complex. least mean square (LMS) algorithm of Widrow et al. [6] . The sidelobe cancelling branch is preceded by the signal blocking matrix. B, a preprocessor designed to block the signal. The preprocessor has K inputs and K outputs . Although not absolutely necessary , in this paper it will be assumed that K < K. If only a zero-order constraint is present, then K = K - 1, and there will be K - I degrees of freedom available in the sidelobe cancelling branch to form nulls in jammer directions . If a first-order constraint is used instead, then K = K - 2, and one less degree of freedom will be available . The restrictions on B are [I]

B1 K = 0..: rank (B) =

(4)

K.

(5)

x~

T'~

I " '~k

Fig. 2 .

l~ TO'"'''~

Model of ith receiver channel (i

=

•

I • . . . • K) .

TK is the all l ' s vector of length

K , and 0..: the zero vector of length K. For the rest of this paper the all l ' s vector will simply be indicated by T. In the subsequent derivation . matrices and vectors will be denoted by bold uppercase and lowercase letters, respectively . Complex conjugates will be represented by overbars, transposes by a superscript T, Hermitian (complex conjugate) transposes by a superscript H, and stead y state quantities based on using the Wiener weight vector by a superscript aster isk. E [ . I represents time expectation. At time sample k, define the state vector u, as [u 1.k> " ', U":,k) T and the weight vector W k as [w i,k, "', w,u) T . The autocovariance matrix R"u and cross-covariance vector fud are

R UII

~ E[uku{il

(6) (7)

There is no k subscript on Ruu and rlld because the signal, jammer, and receiver noise were all assumed wide-sense stationary . The Wiener solution for the GSC. denoted by the vector w*,

215

where Qj is the normalized array (spatial) factor [18] in the jammer's direction

is then [6] (8)

The complex snapshot vector at the kth time sample Xk g [XI,k, ••• , XK,kl T comprises signal and jammer after presteering, plus receiver noise. B then transforms Xk into Uk: (9) Xk is the sum of three components: a component s, due to the signal, a component jk due to the jammer, and a component n, due to the receiver noise:

(15)

Equation (15) without the j subscripts defines the normalized array factor in any general direction (J, with A formed by substituting (J for (Jj in (2). a is the normalized voltage beam pattern of a conventional beamformer. Based on (6) and (13) and the independence of jammer and receiver noise R li u = o 2nBB T + (Jj2 (RAJ -1)(BA j -1 )H.

(10)

Utilizing the plane-wave assumption for the signal, It IS possible to represent s, purely in terms of the signal Sk at the kth time sample and the vector T. The same plane-wave assumption can be used to represent ik in terms of the jammer jk at the kth time sample, a diagonal matrix AJ which accounts for the phase shift of components of ik due to presteering, and the vector T. Notationally

Substituting (13), (14) into (7) and using (4) to eliminate

BE[DkD:lT results in rud = aJajBAjT, which is multiplied in front by R 1:U) to obtain 1

Iw*=wo(BBT)-IBAjl.

The complex constant

where AJ d diag {e - jWT I, " ' , e - JINT K} with w being the center frequency of si, ik' and the receiver noise in rad/s, n, ~ [n I.k, " . , n K,k ] T, where ni,k is the noise added to antenna element i at time sample k. s., i k' and ni.k can all be represented in complex envelope notation. A sampled signal x, which is represented in complex envelope notation is written ""(k = a,eJ( r,« + tit k ). ak is the random amplitude-modulated portion of xi, el w r» the noninformation bearing carrier frequency portion. and e!Vt k the random phase-modulated portion. T, is the sampling period, in seconds. (J2, the power as measured at any element" is defined as E[ 10k 1 2 ] . V;k must be - U(O, 21l") for Xk to be stationary [17], where - U(a, b) represents a random variable uniformly distributed over (a, b) on the real line. If a, and 'J;k are slowly varying, then x, is considered narrow band. Thus to represent si, )k, and ni,k in complex envelope notation, as,k, aj,k, and an,i.k are defined as their random and (J~ are defined as the signal, amplitudes. Secondly, jammer, and receiver noise power as measured at any element. Thirdly, "'s,k' "'j,k' and "'n,i,k are defined as the random phase of s., ik' and ni.k, respectively. Fourthly, Qs,k, Qj,k, and Qn,i,k, are all statistically independent, as are l/;s,k, "'j,k' and "'n,i,k' and the latter six quantities are further assumed to be varying slowly enough to satisfy the narrow-band assumptions. Using (9)-(12), and (4) to eliminate s, in (13) IN

(J;, (J;,

Wo

where INRi is

(17)

is given by

Wo

(11 ) (12)

(16)

INRia

J g ---I +INR,<>

(18)

aJ / a~, and the real quantity <> will be called the

signal blocking matrix factor (j

= (AjT)HBT(BBT)-IB(AjT).

(19)

The transformation (9) is an underdetermined system of equations, since rank (B) < K_ Thus, if one were to try to estimate Xk directly from Ukt there would be an infinite number of x, that would solve the estimation problem. The solution having minimum power is the one with minimum nonn. This minimum norm x, (denoted by ik) is given by either the left or right inverse solution to (9) [20] . Here the right inverse solution is appropriate, so that Xk = B T (DB T) -I Uk. Using (17) and the symmetry of(BBT)-I, (i.e., the steady state Yk) is Yk ~ (W*)H Uk

y:

= wo(A j T)H Xk .

(20)

Since Aj 1 represents the received jammer, including presteering, (20) demonstrates that the GSC sidelobe cancelling branch forms the minimum norm estimate of the jammer for use by the adaptive noise canceller. Y k then represents the output of a cancellation beam steered in the jammer's direction. This can be seen by applying (13) and (17) to the first line of (20):

(13)

Yk = WO[ikO + (A j T)HB T(BB T) -1 0 k].

(21)

The first term in (21), namely (woo»)k» is the amount of I

Inversion of Ruu is accomplished with the matrix inversion lemma [191

(Q+J-J'H)-I=Q-I- Q-ljjHQ-l l+jHQ-lj

(14)

216

where Q is a nonsingular matrix, j is a vector, and Q + jjH is also assumed to be a nonsingular matrix.

n

Recall that (27) involved no approximations with respect to relative power levels of signal, jammer, and receiver noise.

jammer in From (14), the amount of jammer in d k is (cx) )jk. Therefore, the closer that the quantity WOO is to ai' the better the jammer cancellation is. Since () is a function of both the array geometry through A) and the signal blocking matrix used through B, one must be careful about making generalizations as to when the jammer cancellation is best. However, one thing can be said for sure:

IV.

Therefore, regardless of the array geometry or signal blocking matrix used, as long as a degree of freedom is available, the beamformer approaches an infinite null in the direction of the jammer as the receiver noise approaches zero. Due to B, there are no terms involving the signal in the Wiener solution. Furthermore, due to the uncorrelatedness of the WGN in both time and space, there are no distinct terms in the Wiener solution that attempt to cancel receiver noise. However, the presence of the receiver noise power in the denominator of the term INRi does allow the receiver noise power to impose an inherent limitation on the ability of the GSC to cancel jammers. STEADY STATE

Calling the ratio SINRri / SINR Oc performance improvement due to adaptation (PIA) from (18) and (27), (28) PIA = 1 +

SINRo

~2 E[iZkI 2 ]

" -E[dkYk] - -E[dkYd - +E[IYkI 2 ]}. ="21 {E[ Idkl-]

(23)

When each term in (23) is evaluated, it turns out that Po can be expressed as the sum of three terms, one each due to the signal (Pos), jammer (P Oj), and receiver noise (POn). When w* is used in the first line of (20) to calculate y:, and then y i is used in (23) to calculate the steady state values of POs, P Ob and Pan (denoted by Po:' P and Po:)' the following expressions result: 1

V.

EXPRESSIONS FOR

-

K

(25)

p*

SINR * £ Os o - POj+P * O*n

SNR i

where SNR j is the input signal-to-noise ratio

(27)

a; /

a~.

1 + INRt(Klaj 1 2 + 0)

(29)

0

\VHEN ZERO-ORDER AND FIRST-ORDER

The zero-order constraint is easily implemented by the (K 1) x K matrix

0(0)

SINRti (i.e., the steady state SINRo) is then

l2 o

CONSTRAINTS ARE USED

(24)

(26)

INR7Klaj

This expression is exact, and is seen to be controlled by just four quantities. The only complications involved in computing PIA from (29) are in performing the matrix operations given by (19), the equation for O. PIA is the real quantity of interest to a system designer who has to make a decision whether to use an adaptive antenna or not, for it explicitly indicates how many dB in SINRri can be gained by using a particular adaptive antenna structure, as opposed to a conventional beamformer. It should also be noted that PIA is different from both cancellation ~ defined as the improvement in output interference-to-noise ratio due to adaptation [10], [21], and array gain, defined as the ratio of SINR~ to input signal-tointerference-pius-noise ratio [14].

of,

P*Os == -2 a s2

DUE TO ADAPTATION

(28)

In this section the Wiener weights calculated in Section II will be used to derive the steady state SINRo of the GSC. The output power Po of the GSC is Po d

SINRo

If all the adaptive weights are zero, as when the GSC is initialized, the only active part of the beamformer will be the desired response branch. In this case the output signal Zk will just be di, since Yk will be zero. SINRo for this situation will be symbolized by SINR oc, SINRo for a conventional beamformer. SINRoc can be derived simply by setting Wo to zero in (27):

(22)

III.

IMPROVEMENT IN

_[1-1. 0 ] o 1-1

(30)

where the subscript indicates the number of antenna elements, and the superscript (r - 1) the use of a main beam zero (r 1)st derivative constraint in the look direction. Fig. 3 shows a simple circuit for B~-l). Thus the zero superscript in (30) represents r = 1, which is just adjacent element differencing. The first-order constraint can then be implemented by the (K - 2) x K matrix B~) == B(~_l B~). B~) represents two columns of differencing in series, and as such entails one less degree of freedom than B~). Griffiths and Jim [1] first suggested B~) as a remedy for reducing system sensitivity to pointing errors caused by element amplitude/phase or other perturbations. In order to evaluate PIA, one must first be able to evaluate

217

where the constant interelement phase shift Uj is 27r(d/A)(sin OJ - sin Os), with d denoting the interelement spacing and A the wavelength. For the equally spaced line array, aj has the special form [16]

2

~

···~I

K Inputs

K - r

sin

Outputs

aj=

KUo) (T

K sin

(~)

(35)

VI. INTERPRETATION OF THE BEHAVIOR OF PERFORMANCE

Fig. 3.

Cascaded columns of differencing.

IMPROVEMENT DUE TO ADAPTATION

o. The most difficult part in the analytical evaluation of 0 is the calculation of (BB T) - 1. fJ corresponding to the zero-order constraint is written as Do, and is [22] 2

00 = K (I -

I

Ci)

1

2

(31)

).

For the first-order constraint, 0 is written as 01, and is [22]

ol=K

[1-2 (~:ll)

lajl2

(K+I) ( Ii3 1-., - a - a-i3 ]

- 3 K_1

j

j Pj -

j

j )

(32)

with the normalized (for the case of r, = 0; i = I, "', K) quantity (33) Ar this point, it is worthwhile to pause and reflect on the fact that PIA can now be computed for any arbitrary array geometry without having to perform a single matrix operation. as long as either a zero- or first-order constraint is chosen. The most complicated work involved is just the evaluation of a} according to (15) and its "cousin" /3J according to (33). Sometimes (3J can be written in terms of aj, which makes the computation even easier. An example is an equally spaced line array, and in [22] it is further shown that for this type of array

a;-2aj cos

KU') cos (u.) 1 +1 (T

(34)

2 Substitution of (18) and (31) into (27) yields an equation for SINRo which agrees with the one for a converged narrow-band Frost beamformer obtained .by Takao et 01. in [3. cq. (48)]. This result is consistent with Griffith and Jim's [I] assertion that the GSC implemented as in this paper and the Frost [2] beamformer should provide the same steady state performance in a stationary signal environment.

In this section, PIA for an equally spaced line array using the zero-order (PIA o) and first-order (PIA I) constraints will be plotted. PIA o and PIAl will be computed using (35) for aj, (31) for 00, (34) for 01, and then (29) for PIA o and PIA l : Plots of PIA and () as a function of jammer angle are presented in Figs. 4 and 5. Fig. 4 plots PIA for a three-element equally spaced line array with a broadside look-direction, and INRi = 0, 20, and 40 dB. Fig. 5(a) plots 0 for a ten-element equally spaced line array with a broadside look direction. Fig. 5(b) plots PIA for the latter case when INRI = 20 dB. It is assumed throughout that d/). = 0.5. Studying these plots and keeping (29) in mind. several points are apparent. 1) Signal power has no effect on PIA. This is due to at least two simplifications in the analysis. The first is the lack of element imperfections, which would cause the signal to "leak" into the sidelobe cancelling branch [6]. Jablon [22], [23] did a detailed study of this phenomenon, and found that although for ~ 'high" SNRi the GSC is hypersensitive to small element imperfections, the problem could be fixed by artificially injecting receiver noise, a la Zahm [24]. Hudson [12] also treated this subject. The second simplification is the use of a steady state analysis based on the Wiener solution, which does not take into account non-Wiener signal cancellation [6]. 2) PIA goes to 0 dB for several angles. At these angles, either the array factor approaches 0 or the jammer is actually coming from the look direction. The explanation is that when the jammer falls in a null of the array factor, the unadapted array naturally performs extremely well. The adapted array can match this performance without altering the weights from their initial all-zero values. On the other hand, when the jammer comes from the look-direction, the GSC is helpless, as forming a spatial null in the look direction is prevented by the constant gain constraint, so the adapted array is forced to perform as poorly as the unadapted array. 3) PL4 0 ~ PIA t. This is no surprise, as one has to give up something to get the robustness to perturbations that come from using the extra constraint. Fortunately, however, the degradation in PIA due to the extra constraint only appears to be evident in the inner half of the unadapted pattern BWFN region. Stutzman and Thiele [16] show that for K ~ 1 and d/). = 0.5, BWFN of a conventional beamformer near broadside is about equal to 4/K rad. Using this approximation for BWFN, the degradation in PIA is only experienced over an angle of roughly 21K rad. For a ten-element array BWFN is

218

approximately 20 so the degradation in PIA due to the additional constraint will only be significant when the jammer falls within ± 50 of the array broadside, which will be tolerable for many applications. 0

50,----------------------, K -3 Look-direction

----

~ ~o

_0.5

/ P IA / '/ ' PIA~

S., _ rjl

30

l

INR ;

PlAul PIA I INR

j

,

= . . 0 dO

= 20

4) As INRi increases, the degradation in PIA near the look-direction due to the extra constraint becomes more serious. The willingness of any adaptive beamformer to null a

an

10

o · 10 · 100

·50

0

50

100

Jammer angle or arrival, OJ (deg) -

Fig . 4. Narrow-band GSC performance improvement due to adaptation for three-element broadside line array having d /): = 0.5 and INR, = O. 20. and 40 dB.

12

i

jammer falling in any part of the unadapted pattern other than a null or a look direction is related to INRi' From Fig . 4, for a jammer falling in the BWFN region, when INRi is small, the first-order constrained GSC does not even bother to null the jammer, which accounts for the flatness of PIA, near broadside. However, when INRi reaches a certain critical level , the first-order constrained GSC decides to null the jammer. Due to the extra constraint, it has to work harder than the zero-order constrained GSC, and the degradation in PIA thus becomes especially pronounced.

5) The key to understanding the difference in PIA resulting from the use of different signal blocking matrices is to understand the behavior of o. In Fig. 5(a), 0, is flat near the look direction, so PIA, must also be flat there . Since 0 is the only quantity in (29) that changes when B changes, if two B have similar 0, they will also have similar PIA's. This is clearly demonstrated by the fact that the two PIA's in Fig . 5(b) are similar for the same jammer angles where the two 0 in Fig. 5(a) are similar. It is also interesting that 00 ~ 0,.

Look-direction

10

6) It is useful to approximate PIA, although one must be cautious. First consider PIA o. Using (31) and the fact that

almost surely INR, K

~

I

PIAo == I + INR,Ka; (I -

- 0.5

·50

0

50

100

Jammer angle or arrival, OJ (deg) _ (a)

3 0K,-10 ---------------------, ~

PIAol max

- 0.5

., _

8

(36)

This expression can be differentiated with respect to aj to find the angle(s) 8m. , where PIA o is maximum. It is straightforward to show that at 8ma. , aj = l /,fi. and from a visual inspection of a J versus 8j plots , 8m• • == ± (BWFN/4) for broadside arrays as considered here. Substituting aj = I/-li into (36) with INR,K ~ I:

K -10

~

a;).

0°

I

- 4 INR;K .

It is more complicated to analyze PIA, in the same way, as a comparison of (31) and (34) makes quite clear . However, easy results can be obtained outside the BWFN region, where it seems reasonable to approximate PIAl by PIA o, so that (36) applies with PIAl replacing PIA o. Unfortunately, inside the BWFN region , PIA, does not behave as "nicely" as PIAodid, which makes the analysis beyond the scope of this paper, except to say that PIAd ma, :S PIAolmax '

INRi - 20 dB

20

10

VII . CONCLUSION o l . - - L_ _---''"'----L---'_.L.-..L.---L.......l._L-...u-_ _~_-.J

· 100

-50

0

50

100

Jammer angle or arri val, OJ (deg) (b)

Fig. 5. Narrow-band GSC implemented with ten-element broadside line array having d/'A = 0.5 . (a) Signal blocking matrix factor (0). (b) Performance improvement due to adaptation (PIA) when INRi = 20 dB.

219

The narrow-band generalized sidelobe canceller was studied by applying adaptive noise cancelling techniques . The signal environment was assumed to consist of a look-direction signal , one jammer, and additive Gaussian receiver noise at the antenna elements. Exact expressions were derived for the Wiener weight vector, steady state output signal-to-interferencc-plus-noise ratio, and performance improvement due to

adaptation, defined as the ratio of SINRo after adaptation compared to SINRo before adaptation. These expressions were shown to all be critically dependent on a single quantity 0, called the signal blocking matrix factor, which is related to the amount of the jammer appearing in the beamfonner steady state output signal. For a general array geometry and GSC signal blocking matrix, the evaluation of () involves matrix inversion and matrix multiplication, but nothing more complicated than that. It was also shown that if one were willing to make certain assumptions about the nature of the linear constraints involved, namely the use of a constant gain constraint in the look direction, or a constant gain plus a main beam zero first derivative constraint in the look direction, that the matrix operations involved in the evaluation of the signal blocking matrix factor could be eliminated completely. In order to demonstrate the usefulness of these completeI y scalar equations for signal matrix blocking factor, an equally spaced line array was assumed, and PIA was plotted in several situations. It was seen from the graphs that for jammers arriving outside the beamwidth between first nulls region of the unadapted beampattem, the use of the additional main beam zero first derivative constraint resulted in negligible degradation in PIA. ACKNOWLEDGMENT

The author is grateful to his research supervisor, Dr. Bernard Widrow, and also to Dr. A. Paul raj , Dr. Richard Gooch, and Dr. William C. Newman for several enjoyable discussions on adaptive beamforming. In addition, Dr. Widrow, Dr. Paulraj, and the anonymous reviewers made several suggestions which significantly improved this paper. Thanks go to Mieko Parker for her careful typing.

[9] A. M. Vural, "Effects of perturbations on the performance of optimum/adaptive arrays," IEEE Trans. Aerosp. Electron. Syst., vol. AES-15, no. 1, pp. 76-87, Jan. 1979. [10] J. T. Mayhan, "Some techniques for evaluating the bandwidth characteristics of adaptive nuJling systems," IEEE Trans. Antennas Propagat., vol. AP-27, no. 3, pp. 363-373, May 1979. [11] I. J. Gupta and A. A. Ksienski, "Effect of mutual coupling on the performance of adaptive arrays, " IEEE Trans. Antennas Propagat.• vol. AP-31, no. 5, pp. 785-791, Sep. 1983. [12] J. E. Hudson, Adaptive Array Principles. New York: Peter Peregrinus and The Institute of Electrical Engineers, 1981, ch. 2 and 6. pp. 56 and 160-191. [13] C. W. Jim, "A comparison of two LMS constrained optimal array structures," Proc. IEEE, vol. 65, no. 12, pp. 1730-1731, Dec. 1977. [14] M. H. Er and A. Cantoni, "Derivative constraints for broad-band element space antenna array processors," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, no. 6, pp. 1378-1393, Dec. 1983. [I 5] K. Takao and K. Komiyama, "An adaptive antenna for rejection of wideband interference," IEEE Trans. Aerosp. Electron. Syst., vol. AES-16, no. 4, pp. 452-459, Jul. 1980. [16] L. L. Horowitz and K. D. Senne, "Performance advantage of complex LMS for controlling narrow-band adaptive arrays," IEEE Trans. Circuits Syst., vol. CAS-28, no. 6, pp. 562-576, Jun. 1981. [17] A. Papoulis, Probability, Random Variables, and Stochastic Processes. New York: McGraw-Hill. 1965, ch. 9, p. 303. [18] W. L. Stutzman and G. A. Thiele. Antenna Theory and Design. New York: Wiley, 1981, ch. 3, pp. 124-129. [19] T. Kailatb. Linear Systems. Englewood Cliffs, NJ: Prentice-Hall, 1980, appendix, p. 655, A. 20. [20] G. Strang, Linear Algebra and Its Applications, 2nd ed. New York: Academic. 1980. ch. 2, p. 79. (21] R. C. Johnson and H. Jasik, Antenna Engineering Handbook, 2nd ed. New York: McGraw-Hill, 1984, ch. 22, p. 22-6. [22] N. K. Jablon, "Adaptive beamfonning with imperfect arrays," Ph.D. dissertation. Elec. Eng. Dept.. Stanford Univ.~ Stanford, CA, Aug. 1985. [23] - -... Adaptive beamforming with the generalized sidelobe canceller in the presence of array imperfections." IEEE Trans. Antennas Propagat.. to be published. [24] C. L. Zahm, "Application of adaptive arrays to suppress strong jammers in the presence of weak signals," IEEE Trans. Aerosp. Electron. Syst., vol, AES-9. no. 2. pp. 260-271, Mar. 1973.

REFERENCES

[I]

[2] [3] [4] [5]

[6] [7] [8]

L. J. Griffiths and C. W. Jim. "An alternative approach to linearly constrained adaptive beamforming," IEEE Trans. Antennas Propagat., vol. AP-30. no. I. pp. 27-34. Jan. 1982. O. L. Frost, III. ,. An algorithm for linearly constrained adaptive array processing," Proc. IEEE, vol. 60, no. 8, pp. 926-935, Aug. 1972. K. Takao, M. Fujita. and T. Nishi. "An adaptive antenna array under directional constraint:' IEEE Trans. Antennas Propagat., vol. AP24, no. 5, pp. 662-669, Sept. 1976. S. P. Applebaum and D. J. Chapman, "Adaptive arrays with main beam constraints." IEEE Trans. Antennas Propagat., pp. 650-662, Sept. 1976. L. J. Griffiths, "An adaptive beamformer which implements constraints using an auxiliary array preprocessor." in Aspects of Signal Processing, pt. 2, G. Tacconi. Ed. Dordrecht, Holland: D. Reidel Publishing Co., 1977. pp. 517-522. B. Widrow and S. D. Stearns, Adaptive Signal Processing. Englewood Cliffs. NJ: Prentice-Hall. 1985. R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980, ch. 11, pp. 451-475. J. M. McCool, "A constrained adaptive beamformer tolerant of array gain and phase errors," in Aspects ,of Signal Processing, pt. 2, G. Tacconi, Ed. Dordrecht, Holland: D. Reidel Publishing Co., 1977, pp. 477-483.

220

Proceedings Letters This section is intended primarily for rapid dissemination of brief reports on new research results within the scope of the IEEE members. Contributions are reviewed immediately, and acceptance is determined by timeliness and importance of the subject, and brevity and clarity of the presentation. Research letters must contain a clear concise statement of the problem studied, identify new results, and make evident their utility, importance, or relevance to electrical engineering and science. Key references to related literature must be given. Contributions should be submitted in triplicate to the Editor, PROCEEDINGS OF THE IEEE, 345 East 47th Street, New York, NY 70077- 2394. The length should be limited to five double- spaced typewritten pages, counting each illustration (whether labeled as a figure or part of a figure) as a half page. An abstract of 50 words or less and the original figures should be included. Instructions covering abbreviations, the form for references, general style, and preparation of figures are found in "tniormstion for IEEE Authors, // available on request from the IEEE Publishing Services Department. A uthors are invited to suggest the categones in the table of contents under which their letters best fit. A tter a letter has been accepted, the sutbor's company or institution will be requested to pay a voluntary charge of $110 per printed page, calculated to the nearest whole page and with a $770 minimum to cover part of the cost of publication.

the equivalent array could be very small. For example, if there Jre only 10 jammers then J + 7 becomes 17. The total of multiplies needed to do the adaptive array processing reduces from something of the order of 2N J = 2 (70(X)()J) = 2 X 70'2 to about 105 a reduction in the computation complexity by seven orders of magnitude. In addition, the settling time for the adaptive-adaptive array is much faster. For the above example, the settling time for the full array is about 20CXJO samples, whereas for the adaptive-adaptive array it is only 22 time samples, for an improvement of three orders of magnitude. SUI\t'MAR'r

A technique is described for adaptive array processing which eliminates the complex computation problem (see Table 1) of a large fully adaptive array while at the same time provides the same optimum performance as obtained for the fullv adaptive array in [1]. The technique also has the advantage of not Significantly degrading the antenna sidelobe levels at angles where the Jammers are not present; see Fig. 1. This feature IS Important in the presence of Intermittent short pulse interference coming through the radar sidelobes and for ground radars which have clutter In the sidelobes

(a) -80.0 i'------''----r~-___._-~.......----&.--....... -90.0 -45.0 0.0 45.0 90.0 ANGLE FROM BROADSIDE (DEGREES)

Adaptive-Adaptive Array Processing

0.0

ELI BROOKNER AND JAMES M. HOWELL

iii'

(b) ;

A technique is described which provides the jammer cancellation

-40.0

C(

e

advantages of a fully adaptive array without its many disadvantages such as an excessively large number of computations, poor sidelobes in the directions other than the jammer locations, and poor transient response. This is done at the expense of the hardware complexity. The technique involves transforming a large array of N elements into an equivalent small array of J + 7 elements, where j is the number of jammers present. The technique involves estimating the number and locations of the jammers by a discrete Fourier transform of the array element outputs or by the use of standard maximum entropy methods (MEM) or by other super-resolution techniques. Once the number and locations of the jammers have been determined, beams are formed in the direction of the jammers using the whole array. The outputs of these jammer beams together with the output of the main signal beam from the transformed array now consist of J + 7 ports instead of N ports. The standard sample matrix inversion (SMI) or the Applebaum algorithm can be applied to the J + 1 ports of the equivalent adaptive-adaptive array. Whereas N may be very large, like 70CXXJ for large arrays, J + 1 for

-80.0,-+---r------ro-..l-_-_ -90.0 -45.0 0.0 45.0 90.0 ANGLE FROM BROADSIDE (DEGREES)

0.0

(c)

iii'

i

;(

-40.0

C)

-80.0f------,.---A._--'T""---"_......--'--~

-90.0

-45.0

0.0

45.0

90.0

ANGLE FROM BROADSIDE (DEGREES)

Fig. 1. Sixteen-element array having 40-dB antenna sidelobes (Chebyshev weighting). Jammer at 20 0 (peak of second sidelobe). (a) Unadapted antenna pattern. (b) Antenna pattern for fully adaptive array (SMI algorithm). M = 2N = 32 (M equals the number of time samples used to estimate the adaptive antenna weights). For the fully adaptive array, not only is there a degradation of the antenna sidelobes, there is also a degradation in the antenna main lobe peak gain. The peak gain degradation was found to be as much as 5 dB in the simulation carried out. (c) Antenna pattern for adaptive-adaptive array processing. M = 2(J+1) = 4.

Manuscript received January 3, 1985; revised April 15, 1985. The authors are with the Raytheon Company, Wayland, MA 01778.

Reprinted from Proceedings of the IEEE, Vol. 74, No.4, pp. 602-604, April 1986.

221

Comparison of Computations Required (Assumption : J = number of jammers ~ 10.)

Table 1

Jammer Cancellation Technique

Number of Complex Mult iples to Calculate We ights

One-D imensiona l ( N Ele me nts

Fully Adapt ive AdaptiveAdaptive (Improvement)

2"; - 2 x 10& 2(J + 1)3 + 7 Nr - 10 '

Square Two -D imens ional (N Elements

Fully Adapt ive Adapt iveAd aptive (Improvement)

Type Array An tenna

- 100)

_ 10' )

Complex Mult ipl ies to Form Array Transient Time Output per Signa l (Units of Signal Time Sample" Time Samples) 2 N - 10 2N - 200

J + 1 - 11

200 1012

2"; - 2 x 2(J + 1)3 + N log 2 .;, 7 x 10 ' _ l OS - 2 x 107

IN••

2(J + 1 ) - 22

- 10 10

- 10

N = 10'

2N = 2 X 10'

J + 1 - 11

2(J + 1)

,;" 10 10 3

=

22

- 10 3

#Does not incl ude co mp utatio ns of column three. 'Second term assumes MEM algorithm used to locate jammer. This term drops out if jammers located usmg sea rch beam. For th is case number o f mult iplies .. 2(J + 1)3 .. 2 X 10 3 and im p rov e me nt be comes - 103.

"Second te rm assume s fast Fourier transform (H T) algor ithm used to locate jam me r. Term drops out if Jammer located using search beam (or beams). In this case number of mult iplie s " 2(J + 1 ») .= 2 x 10 ) and im p ro ve me n t becomes - 109 .

[4J. Thus for the adaptive-adaptive array 2(1 + 1) time samples are needed instead of the 2N required for the full array; see Table 1.

and the mainlobe. The adaptiv e-adaptive array also has the advantage of a much faster settl ing time ; see Table 1. The technique uses a two-step process. First the number of .ntertering jammers and thei r locations are estimated by such ~,~ c h n i q u e s as a spatial discrete Fourier transform of the array outputs (d igitally or by use of a Butler matrix or Rotman lens), by maximum entropy method spectral estimation techniques [2]. [3), or just by a search in angle with an auxil iarv beam. On ce the numb er of jammers and their locat ions have been determ ined, auxiliary beams are form ed po int ing at these jammers, wi th one beam being po inted at each jammer; see Fig. 2. These beams are form ed usi ng

JAMMER

It is useful to physically understand wh y the adaptive-adapti ve array does not degrade the antenna sidelobes . The adaptive-adap tive array subtracts one auxil iary beam pointed at the jammer and containing the jammer signal from the main Signal channel beam as ill ustrated in Fig. 3. The gain o f the auxil iary beam in the dir ection LEGEND --

UNADAPTED MAIN BEAM PATTERN

• ••••••

AUXILIARY PATTERN

- -- - -

ADAPTIVe-ADAPTIVE PATTERN

JAMMER

NO .1

0 .0

NO .2

TARGET

!{1

~

iii

a

~

z :c o

JAMMER

NO . ..

-80.0 _---'L...,,......J'-~...:..-.;...J:~-l._...I, - 90.0

- 4 5. 0

0 .0

45.0

90.0

ANGLE FROM BROADSIDE (DEGREES)

MAIN ____ ARRAy

T1 Tl ··· ··· T.

Fig. 3.

Main unadapted array pattern, the auxiliar y jammer beam

pornted at the jammer that is subtracted from the mam beam at

the Jammer locat ion . and the resultant adapnve-adapnve pattern

MAIN BEAM

AUXILIARY

BEAM POINTING AT JAMMER NUMBER 2

Fig. 2. Adaptive-adaptive array configuration. N·element array is reduced to J + I element array where) is equal to t he number

of jammers.

the whole array. They are formed using beam-form ing networks parallel to the main signal beam network. The number of beams formed is equal to the number of jammers. These beams could be formed using amplitude weight ing to achieve low sidelobe levels if desirable . The outputs of the auxiliary jammer beam ports together with the main signal beam port form the adaptive-adaptive transformed array. The number of degrees of freedom in this transformed array is reduced from N, the number of elements in the orig inal array, to one plus the number of jammers j. Thus for the .i daptive- edaptive array a (1 + 1) (1 + 1) matrix has to be inverted instead of an N X N matrix . Furthermore , the conversion time for the adaptive-adapt ive array is much faster than for the full array. For the SMI algorithm, the number of time samples needed to form the weights is equal to two times the number of degrees of freedom in order to obtai n cancellat ion within 3 dB of the optimum

222

of the Jammer IS made to equal the gain of the main channel beam sidelobe in the direction of the jammer . As a result, the subtraction produces a null at the angle of the jammer in the main channel sidelobe. It IS apparent from Fig. 3 that the auxiliary antenna pattern subtraction does not signif icantly degrade the main antenna beam sidelobe levels. For the fully adaptive array, N retrod irect ive beams are formed based on the eigenvalues and eigenvectors of the fully adaptive array covariance matrix [5). Because of the presence of thermal no ise in the array elements the estimates of the covariance matr ix of the fully adaptive array and, in turn, the retrodirective beams are poor for M = 2N . Instead of forming onl y one retrodirecti ve beam as desired when one jammer is present, N retrodirective beams are formed for the full y adaptive array. The N - 1 retrodi rective beams for wh ich there are no jammers are the ones wh ich degrade the antenna sidelobe levels at the angles where no jam mers exist. It is found that even i f 3000 time samples are used, the sidelobe levels are still severely degraded for the fully adaptive array system although considerably improved. The adaptive-adaptive array technique first determines what jammers are present which will degrade the system performance . Once the locations o f these jammers are determined the array adapts to the situation by only placing retrodirective beams at these angles. Consequently, the beams at other angles where there are no jammers are not formed and do not, as a result , degrade the antenna sidelobes at these angles. A number of variations are possible on the above adaptive-adap-

tive array system. First, the MOSAR method of [6] can be used to locate the jammer positions based on a single time sample. Second, it is not necessary to use the whole array to locate the jammers. Third, if the jammers can be located so as to come through the backlobes, then an auxiliary array (or arrays) is needed which covers the backlobes or whatever angles are not covered by the main array. Finally, it is possible to use only one parallel beam-forming network instead of J with this beam-forming network being time multiplexed so as to produce the J beams pointed at the J jammers and in this way reduce the hardware complexity of the adaptive·adaptive array processor. The physical explanation given above together with Fig. 3 helps in understanding the performance of the adaptive-adaptive algorithm for nonperfect conditions and leads to the following insights. Even if the jammer location is in error by plus and minus a half beamwidth , jammer cancellation results similar to those in Fig. l(c) will still be obtained. There will only be a degradation of the sidelobe to the right or left of the null by about 3 dB. Furthermore, if the cancelor weights calculated using the SMI (or some other adaptive algorithm) is inexact, the null depth will be degraded but it IS apparent from Fig. 3 that the sidelobe level will be unaffected except for a small amount for the sidelobes just to the right and left of the null. Increasing M for the SMI computation will increase the null depth. If a jammer is not detected than it will not be canceled out. This, however, will tend to occur only if the jammer is weak, a case not of as much concern because the jammer will then only cause a small degradation in signal-to-interference ratio. If a jammer is estimated to be present when in fact it is not, the system will incur very little degradation in signal-to-Interference ratio and in antenna sidelobe level because the SMI weights for the channel pointing in the direction where no jammer actually exists will be very low, the weight being established by the correlation between the noise in the main channel and the noise in the auxiliary channel pointing at no jammer with these noises being Independent so that the correlation on the average is zero. If there are a large number of jammers then there can be antenna sidelobe level degradation If the auxiliary jammer beams have sidelobe levels that are not low enough If J jammers are present then, .n order to avoid sidelobe level degradation in the main channel, the auxiliary channel antenna sidelobe levels should be greater than 10 (loglo !) decibels down, a condition that can generally be met. ACKNOWLEDGMENT

The idea of pointing high-gain auxiliary antenna beams in the direction of the jammers appears to have first been suggested by P. W. Howells, the inventor of the IF sidelobe cancel or [7]. He did not, however, form multiple auxiliary high-gain beams in an array to achieve jammer nulling performance essentially that of a fully adaptive array while avoiding the associated sidelobe degradation problem as done in this letter. W. F. Gabriel of NRL has independently done this. Fig. 1 was obtained using a simulation written by C. D. Brommer (Raytheon). REFERENCES

[1]

[2]

[3] [4)

[5]

[6] [7]

S. P. Applebaum, "Adaptive arrays," IEEE Trans. Antennas Propeget, vol. AP-24, no. 5, pp. 585-598, Sept. 1976. S. M. Key and S. l. Marple, Ir., "Spectrum analysis-A modern perspective," Proc. IEEE, vol. 69, no. 11, pp. 1380-1418, Nov. '981. J. P. Burg, "Maximum entropy spectral analysts." Ph.D. dissertation, Dept. Geophysics, Stanford U., Stanford, CA, May 1975. I. S. Reed, J. D. Mallett, and L. E. Brennan, "Rapid convergence rate in adaptive arrays," IEEE Trans. Aerosp. Electron. Syst., vol. AES-10, no. 6, pp. 853-863, Nov. 1974. W. F. Gabriel, "Adaptive arrays-An introduction," Proc. IEEE, vol. 64, no. 2, pp. 239-272, Feb. 1976. M. A. Johnson, "Phased-array beam steering by multiplex sampling," Proc. IfEE, vol. 56, pp. 1801-1811, Nov. 1%8. P. W. Howells, "Explorations in fixed and adaptive resolution at GE and SURe," IEEE Trans. Antennas Propagat., vol AP-24, no. 5, pp.

575-584, Sept. 1976.

223

ESPRIT-Estimation of Signal Parameters Via Rotational Invariance Techniques RICHARD ROY

AND

THOMAS KAlLATH.

Abstract-High-resolution signal parameter estimation is a problem of significance in many signal processing applications. Such applications include direction-of-arril'al (DOA) estimation, system identification. and time series analysis. A novel approach to the general problem of signal parameter estimation is described. Although discussed in the context or dircction-of-arrival estimation, ESPRIT can be applied to a wide variet~ of problems including accurate detection and estlmanon of sinusoid~ in noise. It exploits an underlying rotational invariance among signal subspaces induced by an array or sensors wllh a translational invariance structure. The technique, when applicable, maniIests ~ignificant performance and computational advantages over previous algortthrns such as MEl\rI. Capon's l\tLM, and MUSIC.

I

I.

INTRODUCTION

N many practical signal processing problems, the objective is to estimate from measurements a set of constant parameters upon which the received signals depend. For example, high-resolution direction-of-arrival (DOA) estimation is important in many sensor systems such as radar, sonar, electronic surveillance, and seismic exploration. High-resolution frequency estimation is important in numerous applications. recent examples of which include the design and control of robots and large flexible space structures. In such problems, the functional form of the underlying signals can often be assumed to be known (e.g., narrow-band plane waves, cisoids). The quantities to be estimated are parameters (e.g .• frequencies and DOA's of plane waves, cisoid frequencies) upon which the sensor outputs depend. and these parameters are assumed to be constant. \

There have been several approaches to such problems including the so-called maximum likelihood (ML) method of Capon (1969) and Burg s (1967) maximum entropy (ME) method. Although often successful and widely used, these methods have certain fundamental limitations (esq

Manuscript received January 12. 1988: revised October 5. 1988. This work was supported in part by the Joint Services Program at Stanford University (U.S. Army. U.S. Navy. U.S. Air Forcej under Contract DAAG2985-K-0048. and the SOl/1ST Program managed by the Office of Naval Research. under Contract NOOO14-85-K-0550. The authors are with the Information Systems Laboratory. Stanford Universiry, Stanford. CA 94305. IEEE Log Number 892R125. I Extensions to situations in which the parameters may be time varying can be made. however. they rely on an Inherent time-scale or eigenvalue separation between the parameter dynamics and the dynamics of the signal process. Fundamentally. the assumption IS made that over time intervals long enough to collect sufficient information from which to obtain accurate parameter estimates. the parameters have not changed significantly.

FELLOW. IEEE

pecially bias and sensinvny in parameter estimates). largely because they use an incorrect model (e.g .• AR rather than special ARMA) of the measurements. Pisarenko (1973) was one of the first to exploit the structure of the data model, doing so in the context of estimation of parameters of cisoids in additive noise using a covariance approach. Schmidt (1977) and independently Bienvenu (1979) were the fi rst to correctly exploit the measurement model in the case of sensor arrays of arbitrary form. Schmidt, in particular. accomplished this by first deriving a complete geometric solution in the absence of noise, then cleverly extending the geometric concepts to obtain a reasonable approximate solution in the presence of noise. The resulting algorithm was called MUSIC (MUltiple SIgnal Classification) and has been widely studied. In a detailed evaluation based on thousands of simulations, M.1.T. 's Lincoln Laboratory concluded that, among currently accepted high-resolution algorithms, MUSIC was the most promising and a leading candidate for further study and actual hardware implementation. However, although the performance advantages of MUSIC are substantial, they are achieved at a considerable cost in computation (searching over parameter space) and storage (of array calibration data). In this paper., a new algorithm (ESPRIT) that dramatically reduces these computation and storage costs is presented. In the context of DOA estimation. the reductions are achieved by requiring that the sensor array possess a displacement in variance. i.e .. sensors occur in matched pairs with identical displacement vectors. Fortunately, there are many practical problems in which these conditions are or can be satisfied. In addition to obtaining signal parameter estimates efficiently, optimal signal copy vectors for reconstructing the signals are elements of the ESPRIT solution as well. ESPRIT is also manifestly more robust (i.e., less sensitive) with respect to array imperfections than previous techniques including MUSIC [1]. To make the presentation as clear as possible, an attempt is made to adhere to a somewhat standard notational convention. Lowercase boldface italic characters will generally refer to vectors. Uppercase boldface italic characters will generally refer to matrices. For either real- or complex-valued matrices. (.)* will be used to denote the Hermitian conjugate (or complex-conjugate transpose) operation. Eigenvalues of square Hermitian matrices are assumed to be ordered in decreasing magnitude, as are the

Reprinted from IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 37, No.7, pp. 984-995, July 1989.

224

singular values of nonsquare matrices. Knowiedge of the fundamental theorems of matrix algebra dealing with eigendecompositions and singular value decompositions (SVD) is assumed (cf. [2]). II.

THE DATA MODEL

Although ESPRIT is generally applicable to a wide variety of problems, for illustrative purposes the discussions herein focus on DOA estimation. In many practical signal processing applications, data from an array of sensors are collected, and the objective is to locate point sources assumed to be radiating energy that is detectable by the sensors (cf. Fig. 1). Mathematically, such problems are quite simply, although abstractly, modeled using Green's functions for the particular differential operator that describes the physics of radiation propagation from the sources to the sensors. For the intended applications, however, a few reasonable assumptions can be invoked to make the problem analytically tractable. The transmission medium is assumed to be isotropic and nondispersive so that the radiation propagates in straight lines, and the sources are assumed to be in the far-field of the array. Consequently, the radiation impinging on the array is in the form of a sum of plane waves. For simplicity, it will initially be assumed that the problem is planar, thus reducing the location parameter space to a single-dimensional subset of
JThc definition of slowly varying is taken to mean that the approximais valid, i.e .. tion 5, (I - Td8,» == u, (I) cos (wo(t - TdO,» + u, the amplitude and phase variations as functions of spatial position for fixed I are negligible over the extent of the array.

(I».

Receiver 2

8 1

Receiver 1

Fig. I. Passive sensor array geometry.

time t, can be written as d

Xk(t) = ,;. _~I ak(6;) s,(t - Tk(8J) d

= .I; ak(6,.) S, (t) e- iwo 1t( 6, ) ,

(I)

l= )

where ik( 0;) is the propagation delay between a reference point and the kth sensor for the ith wavefront impinging on the array from direction 0;, ak «(J,.) is the corresponding sensor element complex response (gain and phase) at frequency wo, and there are assumed to be d point sources present. Employing vector notation for the outputs of the m sensors, the data model becomes d

x(t)

= l: a(O;) 5,(t), i= I

(2)

where

a(8;)

= [a,(O,) e-

j WOT 1

( 8 ,) ,

(3 ) often termed the array response or array steering vector for direction 9;. Setting A(9) = [a(O]), . · . , a(Od)], s( t) = [SI (t), ... , Sd (t)] T, and adding measurement noise n (t), the measurement model for the passive sensor array narrow-band signal processing problem is

x(t) = A(6) s(t) + n(t).

(4)

Note that x ( t ), n ( t) E ([ '", S ( t) E (: ", and A ( 9) and it will be assumed that m > d.

em x d,

III.

E

THE GEOMETRIC ApPROACH

In 1977, Schmidt [4] developed the MUSIC (MUltiple SIgnal Classification) algorithm by taking a geometric view of the signal parameter estimation problem. One of the major breakthroughs afforded by the MUSIC algo-

225

rirhm was the ability to handle arbitrary arrays of sensors. Until the nlid-1970's. direction finding techniques required knowledge of the array directional sensitivity pattern in analytical form, and the task of the antenna designer was to build an array of antennas with a prespecified sensitivity pattern. The work of Schmidt essentially relieved the designer from such constraints by exploiting the reduction in analytical complexity that could be achieved by calibrating the array. Thus, the highly nonlinear problem of calculating the array response to a signal from a given direction was reduced to that of measuring and storing the response. Although MUSIC did not mitigate the computational complexity of solution to the DOA estimation problem. it did extend the applicability of high-resolution DOA estimation to arbitrary arrays of sensors. A. Array Manifolds and Signal Subspaces

To introduce the concepts of the array manifold and the signal subspace. recall the noise-free data model x (t) = A (0) s( t). The vectors a( 0,) E trill, the columns of A (9), are elements of a set (not a subspace), termed the array manifold" (a), composed of all array response (steering) vectors obtained as 8 ranges over the entire parameter space. (1 is completely determined by the sensor directivity patterns and the array geometry, and can sometimes

be computed analytically. However, for complex arrays that defy analytical description, (1 can be obtained by calibration (i.e., physical measurements). For azimuth-only DOA estimation, the array manifold is a one-parameter manifold that can be viewed as a rope weaving through ce III. For azimuth and elevation DOA estimation, the manifold is a sheet in cr m. To avoid ambiguities, it is necessary to assume that the map from 9 = {8 1 , • • • , Oil} to
signal subspace (S x) spanned by the columns of A (0 ). Once d independent vectors have been observed." Sx is

known, and intersections between the observed subspace and the array manifold yield the set of vectors from the array manifold that span the observed signal subspace. A three-sensor two-source example is graphically depicted in Fig. 2. Assuming that the sensor array has been designed such that the map from parameters to array manifold vectors is unique, the parameters are immediately detennined. Problems arise when only noisy measurements x( 1) == A (9) s (t) + n (t) of the array output are available, since Sx must be estimated. Imposing the constraint that the estimate Sx be spanned by elements from ct and assuming unknown deterministic signals and Gaussian noise. a maximum-likelihood (ML) estimator can be formulated as described in Appendix A. However, the ML solution (of obtaining a set of vectors from the array manifold that best fits all the measurements) is computationally prohibitive in most practical applications. Schmidt's idea was to employ a suboptimal I\vO-JTep procedure instead. First. an unconstrained set of d vectors that best fits all the measurements is found. Then points of closest approach of the space spanned by those vectors to the array manifold are sought. This procedure. although clearly suboptimal, retains some of the key properties of the ML solution, including the fact that the exact answer is obtained asymptotically as the number of measurements goes to infinity.

C. Estimating the Signal Subspace To obtain an unconstrained estimate of the signal subspace, the least-squares (LS) criterion is most often CITIployed. The idea is to find a set of d vectors that span a subspace of «: III that best fits, in an LS sense, the observed data. Assuming the signals and noise are zero-mean, a method can be derived by first examining the covariance matrix of the measurements. If the signals are modeled as stationary (zero-mean) stochastic processes. they are assumed to be uncorrelated with the noise and possess a positive definite covariance matrix R ss > O. IL on (he other hand, a deterministic (zero-mean) signal model I~ chosen, a persistent excitation condition is imposed, i.e .. del'

)

R ss = lim -

".

L:

N-ooNl=1

B. Intersections as Solutions

The concepts of an observed signal subspace and a calibrated array manifold permit an immediate visualization of the solution. In the absence of noise, the outputs of the sensor array lie in a d-dimensional subspace of ce", the

s(t) s*(t)

is assumed to exist and be positive definite. Under the conditions given above, the covariance matrix of the measurements is given by

"Technically. a k-dimensional manifold in C is a subset of points in satisfying certain local continuity and differentiability conditions. The physics of sensor arrays guarantee the continuity and differentiability properties will be satisfied. Associated with each point on the manifold is a vector to that point from the origin in £", usrng the standard basis. ~It is also convenient to define the noise subspace ( S i ) as the on hogonal complement of the signal subspace in ("'''. III

(m

226

dcf

R x>: = E{xx*}

= ARssA*

+

0

2

1:" .

(5)

The objective is to find a set of d linearly independent vectors that is contained in Sx = (R {A }. the subspace ~hc problem of degenerate signal spaces. I.e .. fully correlated ~ignal~. is discussed in I J]. For the purposes of this discussion. it i~ assumed thai the signals are not fully correlated.

ployed . Note that this estimate can be scaled by N I(N m) to obtain an unbiased estimate, but the scaling only affects the magnitudes of the eigenvalues, not the eigenvectors, so the subspace estimates remain unaffected . When the covariances are estimated from finite data matrices, the m - d smallest GE's are only clustered around, and not all equal to, (12. In this case, special statistical techniques based on likelihood ratio (LR) tests [I], [5] can be used to obtain an estimate J of the signal subspace dimension . In any case , in the presence of a finite amount of noisy measurements,
s.

D. Estimating the Signal Parameters

In the absence of noise, parameter estimates can be obtained by finding intersections of with Sx or, equivalently, finding elements of a that are orthogonal to S;. At this point in the suboptimal signal subspace algorithms, the real computational effort" begins. Even with perfect knowledge of the signal subspace, searching the array manifold for d intersections with Sx can be quite costly, especially for multidimensional parameter spaces (e.g., azimuth, elevation, and range). The problem is further complicated in the presence of noise since, with probability one, Sx a = 0; there are no intersections. Consequently, there are no elements of that are orthogonal to Si . Referring to Fig. 2, it would seem that intersections could almost always be found . In this respect, the figure is somewhat misleading. For three sensors and two sources. the signal subspace is a two-dimensional complex subspace of three-dimensional complex space . In the real field, the estimated signal subspace is actually a four-dimensional subspace of six-dimensional space, and it need not intersect the one-dimensional manifold at all . Obviously, elements of a that are closest to Sx should be considered as potential solutions, but the issue of an appropriate measure of closeness remains . Schmidt [4] proposed the following function as one possible measure" of the closeness of an element of to

a

Array Manilold

Fig 2. The geometry of MUSIC for a three-sensor two-source example (no noise) .

spanned by the columns of A. One such set of vectors is givenbyEs=1:.[e,I···led]where{e;,i= I, " ' , d } is the set of generalized eigenvectors (GEV's) of R xx corresponding to the d largest generalized eigenvalues (GE's) . This fact follows directly from the defining propeny of GEV's, i.e ., RxxE = 1:. EA;, and their 1:.-orthoganality (E*1:.E = 1) as follows: RxxE = 1:.EA, =>

ARssA*£ + (121:.E = 1:.EA,

= E*I:.EA

=>

E*ARssA*E

=>

E*ARssA*E = A -

- (12E*1:.E,

(121,

=>

ARssA* = £-*[A -

(121]

E- I ,

=>

ARssA* = 1:.E [A

(121]

E*1:..

-

(6)

Since ARssA * is rank d and positive semidefinite by construction , the d largest GE's of (Rxx , 1:n ) are simply the 2 d nonzero GE' s of (AR ss A *, 1:.) augmented by (1 , and

n

Sx:

def

a

a

a*(8) a(8) PM(8) = a*(8) ENE~a«(J)'

(7)

a

'The sample covariance is known to be a sufficient statistic for the estimation problem of finding the best fit of a rank d subspace to the observed data in a least-squares sense .

i.

over (1 . "In (4), the numerator was not explicitly included in the measure since it was assumed Ihat the array manifold vectors were all normalized in some suitable fashion.

227

An alternative to first forming the measurement covariance matrix and then performing an eigendecomposition is to operate directly on the measurements using the singular value decomposition (SVD). In addition to avoiding squaring the data, this approach has a nice geometric interpretation. Letting X E ([ m x N denote the data matrix, the objective is to obtain a set of vectors spanning the column space of the rank d matrix, X E cr m x N, that best approximates X in a least-squares sense. The solution is given by the d left singular vectors of X corresponding to the d largest singular values. 12 It is easily demonstrated that the eigendecomposition and the SVD yield the same subspace estimate. If the SVD of X/ is given by VI:, V*,

of the eigenvectors have been proposed by several investigators, e.g., Johnson [6]. Although conceptually simple, the one-dimensional'S MUSIC measure has several drawbacks. Primarily, problems in the finite measurement case arise from the fact that since d signals are known to be present, d parameter estimates. {(J\, .•• , (Jd}, should be sought simultaneously by maximizing an appropriate functional rather than obtaining estimates one at a time as is done in the search over PM ( (J). However, multidimensional searches are exponentially more expensive than one-dimensional searches. The price paid for the computational reduction achieved by employing a one-dimensional search for d parameters is that the method is finite-sample-biased in the multiple source environment (cf. {I]). Furthermore, in low SNR scenarios and in situations where even small sensor array errors are present, the ability of the conventional MUSIC spectrum to resolve closely spaced sources (i.e., observe multiple peaks in the measure) is severely degraded (cf. [I ]). Nevertheless, it should be emphasized that in spite of these drawbacks, MUSIC has been shown to outperform previous techniques (cf. (7). Finally, as indicated above, MUSIC asymptotically yields unbiased parameter estimates since as the amount of data becomes infinite. errors in the estimate Sx vanish l8}. If the noise is spatially Gaussian and temporally independent. the distribution of the eigenvectors of the sample covariance is asymptotically Gaussian, with mean equal to the true eigenvectors (assuming distinct eigenvalues 1I) and covariances that go to zero [9], [10]. Thus, the estimated signal subspace converges in meansquare to the true signal subspace, and the parameter estimates converge to the true values as well.

IN

I

- XX*

N

E. Summary of the MUSIC Algorithm The following is a summary of the MUSIC algorithm based on the covariance approach described above. 1) Collect the data and estimate Rxx = E { xx*} = ARssA * + a 2l: n denoting the estimate Rxx2) Solve for the eigensystem: RxxE = 1:n E A , w~e5S

A=diag{A, . . . . ,Am},Ar

~

...

[e, I ... I em]' 3) Estimate the number of sources 4) Evaluate

~Am,andE=

d.

a*(O) a(O) where EN = E n l ed+ l t · . . I em]' 5) Find the d (largest) peaks of P( 8) to obtain estimates of the parameters. \liThe conventional MUSIC measure is herein referred to as a one-dimensional measure. although the search over Ci is potentially a multidimensional one. the dimension being that of the parameter vector (e.g .• three for a parameter vector consisting of range. azimuth. and elevation). II If there are repeated eigenvalues. subspace convergence is still guaranteed as long as the subspace contains all the eigenvectors associated with the repealed eigenvalues.

= VE

2

U*

= Rxx' II

since ~ is diagonal and real, and V and V are unitary. Thus, the left singular vectors, U, of X are the right eigenvectors of Rxx , the sample covariance matrix. Thus, in the absence of finite precision effects in computing the decompositions, the subspace estimates from both techniques are identical. There are, however, significant computational differences. The computation of the full SVD of X is of order mN 2 which can be significantly larger than the 0 (m 3 ) operations required for an eigendecomposition of Rxx . The increase in computation is due to the fact that the full SVD is also obtaining a set of d vectors, the first d columns of V E G: N x N, that span the d-dimensional subspace of N-dimensional space spanned by the d signal vectors, vectors in CC N whose components are the samples of the underlying signals. 13 If the infonnation of the full SVD is not required, partial SVD algorithms that compute only the left singular vectors and singular values can be employed resulting in substantial computational savings [11]. If N is the number of measurements to be processed. and m is the number of elements in each sample vector ~ forming the sample covariance matrix requires on the order of Nm 2 operations. Eigendecompositions of matrices in
228

IV. ESPRIT Although MUSIC was the first of the high-resolution algorithms to correctly exploit the underlying data rnodel of narrow-band signals in additive noise, the algorithm has several limitations including the fact that complete knowledge of the array manifold is required, and that the search over parameter space is computationally very expensive. In this section, an approach (ESPRIT) to the signal parameter estimation probiem that exploits sensor array invariances is described. 14 ESPRIT is similar to MUSIC in that it correctly exploits the underlying data model, while manifesting significant advantages over MUSIC as described in Section I. To simplify the description of the basic ideas behind ESPRIT, much of the ensuing discussion is couched in terms of the problem of multiple source direction-of-arrival (DOA) estimation from data collected by an array of sensors. For simplicity, discussions deal only with singledimensional parameter spaces, e.g., azimuth-only direction finding (DF) of far-field point sources, since the basic concepts are most easily visualized in such spaces. Narrow-band signals of known center frequency will be assumed. Recall that a DOA/DF problem is classified as narrow-band if the signal bandwidth is small compared to the inverse of the transit time of a wavefront across the array, and the array response is not a function of frequency over the signal bandwidth. The generality of the fundamental concepts on which ESPRIT is based makes the extension to higher spatial dimensions and to signals containing multiple frequencies possible.

A. Array Geometry

ESPRIT retains most of the essential features of the arbitrary array of sensors, but achieves a significant reduction in computational complexity by imposing a constraint on the structure of the sensor array, a constraint most easily described by an example. Consider a planar array of arbitrary geometry composed of m sensor doublets as shown in Fig. 3. The elements in each doublet have identical sensitivity patterns and are translationally separated by a known constant displacement vector A. Other than the obvious requirement that each sensor have nonzero sensitivity in all directions of interest, the gain, phase, and polarization sensitivity of the elements in the doublet are arbitrary. Furthermore, there is no requirement that any of the doublets possess the same sensitivity patterns although, as discussed in [1 J and [14], there are advantages to employing arrays with such characteristics. B. The Data Model Assume that there are d S m narrow-band sources'! centered at frequency Wo, and that the sources are located 14 A patent has been issued on the sensor array design and concepts embodied in ESPRIT [12], [13]. I~MUSIC imposes the requirement d < 2 m, and can therefore handle roughly twice as many sources as ESPRIT in general. For uniform linear arrays. however, ESPRIT can handle as many sources as MUSIC [I) by employing overlapping subarrays.

Signal 1 -

/

s,

Signal 2 - s]

Fig. 3. Sensor array geometry for multiple source DOA estimation using ESPRIT.

sufficiently far from the array such that in homogeneous isotropic transmission media, the wavefronts impinging on the array are planar. As before, the sources may be assumed to be stationary zero-mean random processes or deterministic signals. Additive noise is present at all 2 m sensors .and is assumed to be a stationary zero-mean random process with a spatial covariance 0 2~n. To describe mathematically the effect of the translational invariance of the sensor array, it is convenient to describe the array as being comprised of two subarrays, Zx and Zy, identical in every respect although physically displaced (not rotated) from each other by a known displacement vector A of magnitude d. The signals received at the ith doublet can then be expressed as d

Xi

(r)

Yi (t)

= k=1 l: =

Sk( r)

a, (6k ) + nx , (r),

d

~ Sk(t)

ksl

e)ClXJAsinfh./

c

Q;

(lh) + n,.,(t) ,

(8)

where (Jk is now the direction-of-arrival of the kth source relative to the direction of the translational displacement vector A. Since the sensor gain and phase patterns are arbitrary and since ESPRIT does not require any knowledge of the sensitivities, the subarray displacement vector A sets not only the scale for the problem, but the reference direction as well. The DOA estimates obtained are angles-of-arrival with respect to the direction of the vector A. A natural consequence of this fact is the necessity for a corresponding displacement vector for each dimension in which parameter estimates are desired. Combining the outputs of each of the sensors in the two subarrays, the received data vectors can be written as fol-

229

lows: x (I)

=

As ( r) + n, (r ) ,

y ( t)

:=

A (J) s ( t)

( 9)

+ n\' ( I ) • ( 10) x 1 vector of impinging

where the vector s ( t) is the d signals (wavefronts) as observed at the reference sensor of subarray Zx. The signals can be correlated in the sense that E {S,. (t) st(t)} :# 0 for i j, although the case of coherent (or fully correlated) sources is not considered herein (cf. [1, Section 7. 11] for further discussion). The matrix til is a diagonal d x d matrix of the phase delays between the doublet sensors for the d wavefronts, and is given by

'*

= diag {el e /1'(/} , ( 11 ) ""0 d sin Ok I c. CIl is a unitary matrix (operator) 4l

1' l ,

• ••

,

where "fA = that relates the measurements from subarray Zx to those from subarray Zy. In the complex field, cJ) is a simple scaling operator. However, it is isomorphic to the real twodimensional rotation operator and is herein referred to as a rotation'? operator. The unitary nature of • is a consequence of the narrow-band planewave assumption, an assumption that leads to unit-modulus cisoidal signals in the spatial domain. In time series analysis, the diagonal elements of 4l are potentially arbitrary complex numbers in which case tI» could be an expansive or contractive operator. Defining the total array output vector as z( t), the subarray outputs can be combined to yield

z(t)

=

I

X(l)]

_y(1)

-A = [A ] , A «I»

-

= As(t)

+ R:(t),

n;:(t) =

ln~(I)] . nJ{t}

(12) (13)

It is the structure of A that is exploited to obtain estimates of the diagonal elements of cJ) without having to know A. From (12). it is easily seen that the estimation problem posed is scale-invariant in the sense that absolute signal powers are not observable. For any nonsingular diagonal matrix. D, the data model is invariant with respect to the transformations s (t) -+ D - I s( t) and A -+ AD. Thus, estimates of the signals and the associated array manifold vectors derived herein are to be interpreted modu 10 an arbitral)' scale factor unless knowledge of the gain pattern of one of the sensors is available.

c.

ESPRIT-The Invariance Approach

The basic idea behind ESPRIT is to exploit the rotational invariance of the underlying signal subspace induced by the translational invariance of the sensor array. The relevant signal subspace is the one that contains the It)This is the origin of rotational in the acronym ESPRIT.

outputs from the two subarrays described above. Zx and Zy. Simultaneous sampling 17 of the output of the arrays leads to two sets of vectors, Ex and E y, that span the same signal subspace (ideally. that spanned by the columns of A).

The ESPRIT algorithm is based on the following results for the case in which the underlying 2 m-dimensional signal subspace containing the entire array output is known. In the absence of noise, the signal subspace can be obtained as before by collecting a sufficient number of measurements and finding any set of d linearly independent measurement vectors. These vectors span the d-dimensional subspace of
= AT.

( 14)

Furthermore, the invariance structure of the array implies . Ex E If"ItIXi/ ~mxd Es can be decomposed into 'I.an d E y E \L. (cf. Zx and Zy subarrays) such that

Es = [::] =

[~:Tl

( 15)

from which it is easily seen that

CR{Ex } = ffi{E y} = ffi{A}. Since Ex and E y share a common column space, the rank of def

Exy = [Ext E y]

(16)

is d, which implies there exists a unique (recall d rank d matrix IR FE cr 2d x d such that

o

:=

[Exl E y] F =

e.»; + e,r;

= ATFx + AcJ>TFy.

~

m)

(17)

(18)

F spans the null-space of [Ex lEy]. Defining

'I'

=

def

-Fx [F y]

_I

•

(

)

19

171n many practical situations. srrnultaneous sampling is a nontrivial hardware design issue. Although in cerium of these situations it is possible to relax this condition to say one of uniform sampling in time instead. It IS assumed herein that simultaneous sampling is performed. The extent to which this is not exactly achieved represents errors in the underlying model. errors to which ESPRIT is manifestly less sensuive than all other signal subspace based algorithms. 18This derivation. although somewhat more lengthy than at first glance seems necessary. will prove useful when noisy estimates of EJ( and E., are available.

230

equation (J 8) can be rearranged to yield!" A I'll = A 4> T :::) A T'P T - I == A 0.

( 20 )

Assuming A to be full rank implies

T'PT

I

=--;1

(21 )

Therefore, the eigenvalues of 'II must be equal to the diagonal elements of (J>, and the columns of T are the eigenvectors of 'P. This is the key relationship in the development of ESPRIT and its properties. The signal parameters are obtained as nonlinear functions of the eigenvalues of the operator 'I' that maps (rotates) one set of vectors ( Ex ) that span an m-dimensional signal subspace into another ( E F) D, Estimating the Subspace Rotation Operator

In practical situations where only a finite number of noisy measurements are available, Es is estimate? from the covariance matrices of the measurements RZ7 or, equivalently, from the data matrix Z. The result is that CR { Es } is only an estimate of Sz, and with probabil ity one, CR { E s } *"
"*

x == [A*A]

-I

A*B.

It is easily verified that the estimate is unbiased and minimum variance. The extension to arbitrary, but known,

and B are noisy, however, the LS solution is known to be biased. Since it is not difficult to argue that the estimates E.t.· and E yare equally noisy. the LS criterion is clearly inappropriate. A criterion that takes into account noise on both A and B is the total least-squares (TLS) criterion. The TLS criterion can he stated [2] as finding residual matrices R A and R B of minimum Frobenius norm, and X such that (22)

This criterion is easily shown to be equivalent to replacing the zero matrix in (17) by a matrix of errors the Frobcnius norm of which is to be minimized (i .e .. lota/least-squared error). If the covariance of the errors, specifically the rows of [R A I RB ], is known to within a scale factor, the TLS estimate of X is strongly consistent [81. Appending a nontriviality constraint F* F = I to eliminate the zero solution and applying standard Lagrange techniques leads to a solution for F given by the eigenvectors corresponding to the d smallest eigenvalues of E:yExy. The eigenvalues of 'I' as defined above and calculated from the estimates F x and F, are taken as estimates of the diagonal elements of CI>.

E. Summary of the TLS ESPRIT Covariance Algorithm The TLS ESPRIT algorithm based on a covariance formulation can be summarized as follows. 1) Obtain an estimate of Rzl , denoted Ru . from the measurernents Z. 2) Compute the generalized eigendecornposition of {RZl • 1:" } where A = diag {AI' .... A2",}, AI ~. [e, I ... I e~ 11/ J. 3) Estimate the number of sources" d. 4) Obtain the signal subspace estimate and decompose it to obtain Ex and Ey, where

E=

s.

covariance of the rows of B is straightforward and leads to the weighted least-squares (WLS) solution. If both A

'\)Thc same argument uxcd In deriving (14) can be used to derive (20) directly. However. the derivation only ensures the existence and uniqueness of such a full-rank '1'. The advantage of the preceding. derivation is that implicitly a prescription for obtaining 'I' is given. Note thai the existence and uniqueness of a full-rank 'I' guarantees the invertibility of Fr. !Clpreviou~ Implementations of ESPRIT (151. 1161 exploited the invariance by finding »i-dimcnsronal operators that hi'S! muppl'd one of the spanrune sets either E, or E). into the other using a least-squares (LS) criteno~. Thc ohjectivc was to find a 'JI E ([ ", • '" such that

E s del == 1:" [er I

and

ffi { e, }.

r. EExlr .

5) Compute the eigendecornposition (A,

> ... >

"2ii) ,

and partition E into

'1'£'( ::::: E y . Since the problem 1\ undcrdeternuncd by construction t cf typically d < there I" no unique solution. although the parameter estimates. the d ergenvalucs of 'I' on the unit circle that arc associated with the cJ-dimensional subspace being rotated, arc urnquc. Imposing a minimum norm constramt on 'I' leads to a unique LS solution in which III - d eigenvalues are equal to zero. See 11] for funhcr detai I".

. . . Ie"1 =

~ A~""

d

x

asubmatrices,

11/ ).

~ISee [ l] and 151 for details on various techniques for esurnaung the number of sources

231

6) Calculate the eigenvalues of 'I' ~k

=

Ak(-E rl.E 22

J

) .

:=

vk =

-

DOA's is given by

E I2 E;l,

1, ...

W

,d.

7) Estimate fh = f - I ( J>k); e. g., for DOA estimation, lJk = sin - I {c a rg (~k) / ( Wo A ) }. For arrays with multiple invariances, such as uniform linear arrays. the decomposition of Es into Ex and Ey is not unique. See [14) and {17] for more details concerning multiple invariance ESPRIT. In many instances, it is preferable to avoid forming covariance matrices, and instead to operate directly on the data as discussed in Section III. This approach leads to (generalized) singular value decompositions (GSVD's) of data matrices . and a GSVD variant of ESPRIT discussed in detail in [I]. From the key relation (21), several other quite striking results can be derived. For example, not only is knowledge of the array manifold not required, but the elements thereof associated with the estimated signal parameters (DOA's) can be estimated if desired. The same is true of the source correlation matrix. knowledge of which is not needed in ESPRIT. F. Array Calibration Using the TLS formulation of ESPRIT . the array manifold vectors associated with each signal (parameter) can be estimated (to within an arbitrary scale factor). From (21).. the right eigenvectors of 'I' are given by Eo+ = T - I . This result can be used to obtain estimates of the array manifold vectors as

EsE tt =

A1T- 1 = A.

(23)

No assumption concerning the source covariance is re-

quired. Although simple to compute, this estimate will not in general conform to the invariance structure of the array in the presence of noise. In low SNR scenarios, the deviation from the assumed structure A = [A TI (A cD) T] r may be significant. In such situations, improved estimates of the array manifold vectors can be obtained by employing the formulation discussed in r18]. G. Signal COP)' In many practical applications . not only the signal parameters, but the signals themselves, are of interest. Estimation of the signals as a function of time from an estimated DOA is termed signal copy. The basic objective is 10 obtain estimates S( t) of the signals s (r ) from the array output . zt r) = As(t) + n(t). Employing a linear estimator, a squared-error cost criterion in the metric of the noise (which is ML if the noise is Gaussian), and conditioning on knowledge of A, leads to the estimate i (t), the vector of coefficients resulting from the oblique projection of z( t) onto the space spanned by the columns of A (cf. the Appendix). The resulting weight matrix W (i.e., the linear estimator) whose ith column is a weight vector that can be used to obtain an estimate of the signal from the ith estimated DOA and reject those from the other

= 1:;IA[A*1:,~'A]-'.

(24)

In terms of quantities already available. (24) can be written as W = I;,~I e, lEt}:;' ES]-' E;*, (25)

using (23) to estimate A. This equivalence is easily established since from (21) it follows that the right eigenvectors of 'I' equal T - I. Combining this fact with E s == A T and substituting in (25) yields

w- = E;' [E.~l:,~1 Esl -I Etr,;l == [A * £; I A ] - I A * 1:.; ,.

( 26)

Note that the optimal copy vector is a vector that is ~; I orthogonal to all but one of the vectors in the columns of A since W*A" = 1. There is, of course, a total least-squares alternative to conditioning on knowledge of A. Since only estimates of A are available, in low SNR scenarios where accurate signal estimates are desired, the TLS approach yields improved estimates at the cost of increased computation. Although not derived herein, S( I) can be obtai ned by performing a (generalized) singular value decomposition of [A Iz(1) ]. The right singular vector corresponding (0 the smallest singular value yields i (t) as the first d elements after normalizing the last element to unity. 22 H. Source Correlation Estimation

There are several approaches that can be used to estimate the source correlations. The most straightforward is to simply note that the optimal signal copy matrix W obtained above removes the spatial correlation in the observed measurements fcf. (25)]. Thus. W*CzzW == DSD* where S is the source correlation (not covariance) matrix, eZL = Rzz - 0- 21:", and the diagonal factor D accounts for arbitrary normalization of the columns of W. Note that when Rzz must be estimated, a manifestly rank d estimate CZl = e, [A ~cl) - &2lcd E can be used [cf. (6)], where A~d) = diag {AJ, .. · , Ad} and Ai is a GE of (R zz , 1:n ) . Combining this with E s = A T gives DSD* = T[A1 J - &~/d] T*. (27) If a gain pattern for one of the elements is known, specifically if the gain Ia I (Ok)' is known for all (Jk associated with sources whose power is to be estimated, then source power estimation is possible since the array manifold vectors can now be obtained with proper scaling.

t

V. SIMULATION RESULTS Many simulations have been conducted exploring different aspects of ESPRIT and making comparisons to ~~This approach is clearly suboptimal if the sampled signals are temporally correlated in the sense that E { S, ( t ) 5, (I + T)} -:I: 0 for r o. If. for example. the signals are known to be sinusoidally modulated RF and uniform temporal sampling is employed. then estimating the underlying signals requires only the estimation of the modulation frequency. another problem well suited to ESPRIT. Note that in general the modulation Irequency must be a small fraction of the carrier to satisfy the narrow-band assumption.

232

*"

other techn ique s (cf. [I ]) . Herein . only one of the sce narios. but one that addresses se veral issues that a rise in a practical implementatio n of ESPRIT, is presented. Thu s, sensor gain and phase errors , as well as sensor spacing errors , are included. Furthermore , unequal source powers and a high degree of source co rrelatio n are assumed . More speci fically . the array chos e n was a ten -element array with doublet spaci ng A/ 4 and the five doublet s ran domly spaced on a line resulting in a n ape rture of approximately 4 A. Tw o sources were located at 24° and 28° (approxi mately 0.3 Raylei gh or 3 dB beamwidth separa tion) , and were of unequ al po we rs, 20 dB and 15 dB . rcspectivcly . Sen sor errors we re introd uced by zero -me an norm al random additive e rrors with sigmas of O. I dB in am plitude and 2° In phas e" (inde pende nt of angle) . Senso r location error s (alo ng the axis of the a rray ) with sig ma 0.005 ( A/2) were incl ude d as well. The sources were 90 percent temporall y co rrelated and 5000 tr ials were run . A histogram of the result s is giv e n in Fig . 4 . The number of so urces was assumed to be known in the impleme ntatio n of both MUSI C and ESPRIT . The indi cated fail ure rate for MU SIC of 37 pe rcent is the perccn tage of trials in which the conve ntional MUSIC spectrum did not ex hibit two pea ks in the inte rval [20 ° , 32° ). This, of course , is not an issue in ESPRIT , where two parameter est imates are obt ained eve ry time . The sa mple mean s and sigmas of the ESPRIT estimates were 23.93 ° 1: lO? " and 28.06 ° ± 1.37°. wh ile those of the 3 175 successful MUS IC tr ials were 24.35° ± 0.28° and 27.48 ° ± 0.38° . Note that wit h refe re nce to Fig. 4. the re is an ove rlap in the distribut ion s of the ES PRIT estimates . Thi s has an effect o n thc sta tistics calc ulated , since a simple angle-o rderi ng sche me was used whe re in the larg er of the two angle estimates in eac h trial was assoc iated with the 28° source .:" Th e effec t is pre sum ed to be sma ll in this case . Th e results indicate the presen ce of a bias even in the successfu l co nve ntional MUSIC estimates . the source of which IS descr ibed In detail in [I) . On the othe r hand , the ESPRIT estimates are unb iased , altho ugh of larg e r variance since less info rmatio n co nce rni ng the array geometry IS being utilized . Note also that in comparing the estimate variances. there is no atte mpt to account for the 1825 trials in which MUSIC failed 25 to pro vide two DOA estimates! Howe ve r, as the suba rray sepa ratio n inc reases , the ESPRIT paramet er esti mate variances approac h those of MU SIC. The sa me expe rime nt was run fo r a suba rray sep-

1000

900

or

'2

t-8 "!

!? 0:

~ :f

SNR • (20,I5]dB 100 pointslttial 700 5000 trials

n

-

0( <1) • O.OO5(lI2)

0(t)·2· oCA). 0.1 dB

ESPRfT .. MUSIC

MUSIC Fadul. Rale 37'%

:' ~.

600

500 400 300 200 100 0 20 DCA (dell)

Fig . 4. Hist ogram of M USIC and ESPRIT resu lts- ra nd om to -el em e nt line ar a rray . so urce correlat ion 90 pe rce nt. sm all arra y ape rtur e ( .l ~

A/4 ).

aration of 4 A, and the resulting ESPRIT est imate s wer e 24 .003° ± 0.062 ° and 28.002° ± 0.089 ° , and the co rresponding MUSIC estimates were 24.0 11° ± 0.056° and 27.986 ° ± 0.078 0 . Due to the inc reas ed suba rray spacing. (array aperture), there were no MUSIC failure s. Again. ESPRIT is unbiased , but now the sample parameter estimate sigmas are nearly equal to those obtained with MU SIC. More detail s conce rn ing these and man y othe r simulations can be found in [11 (see also [18]) . VI .

mo men ts the trunc at ed d is t ribu t ion s . " Unto rtuna te ly . th is IS oft e n referred 10 a, a fail ure of M liSle to n " w in ' th e tw o so urce s . In tact the det ection of the nu mbe r o f so urce, was made in a pr iur , Iage of Ihe algo rith m . T he fa il ure o f the co nve ntio nal M USI C spectru m (measu re ) 10 pro vi de the a ppro priate num ber of peuh (e st imate s ) indi c ate s only the Ina ppro priate ness o f the meas u re! Ha ving det ect ed d so urces. a n a lgo rithm for find ing d est imat e s sim ultuneouslv is

app ropr iute .

• Al4

m ,10 800 d.2

~ \ I n DO A esurna no n appl ica uo ns, er rors o f t hese magrutu des arc co n-

srde red sma ll: howev e r. it i!'. a ppa rent ly pos si ble to co nstruc t a rrays meet 109 these specrncauon s " T hiS doe s nOI imp ly tha i the sta us uc s a re co m pile d by splilli fll< (he tustogram down the m idd le a nd co rnpuu ng the ce nte r o f ma ss a nd second

ESPRITt MUSle Mon1e Carta Results · Source Co"etalion 90% AnlY • (O.1 .3.5.5,1)(lJ2)

D ISC USSI O N

In th is paper, a new techn ique (ESPRIT ) for signal parameter estimat ion has been introduced . Th e algo rithm di ffers from its predecessors in that a total least-squares rather than a standard least-squares (LS) cr ite rion is employed. The ea rlier versions of ESPRIT descr ibed in [15] and [16) can be seen [I] to be least-squares estim ator s of an m X m operato r whose act ion is restricted to a d-d imen sional subspace . The fact that thi s LS ope rato r is a restricted m x m operator lead s to some co ncern ove r potential numerical difficulties in solving the generali zed eigenproblem . Imposing the subspace restriction as a constra int pr ior to solvi ng the gen e ralized eigenprohlem lead s to a well-conditio ned d x d gene rali zed cigenp robl em , thereby mitigating the se numerical co nce rns ; however . the leas t-squares property of the es timate is retained . In cases whe re the SNR is sufficie ntly larg e. the di fferen ce between the LS and TLS param eter estim ate s is sma ll. Th e difference is notable at low SNR '5. however ; the LS estimates a re bias ed as predicted , while the TLS es timates a re relati vely unb iased and ha ve been sho wn [81 to be stro ngly co nsistent (co nverge with pro bability (Jlle to the true values ). A. ESPRIT and Null-Steering

Many of the previous high-resolution par ameter estimation techniques are based on steering beams toward signal direct ions and , in so me cases, s imultaneous ly at-

233

tempting to otherwise rrumrmze the power in some weighted combination of sensor outputs. Parameter estimates are associated with DOA's at which peaks in the power output occur. Although intuitively appealing at first glance. there is a much more powerful alternative philosophy, that of null-steering. It is well known that deep, sharp notches in directivity patterns and filter gain functions are much easier to achieve than sharp peaks. Interferometers exploit this fact to obtain accurate estimates of source parameters by finding the relative phase required to cancel signal components in two channels. In this context, ESPRIT can be interpreted as a multidimensional null-steering parameter estimation algorithm. Calculation of the eigenvalues of the (rotation) operator '1', which are the roots of its characteristic polynomial, can be interpreted as multidimensional null-steering. Instead of steering broad beams, ESPRIT steers sharp nulls at all sources simultaneously and does so without relying on knowledge of the array manifold!

B. Computational Advantages of ESPRIT The primary computational advantage of ESPRIT is that it eliminates the search procedure inherent in all previous methods (ML, ME, MUSIC). ESPRIT produces signal parameter estimates directly in terms of (generalized) eigcnvalues, As noted previously, this involves computations of the order d 3. On the other hand, MUSIC and the other high-resolution techniques require a search over (i, and it is this search that is computationally expensive. The significant computational advantage of ESPRIT becomes even more pronounced in multidimensional parameter esti mation where the computational load grows linearly with dimension in ESPRIT, while that of MUSIC grows exponentially. If" is the resolution (i.e., number of vectors) required in the calibration of (t for the lth dimension in 9, the computation required to search over L dimensions for d parameter vectors is proportional to f".:: I r,. For r == r, the computational load is r'',

the optimization can be carried out in two steps. A solution for the optimal s (r ) is sought as a function of 9. then the maximization over 9 is performed. Employing this procedure gives s(r ) == w*( 9) z(r ), where w (B) = r.; 1 A (6) I A *( 9) I:,~ I A (0) J- I. Substituting the expression for !J' (t) back into (29) and using standard properties of the trace operator,

oC (9) oc - Tr {pA _ (0) Rzz'E" I } ~ (30) where PA -'- (8) is the oblique projection operator onto the complement of the space spanned by A ( 0) (in the metric 1:"). Maximization of this criterion is equivalent to finding max Tr {p A(o)Rzzl:,; I} . (31 ) 9

as can be easily verified.I" Although easy (0 describe analytically, the computational burden of actually carrying out the multidimensional projection and maximization over 9 is generally prohibitive, resulting in the need for reasonable approximate solutions such as MUSIC and ESPRIT. ACKNOWLEDGMENT

The authors would like to express their gratitude to the referees for their time and effort in carefully reviewing the preliminary manuscript. The authors also thank J. Speiser of NOSe and Prof. G. Golub of Stanford who suggested the potential application of TLS concepts to this problem. The incorporation of their suggestions and comments significantly improved the clarity of the final manuscript.

n

ApPENDIX

THf-. MAXIMUM LIKELIHOOD ESTIMATOR

For the class of problems considered herein, the maxiInurn likelihood estimator is simple to derive analytically although. in most practical real-time applications, com-

putationally prohibitive. For nonrandom signals in Gaussian noise with covariance 1:/1' z( t) = A (9) s (t) + n (t), the likelihood function is easily written [5], [1 j:

£lz(t)]

=

-In [p{z(t)\z(t)

= A(9) s(t)

+ n}] (28)

ex

-[z(t) - A(O) s(r)]*

. 1:,~ 1 [ Z( t) - A (9) s ( t ) ] .

( 29)

The maximization of £ is over (s (r ); tEl 0, N]} x {9 E e }, and is therefore a nonlinear optimization problem. It belongs. however, to the class of separable nonlinear optimization problems. Golub and Pereyra [19] prove that

234

REFERENCES

[1] R. H. Roy, "ESPRIT-Estimation of signal parameters via rotational invariance techniques," Ph.D. dissertation, Stanford Univ., Stanford, CA, 1987. [2] G. H. Golub and C. F. Van Loan, Matrix Computations. Baltimore, MD: Johns Hopkins University Press, 1984. [3] H. L. Van Trees, Detection, Estimation, and Modulation Theory. New York: Wiley, 1971. [4] R. O. Schmidt, "A signal subspace approach to multiple emitter location and spectral estimation," Ph.D. dissertation, Stanford Univ., Stanford, CA, 1981. [5] M. Wax. "Detection and estimation of superimposed signals," Ph.D. dissertation. Stanford Univ., Stanford, CA, 1985. [6J D. H.lohnson and S. R. DeGraaf, "Improving the resolution of bearing in passive sonar arrays by eigenvalue analysis," IEEE Trans. Acoust.. Speech, Signal Processing, vol. ASSP-30, pp. 368-647, Aug. 1982. (7] A. J. Barabell, J. Capon, D. F. Delong, J. R. Johnson, and K. Senne, "Performance comparison of superresolution array processing algorithms," Tech. Rep. TST-72, Lincoln Lab., M.LT., 1984. [8] L. J. GIeser, "Estimation in a multivariate 'errors in variables' regression model: Large sample results," The AnnaLs of Statistics, vol. 9, pp. 22-44, 1981. [9] T. W. Anderson, "Asymptotic theory for principal component analysis," Ann. Math. Statist., vol. 34, pp. 122-148, 1963. [10] - , An Introduction to Multivariate Statistics Analysis, 2nd ed. New York: Wiley, 1985. [11] J. J. Dongarra, J. R. Bunch, C. B. Moler, and G. W. Stewart, LINPACK Users' Guide. Philadelphia, PA: SIAM, 1979. [12] A. Paulraj, R. Roy, and T. Kailath, "Methods and means for signal reception and parameter estimation," Stanford Univ., Stanford, CA, 1985. Patent Application. ~('Thl~ development assumes that the number of vignul ... I~ known When the number of signals is not known a prtort , the maximum likelihood c.. tirnator must be redefined 15]. II]

[13] R. Roy, A. Paulraj, and T. Kailath, "Method for estimating signal source locations and signal parameters using an array of signal sensor pairs," U.S. Patent, 4 750 147, June 7, 1988. [14] R. Roy, B. Ottersten, L. Swindlehurst, and T. Kailath, "Multiple invariancc ESPRIT," in Proc. 22nd Asilomar Con! Signals, Syst., Comput., Asilomar, CA, Nov. 198B. [15] R. Roy, A. Paulraj, and T. Kailath, "ESPRIT-A subspace rotation approach to estimation of parameters of cisoids in noise," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 1340-1342, Oct. 1986. [16] A. Paulraj, R. Roy, and T. Kailath, "A subspace rotation approach to signal parameter estimation," Proc. IEEE, pp. 1044-1045, July 1986. [17] R. Roy, B. Ottersten, L. Swindlehurst, and T. Kailath, "Multiple invariance ESPRIT," IEEE Trans. Acoust., Speech, Signal Processing, in preparation. [18] R. Roy and T. Kailath, "ESPRIT-Estimation of signal parameters via rotational invariance techniques," Opt. Eng., 1988, in review. [19] G. H. Golub and A. Pereyra, "The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate," SIAM J. Numer. Anal., vol. 10~ pp. 413-432, 1973.

235

Spectral Self-Coherence Restoral: A New Approach to Blind Adaptive Signal Extraction Using Antenna Arrays BRIAN G. AGEE, MEMBER, IEEE, STEPHAN v. SCHEll, WILLIAM A. GARDNER, SENIOR MEMBER, IEEE

AND

systems, where it can be too costly to provide an adaptive processor with a separate training sequence for each signal received by the transponder or receiver, and in broadcast FM receivers and military reconnaissance and communication systems, where the 501 and interference waveform and channel parameters are typically unknown and timevarying during the reception time. Antenna arrays provide a particularly useful means for performing this signal extraction in microwave receiver systems, where the most significant signal corruption is caused by co-channel interference. Blind array adaptation algorithms developed to date can be divided into two broad categories: property restora I techniques, which adapt arrays to restore a known set of 501 properties to the array output signal, and spatial
A new approach to blind adaptive signal extraction using narrowband antenna arrays ;5 presented. This approach has the capability to extract communication signals from co-channel interference environments using only known spectral correlation properties of those si8nals-in other words, without us;nB knowledge of the content or direction of arrival of the transmitted si8na/, or the array m~nifold or background noise covariance of the receiver, to train the antenna array. The class of spectral self-coherence restoral (SCORE) objective functions is introduced, and algorithms for adapting antenna arrays to optimize these objective functions are developed. Using the theory of spectral correlation, it is shown via analysisand simulation that these algorithms maximize the siBna/-to-interference-and-noise ratio at the output of a n~rrowban~ antenna array, when a single communication siBnal with spectral sett-cobetence at a known value of frequency separation and an arbitrary number of interferers without spectral self-eoherence at that frequency separation are impinging on the array. It is also shown that the SCORE processors can nearly optimally extract communication signals from environments containing multiple signals with spectral self-coherence at the same value of frequency separation.

I.

STUDENT MEMBER, IEEE,

INTRODUCTION

The need for blind adaptive signal extraction is growing

in a number of signal processing applications. The ability to adapt a receiver processor to remove unknown or time-

varying distortion and interference from a signal of interest (SOl), without using knowledge of the transmission channel or waveform to train the processor, can significantly reduce cost and outage time in telephony and microwave communication systems. Blind adaptive processing can also allow signal extraction to be performed in many other applications where it is impractical or impossi ble to provide such knowledge to the adaptive processor, for example, in mobile radio and in regenerative satellite communication Manuscript received January 16, 1989; revised August 12, 1989. This work was supported in part by a grant from ESL, Inc., with partially matching funds from the California State MICRO Program. B. G. Agee is with AGI Engineering Consulting, Woodland, CA 95695, U.S.A. S. V. Schell and W. A. Gardner are with the Dept. of Electrical Engineering and Computer Science, University of California, Davis, CA 95616, U.S.A. IEEE log Number 9034609.

Reprinted from Proceedings of the IEEE, Vol. 78, No.4, pp. 753-767, April 1990.

236

geometry is changing with time). while ESPRIT imposes a structural constraint on the sensor array that can be difficult to satisfy in some practical applications, and that can redu ce the degrees of freedom (null-steering capability) of the overall array by as much as 50% . In addition , MUSIC and ESPRIT both requir e knowledge or estimation of the covariance of the background noise and of the interfering signals that are not to be extracted by the array. This requirement can limit the application of these techn iques in environments where the background noise and interference statist ics are unknown or varying during the reception time. All of these techn iques suffer irom the additional problem that they are nondiscriminatory, and must therefore extract all of the unknown signals received by the array and rely on additional downstream processing to separate the SOls from the interferers . Th is drawback can be of critica l importance in systems where the number of signals im pi nging on the array is high, especially if only a few of those signals are of interest to the receiver processor. This paper presents the new class of spectral seli-coberence restore! (SCORE) algorithms, which have the potential to overcome these limitations. A property held by most communication signal s is that they are correlated with frequency-shifted and possibly conjugated versions of them selves for certain d iscrete values of frequ ency shi ft. This property, referred to here as spectre! self-coherence or spectral coruugete self- coheren ce , is commonly induced by periodic gating , mixing. or multiplexing o perat io n s at th e transm itter . For i nstance, spectral self-coherence is induced at multiples of the symbol rate in PCM signals and multiples of the pilot-tone frequency in FDM-FM signals, and spectral conjugate self-coherence is commonly ind u ced at twice the carrier frequency In BP5K and AM signals. The spectral self coherence of a received signal is degraded if it is corrupted by additive interference that is not spectrally self-coherent at the same val ue of frequency shift. for in st ance, if a PCM 501 IS corrupted at the rece iver by a PCM interferer with a different symbol rate . The SCORE algorithms adapt a receiver array to restore this 501 self-coherence, and thereby reduce the power of the interference in th e rece iver output signal. Sect ion II introduces the fundamental co ncep t s of spec tral self-coherence and conjugate self-coherence, and motivates the dev elopment of the SCORE algorithms . Section III introduces th e basi c SCORE algorithms presented here: the least -square s SCORE, cross-SCORE, and auto-SCORE algorithms . Section IV analyzes the asymptoti c (infinite timeaverage) performan ce of these algorithms in the renk-L; sp ectral self-co herence env ironment where l-; signals with spectral self -co he re nce or co n ju g at e self-coherence at a tar get frequen cy shift ex are rec eived by an antenna array. Seclion V eval uat es the performance of the SCORE algorithms in the rank-1 and rank-I , spectral self -coherence environments via computer sim u latio n. II.

at some value of r , where ( . ) '" denotes infinite time-averaging. Similarly, a signal waveform s(tl is said to be spec trally conjugete self-coherent at frequency separatio n ex if the correlation between s(t) and the conjugate of s(t) fre · quency-shifted by ex is nonzero for some lag t , that is, if

p~,.(r) ~

p~s(r);'; .J ( Is(t

+

r/2)[s(t -

r /21e , 2 • u ' ] .

r/2)[s'(( -

r /2)e "' o,],) ",

.J(ls(t + rI2W)", ( Is'( t - rI2)e,I.",\ I)",

~,(r) - ( s(t R';,.(r) -

oF 0

(2)

( s(t

+ r /2)s'(t +

r/2)s(t -

r l2le - /2<0' ) ",

(3)

rl2le - II.." ) ,,,.

(4)

An M-element vector waveform x(tl is said to be renk -L; spectrally self-coherent at frequency separation ex or rankLQ spectrally conjugate self-coherent at frequency separation ex if the respective cyclic autocorrelation matrix R~.(TI or cyclic conjugate correlation matrix R~•• (r)

R~'••(TI ~ (x((

+ T/2)x T(t

-

rI2)e - /2..") ,,,

(6)

has rank LQ (L o ::s: M) at frequency-shift ex for some lag T, where T and H denote transpose and conjugate-transpose (Hermitian respo nse) operations, respectively 1. The spectral self-coherence and conjugate self-coherence properties for a DSB-AM waveform are illustrated in Fig. 1 and Fig. 2, respectively." If the real (bandpass rep:F { s(~}, s(~ = real DSB -AM

\ Fig. 1. Spectral self-coherence for a real DSB ·AM signal. resentation) signal is under investigation (Fig . 1), th en th is modulation type has co nj ugate-sym metric frequency content both about its carrier (f = fo) and about DC (f = 0). The 'The shorthand notation R, ,(T) g RI',(T)!o . 0' Ri', ~ R;' ,(r)l. _u- and R" ~ RI' ,(T~ " , •• u is used th roughout thi s paper to reduce the level 01 notation. A simi lar convention is emplo yed for the spectral selfcoherence function PI' ,(r). 'Note that the Fourier transforms shown in these figures areonly used as heuristic aids to illu strate the concepts of spectral selfcoherence and conjugate self-coherence . Signals that are FOUrier transformable cannot exhibit spectral sell-coherence. becausethey cannot have finite average power [7J.

)",

+ T/211 2 ) ", (Is(t _ r/2)e ll'Q' 12)",

= R~s(r)/R «(o1 '" 0

+

at some val ue of r. The functions p~s(r) and p~s.(r) are referred to here as the spectral self-coherence function and the spectral conjugate self-coherence function of s(tj, respectively; the function s ~s(rl and R~s.(T) are refe rred to here as the cyclic autocorrelation function and the cyclic coruugetecorrelation function of stn. respecti vely, and are defined by

A scalar waveform str) is said to be sp ectrall y self-coherent at frequency separation a [6] if the correlation between s(tl and str) frequen cy-shifted by ex is nonzero for some lag r, that is, if ( s(t

(s((

R~s-(r)IRss(OI

TH F SPEC TRAL SELF-COHEREN CE CONcrPT

,

, =====================

(1)

237

:fl Slll I.

if their baseband is spectrally self-coherent, for instance, if their baseband is a TDM waveform . The function Ip~s'·'(TW can be interpreted as a measure of the relative strength of 5(1) contained within S'o l(t T)e,l.a" where the optional con jugat ion (. 1 is onl y appl ied if conjugate self-coherence is be ing measured . Using the Orthogonal Projec tion Theorem , 5(' )(1 - Tle,l.a' can be rep resented by

5(/ ) = .nalytlC DSB-AM

(7)

where 5(1) and Ell) are equal-power o rthogonal waveforms (R... = 01. Therefore, 5(0'(1 - T)e,l •• , can be thought of as a scaled and corrupted replica of sUI. with a signal -to -corruption ratio of

Fig. 2. Spectral conjugate self-coherence for an analytic DSB-AM signal.

1

combined effect of these symmetries renders the negativefrequency component of the modulated signal equal to the positive-frequency component of the modulated signal , except for a complex phase-shift. The overall signal is therefore correlated with a frequency-shifted version of itself when the frequency shift is exactly equal to twice the signal carrier (Fig. 1), that is, the signal is spectrally self-coherent at a = 2fo. This correlat ion can be removed by converting the signal to its analytic representation, that is, by removing the negative-frequency signal component using a complex filter. However, the original negative-frequency component can be recreated by conjugating the analytic signal, which reflects the signal through the DC axis (Fig . 2). The conjugated signal is then correlated with the or iginal signal after a frequency shift of exactly twice the carrier, that is, the or iginal signal is spectrally conjugate self-coherent at a = 2fo. The spectral self-coherence functions and cyclic correlation functions are developed in detail in the theory of spectral correlation [6), [7], where it is shown that complex wide -sense cyc/ostationary and wide-sense elmost-cyclostationary waveform s exhibit spectral self-coherence or conj ugate self-coherence at discrete multiples of the t ime periodicities of thewaveform statistics. Table 11ists the self-

'Y~CR(cr,

10

I p~.,·'( TW

T) = 1 -

(B)

I P~" "(TW ·

This ratio varies between zero and infinity as I p~s,,'(T)12 vane s between zero and un ity. The utility of the self-coherence concept can best be seen in interference environments. Consider the environment where a scalar waveform x(l) is equal to a scaled 501 5(1) plus an independent interference signal i(l l, x(t) = as(l) + i(I). If 5(1) is spectrally self-coherent at frequency separation a , but i(1)is not spectrally self-coherent at a, then the cyclic autocorrelation of x(1) is given by

that is, the infi nite time-averaged cyclic autocorrelation of

X(I) is unchanged by the addition o f arb ilrary interference,

provided that the interference is not spectrally self-coherent at frequency separation a . A useful interpretation of (9) is that the f requency- shift and optional conjugation operations completely decorrelate the interference component of xlt) , but onl y part iall y decorrelate the 501 component of x(t). In terms of the , decomposition given in (7), XI>'( 1 - T)e /2. .. can be expre ssed in terms of the signal and interf erence component s o f x(t) by (10)

Table 1 Examples of Spectrally Self-Coherent Signals Complex Modulation Format ASK,BPSK QPSK MSK, SQPSK CPFSK FDM-FM DSB-AM, VSB-AM SSB-AM

where

Self-Coherence Frequencies

Conj. Self-Coherence Freq. (@ 2 x Carrier)

Baud-rate mult. Symbol-rate mult. Baud-rate rnult. Symbol-rate mull. Pilot-tone mult. None

Baud-rate mull. None ± 1/2 baud rate 2 x symbol frequencies· None 2 x carrier

None

None

'frequency deviation: multiple ot 1n only.

coherence and con jugate self-coherence frequencies for a number of common modulation formats . As th is Table shows , th is class of waveforms includes most communication signals ; for instance, all PCM signals exhibit spectral self-coherence at multiples of the ir baud-rate, and ASK and BP5Ksignals are in addition spectrally conjugate self-coherent at twice thei r carr ier frequency. Furthermore, many nominally stationary signals can be spectrally self-coherent

;w

= a.J1 -

Ip~s'·'(TWf.(I) +

i lO)(1 -

T)e /2• a "

(12)

and where ;(t) is uncorrelated w ith both 5(1) and ;(t). Equat ion (101 motivates the development of interference cancellation techniques that use xl»U - T)e,ha, as the reference signal in a conventional least-squares algorithm. A different interpretation can be obtained by noting that the spectral self-coherence of xII) in the above example is reduced when interference t hat is not spectrally self-coherent at shift a is added to the rece ived env ironment. In this case the self-coherence strength of x (l) is degraded to

for a signal-to-interference-and-noise rat io of !al l R.,1R". Equation (13) mot ivates the development of interference

238

jugate self-coherence is to be restored by the processor. If

cancellation techniques that extract SOls by optimizing some direct or indirect measure of their self-coherence.

11 L

THE

SCORE

x(t) is modeled by (14) and s(t) is the sale received signal

component with spectral self-coherence or conjugate selfcoherence at frequency separation Q, then (10)-(12) can be used to show that r(t) decomposes into a replica of the 501 plus a corruption term that is uncorrelated with both s(t) and x(t) (and therefore y(t»

ALGORITHMS

A. Problem Statement

The SCORE algorithms are motivated by extending the example given in Section II to narrowband vector (multisensor) data signals. Consider an environment where an antenna array is excited by a 501 s(t) and by background noise and co-channel interference. If the inverse bandwidth of the receiver is small with respect to the electrical distance between the array elements, then the received signal vector x(t) can be modelled by x(t) = as(t)

+ i(t),

rtt) = c1s(t)

+

r(t),

(18)

where aand I(t) are given by (11) and (12), respectively, with a = cHa(·> and i(t) = [c(*1 H i( t). Equation (18) motivates the least-squares SCORE algorithm. We define the least-squares SCORE cost function by Fsc(w; c)

(14)

~
(19)

where y(t) = wHx(t) and r(t) is given by (17), and where <')r denotes time-averaging over the interval [0, T]. Substituting (18) into (19) and letting the averaging time grow to infinity yields

where the sal aperture vector a models the polarizationand direction-of-arrival-dependent antenna gains, crosssensor phase mismatches, and near-field multipath (scattering) and mutual coupling effects of the array, and where the interference vector i(t) models the remaining signals and background noise received by the array. Assume that set) is spectrally self-coherent at a, and that i(t) is not spectrally self-coherent at a and is temporally uncorrelated with set) (R;s(T) = 0 for every 7). Given this model, then s(t) can be extracted from x(t) using the linear estimator y(t) = wHx(t), where the processor weight vector w suppresses i(t) in some manner (for example, by forming an effective antenna pattern with a beam In the direction of s(t) and nulls in the directions of the spatially coherent co-channel interferers). For the environment described in (14), this is optimally accomplished by setting w equal to a maximum-SINR linear combiner

Fsc =

(20)

Because I(t) does not depend on w, it follows that (19) becomes equivalent to the true least-squares cost function (16)and the value of w that minimizes (19) converges to the maximum-SINR processor as T -+ 00. This result can be proved more directly by solving for the processor vector Wsc that optimizes (19) for infinite averaging time. Minimizing (19) with respect to w yields the leastsquares SCORE algorithm (21)

where Rn and Iv are the sample autocorelation matrix and cross-correlation vector computed over [0, Tl. If i(t) is not spectrally self-coherent at Q, then as T -. 00 (21) converges to

(15)

where Rii and Rn are the limit (infinite time-average) autocorrelation matrices of the interference and received signal vectors. These weights can also be interpreted as the optimal solution to the least-squares cost function

Wse -+

R;.' Rx,

= R;.' R~x( T) ce ""?'.

(16)

(22) (23)

If only s(t) is spectrally self-coherent at (), then as T -. 00, R~Il(i) reduces to a rank-1 matrix with form

where g is some arbitrary scalar gain constant. Conventional (nonblind) methods for computing W max require knowledge of the interference autocorrelation Rj ; or the 501 aperture vector a to implement (15), or knowledge of the 501 waveform s(t) to minimize (16). The goal in this paper is to adapt w to approximate (15)without using this knowledge, that is, using only knowledge of the spectral selfcoherence properties of the 501.

(24)

and (23) reduces to

That is, Wse reduces to the maximum-5lNR (or scaled leastsquares) weight vector given by (15), where Sse is the gain constant appearing in (16). Note that w converges to the maximum-SI NR solution for any value of (, as long as c is not orthogonal to a l *>. The least-squares SCORE processor block diagram is shown in Fig. 3. The reference signal r(t) is generated by linearly combining, delaying, conjugating (if conjugate selfcoherence is being exploited), and frequency-shifting the data received by the array. The reference signal is then used as a training signal to adapt the processor vector w using a least-squares algorithm. The only control parameters used in the processor are the control vector c, the delay 1, the

B. The Least-Squares SCORE Algorithm

The simplest SCORE algorithm, referred to here as the

least-squares SCORE algorithm), is developed using the interpretation of spectral self-coherence given in (10)-(12).

We define a reference signal r(t) by

(17)

where the vector c is referred to as the control vector and the optional conjugation (Ifl) is applied if and only if con3The least-squares SCORE algorithm was first presented in [8].

239

vector as well as the processor vector to some appropriate value. This generalization leads to the cross-SCORE algorithmS, discussed in the next section.

C. The Cross-SCORE Algorithm

Fig. l.

An algorithm for adapting c can be developed by motivating the least-squares SCORE algorithm from a propertyrestoral viewpoint. The same value of Wscgiven in (21) results from maximizing the strength of the cross-correlation coefficient between y(t) and r(t)

least-squares Score processor.

conjugation control, and the frequency-shift a; however, only ex and the conjugation control are critical to the operation of the processor. For most communication waveforms much latitude can be allowed in the choice of c and T, because in theory these parameters need only be chosen to yield a nonzero value of Bsc in (26)4. In addition, the frequency-shift parameter ex need not be related in any way to the bandwidth or sampling rate of the receiver system; however, care must be taken to avoid aliasing effects if the processor is implemented in digital form and et is large. From Fig. 3 it is clear that the least-squares SCORE processor can be generalized in several ways. For instance, the delay operation can be replaced by a more general filtering operation, by generating r(t) using i(t) = h(t) ~ x(t),

Fsc(w; c) ~

(.hH

a

J CPSf' l

fR: v'R;'

= IwHAKrI2/[wHtxxw R"]

(29)

IwHalluc I2

(30)

= [wHRxlIw][CHRuuc]'

where u(t) is defined to be the control signal, (31)

The cost function f sc is an indirect measurement of the spectral self-coherence in y(t) at frequency separation et; it is lowered if x(t) contains interference that is not spectrally self-coherent at this frequency separation. In this sense, the least-squares SCORE algorithm can be interpreted as a method for restoring this spectral self-coherence to the processor output signal. The crcss-correlatlon coefficient is also degraded if interference is present in r(t). Consequently, maximizing ~sc with respect to wand c should restore this spectral self-coherence to both y(t) and «t). For this reason, (30) is referred to here as the cross-SCORE objective function, and methods for optimizing (30) are referred to here as cross-SCORE algorithms. From the Cauchy-Schwarz Inequality, it is clear that w is optimized for fixed c by

(27)

where h(t) is the control filter impulse response and ® denotes convolution. The optimum weight vector then converges to

Bsc = [ a

IR yrI2/l Ryy R,,1

(28)

where set) is the filtered SOL As Section V shows, a key parameter affecting the convergence rate of the SCORE processor is the strength of the spectral self-coherence P:Si-' being restored by the processor; appropriate design of the control fi Iter can improve the performance of the SCORE processor by increasing this strength. The critical dependence of the SCORE processor on the choice of target ex can also be eased somewhat by the particular choice of averaging window used to calculate the finite-time correlation matrices and i.T• If a growing rectangular window is used to calculate ill" for instance, then the processor will eventually reject a received SOl if there is any error between the self-coherence frequency of the sal and the target self-coherence frequency of the processor. In many environments, however, the self-coherence frequency of the 501 cannot be known exactly, for instance, if the SOl is subject to Doppler shift (which shifts the conjugate self-coherence frequency of the SOl). The SCORE processor can be made more tolerant to this error if a different choice of averaging window, such as an exponentially decaying window, is used to compute Ru . The greatest improvement in SCORE processor performance can be obtained by adaptively adjusting the control

W Op l

oc i~1Rxr = R;x1ixuc,

(32)

wh ich is the least-squares SCOREsolution (if the delay operation in Fig. 3 is generalized to a filtering equation). Similarly, c is optimized for fixed w by (33)

an

Substituting (33) into (30)yields a generalized Rayleigh quotient in w (34)

which is maximized by setting w equal to the dominant mode (eigenvector corresponding to the maximum eigenvalue> of

Al.. w = [l.uA;u'RuJw.

(35)

Similarly, the control vector is globally optimized by setting c equal to the dominant mode of

ARuuc =

[Au. l ;lI' .xulc.

(36)

Equations (35) and (36) are referred to here as cross-SCORf eigenequations; both of these eigenequations have the same eigenvalues, with the maximum eigenvalue equal to the maximized objective function value. Equations (35) and

41n practice, it is important to choose values of c and r that yield

a large value of 8se, as this parameter does have a strong effect on the convergence time of the SCORE algorithm. This can impose a serious constraint on c (for example, if the array is subject to strong co-channel interference), but does not impose a strong constraint on T in most communication applications.

SThe cross-SCORE algorithm was first presented in [8].

240

where gwand Be are used to normalize the power of yCt) and r(t) at each step in the algorithm. Equations (41) and (42) converge very rapidly to the dominant mode of (36) in the ra~k 1 spectral self-coherence environment, because of the Wide spread between the maximum and lesser eigenvalues of (37) that prevails in this case.

(36) can also be used to obtain an equivalent joint crossSCORE eigenequation

where ~ = /. and every solution (~,C' Wk, Ck) to (35), (36) has a palf of solutions (.J>.:;., W/<, Ck) and (-.J>.:;., Wk, -Ck) to (37). It is easily shown that the dominant modes of (35) and (36) both converge to the maximum-SINR solution given in (15) if set) IS the only received signal with spectral self-coherence or conjugate self-coherence at Q. In this environment, the Hermitian matrix on the right-hand side of (35) reduces to a rank-1 matrix as T --t 00,

D. The Auto-SCORE Algorithms

Although the cross-SCORE algorithm can be interpreted as a property-restoral algorithm, it is essentially an exte~ sion of the least-squares SCORE algorithm, and as such IS motivated more naturally from the interference-decorrelation viewpoint discussed in Section II. A more natural framework for developing a true property-restoral algorithm based on spectral self-coherence is to consider the problem of maximizing the spectral or conjugate selfcoherence strength at the output of a single linear combiner

(38)

For an M-element antenna array, the eigenvectors of (35) therefore converge to M - 1 signal-rejection solutions where w is orthogonal to a and A is equal to zero, and one signal-selection solution where w is equal to W max and ~ is approximately equal to I p~sl·)12, ~max

-+

H

(aHR~1 aHa (1

+

Ri-. a) 1

I p~s\·d2

)'x 2)(1

l~s(.d2

+ 'Yi-2) == Ip~~(.)

Fsc(w)

IwHR~~·'(T)w(·)1

wHRuw(·)

(39) 12

~ IP~Y<'l(T)1 (43)

Algorithms for optimizing Ip~y(T)1 are refe~re~ .to h~:e as auto-SCORE algorithms. Algorithms for optimizing I Pyy.(T)\ are referred to as conjugate auto-SCORE algorithms. A detailed development and discussion of the autoSCORE and conjugate auto-SCORE algorithms is given in [10] (where they are referred to as the direct SCORf algorithms); however, some specific results for these algorithms bear mentioning here. Both classes of algorithms asymptotically (as the averaging time grows to infinity) converge to the maximum-SINR solution in the rank-1 spectral selfcoherence environment. In addition, the general (rank-z..; fi nite time-average) maximal solution to the conjugate autoSCORE objective function is nearly always identical to the maximal solution to the analogous conjugate cross-SCORE objective function. In particular, Ip~.(T)1 is maximized if w is equal to the dominant mode of

, (40)

where "y; = 'Y~ax is the maximum-attainable SINR of s(t) for the input data x(t) and ')'; is the maximum-attainable SINR of the filtered sal set) for the filtered input data i(t). Similarly, Rux ll;x1 Rxu reduces to a rank-1 matrix in this environment as T --t 00, and the eigenvectors of (36) converge to M - 1 signal-rejection solutions where cHa C.) = 0 and one signal-selection solution where c ex R;'" a C* ) R~~. A block diagram of the cross-SCORE processor is shown in Fig. 4. Structu rally, the processor is identical to the least-

AR.. w = R.uR~lRu.w, u(t) = x"'(t - T)ej2Tat, (44) where R. u is the symmetrized cross-correlation between x(t) and u(t),

(45)

Fig. 4.

Equation (44) is identical to the cross-SCORE eigenequation (35) if i xu is symmetric (R,U= R~u). In fact, in most applications where conjugate self-coherence can be exploited by the processor, 'p~s.(Tll is maximized for T = 0, and the symmetry condition holds. If the symmetry condition does not hold, (45) also shows that a simple modification to eigenequation (37) can transform the cross-SCORE processor to a conjugate auto-SCORE processor. In contrast, the maximal solutions to the nonconjugated auto-SCORE objective function can differ greatly from the maximal solution to the analogous cross-SCORE objective function under general conditions. In particular, I p~y(1')j has a local maximum at every dominant mode of

Cross-Score processor.

squares SCORE processor, except that the control combining is moved to the end of the control path (to allow measurement of Ruu and Rxu) and the delay operation is changed to a more general filtering operation. However, in the crossSCORE processor, both the processor and the control weights are adapted to find the maximum-eigenvalue solution of (37). If only one signal with spectral self-coherence at the target Q is incident (or expected) at the array, then the cross-SCORE processor can be very simply implemented using the power method [9]

c

+-

8eA';u1 au,\\'

(41)

w

+-

8wA~1 Itxuc,

(42)

A('P)Rnw =

1:](1', ep)w

(46)

such that Amax('P} also has a local maximum with respect to

'P, where i:.(T, 'P) is the Hermitianizedcyclic autocorrelation

241

1) Assumed Environmem: Consider the environment shown in Fig. 5, where N a uncorrelated signals {sn(t)} ~~ 1 with spectral self-coherence or conjugate self-coherence at

matrix of x(l),

R~.(T, lP) ~ ~[R~I(i)e -J.;J + A.~x(T)HeIOP].

(47)

A method for adapting w to perform this optimization is also developed in [10]. A related algorith m has recently been introduced [11] that can be interpreted as a suboptimal solution to the autoSCORE objective function. This algorithm, referred to here as the phase-SCORE algorithm, solves the simpler eigenequation

Source 1 Source 2

Source Na

X

X

(48)

for the complex eigenvalue ~ with maximum strength ,~t. As the next sections show, the phase-SCORE algorithm has very attractive analytical, implementation, and performance properties, which give it some inherent advantages over the cross-SCORE algorithm if nonconjugate selfcoherence is being restored by the adaptive processor. The auto-SCORE and conjugate auto-SCORE objective functions can also be generalized by replacing the delay operation with a more flexible filtering operation, yielding

path

arrive at an M-element antenna array, together with interference i (t) that is not self-coherent at o. Each signal sn(t) is assumed to arrive at the receiver along K; paths, such that La = E~~ 1 Kn signals are received from the N separate sou rces. We denote the kth signal received from the nth source by Skn(t), and assume that this signal arrives at direction 8kn with delay Tkn and scaling coefficient 81en' Then under the narrowband assumption stated in Section III, the received data vector x(t) is modeled by

Q

Q

(49)

x(t)

where u(t) is given by (31), and (with a change from the notation used in the previous sections) r(t) = [W<*)]HU(t). However, this filtering approach is less useful here, because w must achieve a compromise between the optimal extractor of a given 501 from the received environment and the filtered SOl from the filtered received environment. If the filtering operation affects the 501 and interference d' fferently (for example, if the control filter is selectively n t · 'ling the interference), then the resultant processor can diverge significantly from the maximum-SINR processor. The exception to this observation is the phase-SCOREalgorithm given by

n

-

Tkn)

I

+

i(t)

(51)

(52)

=1

= As(t)

+

(53)

i(t)

where

An

g

[g,na(Oln) ... 8Knn a(8K"n)]

are the received signal vector and the received signal aperture matrix due to the nth source, and where s(t)

which converges to a maximum-SINR solution in the rank1 spectral self-coherence environment for any control filter (as long as Pss *- 0). SCORE

r KI

Ncr

= n~l k~T 8kn a(6kn)skn(t

(50)

ANAl Y51S OF THE

Receiver Platform

FiB_ 5. Rank-Ln collect geometry.

Fsc(w) g !Pyrl

IV.

o

La

g

Ag

[sf(t)

[A,

S~Jt)]T ANJ

are the received signal vector and received signal aperture matrix due to all of the sources. Similarly, the control signal u(t) is modeled by

ALGORITHMS

A. Infinite-Collect Analysis

u(t) = A(·)[s(*)(t)e,2T{~I]

The stationary solutions of the SCORE objective functions are easiest to discern under infinite-collect conditions where the averaging-time T grows to infinity. Under these conditions, the sample correlation matrices can be replaced with their limit values, and all cross-correlations between statistically u ncorrelated signals and signal components can be set equal to zero. In Section III, it is shown that all of the SCORE algorithms converge to the maximum-SINR solution in the rank-l spectral self-coherence environment where only one signal with spectral self-coherence at the target Q is received by the array. The behavior of the SCOREalgorithms in the rank-L, spectral self-coherence environment remains to be determined, however. This analysis is performed below.

=

As(t)

+

T(t)

+ [I<*)(t)e,h"c,rt] (54)

where A = A(*) and set) and T(t) are the filtered, optionally conjugated, and frequency-shifted received signal and interference signal vectors. For the analysis presented here, it is also assumed that set),s(t), A, and Ahave maximal rank, that is, the fully coherent multipath environment is not considered. However, the primary results of this analysis should extend easily to the fully coherent multipath environment, because signal extraction (rather than direction finding) is of interest in this paper. 2) Analysis of the Least-Squares SCORE and Cross-SCORf Algorithms: Analysis of this environment shows that the

242

least-squares SCORE and cross-SCORE algorithms screen the data to optimally suppress the noise and interference i(t) that is not spectrally self-coherent at the target a, if the total number of received signals that are spectrally selfcoherent at a is less than or equal to the number of elements in the array (La :S M). Moreover, this analysis also shows that the solutions to the cross-SCOREeigenequation separate the remaining self-coherent signals into No blocks, corresponding to selection of each block of correlated signals {sn(t - Tkn)} ~n= 11 if itt) is low and/or removable by the array and the eigenvalues are distinct between blocks. A corollary of this resu It is that the cross-SCORE processor can sort environments containing multiple uncorrelated signals with spectral self-coherence at the same value of frequency separation, by separating those signals on the basis of their self-coherence strength. The screening property can be deduced by noting that any processor vector w can be expressed as the sum of a component that is orthogonal to A and a component lying in the space spanned by the linear combiners that provide the least-squares estimates of the elements of set) given x(t). That is, W can be represented by w = W s + W.l. 1 where Wl. is in the left null space of A (AHWl. = 0) and (55)

and" = 0, and 2La solutions (La significant solutions) where WJ. = 0 = Col and (v, Pw, Pc) satisfies the eigenequation

l

Rss 0 l[Pw v 0 ~llJ Pc [

J= =

RdNcllPw1 (62) J Pc J [0 R1[Nt 0 1[Pw]. (63) Re. ° J 0 N1J Pc

[0 R'sNs 0 sl

That is, the joint cross-SCORE eigenequation has M - La pairs of signal-rejection solutions where the processor and control weight vectors completely reject the signals with self-coherence at ex (set Pw :: 0 = Pc) and pass the background interference, and La pairs of interference-rejection solutions where the weight vectors optimally reject the background interference (set Wl. = 0 = cJ.)and pass a linear combination of the least-squares estimates of the selfcoherent signals. The interference-rejection solutions in effect screen the data and minimize the contribution from any signals that are not spectrally self-coherent or conjugate self-coherent at the target a. The cross-SCORE sorting property can be deduced from examination of (63) when the background interference is low and/or removable. Using the Matrix Inverse lemma [9],

(C + ASA H)-l = C- 1

-

C- 1AS(I + AHC-'AS)-lAHC- 1

(56)

and where Pw = e, (where e, ~ [hm _/]~;: 1) sets ws equal to the least-squares extractor of the Ith element of set) from xtr). Similarly, any control vector c can be expressed as the sum of a component CJ. that is orthogonal to A (AHCl. = 0) and a component c, lying in the space spanned by the leastsquares estimates of the elements of I(t) given u(t)

(64)

N, and N1 can be rewritten as

AHR~u1ARn

= (lLII

(57) (58)

Substituting (55) and (57) into the least-squares SCORE equation (22) shows that yet) = w~cx(t) lies entirely within the space spanned by the least-squares estimates of 5(t), because

=

ILa -

r,

+ fi"')-l, (lLo

+

~ AHRif'ARII

r~-l

(66)

where I', and f, can be interpreted as matrix measures of the maximum signal-to-noise-and-interference ratios attainable using an adaptive array on x(t) and u(t), respectively. The matrix SINR I', is related to the minimum-attainable mean-square error between the Ith element of s(t) and wHx(t) according to the formula [10] (67)

(59)

for any control vector c. Thus, both the least-squares SCORE algorithm and the cross-SCORE algorithm optimally suppress the background interference i(t), in the sense that they force wl. -+ 0 as T -+ (X). This property extends to each of the cross-SCORE eigenequation solutions with nonzero eigenvalue. Substituting (55) and (57) into the joint cross-SCORE eigenequation (37) yields the coupled equations

v[ARssPw + Rii w.1. 1 =

ARdN,pc

(GO)

v[ARHpc + RncJ.] =

ARisNsPw.

(61)

with an analogous result holding for f c- It is reasonable to assume that I', and f c are usually large (lIf,-' II, IIr;-1\1 « 1) when the true maximum-attainable SINRs of the self-coherent signals are high, allowing (63) to be approximated by

V

[

l SI

o

0 ][Pw] = [0 R~:)

Pc

Rss

== [0

R5~] (lu Q -

0

sl

R RI5 0

]

[Pw] p,

[Pw]

A) Pc

(68)

(69)

where

Equations (60) and (61) have 2(M - La) solutions (M - La significant solutions) where Pw = 0 = Po (w, c) = (wJ.' c.),

(70)

243

Eigenequation (69) can be transformed into separate eigenequations in terms of the individual vectors Pw and Po

= [R.. Rii'R:tJPw

(71)

XRnpc = [R~Rd1 R.lIpCl

(72)

XRssPw

ciated with the nth signal source, and optimally rejects the background interference and the other uncorrelated SOls that are self-coherent at a. Identification of the multipath blocks can be performed by analyzing the structure of the output-signal covariance matrix R" = W~Rn Wet where Woe is the matrix of processor eigenvectors with nonzero eigenvalues at the target Q. The output-signal covariance matrix should be block-diagonal with Net blocks, allowing determination of both the number of sources N; with spectral self-coherence or conjugate selfcoherence at Q and the number of received signals per source {Kn } ~-='1. Thus, although the cross-SCORE processor is not generally able to correct multipath, it is able to sort the received environment into a set of multipath blocks for each transmitted SOL The sorted multipath blocks can then be passed to a second blind processing stage, such asa constant modulus algorithm [1], to reconstruct the original source. 3) Analysis of the Auto-SCORE Algorithms: The infinitecollect analysis of the auto-SCORE objective fu nction is straightforward and parallels the analysis of the crossSCORE objective function't.In particular, if the general environment given in (53) is assumed, and u(t) is formed without a conjugation operation, then A = A and the phase-SCORE eigenequation given in (50) transforms to

where X;::: ,,2. Equations (69)-(72) have the same form as the actual cross-SCORE eigenequations, but are expressed directly in terms of the correlation matrices of the underlying received signal vectors s(t) and I(t). If multipath is not present in the received environment, that is, if the received signals {s~t - TI)} t:. 1 are uncorrelated (N a = Let)' then the received signal correlation matrices are diagonal, and both (71) and (72) reduce to

Ap,.)

= diag {I Ps/s,12} = diag

(73)

p(.)

{I p:'~,12} p(.).

(74)

where p~!, is the cyclic cross-coherence between sJ
Pw(/) Pc(1)

=:

Ip:tf,12

(75)

e,

(76)

= e,.

(m

'A[AR1,Pw + RiiWJ..]

That is, the cross-SCORE eigenequation has La unambiguous solutions, each corresponding to least-squares extraction of each of the received signals s/..t- TI)from the received environment if those signals have distinct self-coherence strengths and sufficiently high maximum-attainable SINR. If multipath is present in the received environment, such that the self-coherent signals can be grouped into Net < Loe blocks of correlated signals as shown in Fig. 5, then Rss, R" and RS1 are block-diagonal and (71)-(72) reduce to A diag {R sl1s,,}Pw = diag {RInInRi:t.R~Lt} Pw A diag

{R'tt'n} Pc = diag {R~s"R~~Rs""'} Pc.

= AR1.N.pw

(83)

under the representation (55) for w. Equation (83) has M - La solutions where Pw = 0, W = WJ.. and X = 0, and Let solutions where W.L = 0 and (A, Pw) satisfies ~RI5Pw

= Rse[lLa

- (Ita - f,)-1]pw

(84) (85)

The phase-SCORE eigenequation shown in (8S) is discussed extensively in [10], [11], where it is used to show that the phase-SCORE eigenvectors share the screening and sorting properties exhibited by the cross-SCORE eigenvectors. However, the phase-SCORE eigenequation also possesses several important advantages over the crossSCORE eigenequation, due to its having a complex-valued eigenvalue, that are worth noting here. In particular, if IIr;'1I « 1 and {s,(t)} ~o are independent and identically distributed signals (for example, if the signals are part of a communications net) and La S M, then Rsl is diagonal and (85) has L nonzero solutions associated with selection of each of the self-coherent signals with A(I) = Ps,~" as long as those signals have distinct self-coherence strengths orphases (for example, caused by different timing phases). This is in contrast to the cross-SCORE eigenequation, which can only separate signals if they have distinct self-coherence strengths. The phase-SCORE eigenequation can also be shown to have Ncr blocks of nonzero solutions corresponding to selection of each block of correlated self-coherent signals in the multipath environment discussed above. In [10], it is also shown that many of these results also extend to the auto-SCORE objective function. In particular, the auto-SCOREobjective function possesses the same screening and sorting properties exhibited by the phase-SCORE algorithms.

(78) (79)

If the eigenvalues of (78) and (79) are distinct, then these eigenequations have Net blocks of solutions, where the nth block has form {'A(k, n), Pw(k, n), Pc(k, n>}~:1 kn

Q

o fJw(k,

o

0 n)

flcCk, n)

(80)

0

and where {Ilw} and {ftc} are Kn-dimensional eigenvectors and {A(k, n), f-w(k, n), Pc(k, n)} ~~ 1 are the solutions to the No eigenequations ARs"s,,~w = [R:nlnRi,.L(R:'ln)~llw

(81)

'AR~n~ Pc = [(R:",)HR~L R:n&n] ftc.

(82)

That is, the cross-SCORE eigenequation divides into No blocks of solutions, where the nth block of solutions extracts a linear combination of the received signals {Skn(t)} ~11_ 1 asso-

&rhe behavior of the conjugate auto-SCORE objective function is not of concern here, as it is equivalent to a slightly modified (and usually identical) conjugate cfoss-SCORE eigenequation.

244

The dependence of the phase-SCORE eigenequation and the auto-SCORE stationary solutions on the phase as well as the strength of the spectral self-coherence of the received signals significantly broadens the applications of these algorithms. as the received signals rarely have id entical timing phase even if they have identical structure. This property can also improve the convergence characteristics of the approach , as discussed in the next section. B. Finite-Collect AnalysIs

A rigorous analysis of the performance of the SCORE algorithms under finite-collect conditions is beyond the scope of this paper. However. some qualitative statements can be made here. In the rank -1 spectral self-coherence environment, both the least-squares SCORE algorithm and the power-method cross -SCORE algorithm described in Section III can be treated as noisy least squares algorithms . where the training (reference) signal r(t) is equal to a desired signal .1s(t) corrupted by uncorrelated add itive noise ~t), Until J(l) is averaged out by the correlation process, this corrupt ion component can have a strong effect on the adaptation of the processor weights. This noise component is dominated by the background Interference in the least-squares SCORE algorithm if c is chosen arbitrarily. Thus, the least-squares SCORE algorithm should converge slowly (with respect to a non blind algorithm that uses r(t) = sIt) to train the processor) if the background interference is strong and is not removed by the control combiner, regardless 01 the self-coherence strength 01 the sal being selected by the processor. In contrast, the performance of the cross-SCORE processor should be strongly dependent on the strength of the 501 sell-coherence, Adaptation of the control vector during the optimization process removes the bulk 01 the back ground interference from the control path, leaving only the self-interference component given in (12), which cannot be removed because it has the same direction of arrival as the sal. As (8) shows , the reciprocal strength i'~CR(a, T) of the self-corruption component is directly dependent on Ip~II'I I; if Ip~l,·, l is close to unity then i'~cR(a, T) is very large and the algorithm should converge nearly as fast as a nonblind leastsquares algorithm . However, if Ip~;,·1 « 1, then i'~CR(a, T) is small and the cross-SCORE algorithm converges much more slowly than the non blind least-squares algorithm . The performance of these algo rithms can also be affected by the spectral self-coherence properties 01 the other stgnals in the enviro n ment, even if the other signals are not truty (for infinite collects) spectrally self -coherent at a . In particular , if the environment contains (. signals that are spectrally self-coherent in the vicinity 01 a, then it is appropriate to treat the environment as renk-L ; self-coherent until the cycle resotuuon (reciproca l of the collect time) (6) of the spectral co rrelatio n measu rement becomes narrow enough to discriminate against the other signals . A positive consequence 01 this observation is that the SCORE algorithms should be able to select signals even il they are not spectrally self-coherent at the target a , given a short enough averaging time or a wide enough cycle resolution . A negative consequence of this result is that the other signals may interfere with the signals of interest at the target a and slow the convergence time 01 the SCORE processor.

This phenomenon affects the cross-SCORE algorithms more strongly than the auto-SCORE and phase-SCORE algorithms , due to the dependence of the cross-SCORE algorithms on the self-coherence strength 01 the received signals. The measu red spectral self-coherence of the received signals is erratic over short averaging times, and the selfcoherence strengths of the signals can coincide and cross at random intervals over the early portion of any collect . The cross -SCORE algorithm is not able to separate the received signals when their measured self -coherence strengths coincide, resulting in random intervals of performance loss , or "signal drop-outs" over the beginning and intermediate portions 01 the collect. However, the complex value (strength and phase) of the self-coherence measurements of the received signals should rarely coincide. Consequently, the auto-SCORE and phase-SCORE algorithms should not be subject to the drop-out phenomenon, V.

PERFORMANCE

OF

THE SCORE ALGORITHMS

A. Simulator Setup The basic collect geometry and received environment for the simulations conducted here are shown in Figs. 6 and

16·QAM

501 "

»>

lnle rterer

noise

Fig. fl.

Receiver front end .

7. A four-element circular array with a 10.24 MHz complex (bandpass) reception bandwidth, isotropic array elements, and a half-wavelength array diameter is excited by white Gaussian noise, two PCM SOls, and FDM -FM and TV inter20 dB

Received·S lgosl Spectrum

10 dB

o dB -10 dB ·20 dB ·311 dB

·.0 dB +-.,...--,--r-""T"'---.-,--,-----r-,---, -s -.

·3

·2

·1

0

1

Frequency. MHz

2

3

Fig. 7. Received signalenvironment, Isotropic antennapattern. ference signals . The Gaussian noise is white in both spatial and frequency domains. The peM signals are transmitted using NyqUist-shaped modulation pulses with 100% rolloH (cos? «11'/2) (flf,)) pulse Fourier transforms, where t, is the symbol rate of the SOIL and with BPSKand 16-QAM symbol constellations . The FDM-FM signal consists of a carrier frequency-modulated by a nO-channel 60-552 kHz noiseloaded baseband with a 2oo-kHz rrns frequency deviation,

245

The TV signal simulates a horizontal-synchronization pulsetra in w ith a 15.625 kHz (CCIR standard) line rate . The received data vector is converted to complex-baseband representation and sampled at a 10.24 Ms/sec complex sampling rate prior to adaptive processing. The received signal parameters are gi ven in Table 2, where DOA denotes direction of amva/ and SWNR denotes signal-to-white-noise ratio .

where a is the true direction vector of the 50 1, R" is the t rue power of the 501 , and Rii is the true autocorrelat ion matrix of the interference (noi se and other signals ) in the env ironment .

8. Performance in the Rank-1 Spectral Self-Coherence Environment

The performance of the SCORE processors in the rank1 spectral self-coherence environment containing a single signal with self-coherence at the target Ot is investigated here. Figure 8 verifies the theoret ical (infin ite t ime-average)

Table 2 Received Environment Parameters Signal

Rate

Carrier

DOA

SWNR

16-QAM BPSK FDM-FM TV

3 Mb/s 4 Mb/s

0

-45·

15dB 20 dB 30dB 40 dB

0

15.625 kHz

-500 kHz 2 Mhz

60· 30· - 110·

The two PCM SOls are spectrally self-coherent at plusand-m inus thei r symbol rate, with maximum self-coherence magnitude of 1/6 (-16 dB with respect to a magnitude of 1) at T = O. In addition, the BPSK 501 is conjugate selfcoherent at 0 kHz , with a maximum conjugate self-coherence strength of 1 (0 dB) at T = O. The TV signal is also spectrally self-coherent at multiples of 15.625 kHz, out to the bandwidth of the synch pulse ( '"'2 MHz). The least-squares SCOREalgorithm i s implemented using the formula r(n)

= cHu(n)

10dB

o dB -fU'........;."....,-----'/"-----,-----, o 200 400 600 800 BPSK SOl eouds (0.25 ~ ...c/l).ud)

Fig. 8. SCORE performance for a BPSK 50 1.

results obtained in Section III and illustrates the differing convergence rates of the least-squares SCORE and crossSCORE processors d iscussed in Section III. The crossSCORE processor converges to with in 3 dB of t he maximum-attainable SINR in under 100 501 bauds in a baud-rate restoral mode (o = 4 MHz, T = 0, con jugation disabled) and in under 20 501 bauds in carrier restoral mode (0' = 0, T = 0, conjugation enabled). The least-squares SCORE processor converges much more slowly: the processor SINR is still 5 dB less than the maximum SINR afte r 350 501 bauds in baud-rate restoral mode, and the processor fails to sign if icant/yextract the 501 after 800 501 bauds in carrier restoral mode. The relatively slow convergence o f the least-squares SCORE processor is due to the large uncorrelated interference component J(t) present in the reference signal r(t) for the choice of control vector used in th is experiment. This effect is greatly reduced when the control vector is also adapted to restore self-coherence: the dominant corr uption component remaining in r(t) after c is optimized is the irreducible self-interference component (12), which is small if the self-coherence strength Ip~,, ·01 is close to unity. This also explains why the cross-SCORE processor converges much faster in carrier restoral mode than in baud -rate restoral mode: the 501 conjugate self -coherence at Ot = 0 is six times stronger than the 501 spect ral self-coherence at o = 4 MHz. The convergence of the cross -SCORE processor is not appreciably slowed by using the stochastic power-method to adapt the processor and control vectors. In fact , the ditference between the processor SINR obtained using the algorithm given in (88)and (89)and the SINR obtained u sing the actual dominant mode of the cross- SCORE eigenequation drops to below 5 dB within 8 sal bauds (2 "sec) in the environment used in Fig. 8 (10). Thrs result is consisten t with the expected performance of the power-method algorithm in the rank-1 spectral self-coherence environment, because the cross-SCORE eigenequation has a very large spread between its maximum eigenvalue and it s lesser eigenvalues

(86)

wIn) = g(n)A ;.l(n)R",(n),

(87)

where gIn) is a power-normalizing gain variable, u(n) = xl"(n)eihnn (delay = 0) and c is set to [1, 0, 0, OlT (isotropic antenna pattern), and where i, .)(n) denotes correlation with time-averaging over discrete-time collect interval [1, n], The dominant mode of the cross -SCORE eigenequation is calculated using a stochastic implementation of the power method algorithm described in (41) and (42), c(n)

= g c(n )R,;;.l(n)R...(n)w(n

- 1)

wIn) = g w(n)R~.l(n)R ..(n)c(n) ,

(88) (89)

where 8(.)(n) are power-normal iz ing gain constants and n refers to the averaging time (and collect time) of the processor. In both cases the output signal is formed using the most recent processor weight vector yIn) = wH(n)x(n). The remain ing modes of (35), (36)are calculated when required using a generalized eigenequation algorithm . The correlation statistics used in the weight update equations are calculated using a gain-normalized exponential growing rectangular window, with a recursive update form ula given by

It..(n)

= [1 - ,,(n)JR.. (n - 1) + /L(n)z(n)vH(n)

(90)

for arb itrary signals z(n) and v(n), where /L(n) is given by

/L(n) =

(~1' ..

l-

(1 -

rectangular Windowing (91) /Loot

exponential windOWing .

The performance measure used to judge the quality of the processor output sign al is the output SINR (92)

246

OutPUt BPSK ConstellaUon

Quadrature

20dB

In-phase

~~~.-..::/M~8X:;:1~6-~9:;:AM;:;;SI~N~~ 10 dB

BPSK SINR. a

= 4 MHz

16-QAM SINR, a

= 3 MHz

a = 4 MHz

______ OU1put QAM

Quadrature

Constellation

o dB ~--...,.---....."';::.....,,..,....---, 200 400 600 BPSK 501 Bauds (0.25 pseclbaud)

800

0' =

3 MHz

Fig. 9. Signal sorting by self-coherence frequency. (which are all asymptotically equal to zero) in this environment. Figure 9 demonstrates the ability of the SCORE algorithm to sort through interference environments and extract SOls on the basis of their differing self-coherence frequencies , a key feature of the cross·SCORE processor. Removing the conjugation operation and setting the target self-coherence frequency a to 4 MHz causes the SCOREprocessor to select the BPSK501 ; changing a to 3 MHz causes the SCORE processo r to reject the BPSK 501 and select the 16-QAM 501. In both cases, the processor converges to within 3 dB of the optimal performance within 100 (BPSK) 501 bauds . Figure 10 illustrates the ability of the rectangularly win dowed SCORE processor to tolerate error in the assumed

over adaptation of the array. Alternately, the rejection time may be long enough to allow the error in a to be estimated and removed over the 501 transmission time. The simulation performed in Fig. 10 is repeated in (10] using the exponential w indowing algorithm given in (91). for 0.001 :S 1'0", :S 0.1, in order to evaluate the tolerance of this window to error in the self -coherence frequency of the 501. It is found that an exponential window can improve the tolerance of the SCOREalgorithm, but at sign ificant cost in misadjustment error. For the largest exponential decay factor chosen in [10] (p.~ = 0.1). for instance, the SCORE algorithm experienced an average SINR loss of 6 dB below the maximum-attainable 51 NR for this example, fo r virtually all of the self-coherence errors considered in Fig. 10. However, this SINR loss fluctuated widely over the collect ion interval, varying by as much as 8 dB from that average value over the run-time of the simulation. It is hoped that more sophisticated averag ing windows can reduce this rnisadjustment and increase the tolerance of the SCORE algo rithm to self-coherence error .

10dB

C. Performance in the Rank-La Spectral Self-Coherence Environment

400 800 BPSK SOl Baud. (0.25 ~_ud)

The performance of the SCORE processors in the rankLo spectral self-coherence environment containing Lo signals with spectral self -coherence at the target a is investigated here . Two environments are of particular interest: the multipath environment, where Lo correlated signals (reflections) w ith self-coherence at a are impinging on the array , and the multiple-SOl environment, where Lo uncorrelated signals with equal self-coherence strength at the same a are imp inging on the array . The cross-SCORE processor is shown to reject interference and background noise , leaving at worst an arbitrary linear combination of the L; self-coherent signals in both env ironments . In addi tion, the cross-SCORE and phas e-SCORE processors are shown to sort through the multiple-SOl environment and separate the self-coherent signals with near-optimal SINR, if those signals have differing self-coherence strength at the same a (if the cross ·SCORE algorithm is used), or if the signals have differing complex value (strength or phase) at the same o (if the phase-SCORE algorithm is used ).

Fig. 10. Tolerance of rectangularly-windowed cross.SCORE to target self-coherence error.

501 self-coherence frequency, a particularly important problem when conjugate self-coherence (which is affected by Doppler shift) is being restored by the processor. The processor is implemented here in carrier-restoral mode (conjugation enabled, T = 0) for vary ing amounts of error in the target a .ln all cases where a is in error, the processor eventually rejects the BPSK 501 ; as Fig. 10 show s, however, the time required for this to happen is long when the error is small. Furthermore, even when the error is large and the reject ion time is short, the SINR of the selected 501 can reach a high value before the SCORE processor begins to reject the 501. In many applications, this SINR may be high enough to allow a more robust (but less discriminatory) algorithm, such as a con stant modulus algorithm, to take

247

Nyquist rolloff, while the eigenvector corresponding to the next-largest eigenvalue of (351 selects the BPSK signal with 50% Nyquist rolloH. In each case, the SINR of the selected signal converges to within 3 dB of its maximum attainable value. Figures 13and 14 illustrate the performance of the SCORE processors when the self-coherent signals have equal self-

>0 dB OdB

·40 dB

Gain, dB 80

180

270

DlrKlion of Anlval, degreee

0.110

Fig. 11. Antenna pattern 01 most-dominant cross-SCORE solutions, multipath environment.

0.165

Figure 11 demonstrates the screening property of the cross-SCORE processor in a multi path environment where the 16-QAM signal in Table 2 has been replaced by a onesample (SO nsec) delayed replica of the BPSK signal listed in that table . The delayed-path signal sIt - Td) is almost fully correlated with the direct-path signal s(t), (I p,,(Tdl! = 0.93) for this value of delay, presenting a difficult extraction problem for any processor operating without prior knowledge of the signal DOAs. However, Fig. 11 shows that the two most-dominant modes of the cross-SCORE eigenequation are able to reject the interference signals that are not selfcoherent at the sal symbol rate, and to pass some linear combination of the two sal components. In (10) it is also shown that sal components are extracted with significant distortion, but without the cancellation effects observed in multipath environments with other blind techniques such as power-minimization techniques. Figure 12 demonstrates the ability of the cross-SCORE processor to sort through interference environments and extract signals on the basis of their differing self-coherence strengths. The cross-SCORE processor is configured here in baud-rate restoral mode, and the 16-QAM signal listed in Table 2 has been replaced with a 4 Mbls BPSK signal with 50% Nyquist rolloff. This environment therefore contains two BPSK signals with spectral self-coherence at 4 MHz but with differing self-coherence strengths (1/6 for the signal with 100% rolloff, and 1/14 for the signal with 50% rolloff). Figure 12 shows that the eigenvector corresponding to the largest eigenvalue of (351 selects the BPSK signal with 100%

1200 1600 Numb« 0' SCM .ymboll (0.25 j.lMel~boI)

Fig. 13. Measured SOl self-coherence magnitudes for two i.i.d, BPSK received signals.

20 dB

15dB 10dB 5dB

1200 1 Numba< of SOl symbob (0.25 uMCiOr mbo l1

Fig. 14. Output cross-SCORE, phase-SCORE SINRs for two i.i.d. BPSK received signals. coherence strengths. The cross-SCORE processor is implemented herewith the conjugation disabled and Q = 4 MHz, and with the 16-QAM signal listed in Table 2 replaced by a BPSK signal with the same structure as the BPSK signal given in that table, but with a diHerent timing phase and a statistically independent bit sequence. This environment then contains two independent and identically distributed (i.i.d.) signals with identical self-coherence strengths of 1/6 at Q = 4 MHz and T = O. However, the two BPSK signals

1.5

20 dB

10 dB

200

400

BP5K 501 Bauds (0.2S

600

800

~

-0.:

~-1.0 -1.S

Fig. 12. Signal sorting by eigenequation solution.

248

-1

-0.6

' 0.2 0.2 Baud 'nlerval

0.6

have differing self-coherence phases, due to their differing timing phases. Figure 13 shows that the estimated self-eoherence strengths of the transmitted BPSK SOls do not begin to settle about their expected values until roughly 800 data bauds have been collected. Comparison of Fig. 13 with the SINR of the two highest-eigenvalue solutions shown in Fig. 14 reveals that the cross-SCORE processor separates the two SOls until their estimated self-coherence strengths become too close. After this point, the dominant modes no longer separate the SOls, but continue to reject the interference and pass some linear combination of the two 5015 (plus a low level background of noise and residual interference). In contrast, Fig. 14 shows that an auto-SCORE algorithm such as the phase-SCORE algorithm can select and separate the BPSK signals even after the estimated self-coherence strengths of the SOls have converged to the same value. The phase-SCORE algorithm is able to select each of the SOls with near-optimal output SINR in this environment, by exploiting the differing self-coherence phase of these signals. The phase-SCORE SINR curves are also much smoother in Fig. 14, demonstrating none of the drop-outs exhibited by the cross-SCORE algorithm in this experiment. VI.

CONCLUSIONS

A new class of algorithms for blind adaptation of antenna arrays, the spectral self-coherence restoral (SCORE) algorithms, IS presented. Three new adaptive processors, the least-squares SCORE processor, the cross-SCORE processor, and the auto-SCORE processors, are developed, analyzed, and simulated In the rank-1 and rsnk-L; spectral self-coherence environments where 1 and La signals with spectral selfcoherence or conjugate self-coherence at a known value of frequency shift and arbitrary interference without selfcoherence at that value of frequency-shift are received by an antenna array. It is shown analytically and by computer simulation that the SCORE processors can select a sal with maximum 51NR in the rank-1 spectral self-coherence environment, given only knowledge of a self-coherence frequency of the Sal, for example, knowledge of the 501 symbol-rate or carrier frequency. It is also that the cross-SCORE processor can select SOls even if their self-coherence frequencies are only approximately known, and that the crossSCORE processor can select SOls with near-optimum SINR in the rank-t., spectral self-coherence environment. These properties are used to demonstrate the ability of the SCORE processor to sort through environments to extract and separate multiple PCM SOls on the basis of their differing symbol-rates, or (if their symbol rates are equal) on the basis of their differing self-coherence strengths or phases. These results show that the SCORE approach provides a promising alternative to existing blind adaptation techniques. The SCORE processors have unambiguous and analytically tractable convergence and selection properties, giving them an advantage over other property restoral techniques In automatic processing applications. The SCORE algorithms also operate WIthout knowledge of the background noise or Interference covariance matrix, and without knowledge of (or constraints on) the sensor array geometry or individual sensor characteristics, giving them cost, complexity and performance advantages over techniques that exploit only the spatial coherence. The highly discrim-

inatory signal-selection properties of the SCORE approach make it ideal for directed-search applications where a few SOls with well-known modulation properties must be extracted from dense interference environments. REFERENCES

(1) J. R. Treichler and B. G. Agee, "A new approach to multipath correction of constant modulus signals," IEEE Trans. ASSP vol. ASSP-31, pp. 459-472, Apr. 1983. (2) R. P. Gooch and J. Lundel, "The eM array: An adaptive beamformer for constant modulus signals," in Proc. 1986/nt. Conf. on ASSP, pp. 2523-2526, 1986. [31 l. J. Griffiths and M. J. Rude, "The P-vector algorithm: A linearly constrained point of view," in Proe. 20th Asilomar Conf. on Signals, Systems and Computers, pp. 457-461, 1986. [4] R. O. Schmidt, "Multiple emitter location and signal parameter estimation," IEEE Trans. Antennas Propagat., vol, AP-34, pp. 276-280, Mar. 1986. [5] A. Paulraj, R. Roy, and T. Kailath, "Estimation of signal parameters via rotational invariance techniques-ESPRIT," in Proc. 19th Asilomar Conf. on Circuits, Systems and Computers, pp. 83-89, 1985. (6) W. A. Gardner, Statistical Spectral Analysis: A Nonprobabilistic Theory. Englewood Cliffs, NJ: Prentice-Hall, 1987. [7] W. A. Gardner, Introduction to Random Processes with Applications to Signals and Systems, 2nd ed. New York: McGrawHill,1990. [81 B. C. Agee, S. V. Schell, and W. A. Gardner, "Self-Coherence Restoral: A New Approach to Blind Adaptation of Antenna Arrays," in Proc. 21st Asilomar Coni. on Signals, Systems and Computers, pp. 589-593, 1987. [9J G. W. Stewart, Introduction to Matrix Computations. New York, NY: Academic Press, 1973. [10} B. G. Agee, "The property restoral approach to blind adaptive signal extraction," Ph.D. dissertation, Dept. of Electrical Engineering and Computer Science, Univ. of California, Davis, CA,1989. [11} S. V. Schell and B. G. Agee, "Application of the SCORE algorithm and SCORE extensions to sorting in the rank-L environment," in Proe. 22nd Asilomar Conference on Signals, Systems and Computers, pp. 274-278, 1988. p

249

Sensor Array Processing Based on Subspace Fitting Mats Viberg, Member, IEEE, and Bjorn Ottersten, Member, IEEE

Abstract-A large number of signal processing problems are concerned with estimating unknown signal parameters from sensor array measurements. This area has drawn much interest and many methods for parameter estimation based on array data have appeared in the literature. This paper presents some of these algorithms as variations of the same subspace titting problem. The methods considered herein are the deterministic maximum likelihood method (Ml.l , ESPRIT, and a recently proposed multidimensional signal subspace method. These methods are formulated in a subspace fitting based framework" which provides insight into their algebraic and asymptotic relations. It is shown that by introducing a specific weighting matrix, the multidimensional signal subspace method can achieve the same asymptotic properties as !\tIL. The asymptotic distribution of the estimation error is derived for a general subspace weighting and the weighting that provides minimum variance estimates is identified. The resulting optimal technique is termed the weighted subspace fitting (WSF) method. Numerical examples indicate that the asymptotic variance of the WSF estimates coincides with the Cramer-Rae bound. The performance improvement compared to the other techniques is found to be most prominent for highly correlated signals. A sirnulation study is presented, indicating that the asymptotic variance expressions are valid for a wide range of scenarios.

T

I.

INTRODUCTION

HE area of sensor array processing has drawn considerable interest for several years. A vast number of algorithms for estimating unknown signal parameters from the measured output of a sensor array have appeared in the literature. The interest in sensor array processing sterns from a large number of applications where waveforms are measured at several points in space and/or time. The problem of estimating signal parameters from measurements of sensor array outputs occurs in areas such as radar, radio and microwave communication, underwater acoustics, and geophysics. Much of the recent work in array processing has focussed on methods for high-resolution direction-of-arrival (DOA) estimation. When the emitter signals are generated by spatially close sources, conventional beamforming methods fail to separate the DOA's. A number of dif-

ferent methods have been proposed for estimating closely spaced DOAs. Herein. it is shown that many of these techniques can be formulated in a subspace fitting framework. From this viewpoint the connections between different algorithms become clear. This also allows a systematic procedure for deriving asymptotic properties of the estimation methods. Furthermore, the proposed framework leads to some natural extensions of existing methods and to interesting areas for future research. The maximum likelihood (ML) approach in sensor array processing has appeared in two versions. referred to as the stochastic ML method [I ]-[3] and the deterministic' ML method [4], [5]. The approach of interest herein is the deterministic ML technique. which coincides with a least squares fit of the model to the observed data. In general, this results in a highly nonlinear multidimensional optimization problem. Several numerical methods for dealing with this optimization have been suggested [61[9]. Asymptotic properties (when the amount of data is large) of the resulting estimates have been reported for the special case of uncorrelated signals in [ 10]. and extended to the gene raI case in [1 1] and (12). Subspace or eigenvector methods are known to have high resolution capabilities and yield accurate estimates. The introduction of the multiple signal classiticution (MUSIC) algorithm [1]. [13). which requires a one-dimensional search was an attempt to more fully exploit the underlying data model. ivlUSIC provides a nice geometric interpretation of the DOA problem and has received much attention. The asymptotic properties of N1lJSIC have been well documented in the literature [10). [14)-118). A multidimensional version of MUSIC has been discussed in the literature [1], [19). 120] when dealing with problems of correlated signals and finite sample bias. This. however, leads back to a computationally expensive search as in the ML procedure. The asymptotic properties of this MD-MUSIC method have not been reported and thus, the method has previously not been well motivated from a performance/computational complexity point of view.

Manuscript received February I. 1989: revised May 16. 1990. This work was supported in part by the SDl/IST Program managed by the Office of Naval Research. under Contract N00014-85-K-0550 and by the Joint Services Program at Stanford University (U.S. Army. U.S. Navy. U.S. Air Force) under Contract DAAL03-88-C-OOII. and by the Swedish Board for Technical Development (STU). M. Viberg is with the Department of Electrical Engineering. Linkoping University, 5-581 83 Linkoping , Sweden. B. Ottersten was with the Information Systems Laboratory. Stanford University. Stanford, CA. He is now with the Department of Electrical Engineering. Linkoping University. 5-581 83 Linkoping , Sweden. IEEE Log Number 9042723.

Estimation of signal parameters via rotational invariance techniques (ESPRIT) is a recently proposed method [19], [21] for DOA estimation exploiting a specific array structure. It has vast computational advantages over search techniques and requires no knowledge of the individual antenna gain and phase patterns. The asymptotic properties for a least squares variation of ESPRIT have recently 'Sometimes called conditional ~'1L.

Reprinted from IEEE Transactions on Signal Processing, Vol. 39, No.5, pp. 1110-1121, May 1991.

250

appeared [22]. The method considered here is total least squares (TLS) ESPRIT [19], [23]. The methods discussed above are all closely related, although the actual computations involved can be quite different. This paper formulates the different methods in a common subspace fitting based framework, which clarifies the algebraic relations between the algorithms. The framework can be used for designing common numerical algorithms and for obtaining new methods. For example, extensions of the TLS-ESPRIT algorithm arise in a natural way from the subspace fitting formulation. The proposed framework also invites a unified derivation of the asymptotic properties of the different methods. As a starting point, it is shown here that by introducing a specific weighting matrix in MD-MUSIC, this method can be given the same asymptotic properties as ML. The asymptotic distribution of the estimation error is derived for a general weighting matrix. hence including the results for ML. MD-MUSIC, and ESPRIT as special cases. The covariance matri x of the cxtimanon error for the general subspace fitting method is then minimized with respect to the weighting matrix. It is found that there exists a weighting matrix such that the optimally weighted MD MUSIC method always outperforms ML. This optimal subspace fitting approach is referred to as the weighted subspace titting (WSF) method. Some numerical examples and simulations are presented, verifying the theoretical results for finite data and comparing the properties of the methods. The xirnulations indicate that the asymptotic results provide useful pertorrnance predictions for a wide range of intormation-to-noise ratios (I~R). The numerical examples also indicate that the \VSF cstunate : arc .rxymptoricully efficient. i .c .. the asymptouc parameter variance coincides with the Cramcr-Rao hound (CRB) for the problem at hand. II.

PROBLE:-vl

phase patterns and the propagation phase delay. The output of the ith sensor is represented by d

= 2.:

X, (r)

j=1

Q,

+

(Of) 5j (t)

n, (t)

(1)

where a, (OJ) is a complex scalar representing the propagation delay of the jth emitter signal and the gain and phase adjustments by the ith sensor. The jth emitter signal is represented by Sj (r) and the additive noise sequence is denoted III (t). In matrix notation this familiar equation is obtained x(t)

== [a(e ,) ... a(O,,)]s(t) + net) == A(9() s(t) + n(t)

(2)

where {)o is a d-dilnensional parameter vector correspondin g to the true DO A ·s. The v ecto r a (e,) := [ a I ( 8I) . . . r -all/Oj)] contains the sensor responses to a unit wavefront from the direction OJ" The collection of these vectors over the parameter space of interest

ct == {a (OJ) i

OJ E

(3)

e}

is often called the array manifold. It is assumed that the parametrization of a is known. Let us also introduce the set ct'/, defined as the collection of all matrices with d COIUI11nS corresponding to distinct array manifold vectors

Ct"

:=

{A \ ~4 == l a (e1) ()I

<

t)~

. • .

a (0d)] •

< ... < O
(..+)

Thus. Cf ,/ is parametrized by the parameter vector e == \8 1 , • • • , ei l ] !". The array is assumed to be unambiguous, i.e .. any .4. E a" has full rank. The array output is sampled at ,v time instances and these snapshots are collected in the columns of an III x .V data matrix .t \

FOR.\ll:L;\ TIO~

B. Notation and Assumptions

Consider a sensor array conxisting of 111 clements. 1n1pinging on the array arc wavefronts from d emitters at different locations. The signal sources are assumed to be in the far field. and thus the wavcfronts are approximated as planar. The case of principal interest herein is the onedimensional parameter problem. DO.A. estimation. However. most results extend readily to the multiple parameter per source case.

Throughout the paper we will use the following notation: Hermitian transpose, trace of it. range space of A, Schur (Hadamard) product.

( .)*

tr {A} (R{A1 A 0 B

(A 0 B )/; ==

A. Data A10dcl The output of the 111 element sensor array is assumed to be a weighted superposition of the d wavcfronts , corrupted by additive sensor noise. uncorrelated with the emitter signals. The emitter signals are assumed to be narrow hand, i.e .. the propagation time across the array is small compared to the time variations of the amplitude and phase modulation of the carrier frequency. Thus, the propagation of a wavefront between sensor elements is modeled as a simple phase delay. The complex weighting of the wavefronts is due to the individual sensor gain and 251

pl.'\

X E AsN(I1I,

e=

A/I

B,l'

Frobenius norm. 11 A\I~ == tr (A*A). pseudoinverse. for A of full rank: A":' == (A*A)-IA:i<, projection matrix, p.~ == A(A *A) -IA * == AA-;-,

C)

arg min V(e) ij

r;

X

e

== I - PA , is asymptotically (for large N)

Gaussian distributed with mean and covariance matrix C. is the minimizing argument- of

111

V(9).

The additive noise, n(t), is modeled as a stationary, temporally white.i zero-mean Gaussian random process. For simplicity we will also require n(t) to be spatially white, i.e., E[n(t) n*(t)] = (J21. The assumption of spatially white noise is no restriction if the noise covariance is known (up to an unknown scalar (J 2) [1]. The emitter waveforms are also modeled as stationary, temporally white, zero-mean Gaussian random processes. The emitter covariance is denoted 1 N S = E[s(t) s*(t)] = lim - 2: s(t) s*(t). (6)

subspace is a subset of the range space of A(Oo), i.e., (11)

with equality if and only if d' = d. The eigendecomposition of the sample covariance matrix Ii is defined in a similar fashion as (10) A

R

The noise and signal waveforms are further assumed to be circularly symmetric, i.e.,

= 0,

E[s(t) S T(t)] = O.

(7)

III.

Under these assumptions, the array output is complex Gaussian with zero mean and covariance matrix 1 IV R = E[x(t) x*(t)] = lim - 2: x(t) x*(t)

+

The rank of S is denoted d'. Clearly d' -s d. and strict inequality implies linear dependence among the signal waveforms, emanating from, e.g., specular multipath or "smart jamming" in communication applications. For the asymptotic analysis, it is assumed that the array manifold vectors are twice continuously differentiable with bounded second derivatives in a neighborhood of the true DOA's. To enable unique identification of the signal parameters. the following condition is imposed on the array response: A(9) T

= A(ll) U

:::)

9

= 11

R =

2:

i= t

A;eie; = EsA.\.E.:

+

EIlAIlE:

A

A*

+

A

A

A*

E"A"E".

(12)

A

SUBSPACE FITTING FRAMEWORK FOR SENSOR

A. The Basic Subspace Fitting Problem

It is readily seen in [5]. [8] that the deterministic ML method fits the subspace spanned by A(9) to the measurements XlV in a least squares sense. This observation leads us to define the basic subspace fitting problem by the following equation:

A. i =

(9)

where T, U are arbitrary d x d' -matrices of full rank and 9,11 are parameter vectors. For A E ad, this condition is always met if d < (m + d') /2. If instead (In -t- li') /2 ~ d < 2 d' m / (2 d' + 1), the condition is satisfied for almost all matrices T and U, see [24]. For later reference, the eigendecomposition of R is introduced: m

A

= E.\.A s E.\

ARRA Y PROCESSING

(8)

(Jll.

N

This section attempts to collect various methods ina common framework. These algorithms are interpreted as variations of the same subspace fitting method. The subspace fitting formulation of the deterministic ML method is exploited and the other algorithms are related to this formulation.

tV-coNI=)

= A(90 ) SA *(9 0)

1 * N XNX

Determination of the number of emitters d and the dimension of the signal subspace d' is crucial for all methods described herein. Methods that accomplish this detection are discussed in, e. g., [25]. [26]. It is assumed here that d and d' are known.

N-coN,=t

E[n(t) n T(t)]

=

(10)

where Al > · .. > AcI' > Ad' + I = .. · = Am = a . The matrix E, = [e., · . · , ed'] contains the d' eigenvectors of R corresponding to the largest eigenvalues and these, so-called signal eigenvalues, are assumed to be distinct. The range space of E, is called the signal subspace. Its orthogonal complement is the noise subspace and is spanned by the columns of En = [ed' + h · · · , em]. The fact that the smallest eigenvalue of R has multiplicity In - d' and is equal to the noise variance is well known, see [1]. Thus, An = (J 21. Moreover, it is clear that the signal l

arg min ·t. T

IIJl - ATlli--.

(13)

In the above equation. the In x q matrix i.11 represents the data and T is any p x q matrix. For a fixed A. the minimum with respect to T is a measure of how well the range spaces of A and iW match. The subspace fitting estimate selects A so that these subspaces are as close as possible. The estimate of 9 is obtained from the parameters of A. The matrix M and its dimension as well as the parametrization of A can be chosen in different ways. leading to different estimates. Indeed. this is the key observation for describing the previously mentioned methods as solutions to variations of the same basic problem. Before going into different choices, notice that the subspace fitting problem is separable in A and T [27]. By substituting the pseudoinverse solution. t = A'lM. back into (13) we obtain the following equivalent problem:

lIn this context, a random process is temporally white if it has a flat spectrum over the (narrow) frequency band of interest.

252

A=

arg max tr {P Ai\1Jl*} ,4

( 14)

where PA = AA t is a projection matrix that projects onto the column space of A. B. Subspace Fitting Methods

Table I lists different choices of M and the set containing A for the basic subspace fitting problem. The goal of

3) ESPRIT: The ESPRIT algorithm [19], [21], [23] assumes a specific array geometry and is thus not as gener~l as the algorithms discussed up to now. By exploiting this array structure, a very interesting solution to the DOA problem which requires no knowledge of the arrav rnanifold is obtained. No search is involved as in the ;revious methods, and therefore the computational requirements are less. Assume that the array is composed of two identical subarrays, each of 111 /2 elements. The subarrays are displaced from each other by a known displacement vector A. The output of the array is modeled as

TABLE I SUBSPACE FITTING METHODS OBTAINED FROM (13).

( 14)

Constraint on A A. E ct';

Choice nf.\1 JJJ1*

JI =

= R

..lEG

ML

E,

ivll-ESPRIT TLS-ESPRIT

ylD-\:lUSIC

this section is to explain how these choices lead to the different methods. 1) Deterministic ML: The deterministic NIL method maximizes the conditional likelihood (given the siunal waveforms) of the data matrix X\,. This leads to the ~fol lowing minimization [4]. lSI: min tr {(Xv -

A( 8) S\'r~(X.v

H.S\

x(r)

( 15)

O. s\

Hence. the choice 1\;/ == ,V -- I 2X\' and A E Ci d in (13) gives deterministic ivt L. The ~vl L estimate of the emitter \~a veforms S v is obtained from ( 15) by

By sub stit uti ng ( 16) in ( 15L the fo ll0 win £ fa 01 ilia r L) ptimization problem is obtained:

e ==

J

...

rg In;Jx tr ~ [.J \ ( H) R}

( 17)

(I

where R is the sample covariance matri x. Notice from ( 14) that the same method ( 17) could lust ~lS 'Nell be obtained by takinu Jl as anv IN x In Hcrnli-tian square root of R. i. c.. R == '-J/IJJ:-: == ~Jl ~. Thus. the \11 L method tIts a d-din1ensional subspace spanned by the columns of .-\( e) to the data matri x , or. equivalently. to a compact representation thereof. 2) MDsMUSIC: An alternau vc mu ltidi rnensional array processing technique is an extension of the onedirnenxional' M USIC algorithIll [1]. to a multidimen-ional search method. This is described in [11, {19] and .cccntly also formulated in [20}. Adopting the subspace fitting formulation. the method can be posed as

e == arg are

I.;:,

111in

,.\Ea". T ITI'lX

c , . :\E(:t,I

\i

E\ -

AT\I~-

tr fl1".J T; E-; \* 1t : . \ L'.J\

r Is(t) I

11l\(t)

+

I

(19)

! 1l~(t)

/2 x d matrix [ contains the common array manifold vectors of the two subarravs. The ESPRIT algorithm does not exploit the entire array manifold The only knowledge that is used (and consequently the only knowledge that is required) is the displacement structure of the array. This limits the number of resolvable == diag r e ":": . . . . (! ,'\.i.'let] , where Tk is the time delay in the propagation of the kth emitter signal between the two subarrays and .» is the center frequency of the emitters. The time delay is related to the angle l;f arrival by T, = i ~, sin 0; .. where ( is the speed of propuguuon. Thus, estimates of the DOA's can readily be obtained from estimates of the time delay s T~. In ESPRIT it is assumed that the siana! covariance matrix S has full rank. i.e .. that d' = d.'-The arrav manitolc vectors corresponding to the true DOA 's 8! .. : .. 8d then span the signal subspace. (R(EJ == J1nE r E ~-Jr). Since the mat riceS IErE {] I J nd [T r 1r r ]t h avet he sa111e range space. there exists a full rank d x d matrix T such that

whe~~ the HZ

- A( e) s \,)}

min II XV - A(8)S\II7.,.

I

== I l [

( 18)

The above is easily recognized as (14) with Jl == E\. and .1. E ad. The method suffers. just like ML. from a costly multidimensional optimization. It is thus not clear why one would chose this method in favor or Ml.. This question is discussed in the next section.

.I

1£1 1\ ~! : £2

-

I

I :

r

r

\'1 1

I

.

By eliminating r in (20), the following expression is obtained: (21)

\Vhen t~e eige~vectors arc corrupted by noise. the estimates E) and E:. will. in general. not span the same column space" and no matrix 'P will satisfy the relation (21). Following r191, (23) . a total least squares (TLS) estimate [28] of 'IJ, given £1 and £2, is provided by

;~1USIC Involves a one-dimcn-.ional parameter search in the case prevented here. In general. \1USIC searches for one si~nal "at a time" and the search is of dimension equal to the number of parameters per signal.

253

(22)

where VI:' and composition

~/~2

are implicitly defined by the cigende-

subspace matrix z; The following observation shows that for large N, it is sufficient to consider the latter case if the signal subspace matrix is postmultiplied by a specific weighting matrix. Lemma 2: The deterministic ML method has the same

(23)

where L = diag [l" .. · , l2d], 1J ~ 12 ~ · · · l2d. Since 'II = T- '4>T, the elements of
asymptotic distribution as the following estimator:

A=

4>

=

[r T <J)TrT]T, r

= diag

E

E

(26)

where A = A.\. - a 21 = A1/ 2A 1/ 2 • Proof: The ML criterion function is

{P.4(E.\A.\.E.--: + EnA-"E:)}. (27) I of [29], it is shown that A" in (27) can be

tr {PAR} = tr

In Appendix replaced by a 2I without affecting the asymptotic properties. The identity E"E~ = I - EsE; implies that the ML estimator is asymptotically given by

er'>',

[-Yh · .. ,'Yel], l'i

IIEs A'/2 - ATII}

Aea,l

where the set 8 is defined by 8 = {A \ A

Aeacl

= arg max tr {PAEsA£.;}

(24)

Ae0. T

arg min

e }. (25)

Lemma 1: Let the TLS-ESPRIT estimate of 'P be defined by (22), (23). Then the minimizing
A = arg max tr {p.-tE.\(A, ..teett!

a

2

[ )E.; }.

(28)

By Lemma 5, replacement of the weighting matrix A, 2 a'!.J by A~ - a [ does not change the asymptotic distribution of the estimate. [J An important consequence of Lemma 2 is that all methods within the subspace fitting framework asymptotically maximize the following criterion function:

In this problem, only the displacement structure of the two arrays is exploited. No knowledge of the array manifold of the individual subarrays is required. The price for this is the fact that coherent sources can no longer be resolved. 4) To make Table I complete, the ML-ESPRIT method [19] is included. This approach maximizes the conditional likelihood of the data, assuming the ESPRIT parametrization of A. Clearly, this is accomplished by replacing Es in (24) by any Hermitian square root of R~' Unfortunately, the ML-ESPRIT method has no known simple solution since the connection to a TLS problem can not be made with this choice of M. The algebraic connection between the methods of Table I is clearly seen in the subspace fitting formulation. Using this framework as a base, asymptotic results will be derived in the next section.

(29)

where W is a d' x d' weighting matrix. The choice of W affects the asymptotic properties of the estimation error. as is demonstrated later in this section. B. Consistencv

IV. ASYMPTOTIC RESULTS In general, it is very difficult to answer questions about the performance of different methods. However, valuable insight can be gained by investigating what happens when the amount of data (N) is large. Several papers have been devoted to the asymptotic properties of the signal subspace methods as mentioned in the introduction. The discussion in the previous section suggests that one can obtain asymptotic results of all methods within the subspace fitting framework simultaneously. It is useful to first examine the asymptotic form of the criterion function. A. The Asymptotic Criterion Function

In the previous section, the data (M in (13)) are represented in two ways. Either M is chosen as a Hermitian square root of the covariance matrix R, or as the signal 254

The general subspace fitting estimate is obtained by maximizing (29). 'For ML and MD-MUSIC~ A is parametrized by the DOA's only (A E acl)~ whereas for ML-ESPRIT and TLS-ESPRIT, a more complicated parametrization involving the elements of rand
9=

arg max V(9) 6

= arg

max tr {P,\(9)E.,WE"~~}. 8

(30)

We will restrict the weighting matrix W of (30) to be Hermitian and positive definite. Naturally, the limiting behavior of 9 depends on the limiting properties of the sample covariance eigenvectors. We have the following result. Lemma 3: Let the eigendecomposition of the array covariance be defined as in Section II and llSSU111e that the signal eigenvalues (the d' largest) are distinct. Then the following limits hold witn probability one (w.p, J) as N -4 00

k

= 1, ...

,m

(31)

k = 1, ... , d'

Lemma 4 ([15J, [16J): The d' largest eigenvectors of R are asymptotically Gaussian distributed with means and covariances given by

(32)

A

where d' is the rank of the signal covariance. Proof: Under the conditions stated." the elements of converge w.p.l to those of R, see, e.g., [30]. Standard theory on the perturbation of eigenvalues, see [31], then proves (31). Some more care must be exercised concern-

R

E[e"J =

E[(ek - E[ed)(e, - E[eil)*]

ing the eigenvectors. Assume that an arbitrary uniqueness condition is used when calculating the eigenvectors, for example, that the kth component of e, is positive and real. Then the perturbation theory for invariant subspaces , see [32] or [33, p. 413], applied to each signal eigenvector can be used to show (32). [J Using the above result. the strong consistency of the parameter estimates is readily established. Theorenl 1: The estimate,

e.

obtained [rom (30)

e, + D(N- I )

A"

== O,d -

N

=

A

m

l:

(36)

°kl V!kl N

+

o(N-- 1 )

(37)

-AkAI T _I Ok') NO'k _ A/)2 e.e ; + o(N ).

(38)

1=

I.i;t:k

(Ai -

I")

Ak)-

e e* 1

1

E[(e" - E rek]) (el - E[e,]) T] _

-

(I -

D

CO!l-

verges w.p . 1 to eo (IS IV ~ 00. Proo]: Notice first that the criterion function V(9) converges w .p. I, uniformly in the parameters to the limit function V( 9)

Notice that only the N-independent term is shown in (36). This app~oximation is sufficient for our purposes. Since {} maximizes Vee) given by (30). we have V'(e) == 0, where V' is the gradient of V. Following [35. p. 240]. a first-order Taylor series expansion of V' around the true value 90 leads to

(33) as N tends to infinity. To see this. consider the difference

o=

sup itr {P.\(O)E, WE::~) - tr {P.\(9)E\ ~VE:~) \ o

V'(e o)

+

V"(e~) (9

-

eo)

(39)

where 9~ is a point on the line segment joining 90 and Denote the limiting second derivative of V

(34)

V"(9) == lim V"(e).

Since the norm of the projection matrix is bounded and the signal eigenvectors converge \\1. p. I to their true values. the right-hand side of (3~) tends to zero w. p. 1. ConsequentlL: converges w . p. 1 to the ma xirnizing argumerit of V(eL We have

9.

(40)

\'-:::x:l

To establish the convergence of V" (e~) consider

e

1\

V"(e~) - V"(9 0)\\F ~

+ :IV"(9 0 )

-

\1

V"(On)IIF

V"(e~) -

V"(9 0 )\IF'

(41)

By Theorem 1. 9~ converges to 90 w.p.l and since V"(9) is continuous by assumption. the first term on the righthand side converges to zero. The second term tends to zero by a ~imilar argument as in (34). Consequently. V"(9~) ~ V"(9 0 ) w .p.l. Next. assume that V"(9 0 ) is invertible.? For large N we then have

(35)

In the inequality. we have used that the trace of a matrix is the sum of its eigenvalues. '[he inequality is strict unless P:t(O)E, = O. in which case E, == .4.(0) T for some nonsingular d x £I' matrix T. However. in view of (9) and ( 11) this is possible if and only if e == eo. '--'

(9 - ( 0) == - {V"(e o) } -I V'(Oo) +

0

(V'(9 0 ) ) .

(42)

Let Vry denote the nth component of the gradient V' (9 0 ) , Using (30) and the expression for the derivative of the projection matrix in Appendix B (B.3). we obtain

The above result states that the parameter estimate is strongly consistent. and hence of course asymptotically unbiased. Notice. though. that it is in general biased for finite lv.

(43)

C. Asymptotic Distribution

Having established consistency of the general subspace fitting method, we now proceed to derive the asymptotic distribution of the estimate. The distributional results to be presented are based on the signal eigenvector statistics. The result was given in {34} for the real case and has been extended to complex eigenvectors.

where Wk is the kth column of W. It follows from Lemma 4 that for large N, E.\. = E, + OpeN -I /2), where the symbol Ope . ) denotes the .• in probability" version of the corresponding deterministic notation, see [36, sec. 2.9]. hThis is guaranteed by the uniqueness condition (9) except in degenerate

SAnd. in fact. under much more general conditions

cases.

255

.t

= 0,

Since P (9 0 ) ek

k

=

to any uniquely identifiable parametrization of A, including for example, simultaneous estimation of azimuth and elevation. For the parametrization, A E ad, we have

1, · · · , d' we have

V~ = 2 Re [k~1 W:E;At*A:Piek]

+ op(N- 1 / 2 ) . (44)

A'1

=

[0, .. · , 0, d«().,,), 0, . · · ,

Introduce ek = ek - E [ek], and note from Lemma 4 that ek can be replaced by ek:

d(()~)

=

:0 a (0)1

d' ] * t*A~* P A.L ek V~ -_ 2 Re [ k~1 Wk* EsA + opeN -1/2 ).

V" = -2 Re {(D*Pi D) Q

The asymptotic normality of ek, k = 1, ... , d'; now imply that the gradient, V' (9 0) , is also asymptotically normal. It follows that where the

lJ~th

QTJ~ =

lim

NE[2Re[f

. ?- Re

k=l

tI' ]1 l: Wk* e;*A T* A~* P.4 ek J. ..L-

9 0)

E

AsN(O, C),

j

(47)

Q,,~

=

2(12

sWE;A

T

* }]

(50)

Re [tr {A{PiA7JATE.\.WAsA-1WE.:A7*}]. (51 )

Here, A'1 is the partial derivative of A w. r. t. the lJth parameter and all expressions are evaluated at the true value

90 ,

Proof: The expressions (50) and (51) are derived in 0 Appendix C. No explicit expression for the finite sample bias is given in Theorem 2, but notice from (42) and (45) that the asymptotic bias is of order 0 (N -, I:!), while from (48) the standard deviation is of order 0 (N - 1/2). Thus, the bias is negligible compared to the standard deviation for large N. The asymptotic distribution holds for arbitrary signal correlation, including full coherence." Let us also emphasize that the covariance formula (50), (51) can be applied 7For ESPRIT. the result is of course valid only if S

op(]

/J"N),

i.e.,

lim s- 00

Theorem 2: Consider the subspace fitting method (30). The asymptotic distribution ofthe estimation error is given by (48), (49), where the 1J~th elements of the matrices are TE

(55)

w

(48)

The matrix V" is defined in (40) and Q is defined by (47). We are now ready to state the following result.

-2 Re [tr {A{P.tA l1A

(54)

Lemma 5: Let Wbe a consistent estimate ofthe "true" weighting matrix W, i. e., W = + op (I). Let 9 be obtained from (30) using Wand assume it is obtained when W is used. The difference (it - 9) is then oj" order

(49)

=

Re {(D*P; D)

(AtEs WE;A1-*)T}

The parametrization of the ESPRIT set, A E 8, leads to more complicated matrix expressions, see [371 for details. The weighting matrix W has until now been regarded as known. However, the dML choice," W = A and also the optimal choice (Theorem 3) depends on unknown quantities. Let us therefore show that W can be replaced by any (weakly) consistent estimate without affecting the large sample distribution of the DOA estimates.

where

Vl1~

= 2a:!

0

(56)

Equations (42) and (46) imply that the asymptotic distribution of the estimation error is given by

IN(9 -

(53)

where

(46)

wiE;At*A:P.tek(

[ k=l

= 1, ... , d.

1]

o (A 7E.\. WAsA -2WE; At*)T}

element of Q is

."1-00

'

(52)

This leads to the following matrix expressions:

(45)

JNV'(9 0 ) E AsN(O, Q)

(J=8"

01

> o.

256

.IN (11 - 9) = 0

in probability.

(57)

Proof' Write V( W) to stress the dependence of the criterion function on the weighting matrix. In view of (42) and (412 . the result follows if Vry(W) = Vry(W) + op(l/.JN) and Vl1 t; ( W ) = Vl1~(W) + op(l). But the first equality follows immediately from (45) and the second equality is trivial. 0 Some more convenient covariance formulae are obtained for the special choices of weighting matrix W that correspond to the methods of the previous section. Corollary]: Assume that S is invertible. Then, i) The asymptotic covariance of the unweighted MDMUSIC method (W = I) is given by (49) with.

1/" = -2 Re {(D*Pi D) Q=

2(12

Re {(D*P; D)

0

(A*A)-T}

0

[(A*ASA*A)-I

+ a:!(A*ASA*ASA*A)-I]T}.

(58)

ii) The asymptotic covariance of the deterministic ML method (W = A) is C = CRB DET

where

+ (V,,)-IQ(V")-l

V" = -2 Re {(D*Pi D)

0

ST}

(59) (60)

Q=

2a 4 Re {(D*Pi D) 0 (A*A)-T}

(61)

and where CRB DET is the asymptotic deterministic Cramer-Rae lower bound as derived in [18J, [29J. Proof: From the structure of the covariance R, straightforward manipulations give the following relations:

(63)

A == E;ASA*E,..

(64)

Proof: The matrix inequality follows immediately from [11, lemma A.2]. For proving (70), notice first from (54), (55) that V" = -a- 2Q. Hence, only the expression AtEs WoprE7AT* needs to be evaluated. The weighting matrix is Wopt

(68)

This in conjunction with Lemma 2 proves ii). C The expression for the asymptotic deterministic ML covariance ii) is consistent with the expressions derived independently in r 11] and r 121· Note that the deterministic ML variance is greater than the deterministic CRB. This was also observed in [291. Let us emphasize that this bound assumes deterministic signal waveforms and is different from the CRB for Gaussian signals. Notice also that when S is diagonal (uncorrelated signals) ~ the asymptotic covariance of ML coincides with the one of MUSIC [17],

[18], [29].

D. Performance Optimization Given the result of Theorem 2, it is natural to ask if there exists an optimal weighting for the subspace fitting method. In other words, is there a weighting Wopt which minimizes the estimation error variance?

Theorem 3: Let the matrix vaLued function C (W) be defined by (49), (54), (55). Then for all Hermitian matrices W (69)

where the matrix inequality means that the difference C (W) - C (A lA,;') is positive semidefinite. 171e asymptotic covariance for the optimal subspace fitting method is

.,

- o ' I.

(71)

Using this and the fact that A'~EsAE,:AT* = S from (63), in (73) gives A 7E\ WnptE; A 7* = ATE.\.(.\.

+

a~A,\~' - a 2I ) E.:A 7*

= S - (a- 2SA*A

+ I)-Is

= S - (I - SA*(ASA*

+

(74)

a 21)- IA) S

== SA*R -lAS.

i ng that P, = E." E =: shows i). The result ii) is obtained by noting that

S + a 2C4 *.4 ) - I.

-I

(73)

Applying the above expressions to (54) and (55) and not-

* ==

4

-A(a- 2SA*A + /)-'SA*.

(67)

7

-

== A + a As

Applying the matrix inversion lemma, the expression above can be written as

Notice that

7

-(

(72)

(66)

A EsA.\E7ft

-:2

A As

It follows immediately from (64) that

To prove i), assume that S is full rank, (64) then gives after some manipulation (65)

=

(75)

The implication of Theorem 3 is of considerable theoretical interest. It follows that the optimal subspace fitting method, referred to as the weighted subspace fitting (WSF) method. never performs worse than ML and in general it outperforms ML. As we will see, the result is also of practical interest since the difference can be large when the sources are highly correlated. It should be noted here that Stoica and Nehorai [11] have reported a special case when the one-dimensional MUSIC method gives smaller asymptotic variance than ML. V.

NUMERICAL EXAMPLES AND SIMULATIONS

In this section we present some numerical examples to compare the performance of the discussed methods. Simulations are also included to investigate the applicability of the theoretical results obtained in the previous section. For the theoretical curves, Theorem 2 is used in the pragmatic form COy (0) =::: N -1 C. The theoretical results assert the quality of the local maximum of the criterion function closest to 0 0 , The estimates are therefore calculated by initializing a Gauss-Newton type descent method at the true DOA's. In practice, other methods of initialization would, of course, have to be considered and the question of global convergence arises. This issue is not addressed here. The WSF method is implemented using the weights Wo Pt == (As - a2[ )2A .; 1, where the noise variance is estimated as the average of the m - d' smallest eigenvalues of R. In both examples, a uniform linear array (ULA) of half-wavelength element spacing is assumed, the array manifold vector having length and first element 1.

-;r;;,

257

Two emitters of equal power are symmetrically located with respect to the array broadside . For each case , 500 independent trials are run and the standard deviation of the first DOA estimate is calculated. The bias, observed in the simulations , is less than 20% of the standard deviation for all cases and methods. To obtain correct results in the region where the standard deviation is of the same order as the DOA separation , a DOA/signal association must be made . This was done by comparing the estimated signal waveforms, ob tained from (16), with the true ones. The CRB under the Gaussian signal assumption [I], [2] is also displayed in the figures.

MD·MUSIC: •• . • Det. ML: WSF : CAB :

. '. 10·

DOAs: 2' ·2'

Number of sensors: 6 Correlation : 0 SNA : 13 dB 102

10'

N

A. Example 5.1: Uncorrelated Signals In the first example, we examine how many snapshots that are needed for the asymptotic results to be valid for a "typical case ." A six element ULA is used . the signal waveforms are uncorrelated and both of power level 13 dB above the noise. The DOA' s are 2 0 and - 2 0, respectively. In Fig. I, the standard deviation of the first DOA estimate is plotted versus the number of snapshots . The theoretical values are displayed with lines . while the sim ulated values are represented by +. 0 , and x. The figure illustrates that the theoretical results agree very well from about 5 snapshots for this case . The theoretical variance of the WSF estimates is identical to the CRB up to numerical precision. Also, the ML variance is indistingu ishable from the CRB in Fig. I. but they are not identical. Clearly , the unweighted MD-MUSIC method performs notably worse . This is somewhat unexpected since the one-dimensional MUSIC is known to have the same asymptotic performance as ML for uncorrelated sources [10]. 0 B. Example 5.2 : Correlated Signals This example is chosen to demonstrate the extreme importance of the weighting matrix Win WSF when the signals are highly correlated. A four element ULA is assumed and the DOA 's are 50 and -5 0. The waveforms have a 99% correlation with correlation phase 0 0 (at the first sensor), i.e., the signal covariance matrix is

S =

lO(o. 'SNR) X

[

I 0.99

0.99] . I

(76)

The number of snapshots is 200 . In Fig. 2, the standard deviation of the first DOA estimate is plotted versus the SNR of the signals. Again, the theoretical standard deviation of WSF is equal to the CRB up to numerical precision . Deterministic ML performs notably worse for SNR's below 8 dB . Notice the severe degradation of the MD-MUSIC method for this scenario . 0 VI. CONCLUSIONS This paper attempts to collect several algorithms and versions of algorithms under a unified framework. Alge258

Fig . I. Standard deviation o f DOA estimate versus the number of snap shots .

deg

2 0r-~-~-~-,..--.,-~---~--~--'

18

16

MD·MUSIC: Det. ML: WSF : CA B:

DOAs: S' ·S·

Number of sensors: 4 Correlation : 0.99 Number of snapshots: 200

14 12 10

8

,

j

. j a

2

4

6

8

10

12

?

14

,

16

1 ,

,

18

20 dB

Fig. 2. Standard deviation of DOA estimate versus SNR.

braic relations are presented and based on this . asymptot ic expressions for the estimation error are derived. The proposed framework allows a unified derivation of the asymptotic properties of several subspace fitting based methods. A weighted subspace fitting method is introduced and its asymptotic properties are derived for a general weighting and arbitrary signal correlation (including full coherence) . This result includes the asymptotic distribution of the estimation error for deterministic ML and MD-MUSIC as special cases . The optimal weighting matrix is derived, resulting in the WSF method which always outperforms the deterministic ML method . Numerical examples presented herein and in [38], indicate that the asymptotic variance of the WSF estimates coincides with the CRB for the Gaussian signal and noise model. There are other advantages in viewing sensor array processing as a subspace fitting problem. Extensions of the ESPRIT algorithm become clear in this framework, e .g ., multiple array invariances [391 . Furthermore, the meth ods within the framework can be implemented by means of the same optimization algorithm. The examples also demonstrate that the difference in performance between the methods can be significant in many cases, i.e., the choice of subspace weighting can be crucial. The simulation study indicates that the theoretical

results can also be found in [27]. For ease of notation we write P instead of PAce). Consider the first derivative

variances are valid for a large range of information-tonoise ratios. ApPENDIX

A

TOTAL LEAST SQUARES AND SUBSPACE FITTING

Pry

Consider a system of linear equations

and let A and iJ represent noisy observations of the linearly related matrices A o and Bi; Let A and jj be the solution of ..\.8

II i1- B- IIi r

A - .4

min H. 'I'

I

11

I £~ £1

1-

i

f'

B I B'V i

1)

111F

min

ill

t:

1I'. (I>. t ;: II E' . . !I

,r I

I'

r

(

(B.3)

(B.4)

Using (B.2) and that P;- ==
~

P'I~

==

~

~

-P-LA~A'''AlJA''

- A-*~4rpl.£~lJA'·

+

P~'-\TJC--\* . 4t)-·1~4;P~

T

Pl.A1J~A·"

-

P-L.'-\)1.--\".4~A"

+ (... (:. C

..\PPE~DlX PROOF OF

THEORE\'f

(8.5)

:2

Evaluate the expressions (.+0) and (4.7). The derivatives of the criterion function (30) with respect to the parameters (}rl and 8~ are

,..

(.;\.5)

which is (2-\.). Remark I: There are some technical difficulties associated with the TLS ESPRIT algorithm and the reforrnulation as J subspace fitting problem. From the proof of [33. theorem 12.3.11. 'P T LS exists uniquely iff V22 is invertible and if It! > It!~ I' where V2~ and I{ are defined in (23). Since the set of matrices lEI E~l having any of these' 'degeneracies" is of Lebesgue measure zero. 'V T LS exists uniquely w. p. 1. The second technicality regards the variable change \11 == T - I . eve n if the fo rmer is de fect ive. [J ApPENDIX

(B.2)

.

_A7+pl.AA~+(···)*. P ry~ -PJ...AA7+p.lA· ~ rl }7~ TJ ~

1Til ~

I> I

7AlJA 7

where the notation ( ... )* means that the same expression appears again with complex conjugate and transpose. From (B.3) it is easily verified that tr (P1]) == 0 as expected since the trace of a projection matrix depends only on the dimension of the subspace onto which it projects. The second derivative is given by

==

This straightforward subspace fining formulation of the TLS problem has the advantage of precisely specifying what criterion function (without constraints) that is minimized. For ESPRIT. the underlying linear relation is E~ == E 1'1'. The variable change 'V == T - I T and r == B T" I . where T is an arbitrary d x d matrix of full rank and <[) is diagonal leads to 1

(B. 1)

Pry == P.LA"A·;· + (... )*

(A.2)

(A.4)

~

t

Combining (B. 1) and (B.2) gives

Then any estimate of X o, as formulated in (33). Introduce A == A - A and B == B - B. Hence. the TLS solution can be recast as follows:

I (I

t

A~ == (A*A)-'A:P.L - A

(IJ - B)X. (A.3) XT LS satisfying A - .4: == (B - B).tYT LS is a TLS subject to

dP

dOry = A~A + AA~.

After some algebraic manipulations the following is obtained for the pseudoinverse:

(A.I)

. rnm

=

VfJ == tr {P11E\. WE;}

rc.n

V11~ == tr {PTJ~ E,. WE ~ }

(C.2)

where Pl1 and PTJ~ are obtained from (B.3) and (B.5), respectively. The 1]~th element of V". evaluated in 9 0 is given by

V1J~ == lim tr {P1J~(90)E\ WE;}. v-

(C.3)

00

Applying (B.5) and noting that

r- E\

== 0 leads to

V1J~ == tr {PlJ~(90) e. WE.: }

== -2 Re [tr {A;PiA1]AtE\WE,:A·~*}]. Consider now the matrix

(47) as

l

Q'I~ = ,~i~oo NE 2

B

Re

Q. defined in

(C.4)

(47). Rewrite

[d' d' k~1 {~I

. (Wi*E*A-;"*A*PJ.. *E*A 7*A*Pl.>, ,t e-k WI.\ ~ A eI

DIFFERENTIATION OF THE PROJECTION MATRIX

1"J

In this appendix the expressions for the first and second derivatives of the projection matri x p.:\ (9) == A(A *A) -1 ..4 * =:-= Ai\~ with respect to the elements in e are derived. These 259

-)]l

AtEs W k WI*E*At*A*p.l + e-*P.lA k .~ lJ s ~.o1 eI

_.

(C.5)

Using Lemma 4, this reduces to

· Re {w:E;A

tE

7*A!PiE"E:.PiA

rtA

s lVk }

(C.6)

where we have used that A. k

=

a:!

for k

= d' +

1, .

m, and that E Z'= £i' + I e, et EllE: . Noting that Pi EnE: = EsE; = Pi gives

rso -

(C.?)

where

A REFERENCES

[I] R. O. Schmidt ... A signal subspace approach to multiple errutter location and spectral estimation." Ph.D. dissertation. Stanford Uruv ..

Stanford. CA. Nov. 1981. [2] W. 1. Bangs... Array processing WIth generalized beamformers." Ph.D. dissertation. Yale Un IV .• New Haven. CT. 1971. (3) J. F. Bohrne ... Estimation of spectral parameters of correlated ~lgnilb in wavefields.·· Signal Processing. vol. 10. pp. 329-337. IY86. [4] J. F. Bohrne, "Estimation of source parameters by maximum likelihood and nonlinear regression." in Proc. ICASSP 84. I ,}8~. pp. 7.3.1-7.3.4. [5] M. Wax. "Detecuon and estirnauon of superimposed signals." Ph.D. dissertation. Stanford U rnv.. Stanford. CA. Mar. 1985. [6] Y. Bresler and A. Macovski, "Exact maximum likelihood parameter estimation of superimposed exponential signals in noise." IEEE Trans. Acoust.. Speech. Signal Processing. vol. ASSP-34. pp. 10811089. Oct. 19R6. [7] M. Feder and E. Weinstein. "Parameter estimation of superimposed signals using the EM algorithm." IEEE Trans. AnJtof .• Speech, Signal Processing. vol . 36. pp. 477-489. Apr. 1988. [8] I. Ziskind and M. Wax. "Maximum likelihood localization of multiple sources by alternating projection." IEEE Trails. Acousr.. Speech, Signal Processing. vol. 36. pp. 1553-1560. Oct. 1988. [9] D. Starer and A. Nehorai ... Maximum likelihood estimation of exponential signals in noise using a Newton algonthm.·· in Proc. -lth ASSP Workshop Spectrum Estimation Modeling (Minneapolis. MN). Aug. 1988. pp. 240-245. (10) P. Stoica and A. Nehorai, ··MUSIC. maximum likelihood and Cramer-Rae bound." in Proc. ICASSP 88 COllI (New York. NY). Apr. 1988, pp. 2296-2299. [11] P. Stoica and A. Nehorai , ·'MUSIC. maximum likelihood. and Cramer-Rae bound: Further results and comparisons:' Tech. Rep. 8819. Yale Univ .. New Haven. CT, 1988; also IEEE Trans. Acoust.. Speech, Signal Processing, vol. 38. no. 12. pp. 2140-2150, Dec. 1990. [12] B. Ottersten and M. Viberg, "Asymptotic results for multidimensional sensor array processing," in Proc. 22nd Asilomar Con! Signals, Syst., Comput. (Monterey. CA). Nov. 1988. pp. 833-837. [13] G. Bienvenu and L. Kopp ... Adaptivuy to background noise spatial coherence for high resolution passive methods," in Proc. IEEE ICASSP (Denver. CO). 1980. pp. 307-310. [14] K. Sharman. T. S. Durrani, M. Wax. and T. Kailath .. , Asymptotic

performance of eigenstructure spectral analysis methods." in Pro-, ICASSP 84 Conf. (San Diego. CA), Mar. 1984. pp. 45.5.1-45.5.4. [15] D. J. Jeffries and D. R. Farrier... Asymptotic results for eigenvector methods." Proc. Inst. Elec. Eng .. F. vol. 132. no. 7. pp. 589-594, June 1985. [16] M. Kaveh and A. J. Barabell , "The statistical performance of the MUSIC and the minimum-norm algorithms in resolving plane waves in noise." IEEE Trans. Acoust., Speech, Signal Processing. vol ASSP-34. pp. 331-341. Apr. 1986. (17] B. Porat and B. Friedlander... Analysis of the asymptotic relative efficiency of the MUSIC algornhrn;" IEEE Trans. Acoust.. Speech, Signal Processing. vol. ASSP-36. pp. 532-544. Apr. 1988. [18] H. Clergeot, S. Tressens, and A. Ouarnri. "Performance of high resolution frequencies estimation methods compared to the Cramer-Rao bounds." IEEE Trans. ACOUSf .• Speech, Signal Processing. vol. ASSP-37. no. 11. pp. 1703-1720. Nov. 1989. [19] R. H. Roy, ··ESPRIT. estimation of signal parameters via rotational invariance techniques." Ph.D. dissertation. Stanford Univ .. Stanford. CA. Aug. 1987. (20] 1. A. Cadzow ... A high resolution direction-or-arrival algorithm for narrow-band coherent and incoherent sources." IEEE Trans. Acoust.. Speech, Signal Processing; vol. ASSP-30. pp. 965-979. July 1988. [21] R. Roy, A. Paulraj, and T. Kailath, "ESPRIT-A subspace rotation approach to estimation of parameters of cisoids in noise," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, no. 4, pp. 13401342, Oct. 1986. [22] B. D. Rao and K. V. S. Hari, "Performance analysis of ESPRIT and T AM in determining the direction of arrival of plane waves in noise," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-37, pp. 1990-1995, Dec. 1989. [23] R. Roy and T. Kailath, "ESPRIT-estimation of signal parameters via rotational invariance techniques," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-37, no. 7, pp. 984-995, July 1989. [24] M. Wax and I. Ziskind, "On unique localization of multiple sources by passive sensor arrays," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-37, no. 7, pp. 996-1000, July 1989. [25] M. Wax and T. Kailath, "Detection of signals by information theoretic criteria," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, no. 2, pp. 387-392, Apr. 1985. [26] M. Wax and I. Ziskind, "Detection of the number of coherent signals by the MDL principle," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-37, no. 8, pp.1190-1196, Aug. 1989. [27] G. Golub and V. Pereyra, "The differentiation of pseudoinverses and nonlinear least squares problems whose variables separate," SIAM 1. Numer. Anal., vol. 10, pp. 413-432,1973. [28] G. H. Golub and C. F. Van Loan, "An analysis of the total least squares problem," SIAM J. Numer. Anal., vol. 17, pp. 883-893, 1980. [29] P. Stoica and A. Nehorai, "MUSIC, maximum likelihood and Cramer-Rae bound," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 720-741, May 1989. [30] W. F. Stout, Almost Sure Convergence. New York: Academic, 1974. 131] J. H. Wilkinson. The Algebrau: Eigenvalue Problem. Oxford. U.K.: Clarendon. 1965. [321 G. W Stewart, "Error and perturbation bounds for subspaces ussociated with certain eigenvalue problems;" SIA,'d Rev: vol , 15. pp. 727-764. 1973. (33) G. H. Golub and C. F. Van Loan. Matrix Computations, 2nd ed. Baltimore. MD: Johns Hopkins University Press. 1989. (34) T. W. Anderson, .. Asy mptouc theory for principal component analysis," Ann. Math. Stai., vol. 34. pp. 122-148. 1963. [351 L. Ljung. System Identification: Theorv [or the User. Englewood Cliffs. NJ: Prentice-Hall. 1987. . . [36) K. V. Mardia, J. T. Kent. and 1. M. Bibby. Multivuriute Analysts. London: Academic. 1979. [371 B. Ottersten, M. Viberg, and T. Kailath. "Performance analysis of the total least squares ESPRIT algorithm." IEEE Trans. Signal Processing, this issue. pp. 1122-1135. (38] B. Ottersten and L. Ljung ... Asymptotic results for sensor array processing." in Proc. ICASSP 89
260

Direction-of-Arrival Estimation via Exploitation of Cyclostationarity-A Combination of Temporal and Spatial Processing Guanghan Xu, Student Member, IEEE, and Thomas Kailath, Fellolv, IEEE

Abstract-Many modulated communication signals exhibit a cyclostationarity (or periodic correlation) property, corresponding to the underlying periodicity arising from carrier frequencies or baud rates. By exploiting cyclostationarity, i.e., evaluating the cyclic correlations of the received data at certain cy·'le frequencies, we can extract the cyclic correlations of only signals with the same cycle frequency and null out the cyclic correlations of stationary additive noise and all other co-channel interferences with different cycle frequencies. Thus, the signal detection capability can be significantly improved. Cyclic correlation was first used into array signal processing by Gardner et ale However. their approach was based on a narrow-band data model, which is only exact for perfect narrow-band sinusoids: hence, their algorithms are approximate for general cyclostationary sources. In this paper. we propose a new appr -ach for exploiting cvclostationarity that is asymptotically ex.ict for either narrow-band or broad-band sources. Moreover, the new method also has signiticant implementational advantages over the earlier techniques. The simulation results indicate a significantly better performance for the new method in some environments.

C

I.

INTRODL:CTION

O NVE NT IO N A L array processing methods basically rely on the spatial properties (e.g., spatial delay) of the signals impinging on an array of sensors. The scenario that most conventional algorithms assume is that the sources under consideration are narrow band having the same center frequency, and that their temporal samples are uncorrelated. With these assumptions, in the now familiar subspace formulation (see, e.g., [1], [2]), the data vector from each source spans a one-dimensional subspace and the direction-of-arrival (DOA) estimation problem then becomes one of finding the signal subspace. Many algorithms, e.g., MUSIC and ESPRIT [3], [4] have been proposed to estimate the DOA's by searching for the signal subspace. In practice, however, the narrow-band data assumption is only approximate, especially if a source signal has a significant bandwidth; then signal-subspace based algorithms mentioned above can only be approximate solutions. Another shortcoming of the above apManuscript received March 3, 1990; revised June 5, 1991. This work was supported in part by the Joint Services Program at Stanford University (U.S. Army, U.S. Navy, and U.S. Air Force) under Contract DAAL0388-C-0011 and by NASA under Contract NAGW41951, S5. The authors are with the Information Systems Laboratory, Stanford University, Stanford, CA 94305. IEEE Log Number 9200249.

proaches is that they ignore the temporal properties of the signals of interest. In applications such as radar and sonar tracking, the signals of interest may have rich temporal properties that can be exploited to eliminate interferences and background noise. Nevertheless, it is very difficult, in general, to combine efficiently the temporal and spatial information of the signal in determining the source DOA's. In this paper, we attempt to solve a special but important case of this general problem, viz., the problem of estimating the DOA's of signals exhibiting cyclostationarity or periodic correlation (see Section II for the definition). Generally speaking, by exploiting this special temporal property of the signals, we can null out or greatly reduce the effect of other co-band jammers and background noise [5], [6]. The cyclostationarity concept was first introduced into array signal processing by Gamer [7] and Schell et al. [8]. In their approaches. the correlation matrix estimate used in the general subspace algorithms is replaced by a cyclic cross-correlation matrix estimate (see Section II). The computation of the cyclic correlation. a kind of temporal processing, helps to raise the SNR and to improve eventually the detection capability [8]. Another advantage of their approaches is the ability to separate signals having different cycle frequencies and thus to increase the number of detectable signals. However, the cyclic correlation methods are still based on the narrow-band assumption and thus yield larger root-meansquare error (RMSE) as bandwidth increases. The purpose of this paper is to introduce a new type of algorithm [9] that not only exploits the underlying signal spectral correlation properties but is also asymptotically exact for both broad-band and narrow-band signals. In the following, we first give a short description of the concepts of cyclostationarity and spectral correlation. Then, a more general data model is introduced, followed by presentation of a new algorithm for estimating DOA's by exploiting the cyclostationarity property. To understand the difference between the existing cyclic algorithms and the proposed method, a brief description of the existing approaches is presented and these are compared with the new technique, with respect to performance and implementation. The simulation results and discussions are given in Section V.

Reprinted from IEEE Transactions on Signal Processing, Vol. 40, No.7, pp. 1175-1786, July 1992.

261

II.

CVCLOSTATIONARITV AND SPECTRAL CORRELATION AND OUR DATA MODEL

A random process x (t) is called wide-sense cyclostationary if its mean m, (t) = E {x (n} and autocorrelation R:c(t, u) = E{x(t)x*(u)} are periodic with some period, say 1/ Ci (see, e.g., [5]): mx(t

R). (t

+

+

k/ a) = Inx(t)

(1)

k/ a, u + k/ ex) = Rx(t, u)

(2)

where k is an integer and the asterisk denotes complex conjugation. Such signals are common in communication applications, where the period arises from carrier frequencies and baud rates. The textbook [5] is an excellent reference for results on cyclostationary or periodically correlated processes. More generally, a nonstationary process x (r) is said to exhibit cyclostationarity if there exists a cycle frequency ex for which the cyclic autocorrelation function R~(r)

=

. -1 hm T-~ T

i

T /2

-T/2

( R:c t

r t - -r) e

+ -, 2

-J"" xat

2

-

d

is not identically zero. Note that for a stationary process. will be zero for all nonzero ex. The cyclic spectrum is defined as .lr ;.,

~f -

lim

0 .l { -

00

~) - .11£\ Xl,~f(t.f+~) d t - ~ (.I '2 C (4)

where 1

XI/~f(t, v) =

)

+ (I /'2~J" I

I

>:

((

/

'2~f)

x(u)e

-J'21rtil'

duo

Moreover" as shown in [6],

S~(f)

=

i~oo R~(r)e-J21rfr dr .

(5)

Therefore, if a process x ( . ) exhibits cyclostationarity with cycle frequency a in the time domain, then in the frequency domain it exhibits spectral correlation at a shi ft ex. For convenience in what follows, we shall only use the cyclic correlation function and denote it by R~ (r) =

<x(t + r /2)x* (t - r /2) e-J 27rCY ( )

To simplify the following presentation, we assume that all the sensors of an array have the same linear time-invariant response and that the sensors are uniformly spaced on a line. (The analysis can be extended to a more general array with arbitrary sensor locations if each sensor impulse reponse ft (t, fJ) is factorizable in the temporal and spatial domains, i.e.,ft(t, 8) = h;(t)g;(8), I -s i.] Sm.) Then the data received by the array at the time instant t can be expressed as (7)

.t/en

d

= L: ~h(t + k=l

i

R~ (r)

= lim

A. General Data Model

t

(3)

S~(f)

cyclic correlation. There are some signals that have both nonzero cyclic correlation and nonzero conjugate cyclic correlation" e.g., binary phase shift keying (BPSK). For more details, see [5], [6], [10], [11].

(6)

where
=

(i -

l)D

sin 0klc) + n.ttv,

1'1 2" ..... In

(8)

where s, ( .) and Ok are the received signal and the DOA arising from the k1h source: c is the wave speed and D represents the separation of the neighboring sensors. The noises n, ( .) at the different sensors are assumed to he jointly stationary (so that they have identically zero cyclic self and cross correlation). In fact. as will become clear, this assumption can be weakened. and it is only required that the noises not be cyclically correlated with themselves and with other source signals at the cycle frequency ex of interest. We assume that the d sources are mutually cyclically uncorreluted ' i.e .. that the cyclic cross correlation of any pair of sources is zero at each cycle frequency a of interest. It is worth noting that this condition is weaker than the one requiring that the sources be mutually uncorrelated, because uncorrelated signals are always cyclically uncorrelated with each other" while cyclically uncorrelated processes can be mutually correlated. It is also assumed that there are d.; -s d sources sharing the same cycle frequency ex, where ex is either known or estimated from the data.

B. Narrow-Band Data Model If all the s, ( .) are narrow band with the same center frequency fo, then we can write" approximately. d

x,(t) ;::::

L:

k=\

Sk(t) exp [j21rfo(i -

l)D sin Ok/ c]

+

n/(t)

(9)

and ( 10) 'As will he dear In Section III. the asymptotic exactness or the proposed algorithms cannot be established without this requirement.

= [~ I (fa), ... , ~d ( fa)I. ~k ( fa) = [1, ... ,exp [j27rfo(m - l)D sin Ok/e]]T, and ~(t) = [nl(t), ... , nm(t)]T. This narrow-band model is exact only when the sources are perfectly coherent sinusoidal waveforms. where A ( fa)

III.

NEW SIGNAL SUBSPACE FITTING ALGORITHMS

Let us first note a simple but important property of cvclic correlation, which is given in [5. ch. 12] and in [~ '2, ch. 11] for time series (e. g., sample paths of stoc1lastic processes). Lemma J: If x (.) is a cyclostationarv process with (self) cyclic correlation junction R.~ (T), and y (t) == x (t + T), then

R a\' ( T )

Rr

==

(X

(

T

)

2) y * (t -

7/

<.r ( t + T +

+

<x(t'

T

T/

B~(i)

=

(12)

A(a)B~(T)

w he re A (a) = [a 1( a) , . . . , f1da ( a)] ~ ~ (a) = [1. e xp (j27raD sin Ok/e), · .. ,exp [j27ra(m - l)D sin Ok/e]T and E~ (7) = [R ~I (T), R~:. ( T), . . . , R ~/" ( T)]T. Thus, we have the nice result that whether the original data are broad band or narrow band, when evaluated individually, the cyclic correlations of the data exactly obey a narrow-band data model (10) with cycle frequency a as a ,,'center frequency." We can then estimate the source DOA's by applying conventional narrow-band algorithms [3], [4]. For example. we can proceed as follows.

The Algorithm

eJ.2?rQ T .

Proof' R~~ (T)

we see that

2) e ~ J .2 ?reel

/2) X *(( + T -

)

7/

'2 ) e -J .2 xca

7/2)x*([' - 7/2)e-/ 2 7rctT '

)

Step J: For signal XI ( .) received at the i th sensor, and for each cycle frequency a (estimated or known) estimate the cyclic correlation R~(T). where i = 0, Ts ' • • . ,NT); i == 1. 2, . . . ,m; and T, is the sampling period. Step 2: For each a, form a pseudodata matrix

) ,

r'==t+ T X(Q')

Remark: The point is that R(t~ (T) is not a function of ( arl that the time delay is transformed into a phase shift. Now. let us start frOITI the general data model (7). (8). The cyclic correlation of the signal received by each sensor IS

R,~, (0)

e; (TJ

R.~, (NTJ

R.~Y~ (0)

R~2(TJ

R~2(NTJ

R~n (0)

R~n (TJ

R'~I1/(NTJ

( 13)

where N is determined such that T ~ NT,. R~I (T) are nonzero and significantly varying. Step 3: For each a. pick your favorite signal subspace fitting (SSF) algorithm, e.g .. MUSIC or ESPRIT. to estimate the DOA's of signals of interest based on this pseudo-data matrix "t(Q') or its sample covariance matrix X(a) . X(a)H.

(i

"'-, f

+

=

I

s;*(( -

7(/2

I)D sin ()t)/C)C- 12

(i -

;rr,, ) .

Since the s, ( . ) are mutually not cyclically correlated. and since only dry sources are self cyclically correlated with cycle frequency Q'. the double summation reduces to a sing~ :.~ sum:

~~

LJ

k=1

.

5k

(I t T

(

L:

k=1

t - -

'2

+ -T + 2 +

.

(I -

D sin (8 tJ ')

(i -- 1) - - - C

1)

Si:*

D sin (8k ) ) e -J2rrca c

R~(T)exp(j27rQ'(i - l)DsinOk/c].

Remark: We shall call such algorithms SC-SSF methods. e.g .. SC-MUSIC or SC-ESPRIT. where SC stands for spectral correlation. The fundamental idea behind this algorithm is different from that of conventional narrowband algorithms. i.e., MUSIC, ESPRIT, that rely on spatial processing only. In the narrow-band case. we see from (9) that the source spatial information can be estimated directly by comparing outputs of spatially distributed sensors. In more general scenarios (7), (8). it is difficult to do so. However. Lemma 1 shows that a temporal delay in the signal reveals itself as a phase shift on the cyclic correlation function: the source spatial information is then revealed and it can be estimated using conventional spatial processing algorithms, e.g., MUSIC or ESPRIT. In other words, if the signals exhibit cyclostationarity, the difficulties of DOA estimation arising from (nonzero) signal bandwidth can be circumvented by incorporating proper temporal processing.

( 11 )

Note that by assumption, the additive noises

do not have cycle frequency a and therefore make no contribution to R.~ (T). If we collect these functions in a vector

IV. COMPARISON WITH THE CYCLIC MUSIC AND ESPRIT ALGORITHMS

11/(.)

B~: (T) == [R (\~I ( T), R ~2 ( T) • . . . , R ~m ( T)] T

In this section, we compare our new SC-SSF algorithms with the cyclic MUSIC and ESPRIT algorithms introduced in [7], [8]. 263

also zero. Thus, we obtain

A. Existing Cyclic Algorithms

Suppose that the cycle frequency 0 for the d narrowband sources is available (either known or estimated), and the cross cyclic correlation is then estimated: R:(7) = (-!(t

+

7

/2):s.H (t -

T

+

• S (t

-

T /2

+

(I -

l)id)

1) Td) * e- j 211"o.( )

(k - 1)7d) exp [j1ro(1

+k -

2)Td]

and if a is much smaller than fo, then R~(T)lk ~ R~(T) exp

[j27rfo(k - I)T,,]

. exp [j7rQ: (I

==

R~~(T)

+k -

exp [j27rfo(k -

2)

Tt, ]

!)Tt/]

(18)

from which we can aso conclude R'; (T) = (:E ([ + - r/2)e- J 27rQ' ) :::: A(j~))R~~(T)AH(j;)). Although such a narrow-band model is accurate enough in many applications. for reasonably broad-band sources. the prior algorithms can only yield approximate solutions and their performance become worse as the source bandwidth increases. Besides. the cyclic MUSIC and ESPRIT algorithms still have some other shortcomings. The most serious arises from the fact that in these algorithms. the cyclic covariance corresponding to one time lag r is evaluated and then used to estimate the DOA "s. Of course. we hope that we know the optimal lag T at which R~ (T) achieves its maximum. In reality . however. this information is rarely available. If we pick a wrong time lag. we may not be able to determine all the DOA's. For example, if there is more than one source sharing the same cycle frequency a and the optimal lags of these sources are different, then there may not be any optimal 7 at all. For example. for binary phase shift keying (BPSK) with half-cosine pulse shape.:' the optimal T is 0, while for rectangular pulse BPSK, 7 = 0 is the worst case. i.e., R~ (0) = o. Of course, this kind of extreme cases rarely happens in practice. One possible approach to get around the above difficulty is proposed by Gardner [7]: use S~ (f) :::: r~CX) R~(T)e-J27rfTdi in place of R.~ (T), wherefunlike T can easily be chosen appropriately. Another alternative is to estimate the DOA's by combining all the cyclic correlations of different time lags. Assuming the sources are well approximated as narrow-band and cyclostationary pro-

T/2):'5. H ( t

n,(t)

Td)

+ 7/2 + (k -

R~ (T)kt is the (k, I )th element of the Inatrix k ; 1= 1,2, · · . , m. Clearly, since R~(T + (kI) TJ) is a function of k and I, the matrix R.~ (T) is generally of full-rank instead of rank-one. Hence, the existing cyclic algorithms would yield biased DOA estimates. Nevertheless, if the source of interest is narrow band enough (at center frequency fo) such that s (t + T) ~ S (r) ej21rfoT for arbitrary real t and T

To simplify the explanation of the asymptotic exactness of the SC-SSF methods, we only consider a one-source scenario. Using the above general data model (7), (8). we obtain

+

(s(t

R.~(7).

B. Asymptotic Exactness

set

+ T /2)xt(t - 7/2)* e-j 21fa , )

where

denotes Hermitian conjugate. Now assuming there are a; sources sharing the same cycle frequency a and that they are mutually (cyclically) uncorrelated, then R~(7) will also be diagonal and of rank da . Consequently, R~ (7) will be of rank d., and its columns will span the same subspace as spanned by these d a relevant signals. Hence, the MUSIC and ESPRIT algorithms can be readily applied to estimate the signal subspace (e.g., via an SVD), and then to determine the DOA's of these sources. The basic idea of the cyclic MUSIC and ESPRIT algorithms is the replacement of conventional correlation by cyclic correlation in an attempt to null out or reduce the noise and interference. Since the number of sources present in the evaluated cyclic correlation matrix is reduced, the detection capability and estimation accuracy can be significantly improved. For correlated noise, unlike conventional MUSIC and ESPRIT, these algorithms do not require the knowledge of R.v and the computation of a general eigendecomposition. However, since they were developed based on the narrow-band data model, these methods are approximate approaches for the DOA estimation of reasonably wide-band sources. (In fact. if no narrow-band assumption is imposed. the relation (14) will not hold in general.)

+

(Xk(t

= R~(T +

(14)

T/2)~H(t - T/2)e- j '2 7ro., ) , andH

set)

=

/2) e-j21fal)

= A(fo)R~(T)AH(fo) whereR~(T) = (~(t

R~ (7)kl

+ n2(t)

( 15) where s (.) is a cyclostationary process with cycle frequency o , 7d = D sin 8/c. The cyclic correlation evaluated according to the existing cyclic algorithms is ....

Suppose that the noises n,(.) are not cyclically correlated with s ( .) at the cycle frequency a, and that the cyclic autocorrelation of the noise at the cycle frequency 0 is

~p (t) = sin (2-rrt/ T/J) for - T,,/2 where.

264

-s t -s T,,/2 and p (t)

= 0 for t else-

cesses, we obtain R~ (7)

=

+

7 /2)!H (t -

T /2) e-

j2

between the prior (cyclic SSF) methods and the new (SCSSF) algorithms. Nevertheless, for other cyclostationary signals such as quaternary phase-shift keying (QPSK) the cycle frequency is on the order of fb' which is approximately proportional to the bandwidth. In wide-band cases, viz., when the bandwidth is comparable with the carrier, the resolution of the SC-SSF methods will not be affected much. If the sources are very narrow band (fb « fo), and we still use the same D as the prior methods, of course, the resolution of SC-SSF will degrade. However, since the maximum D (without ambiguity) of the SC-SSF algorithms is c /2ex » c /210' the same resolution can be achieved by significantly increasing the sensor separation distance. Of course, if the range of the wave spatial coherency is small, too large a sensor separation will impair the performance. On the other hand, if the carrier fo is very high . the maximum sensor separation D == c /2fo may still be too small to avoid the coupling or crosstalk effect. By using the SC-SSF approach and exploiting the smaller a, this problem can be resolved because the upper bound for unambiguous D is much larger. Many cyclostationary signals, e.g.. BPSK, also exhibit a conjugate cyclic property, i.e., the conjugate cyclic correlation is nonzero for some cycle frequency Q. If the baseband signal exhibits the conjugate cyclic property, then even though the sources are narrow-band, the induced manifold still has a large effective frequency, i.e .. [; == 210 + a. We will illustrate this fact for the single source case. although it may be easily extended to multiple sources. Let ~ ( .) be the baseband data vector after quadrature detection and down sampling:

-rra t )

= A(fo)R~(T)AH(fo)·

(19)

Since the column span of R~ (T) is the column span of

A (fa), which is independent of T, the following matrix ~f(a)

==

[R~(O), R.~(Ts)' ... , R.~(NTJ]

(20)

== A(j~)[R~(O)AH(lo), R~(TJAH(io), R~ (lvTJA H (J())]

(21)

has the same column span as A (fo). If each specific R~ (7) is rank deficient, the combination of all these [R~(O)AH(/0)' R.~(TJAH(j~»)~ . . . , R~(NTJAH (fo)] is less likely to be rank deficient. Hence, the problems mentioned above can be resolved. -',I evertheless , although the suggested alternative approaches use all the cyclic correlation functions with T == 0, Ts ' • • • , (N - 1) T:" they are still not asymptotically exact especially for wide-band sources. Also, due to a much larger size of the matrix X (a) in the first approach and additional computational cost for S(\~ (J') involving matrices R~Y (T) in T := O. T,.. . . . in the second approach. both methods are not so computationally efficient as SC5sr described in Section III.

c.

Resolution

The resolution properties of direction finding algorithms depend on many practical factors. c. g.. the sensor separation distance D. the effective frequency J~, of the s(T)

-l (l)

+

s(t +

Ilj(t)

Td)

exp (j27rfn Tcf) +

induced manifold, as well as the range of the wave spatial coherency and the coupling between sensors. In this section. we attempt only a brief discussion of the resolution properties of existing cyclic methods and of our new SCSS:- techniques. 1"'0 achieve high resolution and to minimize the coupling effect, one needs to have a large sensor separation distance D. On the other hand. if D is too large, ambiguity problems may arise. For conventional MUSIC or cyclic MUSIC, the maximum D that does not cause ambiguity is c /2fo, where c is the wave speed and fa denotes the carrier frequency. Nevertheless, for the SC-SSF approach, the maximum D is c/2a. In many cases, the cycle frequency a that one may exploit is close to twice the carrier frequency fa, For example, the cycle frequency of AM or FM is 2io [10] and the a of BPSK or FSK is 2.1;) + ih lII], where fj} is the baUd rate. Therefore, the same resolution as conventional MUSIC or cyclic MUSIC can be achieved if D is reduced by half. In this case, there is no fundamental difference

n2(t)

(22)

where Ttl == D sin () / c. Let R.~ (r), R~ (T) be the conjugate cyel ie correlations of Xi ( .) and s ( . ), respectively. It is easy to see that

R(~I (T) ==

(s (t

+ 7/2 +

(i -

I) r d) ex P [j 27r foU - 1) T d]

. stt - 7/2 + (i - l)Td) . exp [j27rj(li - 1) Td] . e-J :'. 1ra c )

== R~(7) exp [j211'"(2fo + a)(i - l)Td].

(23)

Hence, (12) becomes

265

!1.~ (r)

= A (210 + ex) B.~ (r),

(24)

In the following examples of computer simulation, we use BPSK signals, and the conjugate cyclic property mentioned above is exploited.

D. Algorithm Implementation Conventional signal subspace algorithms such as MUSIC and ESPRIT start from the covariance matrix R, ==

liN E~=l -!(n)-!H(n). For cyclic MUSIC and ESPRIT, we need to evaluate a cyclic covariance matrix, i.e., R~(T) = (!(t - TI2)~H(t + iI2)e-j21rat), which is estimated by R~(m) = liN E~=l x(nTs + mTs)!.H(nTs)e-j21ranTs, where T, is the sampling time and N is the number of samples for averaging. If the calculation of the cyclic correlation is done by an analog circuit, then just replace the sum by an integral in the above equation. If noise is stationary but spatially correlated, then cyclic MUSIC and ESPRIT do not require the knowledge of R; (0) and the computation of the generalized eigendecomposition {Rx (0), e, (0) }. Since the original data are sampled at a very high rate to avoid aliasing, and the evaluation of the spatial cyclic correlation is a very simple (though time consuming) operation, the computation of the spatial cyclic correlation is usually implemented by digital or analog parallel processing hardware. The processed data, i.e., the cyclic correlation, will be sent to the central processing unit for further more complicated processing, e.g., an eigendecomposition or a MUSIC search. From the formulas noted above, it is not difficult to see that the evaluation of cross correlation required by the cyclic MUSIC and ESPRIT algorithms involves multiplication of data from different sensors, e.g., Xi (n)xj* (n). Therefore, for each sensor, we have to build interconnections to the other (M - 1) sensors, which may make the hardware design much more difficult. Problems such as cross talk or coupling may also arise. In fact, the computation of the covariance matrix required by the conventional MUSIC and ESPRIT algorithms also has the same problems. On the other hand, the SC-SSF algorithms only require the temporal cyclic correlation, i.e., the cyclic autocorrelation among the data at each sensor. Therefore, the computation of cyclic correlation can be achieved locally at each sensor and no interconnections are required. Figures 1-3 show a simplified diagram of the computation schemes for the three types of correlation matrices required by conventional SSF, cyclic SSF, and SC-SSF algorithms, respectively. V.

Fig. 1. Simplified covariance matrix computational flow graph for Conventional signal subspace algorithms.

e J 2 Jr4 ' Fig. 2. Simplified covariance matrix computational Row graph for cyclic SSF algorithms.

e

COMPUTER SIMULATIONS AND DISCUSSIONS

In order to show the effectiveness of the proposed approach, computer simulations for testing signal selectivity, high resolution, and sensitivity to spatially correlated noise were conducted. In these simulations, the conventional MUSIC, cyclic MUSIC, and SC-MUSIC algorithms were used to estimate the source DOA's. In the following examples, a seven-element uniform linear array was used, with the smallest sensor separation c 12fe being used to avoid any ambiguity in DOA estimation, where c is the wave speed and Ie is the center frequency of the effective array manifold. In order to show the relative performance of the above mentioned approaches in both narrow-band and wide-band scenarios, seven cases ranging from temporal narrow-band (BW/carrier = 1 %) to temporal wide-band (BW/carrier = 40%) 266

j2 . .,

Fig. 3. Simplified covariance matrix computational flow graph for the SCSSF algorithms.

were studied. In this paper, the measure of temporal narrow band or wide band is defined as the ratio of the bandwidths of the complex baseband signals and the carrier. The statistics of the DOA estimates were calculated based on 500 independent trials. For cyclic MUSIC and SCMUSIC, we evaluated the conjugate cyclic correlation to extract the SOl (BPSK). As explained in Section IV-C, although the cycle frequency a = Ib' the effective trequency I. in the induced array manifold is 2/0 + lb. The signal-to-noise ratio (SNR) for each source was defined as the ratio of the power of this source to that of the back" ground noise. In all the following examples, we started with the baseband (complex) signals. The background

J10ise in cases A and B is spatially uncorrelated Gaussian noise, while it is a correlated Gaussian process in case C . For Cyclic MUSIC, we always pick the optimal lag r at which the cyclic correlation R~ (r) achieves its maximum , although we may not know the optimal lag in reality. In order to make the comparison even fairer, SC-MUSIC only used 7 cyclic correlation values so that X(a) is of the same size as that of the R~ (r) used in the cyclic MUSIC algorithm. In fact, SC-MUSIC can use as long a cyclic correlation sequence as possible, although the maxinn .n useful r value is limited by the width of R~(r) that exhibits significant variation. The simulated data in this paper is all generated in the time domain instead of in the frequency domain .

Il9 0.•

0.6

~

I ~

BPSK

\

0.5 0.4

AM

\

0.3 0.2 0.1

Fnqucn
Fig . 4 . FFT magnitude of the received data (case A) .

.4. Signal Selectivity Test The first example was designed to test the signal selectivity of the three approaches mentioned above . The signal of interest was a BPSK signal with half-cosine pulse shape, 3-MHz baud rate, and lO-dB SNR . It arrived at an angle of 55°. Two interferences AM and FM arriving from - 30° and 40°, respectively, were also present. Their SNR's were 5 dB (FM) and 0 dB (AM) . The bandwidth of the AM signal was 0 .2 MHz, while that of the FM signal was I MHz. Five hundred samples were collected at the rate of 8 MHz for 62.5 us. For simplicity of COl iparison , we assumed that the cycle frequencies of the signals of interest (SOl's) and the number of sources were known. The carrier difference was defined to be the difference between the carriers of the signal of interest (BPSK) and other sources . In this example. the carrier differences of the AM and FM signals were 3 MHz and 0.5 MHz . For the conventional MUSIC algorithm, the number of sources was assumed to be three. whereas for both the cyclic algorithms, only one source was considereJ. The FFT magnitude of the received data is shown in Fig . 4. The results of the conventional MUSIC algorithm corresponding to the smallest BW Icarri er and the largest BW I carrier are shown in Fig . 5(a), (b) , respectively . It is easy to observe that the conventional MUSIC algorithm can only detect two sources and that it fails to distinguish the sm (BPSK) and the FM interference . Since the conventional MUSIC algorithm is based upon a narrow-band data model, it can detect the sources (AM) whose energy is concentrated in a small bandwidth much better than SOurces with wide-distributed energy . This restriction of the conventional MUSIC algorithm can be observed from the simulation results as well. As shown in Fig . 5(a) the DOA of the AM signal, which is a narrow-band source, was estimated accurately when BW Icarrier = 1%. In this case, the BW /carrier for this AM signal is about 0 .2 /6 . 1% = 0 .033 %. In the wide-band case (BW Icarrier = 40%) , the BW/carrier for the AM is 0 .2 /6 . 40% = 1.33%. Although it is still detectable by the conventional MUSIC algorithm, the bias of the estimates becomes much larger. Although the signal bandwidth effect may

I'M

/

0.7

0.9 0.8

'\

AM

0.7

lPSK

O.6

~ o.5 o.4 :c o.3 o. 2 o.1

f

I

\

-<>U

-zu

-4U

0

co

OJ

Direction of Arrival (deg)

(a) 0.9

I

0.8

l AM

r

0.7 O.6

J

BPSK

I

~ o. 5

I

.a~ o.4

- o.3

I I

o.2

o.1 -<>U

•

I

20 40 ·.u u Direction of Arrival (deg)

co

0

(b )

Fig . 5 . Results for the conventional MUSIC algorithm (a) BW/carrier = 1 % (b) BWIca rrie r = 40% (case A) .

be used to explain the difference in estimating the DOA of the AM signal in these two extreme cases, we find it difficult to use the same argument to explain the poor DOA estimates for the FM and BPSK signals when BW Icarrier = I %. The possible reason is the limited detection ca pability of the seven-element array. In other words, that the FM and BPSK sources were too closely spaced and three DOA estimates were searched in the noise subspace with dimension 4 . As will be shown below, by exploiting cyclostationarity, we only needs to search one source in the sixth-dimensional noise subspace. The simulation results for the cyclic MUSIC algorithm of Gardner et at. [7], [8], however, look much better as shown in Fig . 6, since the infereferences (AM , FM) are discriminated against in the cyclic correlation measurement. The RMSE in Fig. 6 stands for root-mean-square

267

~_._------~_

.....-.-_ ....... --_ .... -.._...-_ ..-_ ..-

.... --_._-.-_ ...-..-_.~ ...

-. - Existing Cyclic MUSIC

_ NewCyclic MUSIC

where S I ( • ) denotes the baseband signal of the SOl (BPSl() and s2( .) represents the baseband signal of another inter.. ference (FM or AM). Tk = D sin ()k/ c, k = 1, 2. Since D/c = 1/2ft = 1/2(2fo + fb), t, = sin Ok/2(2fo + fb). In our simulations, the BWlearner is changed by fiXing the fb and varying fo. In the narrow-band case (BW /carrier = 1%),fo »fb or ik « T = nT.\.; here T\· is the sample frequency, which is inversely proportional to fb' There.. fore, the cross term residuals are

10~----=-O.~OS---:O':-"":"1--0~.lS---.....O.2--0-"'-.2S------"O.-3-----'O.3-S-~O.4 BW/Canier

Fig. 6. Results for the cyclic MUSIC and SC-MUSIC (case A). Seven element ULA. SNR = 10 dB, 500 trials. 500 snapshots. Sal: BPSK (3 MHz. 55°). Sampling frequency = 8 MHz.

error. Ideally speaking, since cyclic MUSIC is designed for narrow-band sources, the DOA estimates should degrade as the bandwidth increases. In this example, however, the RMSE did not grow with the increased bandwidth so significantly. In other words. the bandwidth effect does not dominate in this case. The relatively large RMSE of the DOA estimates (7 °-8 ° even for the 1 % bandwidth/carrier) may be due to the fact that onlv one value of T is exploited in the cyclic correlation. The results for SC-MUSIC show significant improvement in DOA estimation. The main reason of the improved performance is the more efficient exploitation by our algorithm of multiple values of T in the cyclic correlation functions. Since the SC-MUSIC alsorithm is :;, asymptotically exact for both narrow-band and wide-band sources. in principle. the performance should not differ significantly as the signal bandwidth varies. In practice. however, one can easily observe a sharp drop in the estimation error as the bandwidth increases. This is due to the fact that we have only a finite number of snapshots. In this case, the estimate of the conjugate cyclic correlation between two sources, e.g .. FM (40°) and BPSK (55°), is not zero, although these two sources are cyclically uncorrelated. Therefore, the conjugate cyclic correlation of the data received by the i th sensor is J

R~(T)

= R~,(T) exp +

[j21r2fe(i - l)Td

(5 (t + ~ + 1

(t- ~ +

· 52

(i -

l)TI)

1)T2)e-Jh

(i -

fhl

)

. exp [j21rfo(i - 1) (Tl + T:!)]

+

<51 (t

-~ +

·52 (t + ~ + (i

(i -

-

1)T

<5 (t + ~ + (i - l)TI) 1

(t- ~ +

· 52

(i -

1)T2)e-J21Tfi'')

which are independent of the sensor location. Therefore, the cross terms act consistently like another small induced narrow-band interference with DOA sin -I {(sin 8} + sin ()~)/C1. +fi)/j~»)}. as long as (SI(t - r/2)s,(t + T /:.) e -/2-:r.f;JI) is not small enough. Since only one source was assumed. the DO.~ estimation of the principal source was a little biased. Nevertheless. as the bandwidth increases or 10 decreases, 1/(210 + Ji,) becomes larger and Tk becomes comparable with T. Therefore, the cross term will be further and further from the induced narrow-band interference. In other words. its spatial energy tends to spread out and it looks more and more like a spatially uncorrelated noise. Consequently, the DOA estimation improves as the bandwidth increases. The above is only a qualitative explanation of the dependence of the performance on the source bandwidth. To understand the finitedata effect better, we need to conduct an analytical perfonnance analysis. which will be part of our future research. In fact, the finite-data effect diminishes as the number of snapshots increases. Therefore, in Section V -C, the doubled number of snapshots (i.e., 1000) may be the reason for the fact that the RMSE curves of SC-MUSIC are rather flat as the bandwidth increases (see Fig. 12) .

B. Separation of Closely Spaced Sources In the second simulation test, we wanted to compare the performance of the algorithms in terms of their separation capability for closely spaced sources. Two sources with DOA's 50° and 60° impinged on the same array as mentioned above. The signal waveforms were the same BPSK signals as defined above but with baud rates of 4

1)

l)TZ) e- .-jh/) j2

· exp [j21ffo(i - 1)(Tl + 72)] 268

I

0.8

o.9

0.7

o.7

o.8 .6

I~

0.6

.~

:iJ

0.4

o.3 o.2 o.1

J) 3

~tion

o

6~

of Ani.a! (deS>

(a)

o.9

o.8

Frequency(MHz)

o.7

o.6

Fig . 7 . FFT magnitude of the received dat a (case B).

Jo.s

.~

MHz and 3 MHz, respectively. They had the same lO-dB SNK. The number of sources was assumed to be two for conventional MUSIC, and one for the cyclic algorithms . The data were sampled at 10 MHz . The remaining conditions were exactly the same as those de scribed in Sec tion V-A . F ig . 7 shows the FFT magnitude of the received data. According to the simula tio n results in Fig . 8(a) . (b) , the con ventional MUSIC algorithm reveal s only one peak areand 55 0 . This is not surp rising since the two sources are very close and the algo rithm did not e xploi t the cyclostatio na ry property of the sources to sepa rate them . No significant d ifference between the results corresponding to the two extreme bandwidth cases was not iced bec ause the results canno t be worse than the y are . On the co ntrary . since cy cl ic MUSIC and SC -M USIC did e xplo it the spectral correl at ion property o f the sources . therefore. for o ne cycle frequency . the source with a slight different cycle frequency (baud rate ) is rej ect ed . Only o ne source with the same cycle frequenc y remains . Hence . it is much easier to estimate the DOA accuratel y . Therefore . as show n in Fig . 9 . the se two closel y spaced sources were separated and detected by exploiting cy clostatio na rity or spe ctral correlation . Sim ilarly. due primarily to the fact that only one value of T is e xplo ited in the cy clic autoco rre latio n and secondar ily because of the mismatch between the nar row-band model and the actu al dat a , cyclic MUSIC yielded a relativel y large e rro r in DOA es ti matio n. while SC-MUSIC performed sig nificant ly better when the bandwidth was large for reasons similar to those pre sented in Section V-A.

C. Sensiti vity to Correlated Noise Most con ventional signal subs pace algorithms require exact knowledge of noise covariance funct ion . In the real world, however, the noi se co variance function is rarel y known and can be diffi cult to es timate . Th is requirement greatly restricts the application o f those conv entional algorithms . Nevertheless , the cy cl ic algo rithms never reqUire this knowledge prov ide d that the noise is not jointly cyclically co rre lated with the sources, and neithe r is it

:c

0.4

Q3

o.2 o.1

I

55

4~

Direction

0

6.5

o( .~val

(deg)

(b)

Fig . 8 . Result s for the con vent ional MUSIC algorit hm (a) Bw /carrier = 1% (b) BW/carri er = 40 % (cas e B) .

l0r--~-~-~---~-~------,

. •• Exuung C yclic MUSIC (60 deg ) Ex .. ung Cyclic MUSIC (50 deg ) • New Cyclic MUSIC (60 de&) ._ New Cyclic MUSIC (50 dog)

~

" C'.

".

I

0

0.05

0.1

O .I~

0.2

0.25

0.3

0.35

0.4

BWICamcr

Fi g . 9. Result s for the cy cli c MU SIC and SC-M USIC algorithms (case B). Sev e n ele me nt ULA . 500 trial s . 500 snapsho ts , SNR = 10 dB . 10 dB . SOl : BPSK (3 MH z. 60 ° ). BPSK (4 MHz. 50 °) .

cycl ically sel f-correlated at the cycle frequencies of the SOl' s. The cyclic correlation of a stationary process is always zero theoretically and is very small in pract ice if there are enough data samples . In order to verify the above advantage of cyclic algorithms , we conducted a simulation test with strongly correlated background noise . In this simulation test , the co rrelated noise is generated with a correlation sequence r (n) = 1,0.8,0 .6,0 .3,0 .1, 0, 0 where n = 0, 1, . . . , 6 . Then the noise covariance matrix is a Toeplitz matrix composed from the above correlation sequence . The sources in this test were two BPSK signals with DOA 's 30 0 and 50 ° . They had the same carrier frequency and the same cycle frequencies or baud rates (3 MHz) which were as sumed to be known . How-

269

o.7 0.9

o.6

O.S

o.S

0.7

~

1 ~

~ o.4

0.6

9 ~ O. 3

0..5

o.2

0 .4

/\,

o.1

0.3 - IV

0.2

40

2U

IV

Direction of Arrival (deg)

0.1

(a)

0

IS

·15

o.6

Frequency (MHz)

o.S

Fig . 10. FFT magnitude of the rece ived dat a (case C ).

o. 4

ever, the source at 30 ° is modulated on a half-sine pulse ;' while the source at 50° is modulated on a half-cosine pulse. The SNR ' s were set to be - 3 dB for both two sources. The number of sources was assumed to be two for all the algorithms . The FFT magnitude of the received data is shown in Fig . 10. Due to the low SNR a nd lack of knowledge of the correlated noise , the estimates of the conve ntional MUSIC algorithm were very poor as shown in Fig. l lr a). (b ). The results for the conventio nal MUSIC algorithm show one source located about halfway between the two correct DOA' s (30 ° . 50 °) and another one located near 0 °. whi ch might correspond to the correlated noise since the power of the noise exceeded that of the sors . The evaluation of cycli c correlation , a kind o f temporal processing . ca n help remove the noise and extract the sors. There fore . cyclic MUSIC managed to detect two sources in most cases . For the same reasoning as abo ve , cycli c MUSIC has larger error in DOA est imation than that of the SC -MUSIC approach . Due to the ext remely low SNR in thi s case . both cyclic MUSIC and SC -MUSIC failed to detect two sources seve ral times . According to the results in F ig . 13 . however. SC-MUSIC once again demonst rated its superiority with around 1/10 smaller prob abil ity of detect ion failure . If we look at F ig. 12 more clo sely. it is not difficult to see that the estimation error of the source at 30° is much smaller than that of the source at 50° for the cycl ic MUSIC algorithm , while the difference between two est ima tion errors is not noticeable for the approach developed in the paper. Generally , source bandwidth in array processing is comprised of temporal bandwidth and spatial band width . The spatial bandwidth increases as its DOA increases. Therefore , although these two sources have the same temporal bandwidth , the source with smaller DOA (30°) has smaller spatial bandwidth and is consequently closer to the narrow-band data model. Since cycl ic MU -

o.2

where .

I

o. 1

I

! 10 zo 40 Direcncn of Arri val (deg)

- 10

:>U

(b )

Fig . 11. Resu lts to r (he co nve ntion al M USIC alg or ith m (a) BWIcarrier = I?t' (b) BW/ca rrie r = 40% (ca se C ).

\I

.,. E.x.iInng Cyclic MUSIC (30 de&) _ E.uslUl& CyclICMUSIC (30 deg)

10

Now Cyclic:MUSIC (SOdeS)

..

._ New Cyclic MUSIC (30 deg)

.,.

-a

9

•

3

;ll

-c

8

"0

UJ

51

'"

+.. i

:l f" 0

.. .. ,-

. •.. -.. -- ..,

' .

O.OS

0. 1

......

-". - . .. . -O.IS

0.2

., . 0.25

0.3

o 3S

1 I

0.4

BW"cm"lC'f

Fig . 12 . Resu lts for the cvclic MUSIC a nd SC -MU SIC ala or ithrns (cast C)~ Seven element ULA. SNR = -3 dB . 500 (rials . WOO s-napsho (s. correlated noise .

SIC relies on the narrow-band model. its est imate of the source with DOA 30 ° , whi ch is clo ser to narrow band, was much better than its est imate of the other source. Simil arly , becau se SC-SSF does not depend upon source bandw idth, the estimates with smaller and larger bandwidth were not signifi cantly differen t. Also . since we have twi ce as many snapshots as the previous two cases, the cross term ( SI([ - T/2 + (i - I)Td)S2(1 + T/2 + ti I)Td)e - J21rtb') is small enough. Even thou gh Td « 1 in the narrow-band case, the overall effect due to finite number of samples is negl ig ible . Th is may explain why the errors with SC-MUSIC do not change significantly as the bandwidth va ries . >

3A half-sin e pul se p (l) is defin ed as : p (l) = s in (h l! T,,) for - T,,!2 ~ ~ T,,! 2 and p it) = 0 for I e lsew here ; a ha lf-cosine puls e q (t ) is defined as: q(l) = cos (h l! T,,) for -T,,/2 ~ I ~ T,,! 2 and q (l ) = 0 for ( el se-

I

r\

.3

270

0.25

_ c-

in Proc. ICASSP89 Conf. (Glasgow, Scotland), vol. 4, May 1989. pp. 2278-2281.

ExlJtinl Cyclic MUSIC New Cyclic MUSIC

[9] G. Xu and T. Kailath, ., Array signal processing via exploitation of spectral correlation-a combination of temporal and spatial processing," in Proc. 23rd Asilomar Con! Signals, Svsi., Comput . (Pacific Grove. CAL vol. 2, Oct. 1989, pp. 945-949 . [101 W. A. Gardner.. 'Spectral correlation of modulated signals: Part 1Analog modulation," IEEE Trans. Commun.: vol. COM-35. no. 6, pp. 584-594, June 1987. [11] W. A. Gardner. W. A. Brown, and C.-K. Chen, "Spectral correlation of modulated signals: Part II-Digital modulation.·' IEEE Trans. Commun., vol. COM-35, no. 6. pp. 595-601, June 1987. [12] W. A. Gardner, Statistical Spectral Analysis: A Nonprobabilistic Theory. Englewood Cliffs. NJ: Prentice-Hall. 1987 .

0.2 ~

.;=

'j

~

0.15

Q

0.05

.

-_._-~---_.-_.-

.... -.-._-_.- ..

_-------

....

O\....-.._-"'--_-..r._ _- - ' - - - _ - - ' -_ _~ - - - I . . - - - - - - " - - - J

o

O.OS

0.1

0.15

0.2

0.25

0.3

0.35

0.4

BW/Carrier

Fig. 13. Detection Failure of the cyclic MUSIC and SC-MUSIC algorithms (case C).

VI.

CONCLUSIONS

Most existing approaches, see, e.g., those in [1] and [2], to the direction of arrival problem ignore the temporal characteristics of the signals. Following Gardner [7] and Schell et al. [8], we have shown that cyclostationarity of the signals, a situation common in many communications problems, can be exploited to considerable advantages. The new SC-SSF method in this paper effectively combines temporal and spatial properties and it compares very favorably with earlier methods with respect to performance. ease of implementation. and applicability to both narrow-band and broad-band signals. if the sources are mutually cyclically uncorrelated . .~CKNO\VLEDGMENT

The authors are indebted to Prof. W. Gardner and Dr. S. Schell of the University of California, Davis. for several useful discussions. which motivated their interest in the problem of direction finding for cyclostationary signals. They would also like to express their gratitude to the referees for their constructive comments which greatly improved this paper. REFERENCES

[1 j S. Haykin. Array Signal Processing Englewood Cliffs. NJ: Prentice-Hall. 1984. [2] S. J. Orfanidis , Optimal Signal Processing-An Introduction. New York: Macrni Ban. 1985. [3] R. O. Schmidt. .. A signal subspace approach to multiple emitter location and spectral estimation." Ph. D. dissertation. Stanford Univ .. Stanford. CA. Nov. 1981. [4] R. H. Roy ... ESPRIT. estimation of signal parameters via rotational invariance techniques." Ph. D. dissertation. Stanford Univ .. Stanford, CA. Aug. 1987. (.5! W. A. Gardner, Introduction to Random Processes with Application to Signals and Systems. New York: Macmillan, 1985. [6J W. A. Gardner. "Measurement of spectral correlation." IEEE Trans. Acousc., Speech, Signal Processing: vol. ASSP-34, no. 5. pp. 11111123.1986. [7] W. A. Gardner, "Simplification of MUSIC and ESPRIT by exploitation of cyclostationarity," Proc. IEEE, vol 76. no. 7. pp. 845847. JuIy 1988 . [81 S. V. SchelL R. A. Calabretta, W. A. Gardner, and B. G. Agee, . 'Cyclic MUSIC algorithms for signal-selective direction finding,"

271

Space-Alternating Generalized Expectation-Maximization Algorithm Jeffrey A. Fessler, Member, IEEE, and Alfred O. Hero, Member, IEEE

Abstract- The expectation-maximization (EM) method can facilitate maximizing Iikelibood functions that arise in statistical estimation problems. In the classical EM paradigm, one iteratively maximizes the conditional log-likelihood of a single unobservable complete data space, rather than maximizing the intractable likelihood function for the measured or incomplete data. EM algorithms update all parameters simultaneously, which has two drawbacks: 1) slow convergence, and 2) difficult maximization steps due to coupling when smoothness penalties are used. This paper describes the space-alternating generalized EM (SAGE) method, which updates the parameters sequentially by alternating between several small hidden-data spaces defined by the algorithm designer. We prove that the sequence of estimates monotonically increases the penalized-likelihood objective, we derive asymptotic convergence rates, and we provide sufficient conditions for monotone convergence in norm. Two signal processing applications illustrate the method: estimation of superimposed signals in Gaussian noise, and image reconstruction from Poisson measurements. In both applications, our SAGE algorithms easily accommodate smoothness penalties and converge faster than the El\'f algorithms.

I

I.

INTRODUCTION

N a variety of signal processing applications, direct calculations of maximum-likelihood (ML), maximum a posteriori (MAP) . or maximum penalized-likelihood parameter estimates are intractable due to the complexity of the likelihood functions or to the coupling introduced by smoothness penalties or priors. EM algorithms and generalized EM (GEM) algorithms [1] have proven to be useful for iterative parameter estimation in many such contexts, e.g., [2] and [3]. In the classical formulation of an EM algorithm, one supplements the observed measurements, or incomplete data, with a single complete-data space whose relationship to the parameter space facilitates estimation. An EM algorithm iteratively alternates between an E-step, calculating the conditional expectation of the complete-data log-likelihood, and an M -step, simultaneously maximizing that expectation with respect to all of the unknown parameters. EM algorithms are most useful when the lvI-step is easier than maximizing the original likelihood. The simultaneous update used by a classical EM algorithm Manuscript received May 28, 1993, revised February 4, 1994. This work was supported by a DOE Alexander Hollaender Postdoctoral Fellowship, DOE Grant DE-FG02-87ER60561; NSF Grant BCS-9024370; and NCI Grants CA54362-02 and CA-60711-0 1. The associate editor coordinating the review of this paper and approving it for publication was Prof. Stanley J. Reeves. J. Fessler is with the Division of Nuclear Medicine, University of Michigan, Ann Arbor, MI 48109-0552, USA. A. O. Hero is with the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109 USA. IEEE Log Number 9403729.

necessitates overly informative complete-data spaces, which in tum lead to slow convergence. In this paper we show improved convergence rates by updating the parameters sequentially in small groups. The convergence rate of an EM algorithm is inversely related to the Fisher information of its complete-data space [1], and we have previously shown that less-informative completedata spaces lead to improved asymptotic convergence rates [4]-[6]. Less informative complete-data spaces can also lead to larger step sizes and greater likelihood increases in the early iterations [5]-[7]. Since the relationship between complete-data space information and convergence is therefore more than just an asymptotic phenomenon, we believe that one should strive to minimize the information of the complete-data space. However, in the classical EM formulation a less informative complete data space can lead to an intractable maximization step [1], [5], due to the simultaneous update employed by EM algorithms. (As an example, the least-informative admissible "complete" data space would be the measurement space itself!) To circumvent this tradeoff between convergence rate and complexity, in this paper we extend the concepts in [4] and [6] by proposing a new space-alternating generalized EM (SAGE) method. This method is suited to problems where one can sequentially update small groups of the elements of the parameter vector. Rather than using one large complete-data space" we associate with each group of parameters a hiddendata space (Definition 2 in Section II), which would be a complete-data space in the sense of [I] if the other parameters were known. We define a flexible admissibility criterion that ensures that the algorithm monotonically increases the penalized-likelihood objective. In the examples we describe here and in [8], one can design the hidden-data space for each parameter subset to be considerably less informative than the natural single complete-data space. This reduction leads to faster convergence. Convergence rate is one of two motivations for the SAGE method. In applications such as tomographic imaging and image restoration, where the parameter space is very large, it is often necessary or desirable to regularize using smoothness penalties. Such penalties usually introduce couplings that render intractable the maximization steps of classical EM methods [9]. Several approaches to this problem have been proposed, many motivated by emission tomography, including GEM algorithms [10]-[ 12], linearizations of the penalty function [9], line searches [13], applying ad hoc smoothing in lieu of a smoothness penalty [14], red-black orderings [15], and majorization of the penalty functional [16], [17]. These methods

Reprinted from IEEE Transactions on Signal Processing, Vol. 42, No. 10, pp. 2664-2675, October 1994.

272

are all rooted in the classical EM method, and often they share its slow convergence. In contrast, by using a separate hiddendata space for each parameter, a SAGE algorithm intrinsically decouples the parameter updates. Surprisingly, not only is the maximization simplified, but the convergence rate is improved as well. Two related approaches that also decouple the update are the hybrid ICM-EM algorithm of Abdalla and Kay [18] and the coordinate-wise Newton-Raphson method of Bouman and Sauer [19], [20]. A variety of methods have been proposed for accelerating EM algorithms, most of which are based on standard numerical tools such as Aitken's acceleration [21], overrelaxation [22], line-searches [23], Newton methods [24], [19], and conjugate gradients [23], [25]. These methods, although often effective, do not guarantee monotone increases in the objective unless one explicitly computes the objective function. The SAGE method is based fundamentally on statistical considerations, and monotonicity is guaranteed. The relative importance of monotonicity and convergence rate will of course be application-dependent. When the EM algorithm was first introduced, discussants questioned the term "algorithm' since the general method does not prescribe specific computational steps for particular applications [I]. The SAGE method is similarly general, if not more so! Therefore, we devote much of this paper to a detailed comparison of SAGE and classical EM for two signal processing applications: estimation of superimposed signals in Gaussian noise, and image reconstruction from Poisson measurements. We have simplified the examples for the purposes of illustration, while hopefully retaining sufficient complexity that the reader will gain insight into how to apply SAGE to other problems. The organization of this paper is as follows. Section II defines the generalized concept of "hidden data space:' describes the general form of the S.~GE algorithm. and establishes monotonicity in objective. Sections III and IV describe the applications. Appendix A discusses convergence of the algorithm and a region of monotone convergence in norrn. Appendix B establishes that the region of monotone convergence is nonempty for suitably regular problems. Appendix C examines the relationship between convergence rate and Fisher information of the hidden-data spaces. II.

THE SAGE ALGORITHM

A. Problem

Let the observation }T have the probability density I function fJ t r u e is a parameter vector residing in subset 8 of the p-dimensional space lRP . Given a measurement realization }T == Jj, our goal is to compute the maximum penalized-likelihood estimate {} of (}true, defined by

.f(Y; ()trllf:», where

{j £ arg max ( f}) -

()E(-j

where (1) ( (}) ~

log J(u. H) - P (B).

(1)

I For simplicity, we restrict our description to continuous random variables. The method is easily extended to general distributions [4J.

Unfortunately, direct maximization of
((js. ()5) == ( 0s ~ Os) over H5, even if the index set S contains only one element. One could apply numerical line-search methods, but these can be computationally demanding if evaluating
(., t1§) - ( .: 8§). Just as for an EM algorithm, the functionals ¢s are constructed to ensure that increases in cPs yield increases in . Furthermore, we have found empirically for tomography that by using a new hiddendata space whose Fisher information is small, the analytical maximum of 1;5(-; (ji) increases (., (j~) itself. This is formalized in Appendix C, where we prove that less informative hidden-data spaces lead to faster asymptotic convergence rates. In summary, the SAGE

273

r:

method uses the underlying statistical structure of the problem to replace cumbersome or expensive numerical maximizations with analytical or simpler maximizations.

e, -

X

f(y, z: 0) = f(y I x; (}s )f(x; 0)

(2)

i.e., the conditional distribution f{y I x; Os) must be independent of () s- In other words, X s must be a complete-data space (in the sense of [1]) for (}s given that (}s is known. A few remarks may clarify this definition's relationship to related methods. • The complete-data space for the classical EM algorithm et al. [1] is contained as a special case of Definition 2 by choosing S = {1, ... ~ p} and requiring Y to be a deterministic function of -J'Y S [4]. • Under the decomposition (2), one can think of Y as the output of a noisy channel that may depend on () 5 but not on () s- as illustrated in Fig. 1. • We use the term "hidden" rather than "complete" to describe X s, since in general ..X" 5 will not be complete for () in the original sense of Dempster et ale [1]. Even the aggregate of _X" S over all of S will not in general be an admissible complete-data space for (). • The most significant generalization over the EM complete-data that is embodied by (2) is that the conditional distribution of Y on ..X" 5 is allowed to depend on all of the other parameters () 5 (Fig. 1). In the superimposed signal application described in Section IV, it is precisely this dependency that leads to improved convergencerates. It also allows significantly more flexibility in the design of the distribution of X 5 . • The cascade EM algorithm [28] is an alternative generalization based on a hierarchy of nested complete-data spaces. In principle, one could further generalize the SAGE method by allowing hierarchies for each X S .

c.

Fig. 1. Representing the observed data Yas the output of a possibly noisy channel C whose input is the hidden-data X 5 .

For 1) 2) 3) 4)

i

= 0, 1, ...

SAGE Algorithm

{

Choose an index set S = s'. Choose an admissible hidden-data space X S\ for () St. E-step: Compute ljJsl(()St;()'i) using (4). M-step: (5)

(6) 5) Optional'? Repeat steps 3 and 4.

},

where the maximization in (5) is over the set

(7) If one chooses the index sets and hidden data spaces appropriately, then typically one can combine the E -step and i\tl-step via an analytical maximization into a recursion of the form 1 = g5 (()i). The examples in later sections illustrate this important aspect of the SAGE method. Note that if for some index set S one chooses _Y 5 == Y, then for that S one sees from (3) and (4) that cb s (8s : 1)1) == <1>( ()5 ~ (}i_). Thus, grouped coordinate-ascent [271 is a special case of rhe SAGE method, which one can use with index sets S for which 4>(f)s~ O§) is easily maximized. Rather than requiring a strict maximization in (5), one could settle simply for local maxima [4], or for mere increases in ¢s, in analogy with GEM algorithms [1]. These generalizations provide the opportunity to further refine the tradeoff between convergence rate and computation per iteration.

r:

1

D. Choosing Index Sets

Algorithm

An essential ingredient of any SAGE algorithm is the following conditional expectation of the log-likelihood of X S :

QS(Os; 0)

.D-y

1/

B. Hidden-Data Space

To generate the functions ljJs for each index set S of interest, we must identify an admissible hidden-data space defined in the following sense: Definition 2: A random vector X 5 with probability density function f(x; 0) is an admissible hidden-data space with respect to Os for f(y; 0) if the joint density of X S and Y satisfies

S

= QS(Os;Os, »»

=6 E { log f(~J;.rS ; Os, -(}s) I Y == y; -} (}

=

J

(3)

f(x I Y = y;O) log f(x; (}s,Bs)dx.

To implement a SAGE algorithm, one must choose a sequence of index sets s', i == 0, 1~ .... This choice is as much art as science, and will depend on the structure and relative complexities of the E- and lvI-steps for the problem. To illustrate the tradeoffs, we focus on imaging problems, for which there are at least four natural choices for the index sets: 1) the entire image, 2) individual pixels, i.e.

Si

We combine this expectation with the penalty function:

¢s(Bs;O) £ QS(Os;O) - P(()s,7J s).

(4)

Let ()o E e be an initial parameter estimate. A generic SAGE algorithm produces a sequence of estimates {()i}~o via the following recursion: 274

= {I + (i modulo p)}

(8)

2 Including the optional subiterations of the E- and M -steps yields a "greedier" algorithm. In the few ex.amples we have tried in image reconstruction, the additional greediness was not beneficial. (This is consistent with the benefits of under-relaxation for coordinate-ascent analyzed in [29].) In other applications however, such subiterations may improve the convergence rate, and may be computationally advantageous over line-search methods that require analogous subiterations applied directly to ep.

(this was used in the IeM-EM algorithm of [18]), 3) grouping by rows or by columns, and 4) "red-black" type orderings. These four choices lead to different tradeoffs between convergence rate and ability to parallelize. A "red-black" grouping was used in a modified EM algorithm in [15] to address the M -step coupling introduced by the smoothness penalties. However, those authors recently concluded [16] that a new simultaneous-update algorithm by De Pierro [17] is preferable. Those methods use the same complete-data space as in the conventional EM algorithm for image reconstruction [3], so the convergence rate is still slow. Since the E-step for image reconstruction naturally decomposes into p separate calculations (one for each element of (}), it is natural to update individual pixels (8). By using the less informative hidden-data spaces described in Section III, we show in [8] and [30] that the SAGE algorithm converges faster than the GEM algorithm of Hebert and Leahy [10], which in tum is faster than the new method of De Pierro [17]. Thus, for image reconstruction, it appears that (8) is best for serial computers. As noted by the reviewers, for image restoration problems with spatially-invariant systems, one can compute the Estep of the conventional EM algorithm using fast Fourier transforms (Ff'T's). A SAGE algorithm with single-element index sets (8) would require direct convolutions. Depending on the width and spectrum of the point-spread function, the improved convergence rate of SAGE using (8) may be offset by the use of direct convolution. A compromise would be to group the pixels alternately by rows and by columns. This would allow the use of 1-0 Ff'T's for the E -step, yet could still retain some of the improved convergence rate. Nevertheless. the SAGE method may be most beneficial in applications with spatially-variant responses. Regardless of how one chooses the index sets, we have constructed cps to ensure that increases in s lead to monotone increases in
inequality [1], one can easily show that

HS(Bs;B) ~ H S(Bs;8),

(lr+ 1 )

==

J

j(x I Y

(Os , 7} 5)

WS(B)

(9)

where

- ~(B)

(cPS

(os; 8)

- H S (71 5 ; 0)).

F. Convergence For a well-behaved objective <1>, the monotonicity property ensures that the sequence {f)i} will not diverge, but it does not guarantee convergence even to a local maximum of <1>. (Some EM algorithms have fixed points that are not local maxima [ 1J, [31].) Therefore, in the appendices we provide additional theorems that give sufficient conditions for convergence in norm, and that characterize the asymptotic convergence rate. To summarize briefly, these theorems show under suitable regularity conditions that:

• If a SAGE algorithm is initialized in a region suitably close to a local maximum in the interior of 8, then

the sequence of estimates will converge monotonically in norm to it. (This may not apply when the local maximum lies on the boundary of e, as often happens in the example in Section III.) • For suitably regular objectives, the region of monotone convergence in norm is guaranteed to be nonempty [43~. • The asymptotic convergence rate of a SAGE algorithm will be improved if one chooses a less informative hiddendata space.

This last point is subtle, but is perhaps one of the most important conclusions of our analyses since it emphasizes the need for careful design of the hidden-data spaces. Less informative hidden-data spaces yield faster convergence, but more informative hidden-data spaces may yield easier M -steps

[5], [8], [30].

(10)

W S (7J) £

((P) ~ ¢S((}~+l;(P) _ ¢S(()~;fP).

Thus, if ¢s (f) S; 0) ~ ¢s (lJ 5; 0), then ( ()5, (js) ~ <1>(0) using (11). The results then follow from the definition of the SAGE algorithm. 0 Standard numerical methods require evaluation of (B i + 1 ) - ((}i) to ensure monotonicity. That requirement is obviated for SAGE methods by the monotonicity theorem above.

H 5 ( Os; 0) ~ E {log f (-,y slY == y: f)5 , (j 5) I }T == y; (j} and due to (2)

-

== cPs (() 5; (1) - H S (f) 5; lJ) -

= y;7J) logj(x; 6s ,7J s )dx

ue., 8s ) + HS({}s; 8) -

(11)

Proof' From (4) and (9) it follows that

Let Sand X S respectively denote an index set and hidden data space used in a SAGE algorithm. Under mild regularity conditions [1]. [4], one can apply Bayes' theorem to (3) to see that

=

V8

from which the following theorem follows directly. Theorem 1: Let ()t denote the sequence of estimates generated by a SAGE algorithm (5). Then 1) <1>( ()i) is monotonically nondecreasing, 2) if {) maximizes , then {) is a fixed point of the SAGE algorithm, and 3)

E. Monotonicity

QS(Bs;B)

'riBs,

J

III. EXAMPLE 1 LINEAR POISSON MEASUREMENTS

j(x I Y = y;B)logj(y I x; Bs)dx.

Note that W S is independent of Os, so it does not affect the maximization (5). Using these definitions and Jensen's

275

The EM method has been used for over a decade to compute ML estimates of radionuclide distributions from tomographic data, such as that measured by positron emission tomography (PET) [3], [32]. In this section we present a brief review of the

classical EM algorithm for this problem, and then introduce two SAGE algorithms. The second SAGE algorithm is based on a new hidden-data space, and converges faster than even an accelerated EM algorithm. For simplicity we focus in this paper on ML estimation; the penalized version is described in [8] and [30]. Assume that a radionuclide distribution can be discretized into p pixels with emission rates A [AI, ... , Ap ]' . Assume that the emission source is viewed by N detectors, and let N nk denote the number of emissions from the kth pixel that are detected by the nth detector. Assume the variates N n k have independent Poisson distributions:

=

(12)

For this complete-data space, the Q function (3) becomes (see (4) of [3]) N

Ql(A; Ai)

=L

L p

Maximizing Q1(.; Ai) analytically leads to the algorithm which follows. ML-EM Algorithm for Poisson Data

for i = 0,1, ... {

L ankA~ + p

Yn :=

Yn = LNnk+Rn

for

k=l

k

=

Poisson{r n }

pOisson{tankAk+rn}.

ek

=L

(17) (14)

Given realizations {Yn} of {Yn } , the log-likelihood- for this problem is given by [3]:

=L lV

(-Yn(A)

n=l

+ Yn logYn(A))

where p

Yn(A) =

L

ankAk

k=l

+ Tn·

We would like to compute the ML estimate ~ from y. To apply coordinate ascent directly to this likelihood, one might try to update Ak by equating the derivative of the likelihood to zero

O=

-a.k

+

N ~ ~ank n=l ank(Ak -

Yn.

"

Ak) + Yn(A 1 )

ankYn/Yn

n=l

k=l

log f(y; A)

= 1, ... ~ N

IV

(13)

with means {Tn} assumed known for simplicity. Thus I"V

n

I. .... p {

where {Rn} are independent Poisson variates

Yn

r«.

k=l

p

I"V

+ N nk log(ankAk))

where [3]

where the ank are nonnegative constants that characterize the system [3]. The detectors record emissions from several source locations, so at best one would observe only the sums 2:~=1 N.",k, rather than each N n k . Background emissions, random coincidences, and scatter contaminate the measurements, so we observe

u;

(-ankAk

n=lk=l

}.

}

In words. the previous parameter estimate is used to compute predicted measurements, those predictions are divided into the measurements and backprojected to form multiplicative correction factors, and the estimates are simultaneously updated using those correction factors. This EM algorithm converges globally (3], [5], but slowly. The root-convergence factor is very close to 1 (even if p = 1 [5]), since the complete-data space is considerably more informative than the measurements [5], [8], [30]. We now derive two SAGE algorithms for this problem, both of which use individual pixels for the index sets: Si = {k}, where k = 1 + (i modulo p). The most obvious hidden-data for Ak is just

(15)

2::=1

where a.k = ank. Unfortunately, this equation has no analytical solution. A line-search method would require multiple evaluations of (15), which would be expensive-hence the popularity of EM-type algorithms [3] that require no line searches. The complete-data space for the classical EM algorithm [3] for this problem is the set of unobservable random variates (16)

276

which is a subset of the classical complete-data space (16). The QSk function for the kth parameter is:

QSk (Ak; Ai) =

L (-ankAk + N nk log(ankAk))' N

n=l

Maximizing QSk (.; Ai) analytically yields the following algorithm:

ML-SAGE-l Algorithm for Poisson Data

+ k Tn,

. l lze ' ,,",P ,0 Ln i. t La : -Yn=L.k=lank/l

f or i = 0,1 , . . . { fo r k = 1, . . . , p {

Maximum Likelihood

n= 1 , . . . , N .

6

5.8 5.6 N

ek =

L UnkYn/Yn

"8 5 .4

:5

n=1 ,/lki+1 = /lkek , i / a.k .\i+1 ) = N) ' J' r~ k

-Yn := -Yn + (,/\ki+1 - /\k 'i) ank ,

}.

~5.2

::i

(18)

8'

....I

4.8

vn :ank r

~o

\.J

5

lIE

ML-5AGE·2

o

ML-LINU

X

ML-EM

4.6

}

4.4

This SAGE algorithm updates the parameters sequentially, and immediately updates the predicted measurements Yn within the inner loop, whereas the ML-EM algorithm waits until all parameters have been updated . ML-SAGE-I is the unregularized special case of the ICM-EM algorithm of [18] ; a local convergence result for ICM-EM was mentioned in [ 18]. We found that ML -SAGE-I converges somewhat faster than \1L -EM for well-conditioned problems. but the difference is nin imal for poorly-conditioned problems . The reason is that X S • is still overly informative since the background events are isolated from the parameter being updated (cf (12) and (13)) [8], [30] . Therefore. we now introduce a new. less informative hidden-data space that associates some of the uncertainty of the background events R n with the particular parameter .\k as it is updated [8], [30]. Whereas the ordinary complete-data space has some intuitive relationship with the underlying image formation physics. this new hidden-data space was developed from a statistical perspective on the problem and its Fisher information. First. define

and define unob servable independent Poisson variates

Znk

"oJ

Poisson{an k(Ak + Zk)}

e., '" pOisson{r

n

-

I.L nkZk

+L

Iln} .\j }

(19)

o

10

15

20 iteration

25

30

3S

40

Fig. 2. Comparison of log-likelihood increase log f ( y; 8') - log f (y : 1J0 ) versus iteration i for ML-EM . ML-LINU . and ML-SAGE-2 algorithms. for image reconstruction from PET measurements with 9% random coincidences. ML-SAGE-2 clearly reaches the asymptote sooner.

derivation as in [3] (see [8] and [30] for details), one can show

where

Znk

= E {Znk I Y = y ; Ai } = (Ak + zk )ankYn /Yn (Ai ).

Maximizing Q S' (-: Ai ) analytically (subject to the nonnegativity constraint) yields the ML-SAGE-2 algorithm, which has the same sequential structure as ML-SAGE-I , except that (18) is replaced by: A~T l := max { (Ai, + zk)e k/ a.k - Zk, o].

Provided Zk i= 0, which is always the case in PET since random coincidences are pervasive, this remarkably small modification yields significant improvements in convergence rate . The Fisher information for the classical complete-data space with respect to A is diagonal with entries

a' k /~k

r#

and let the hidden-data space for Ak only be

provided the ML estimate ~ is positive . In contrast, the Fisher information for the new hidden-data space is diagonal with entries

Then, clearly

has the appropriate distribution (14) for any particular k. We have absorbed all of the background events into the terms Z"k and B"k which are associated with Ak. Thus . the aggregate of all p of the hidden -data spaces is not an admissible hiddendata space for the entire parameter vector A. Using a similar

which is clearly smaller since Zk > 0. The improved convergence rate of ML-SAGE-2 is closely related to this difference. To illustrate, Fig. 2 displays the likelihood 4>( Bi ) versus iteration for the ML-EM algorithm and for ML-SAGE-2 applied to a simulation of PET data . The image was an 80 x 110 discretization of a central slice of the digital 3-D Hoffman brain phantom (2 mm pixel size) . The sinogram size was 70

277

radial bins (3 nun wide) by 100 angles. A 900000-count noisy projection was generated using (6-mm-wide) strip-integrals for {ank} [29], including the effects of nonuniform head attenuation and nonuniform detector efficiency. A uniform field of random coincidences was added, reflectin~ a scan with 9% of the total counts due to randoms (i.e., ~'n=1 Tn ~ 0.12::=1 Yn(A)), a typical fraction for a PET study. Further details can be found in [8] and [30], including comparisons over a large range of Tn'S. Also shown in Fig. 2 is the LINU unbounded line-search acceleration algorithm described by Kaufman [23]. The ML-SAGE-2 likelihood clearly increases faster and reaches its asymptote sooner than both the ML-EM and ML-LINU algorithms.' (ML-SAGE-2 was also considerably easier to implement than the bent-line LIND method.) The convergence in norm given by Theorem 3 of Appendix A is inapplicable to this Poisson example when the ML estimate has components that are zero, Le., when the ML estimate lies on the boundary of the nonnegative orthant [33]. See [30] for a global convergence proof for ML-SAGE-I and ML-SAGE-2 similar to the proofs in [3] and [17]. The reader may wonder whether one can also find a better complete-data space for the classical EM algorithm. Because the EM update is simultaneous, one must distribute the background events among all pixels; therefore, the terms Zk are reduced by a factor of roughly [8], [30]. Since Vp is in the hundreds, the change in convergence rate is insignificant, which is consistent with the small reduction in Fisher information [8], [30]. Other simultaneous updates (17] similarly do not improve much [30]. Apparently one benefits most from this less informative hidden-data space by using a SAGE method with the parameters grouped into many small index sets. An alternative to SAGE is the coordinate-wise sequential Newton-Raphson updates recently proposed by Bouman and Sauer [19]. That method is not guaranteed to be monotonic, but when it converges it might do so somewhat faster than SAGE since it is even greedier. One can obtain similar (but monotonic) greediness by using multiple subiterations of the E- and M-steps in the SAGE algorithm, as indicated by Step 5 of the generic SAGE algorithm. However, for the few cases we have tested, we have not observed any improvement in convergence rates using multiple subiterations. Although further investigation of the tradeoffs available is needed, including comparisons with possibly superlinear methods such as preconditioned conjugate gradient [23], [34], it appears that the statistical perspective inherent to the SAGE method is a useful addition to conventional numerical tools.

vp

analysis of the convergence rates. In this section, we analyze the problem of estimating superimposed linear signals in Gaussian noise [2], [9]

where A = fa1 ... ap ] , and f. is additive zero-mean Gaussian noise with covariance II, Le., f. N(O, II). For simplicity we consider a quadratic penalty P(O) = ~O'PO, so the penalizedlikelihood objective function is f'"V

- cll(B)

= ~(y -

AB)'JI-l(y - AB)

+ ~B'PB.

Such objective functions arise in many inverse problems [9]. We assume A has full column rank, P is symmetric nonnegative definite, and the intersection of the null spaces of P and A is empty, in which case the (unique) penalizedlikelihood estimate is

fJ

= (A'rr- 1A + p)-l A'rr-1y.

(21)

If A is large, or if positivity constraints on () are appropriate, then (21) is impractical and iterative methods may be useful. (One can also think of (20) as a linearization of the more interesting nonlinear problem [2].) We present the linear version here since we can derive exact expressions for the convergence rates. We first present admissible hidden-data spaces for this problem, derive EM and SAGE algorithms, and then prove that the SAGE algorithm converges faster. Since the mean of Y is linear in 0, the conventional complete-data [2], [9] for the EM algorithm for this problem is also linear in f}. Here. we restrict our attention to hidden-data spaces ~yS whose means are also linear in f}, and for which the conditional mean of Y given X S is linear in ~Y 5 and OS' Considering a general index set 5, the natural hidden-data space for () 5 is

. ~ ~ ./V(BO s + BO s' C) S

YIX

5

-

=X~jV(Gx+G(}s~W)

which is admissible provided the two normal distributions are independent and consistent with (20), i.e., As = GB, As = GB + G, and II = W + GCG'. The log-likelihood for ~~s is given by

IV. EXAMPLE 2 LINEAR GAUSSIAN MEASUREMENTS

The Poisson problem has important practical applications, but the nonlinearity of the algorithms complicates a formal 3 Fast convergence is clearly desirable for regularized objective functions, but we advise caution when using "stopping rules" in conjunction with coordinate-based algorithms for the unregularized case, since for such algorithms the high spatial frequencies converge faster than the low frequencies [26].

where Cl and C2 are independent of Os. By standard properties of joint normal distributions

278

X

S

= E{X s I y" == u: Oi} = B(}~

+ B(}§ + CG'II-1(y -

A()i).

The ¢s function of (4) is thus i

c/>s(8 s; 8

)

SAGE Algorithm for Superimposed Signals

= (B 8s + B8§)'C- 1 ( X 1

- 2

[0 s ] 8§

I

[pP'2P3

1P 2 ]

S

-

Initialize: for i == 0,1, ...

~(B8s + B8§)),

k == 1 + (i modulo p),

[B s ]

S == {k}, O~+l :== (a~rr-lak + Pkk)-lpkO i + ( ak'IT-l ak + P kk )-1 ak'IT"E " . . (oi+l (}ik ) o,k, E :== E + k -

01 -

8§ + C2

which, maximized over Os, yields the generic combined Eand M -step

85+ 1

= (Fxs + Pd-1[B'C-1(X S -

fJ ki +

B8§) - P 28§]

== fJ~ + (F};s + Pl)-lA~rr-l[y - AtJl] - (F.ys + Pl)-1[P 1 P2]Oi

(22)

A. EM Algorithm

The ordinary EM algorithm [2], [9] is based on the following choices for the complete-data space: S == {1..... P}, ~ == diag{ak}, C == ~Ip I~ IT, G == (l/p.~ I p), and B ==

+

(ll~rr-lak

+ Pkk/p)-l(l~rr-l£LJ,J)~

ABl)

(23)

/ H==LH+DH+L H

B. SAGE Algorithm Because of the additive form of (20)~ it is natural for the SAGE algonthm to update each parameter Ok individually. i.e., S' == {k} where k == 1 + Ci modulo p). In light of the discussion in Appendix C. we would like the Fisher information of the hidden-data space for () k to be small, so we associate all of the noise covariance with the signal vector (Lk f'

r-.J

}v

(24)

where D H is a diagonal matrix with the diagonal entries of H, and L H is a strictly lower triangular matrix. Similarly. let

F == A'Il-1A == L F + D F + L~. DH==DF+D p

where D p == diag{ P k k } and F is the Fisher information for y. with respect to f}. Let II x II denote the standard Euclidian norm of a vector r, and for a nonsingular matrix T define !I~rjlT == IIT.rll, which induces the matrix nonn

IIAIIT =

Sk

+ 1~ ... ~p.

To establish convergence of the EM and SAGE algorithms, we use Definition 3 and Theorem 3 of Appendix A. A few definitions are needed. Let H == A'rr- 1 A + P be the Hessian for this problem, and decompose it by

for k == 1..... p.

.Y

1, k

C. Convergence

G == W == 0, where diag l} denotes a diagonal or blockdiagonal matrix appropriately formed. Ip denotes the p vector of ones. and :~ is the Kronecker matrix product. Note that these choices distribute a fraction 1.p of the noise covariance II to .ach signal vector ak. Thus, F.\ == p diag{a~I1-1(Lk}. which being a diagonal matrix is easily inverted. However. since S == {1..... p}, the penalized EM algorithm (22) requires inversion of F.\ + P. which could be just as difficult as inverting i\'TI- 1 A + P for a general P.Therefore. we consider the case where P == diag{ Pk k } . for which the EM algorithm simultaneouslv updates all parameters via + p kk / p)·-1 akIIJ-l(· .l/ -

e.j' J"-- 1, ... , k -

where P kk is the kth diagonal entry of P, and P k is the kth row of P. Note that unlike the EM algorithm, the SAGE algorithm circumvents the need to invert P by performing a sequential update, so a nondiagonal smoothness penalty P is entirely feasible.

where F.\s == B/C-1B is the Fisher information of ..~s for

iJl+l == -1.,\ akIrr- 1ak P

l . '-

}

Os·

ok

t==y-AOO

rn;x II~~~T

=

IITAT-ill.

In addition. let p( A) denote matrix spectral radius. the maximum magnitude eigenvalue of A. SAGE Algorithm: From the SAGE algorithm given above. one can show (cf. proof of Theorem 3) that g(i+l)p _ {)

== M p

.....

M

1

.

(Oi p

-

(25)

(})

where

M, == I - ekH;;klek/H~ = H- 1 / 2 Hl/2ek(Hkk)-lek'Hl/2)Hi/2,

(I -

(a k () k:» IT)

Y ==.oX" s' + ~ ~aj()j. J=I=k

Thus, F.\_.,.k == a~II-lak, which is p times le~s informative than the EM case. which associates only a fraction 1/p of the noise covariance with each signal. (This provides a statistical interpretation of the modified EM algorithms in [35] and [36].) The above choice for the hidden-data space corresp,?nds to B == o,k, C == IT, B == W == 0, G == I. and G == [al ... ak-l ak+l ... a p ], which, substituted into (22), yields the following algorithm. 279

== T- 1 (I - tk(t~tk)-lt~)T, == T-1ptT

pt

and where T = H 1j 2 , the kth column of T is tk, is the orthogonal projection onto tk, and ek is the kth unit vector of length p. Since an orthogonal projection is nonexpansive, IIM k liT ~ 1, which confirms condition 2 of Definition 3. To confirm condition 3, rewrite the SAGE algorithm using (24) as fjC'i+l)p _

0 == [1 - (D H + L H )-1 HJ((}ip

-

0)

which is the Gauss-Siedel iteration (see p. 72 of [37]). Condition 3 follows from p. 109 of [37] since

III-

(DH

+ L H )-lHIIT

= liMp'"

MIII T

< 1.

where x = H 1/ 2 v. Rearranging and multiplying both sides by x'

EM Algorithm: One can use (21) and (23) to show that (Ji+ 1 - {)

=M

. ((Ji

-

IIxl1 2 =

0)

+ P)-lH.

X'DHX

Thus, the EM algorithm is closely related to the simultaneous overrelaxation (JOR) iteration (p. 72 of [37]). To establish that IIMIIT < 1 for T = H 1 / 2 using Theorem 4, we must show that S + S' > H, where in this case S == pDF + P. Since H == LF + D F + L' r + P and P ~ 0, it suffices to show that pD r > LF + D r + L' F, or equivalently that pI > L + I + where L = Dpl/2LpDpl/2. Since A'rr- 1 A

is positive definite by assumption, x' (L + I + [')x > 0 for any nonzero z; therefore, using x == ej ± ek, we see that I i j E (-1,1). Thus, for any nonzero x, x'(E + 1+ r')x < (Lk I Xk 1)2 ~ pllxl1 2 , where the second inequality is a special case of Holder's inequality. The result then follows. We have thus established that both the EM and SAGE algorithm converge globally. The convergence is globally monotonic in norm with respect to the norm T = H 1 / 2 , i.e., R+ is all of lRP •

= p(I -

PEM

= p(I -

(pDp + P)-lH) == p(1 - ((p - I)D F + DH)-lH) == p(I - (DH + LH)-lH)

(26)

(27)

PEM

~

(Ilxll~

-

x'D Hx)/2.

l)D F + D H )- 1/ 2z 11 2 IIz11 2

1 - 1I((p - l)DF

= 1_

IIxll 2

+ DH)1/2H-l/2xII2 2 IIxII

x'[(p - 1)H-l/2DpH-1/2

> 1- IIxll

+ DH]x

2

X'DHX

= 1 - (1 +

>u =

V)~(l _ v)

= (1 ~ v )

u

PSAGE

where the last inequality follows from v E [O~ 1). 0 The inequalities in this proof are rather loose. and often the difference in convergence rate between EM and SAGE is more dramatic than the proof might suggest. To illustrate. consider the case where P = O. Then returning to (25), for the EM algorithm we have

< PEM < 1.

= x'L~x =

+ DH )-lH) + DH )-1/2H

for any z (by definition of spectral radius). In particular, for z = ((p - I)D F + DH )1/2H- 1 / 2 x :

Proof' The right inequality follows from PE:\t1 ~ IIMIIT < 1. From (24), 1 = LH + D H + L~ where L H = H- 1/ 2L HH - 1/ 2 and D H = H-l/2D HH --1/ 2. Thus, for any vector x

x'LHx

((p - l)DF

IIH 1/ 2 ((p -

~ 1-

since P = D p for diagonal P. Theorem 2: For linear superimposed signals in Gaussian noise with a diagonal penalty matrix, the SAGE algorithm asymptotically converges faster than the EM algorithm, i.e. PSAGE

I-v

x ((p-1)Dp+D H )- 1/2)

-

To compare the root-convergence factors of EM and SAGE. we focus on the case where P is diagonal, since otherwise the EM algorithm is in general impractical. Therefore, from the results above

= 1 + v llxlI 2 .

= p(1 - ((p - l)D F

D. Convergence Rates

PSAGE

+ DH)X.

By the invariance of eigenvalues under similarity transforms

r,

PE~1

(1 - v)x'(L H

Combining with (28)

for the EM algorithm, where (cf. (37))

M == I - (pDp

thus

(28)

Since I - G -1 H is similar to the real symmetric matrix 1 - G- 1 / 2 H G - l / 2, the eigenvalues of 1- (D H + L H )-lH are real. For u = PSAGE E [0,1) there exists v :I 0 such that

[I - (D H + LH)-lH]v = vu 280

M =

~

tM =

P k=1

k

T-

1

(~tPt)T. P k=l

Since eigenvalues are invariant to similarity transforms, it follows that root-convergence factors for the two algorithms are given by the spectral radii PEM

PSAGE

= p(~ p

=P

t Pt), k=l

(Ii Pt) k=l

i.e., for the EM algorithm we have a convex combination of orthogonal projections and for the SAGE algorithm we have the product of those projections. Thus, this SAGE algorithm is closely related to the method of alternating projections [38], [39]. In particular, if P = 0 and the columns of A are orthogonal, then PSAGE == 0 whereas PEM 2: 1 - lip, i.e.,

1,....--....---.,...--....-----.,,----

demonstrated that SAGE algorithms yield faster convergence than EM algorithms in two signal processing applications. The particular SAGE algorithms that we presented in this paper sacrifice one important characteristic of the EM algorithm: they are less amenable to a parallel implementation since they are coordinate-wise methods. However, the general SAGE method is very flexible, and work is in progress on more parallelizable algorithms using index sets S consisting of several elements of () [30]. The benefits of parallelization must be weighed against the convergence rates for each application. It is probably no coincidence that the applications we put forth are ones in which the terminology "incomplete-data" and "complete-data" as introduced in [1] are somewhat unnatural. In most of the statistical applications discussed in [I], there is a clearly identifiable portion of the data that is "missing," and hence one natural complete-data space. In contrast, there is nothing really "incomplete" about tomographic measurements; the problem is simply that the log-likelihood is difficult to maximize. The EM algorithm is thus just a computational tool. (To further illustrate this point, note that in classical missing data problems the estimates of the missing data may be of some intrinsic interest, whereas the "complete-data" for tomography is never explicitly computed and would be of little use anyway.) SAGE algorithms may be most useful in such contexts. We have emphasized that the SAGE algorithm improves the asymptotic convergence rate. The actual convergence rate will certainly depend on how close the initial estimate is to a fixed-point. In tomography and image restoration, fast linear algorithms can provide good initializers for penalized likelihood estimation. A greedy algorithm like SAGE is likely to be most beneficial in applications where such initializers are available.

0.9

•

Conventional EM

0.8

1 ...

0•7

~0.6

c

4D

~0.5

a0.4 4D

> u

~

80•3

a:

0.2

SAGE Algorithm

0.1

0'--.......---......- ----------.......- -......- - - - ' o 0.2 0.4 0.& 0.8 1 cos(Complementary Angle 8etw"n Sub_paces) Fig. 3. Comparison of root-convergence factors for conventional EM algorithm and proposed SAGE algorithm versus complementary angle between subspaces of superimposed signals. The SAGE algorithm has a significantly improved convergence rate.

the SAGE algorithm converges in one iteration, while EM converges very slowly. When p = 2, using a Gram-Schmidt argument one can show that t 1 = [1 0]' and t2 == [Q~)' where a = la~II-la21/(llallllla211) is the cosine of the complementary angle between al and a2. Thus PSAGE

= P(

[

~ ~])

=0 2

PE~I = p(~ [-~;l ~

(12

-a~]) 1+a2

_~ + ::

-

2

2'

Fig. 3 illustrates that the root-convergence factor of SAGE is significantly smaller than that of EM, which substantially reduces the number of iterations required. Not only is PSAGE < PEM, but also PSAGE < P~M' so one SAGE iteration is better than two EM iterations, at least when p = 2. Thus, even though the EM algorithm appears to have the advantage that one can parallelize the M -step using p processors that simultaneously update all parameters, in this case the convergence rate of the parallel algorithm is so much slower that a sequential update may be preferable. This depends, of course, on how difficult the M -step is; in the nonlinear case discussed in [2], the Mstep is presumably fairly difficult, so parallelization may be advantageous. Equations (26) and (27) help one examine these types of tradeoffs. V.

DISCUSSION

We have described a generalization of the classical EM algorithm in which one alternates between several hidden-data spaces rather than using just one, and updates only a subset of the elements of the parameter vector each iteration. By updating the parameters sequentially, rather than simultaneously, we

ApPENDIX A

MONOTONE CONVERGENCE IN NORM

Because the SAGE "algorithm" is so general, a single convergence theorem statement/proof cannot possibly cover all cases of interest (see, for example, the variety of special cases considered for the classical EM algorithm in [40].) Here we adopt the Taylor expansion approach of [4] since it directly illuminates the convergence rate properties and prescribes a region of monotone convergence in norm, However, this general approach has the drawback that it assumes the fixed point lies in the interior of e. This restriction is often not a necessary condition, and at least for some applications one can often find specific convergence results without the restriction, e.g., [3] and [30]. Readers who are satisfied with the assurance of monotonicity of the objective 4?((Ji), as provided by Theorem 1, may wish to simply skim this Appendix. For simplicity, we discuss only the case where the index sets s' are chosen cyclically with period K, i.e., Si = Sk where k == 1 + (i modulo K). We also assume that U~=l Sk == {1, ... , p} so that each parameter is updated at least once per cycle. Before stating the convergence theorem, several definitions are needed. Consider an index set S, and let m denote its

281

cardinality. Bearing in mind our notational convention that cPs (() S; 7J) cPs (() s: (j s, 7Js), we define the m x m matrices

=

Theorem 3: Assume i) Si = Sk where k = I+(i modulo K) and U:=l Sk = {I, ... ,p}, ii) iJ is a fixed point of the SAGE algorithm (5) in the interior of 8, iii) for all 7J E R+ the Sk maximum over ()Sk of cjJSk ((}SIc; 8) is in the interior of 8 (0), iv) s» ; 6) is twice differentiable in both arguments V8 E e Sk and V(}Sk E 8 (8), and v) the region of monotone convergence R+ for a nann II . liT is nonempty. 1. If E R+ then

«: (()

and

eo

and the m x (p - m) matrix

110 + 1 1

-

OIlT ~ IIIP - OIlT

Vi

and where \7 denotes the (row) gradient operator and \7' its transpose. Let {} be a fixed point of the SAGE algorithm, and define

1I()(l+l)K -

-1 1 1

('V 200¢S)(tf)s

+ (1 -

t)Os; to

V ((}s; 0) 1 = ('VllO¢S) (tf)s

+ (1 -

t)Os; to + (1 - t)O)dt

S

+ (1 -

t)O)dt

(29)

(30)

WS(f)s; 0) = 1\'V 101¢S)(tf)s

+ (1- t)Os; to + (1- t)O)dt. (31)

Let R S denote the p x p permutation matrix that reorders the elements of {S, S} into {I, .... p}. Then define the p x p composite matrix

MS(Os; 8) == R S [U S ( 0s: 0)-1 [Y S (

eS: (j) W S ( ()S; 0)] ] (R S )' Ip-

O(p-m)xm 11,

x

11,

(32)

m

(JIlT

(34)

p(MSK(OSK;B) ... MSI(Bsl;B)) which is bounded above by

and

where In denotes the In addition, define

~ QIIOll< -

where a < 1 is defined by (33). Therefore, IIBll< - 0IIT converges monotonically to zero with at least linear rate. 2. The root-convergence factor [41] of the subsequence {e,K } ~o is given by the spectral radius

US((}S; 0)

=

BIIT

Q

(35)

< 1.

Note that by the equivalence of matrix noons (p. 29 of [37]), monotone convergence with respect to the norm II . liT implies convergence with respect to any other norm, although probably nonmonotonically. Since the index sets are chosen cyclically, a "full iteration" consists of K updates; therefore, (34) bounds the convergence rate of the subsequence {8 i K }~o. Proof: Consider the ith iteration and let S == Sk where k == 1 + Ci modulo K). Define

and let ¢s ( z )

= cPs (f)S ~ 7]).

Let

d(z) == d(()s; 8) == (v~s(ps)(z)

identity matrix.

then by assumption iv) we can apply the Taylor formula with remainder [42] to expand d(z) about z

With the above definitions, we can define the following region of monotone convergence in norm to B. Definition 3: R+ c e is a region of monotone convergence in norm if there exists a nonsingular p x p matrix T such that R+ is an open ball with respect to the norm II . liT and Sk 1. For k == 1, ... ,K, U (OSk; B) is invertible for all () E R+ sk and for all f)Sk E (see J.7)). _ 2. For k == 1, ... ,K,JjM (Osk;(I)IIT ~ 1 for all (I E R+ sk and for all ()s» E (0),

e (!L e

3. There exists Q: < 1 such that for any () , ... , e -- k -k and ()Sk E s (0), k == 1, ... , K

~I<

-1

e

IIMSK(esK;8K) ... MSl(Bsl;81)!IT ~

Q.

E

d(z) = d(i)

+

i

1

('Vd)(tz

+ (1 -

t)i)dt (z - i).

Since Bis a fixed point of the SAGE algorithm, by assumption iii) and iv) d(z) O. Observe that by the definitions (29)-(31):

=

(\7d)(z) = [-Us(Os; 0) Y S(8 s: 0) WS((}s; 0)].

d(O~11; (}i) = 0 for the SAGE

By assumptions iii) and iv) algorithm (5), so

R+

U

Sk

i+ 1.

i

i+ 1

(OSk ~O )(()Sk - OSk) A

_

V

Sk

i+ 1.

(()Sk ,()

i

i

)(8S k

i+l. i i + W Sk (OSk ,8 )((}Sk

282

- fJ S k )' A

_

(36)

(33)

In general, T may depend on {), but we do not allow T to vary with iteration. The hard work is to verify condition 3 (see Appendix B), but if one can do so, the reward is the following theorem.

,.

- OSk)

By property 1 (invertibility) of Definition 3

(}i+ sl.: 1

(Oi+ 1. ()i) -1 VSk (Oi+ 1. ()i) (Oi _ {) sA: ) Sk , 51.: , Sk Sk (f)~t1; f)i)-1 W (f)~t1; f)i)( f)lk - 0sd.

_ {) _ USk Sk -

+U

Sk

From (6) the components of permuting using R S (32)

Oi+l _ {} == M

Sk

O§k

(O~"tl; tri ) ( Oi - {})

Sk

where M was defined by (32). Therefore, since by property 2 of Definition 3

IIO i + 1 - BIIT

~

to Os, and HS's is the coupling between Os and Os induced by 4l. Combining the above definitions with (29)-(31)

are just copied, so after

(37) (}o E

R+

IIf)i - eli T

Sk

... M

MS

(f)~~ +1;

f)'iK)( (pK -

0IIT

== Fxs 1y

W

S

== _Hs,s.

= R S [(US)~lVS

(U

= [(HS + Fxsciy)-lFXSIY aliT

~ rtllf)iK -

(38)

S);l WS]

(Rs )' .

(Rs)/MsIl s

0).

Thus, by property 3 of Definition 3

IIO(i+l)K -

v

S

Substituting in (38)

(g~:l)K; o(i+l)K -1) X S1

== H S

If H is positive definite, then US will be invertible; therefore, by (32)

and therefore, it follows by induction that f)t E R+. A full cycle consists of one update over each of the K index sets; therefore, applying (37) K times (O(i+l)K _ 0) == M

+ Fxs 1y

US

= 1 - [~J (H S + Fxs 1y) -1 [HSHS,S]

and therefore, the subsequence {f}tK }~o converges monotonically in norm to () as i ~ oc with linear rate at most

= 1-

o.

By continuity of the derivatives of
-(H s + F~sly)-lHS'S]

[~] (HS + F xSly) -1[1 O)(RS)'HRS.

Thus S

H1 M S H- 1 =1-H1RS [(H +Foxs ly )-1

~] (Rs)'H1. (39)

Since the spectral radius is bounded above by any matrix norm, the root convergence factor is bounded above by n. 0 ApPENDIX B

R+ Is

For simplicity, we now consider the case the index sets are disjoint and are chosen cyclically in the natural order, i.e., S! == s», where k == (1 + i modulo K) and {51 ~ ... ~ SK} = {I ..... p}. In that case, it follows from (39) that

M

NONE~1PTY

In this appendix. we show that the region of monotone convergence in nann R..; is nonempty for suitably regular problems. Thus, the conditions for Theorem 3 are reasonable, and the superimposed signals example in Section IV is a concrete example. First note that from (10) one can show that

(\7 110 HS)(Bs~ B) = _(\7 200 H S )(8s: B). (\l101Hs)(es~B) = 0

(cf. (3.16) of [1]). For an index set 5, define

FX51y = -(v

200

+ D- F + L-H-)1 H

(40)

(41)

where DH is a block diagonal matrix containing the diagonal blocks of H that correspond to the subsets and LH is the corresponding strictly lower block triangular matrix. We sK ... MS11I < 1 by using the can thus establish that 11l\1 T following "splitting matrix" theorem (p. 79 of (37]). Theorem 4: If H is positive definite and S is invertible, then

s».

S-lHIIT

<1

H~ if

for T

S

+ Sf :> H.

(42)

From (40), for a SAGE algorithm S = DH + DF + L H , so in light of (41), condition (42) of Theorem 4 is satisfied. Thus,

H == - \7 2
s

== 1- (D H

is block-diagonal with F xS Iy in the kth block, and

A

then from (10), one can see that the matrix F x Sly is the conditional Fisher information of X S for Os, given Y == Y and given all of the other parameters Os' Define the Hessian of the objective at (; by

H

DF

s:

···M

III -

s H )((}s; ()) A

where

SK

== -(\7 8 5 V~::i 1»(0) A

HS's = -(V 8 s V~s W}). (To simplify notation we leave implicit the dependence on ().) Note that HS is the curvature of the objective

283

IIM s K

...

M

S1

II <

1.

V sing the relationships derived above, one can establish the following result [43]. Theorem 5: Let {) be a fixed point of a SAGE algorithm, and assume that ep is strictly concave on an open set local to {j (so that H is positive definite). Then if q, and the functions ¢s are all twice continuously differentiable near iJ, there exists a nonempty region of monotone convergence in norm R+ satisfying the conditions of Definition 3 for the norm induced by T = H~.

ApPENDIX

C

FISHER INFORMATION

From (35) we see that the root-convergence factor of a SAGE algorithm is given by the spectral radius of a product of matrices M S(OS; 0) of the form (32). For an EM algorithm, this spectral radius increases towards 1 as the completedata becomes more informative, i.e., as its Fisher information increases [1], [4], [5]. In this section we demonstrate that a similar relationship holds for the convergence rate of a SAGE algorithm. Defining

we see from (39) that for 8i small

This last equation suggests that minimizing F x Sly will improve the rate of convergence of 118 i ll to O. To demonstrate this more formally, let n F 1 and D F2 be the block diagonal aggregate of the Fisher information matrices for two SAGE algorithms with DPI < DF~. Then one can use (40) with an argument similar to the proof of Theorem 2 to show that -

p(I - (D H

1 + D- F1 + LH)H)

< p(I - (DR + DF + LH)-lH). :!

Thus, less informative hidden-data spaces lead to smaller root-convergence factors and hence faster converging SAGE algorithms. In particular, once one has chosen the index sets Sk the optimal hidden-data space from the point of view of asymptotic convergence rate would simply be )( 5 = Y, since then Fxs Iy = o. But that choice will often lead to an intractable M -step. The SAGE algorithm allows one to choose hidden-data spaces whose Fisher information matrices are much smaller than that of the usual complete data of an EM algorithm. Finally, note that from (43) we see that since H is determined by cI>, once the index sets are chosen, the only design issue left is to choose the hidden-data X s. This choice should be made by considering the tradeoff between making Fxs 1y small but yet making the M -step tractable. ACKNOWLEDGMENT

The first author gratefully acknowledges helpful discussions on the superimposed signals application with Y. Bresler and S.-F. Yau. The authors thank the reviewers for helpful suggestions, including references to [18] and [15]. REFERENCES [l] A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. Royal Stat. Soc. Series B, vol. 39, no. I, pp. 1-38, 1977. [2] M. Feder and E. Weinstein, "Parameter estimation of superimposed signals using the EM algorithm," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, no. 4, pp. 477-489, Apr. 1988.

[3] K. Lange and R. Carson, "EM reconstruction algorithms for emission and transmission tomography," J. Compo Assisted Tomography, vol. 8, no. 2, pp. 306-316, Apr. 1984. [4] A. O. Hero and J. A. Fessler, "Asymptotic convergence properties of EM-type algorithms," Communications and Signal Processing Laboratory, Dept. of EECS, Univ. of Michigan, Ann Arbor, MI, Technical Report 282, Apr. 1993. [5] 1. A. Fessler, N. H. Clinthome, and W. Leslie Rogers, "On complete data spaces for PET reconstruction algorithms," IEEE Trans. Nucl. Sci., vol. 40, no. 4, pp 1055-1061, Aug. 1993. [6] 1. A. Fessler and A. O. Hero, "Complete-data spaces and generalized EM algorithms," in Proc. IEEE Conf. ASSP, 1993, vol. 4, pp. 1-4. [7] D. G. Politte and D. L. Snyder, "Corrections for accidental coincidences and attenuation in maximum-likelihood image reconstruction for positron-emission tomography," IEEE Trans. Med. lmag., vol. 10, no. 1, pp. 82-89, Mar. 1991. [8] J. A. Fessler and A. O. Hero, "New complete-data spaces and faster algorithms for penalized-likelihood emission tomography," in Conf Rec. IEEE Nucl. Sci. Symp. Med. Imag., 1993, pp. 1897-1901. [9] P. 1. Green, "On use of the EM algorithm for penalized likelihood estimation," J. Royal Stat. Soc. Series B, vol. 52, no. 3, pp. 443-452, 1990. [10] T. Hebert and R. Leahy, l~A Bayesian reconstruction algorithm for emission tomography using a Markov random field prior," in Proc. SPIE 1092. Med. lmag. 11/: Image Processing, 1989, pp. 458-466. [11] _ _ , "A generalized EM algorithm for 3-D Bayesian reconstruction from Poisson data using Gibbs priors," IEEE Trans. Med. lmag., vol. 8. no. 2, pp. 194-202. June 1989. l12] J. A. Fessler, N. H. Clinthorne, and W. L. Rogers, "Regularized emission image reconstruction using imperfect side information," IEEE Trans. Nucl. Sci., vol. 39, no. 5, pp. 1464-1471, Oct. 1992. [13] K. Lange. "Convergence of EM image reconstruction algorithms with Gibbs smoothing," IEEE Trans. Med. Imag .. vol. 9, no. 4. pp. 439-446, Dec. 1990 (Corrections, June 1991). [14] B. W. Silverman et al. "A smoothed EM approach to indirect estimation problems. with particular reference to stereology and emission tomography," 1. Royal Stat. Soc. Series B. vol. 52. no 2. pp. 271-324. 1990. [151 A. R. De Pierro. ·'A generalization of the EM algorithm for maximum likelihood estimates from incomplete data:' Med. Imag. Processing Group, Dept. of Radiology. Univ. of Pennsylvania. Technical Report ~nPG119, 1987. [161 G. T. Herman, A. R. De Pierro. and N. Gai, "On methods for maximum a posteriori image reconstruction with a normal prior:' J. Visual Commun. Image. Represent., vol. 3. no. 4. pp. 316-324. Dec. 1992. l17] A. R. De Pierro. UA modified expectation maximization algonthm for penalized likelihood estimation in emission tomography," to be published in IEEE Trans. Med. Imag. l181 M. Abdalla and J. W. Kay, HEdge-preserving image restoration:' in Stochastic Models, Statistical Methods, and Algorithms in Image Analysis (P. Borne. A. Frigessi, and M. Piccioni, Eds.)., vol. 74 of Lecture Notes in Statistics. New York: Springer. 1992. pp. 1-13. [191 C. Bouman and K. Sauer, "Fast numerical methods for emission and transmission tomographic reconstruction," in Proc. Conf. Inform. Sci. Syst. Johns Hopkins, 1993. [20] _ _ . ··A unified approach to statistical tomography using coordinate descent optimization," 1993, submitted to IEEE Trans. Med. Imag. (21) T. A. Louis .. 'Finding the observed information matrix when using the EM algorithm," J. Royal Stat. Soc. Series B, vol. 44, no. 2, pp. 226-233. 1992. [22] R. M. Lewitt and G. Muehllehner, "Accelerated iterative reconstruction for positron emission tomography based on the EM algorithm for maximum likelihood estimation," IEEE Trans. Med. lmag., vol. MI-5, no. I, pp. 16-22, Mar. 1986. [23] L. Kaufman, "Implementing and accelerating the EM algorithm for positron emission tomography," IEEE Trans. Med. Imag .. vol. MI-6, no. I, pp. 37-51, Mar. 1987. [24] I. Meilijson, uA fast improvement to the EM algorithm on its own terms," 1. Royal Stat. Soc. Series B. vol. 5, no. I, pp. 127-138. 1989. [25] M. Jamshidian and R. 1. Jennrich, "Conjugate gradient acceleration of the EM algorithm," J. Amer. Stat. Assoc., vol. 88. no. 421, pp. 221-228, 1993. [26] K. Sauer and C. Bouman, uA local update strategy for iterative reconstruction from projections," IEEE Trans. Signal Processing, vol. 41, no. 2, pp. 534-548, Feb. 1993. [27] W. H. Press et al., Numerical Recipes in C. Cambridge, UK: Cambridge Univ. Press. 1988. [28] M. Segal and W. Weinstein, "The cascade EM algorithm," Proc. IEEE,

284

vol 76, no. 10, pp. 1388-1390, Oct. 1988. [29] J. A. Fessler, "Penalized weighted least-squares image reconstruction for positron emission tomography," IEEE Trans. Med. lmag., vol. 13. no 2, pp. 290-300, June 1994. [301 J. A. Fessler and A. O. Hero, "Space-alternating generalized EM algorithms for penalized maximum-likelihood image reconstruction." Commun. and Signal Processing Lab., Dept. of Elec. Eng. and Compo Sci., Univ. of Michigan, Ann Arbor, MI, Tech. Rep. 286, Feb. 1994. [31] R. Boyles, "On the convergence of the EM algorithm," J. Royal Stat. Soc. Series B, vol. 45, no. I, pp. 47-50, 1993. [32] L. A. Shepp and Y. Verdi, "Maximum likelihood reconstruction for emission tomography:' IEEE Trans. Med. lmag .. vol. Ml-}, no. 2, pp. 113-122. Oct. 1982. [33] C. L. Byrne, "Iterative image reconstruction reconstruction algorithms based on cross-entropy minimization:' IEEE Trans. Imag. Processing, vol. 2, no. I, pp. 96-103, Jan. 1993. [34] L. Kaufman, "Maximum likelihood, least squares, and penalized least squares for PET," IEEE Trans. Med. lmag., vol. 12, no. 2, pp. 200-214, June ]993. [35] J. A. Fessler, "Object-based 3-D reconstuction of arterial trees from a few projections," Ph.D. thesis, Stanford Univ.. Stanford, CA, Aug. 1990. [36] D. Chazan, Y. Stettiner, and D. Malah, "Optimal multi-path estimation using the EM algorithm for co-channel speech separation," in Proc. IEEE Con! Acoust.. Speech. Signal Processing, 1993, vol. 2. pp. 728-731. [37] D. M. Young. Iterative Solution of Large Linear Systems. New York: Academic. 1971. [38] I. Ziskind and M. Wax, "Maximum liklihood localization of multiple sources by alternating projection," IEEE Trans. Acoust .. Speech. Signal Processing, vol. 26, no. 10, pp. 1553-1560, Oct. 1988. [391 S. Kayalar and H. L. Weinert, "Error bounds for the method of alternating projections," .\-Iath Contr. Signals Svst.. vol. l . pp. ~3-59, 1988. [~O) C. F. 1. Wu. "On the convergence properties of the EM algorithm:' Allll. Stat., vol. l l , no. 1. pp. 95-103, 1983. 411 J. M. Onega and W. C. Rheinboldt, lterauve Solution of Nonlinear Equations ill Several Variables. New York: Academic. 1970. [421 E. Polak, Computational Methods in Optimituuon: a Unified Approach. Orlando, FL: Academic. i 971 . [43 J A. O. Hero and J. A. Fessler. "Convergence in norm for alternating expectauon-rnaxirruzation (EM)-type algonthrns." [0 be published 10 Staustica Sinica.

285

Unitary ESPRIT: How to Obtain Increased Estimation Accuracy with a Reduced Computational Burden Martin Haardt, Student Member, IEEE and Josef A. Nossek, Fellow, IEEE

Abstract- ESPRIT is a high-resolution signal parameter estimation technique based on the translational invariance structure of a sensor array. Previous ESPRIT algorithms do not use the fact that the operator representing the phase delays between the two subarrays is unitary. Here, we present a simple and efficient method to constrain the estimated phase factors to the unit circle, if centro-symmetric array configurations are used. Unitary ESPRIT, the resulting closed-form algorithm, has an ESPRITlike structure except for the fact that it is formulated in terms of real-valued computations throughout. Since the dimension of the matrices is not increased, this completely real-valued algorithm achieves a substantial reduction of the computational complexity. Furthermore, Unitary ESPRIT incorporates forward-backward averaging, leading to an improved performance compared to the standard ESPRIT algorithm, especially for correlated source signals. Like standard ESPRIT, Unitary ESPRIT offers an inexpensive possibility to reconstruct the impinging wavefronts (signal copy). These signal estimates are more accurate. since Unitary ESPRIT improves the underlying signal subspace estimates. Simulations confirm that. even for uncorrelated signals, the standard ESPRIT algorithm needs twice the number of snapshots to achieve a precision comparable to that of Unitary ESPRIT. Thus, Unitary ESPRIT provides increased estimation accuracy with a reduced computational burden.

T

I.

INTRODUCTION

HE recovery of signal parameters from noisy observations is a fundamental problem in (real-time) array signal processing. Due to their simplicity and high-resolution capability, ESPRIT-like subspace estimation schemes have been attracting considerable attention. Their parameter estimates are obtained by exploiting the rotational invariance structure of the signal subspace, induced by the translational invariance structure of the associated sensor array. This can be achieved without computation or search of any spectral measure [15], [17]. Unitary ESPRIT achieves even more accurate results than previous ESPRIT techniques by taking advantage of the unit magnitude property of the phase factors that represent the phase delays between the two subarrays [4]. It has been shown in [12] that constraining the phase factors to the unit circle can also give some improvement for correlated sources. For centro-symmetric sensor arrays with a translational invariance Manuscript received June 8, 1994: revised August 30. 1994. The associate editor coordinating the review of this paper and approving it for publication was Prof. Keshab Parhi. The authors are with the Institute of Network Them)' and Circuit Design, Technical University of Munich, Munich, Gcrmanv. IEEE Log Number 9410301. .

structure, Unitary ESPRIT provides a very simple and efficient solution to this task. Although Unitary ESPRIT effectively doubles the number of of data samples, the computational complexity is reduced by transforming the required rank-revealing factorizations of complex matrices into decompositions of real-valued matrices of the same si;e. Thus, we obtain increased estimation accuracy with a reduced computational load. This reduction can be achieved by constructing invertible transformations that map centro-Hermitian matrices to real matrices. These transformations have been introduced in Lee' s pioneering work on centro-Herrnitiun matrices (10). More than a decade later. her results were used to transform the complex covariance matrix of a uniform linear array (ULA) into a real matrix of the same size (8) to reduce the computational load of adaptiv...~ beamfonning schemes (91. In this paper. we use more general centro-symmetric array configurations that have been receiving increased attention lately 12~]. We derive an efficient square root version of Unitary ESPRIT that only requires real-valued computations from start to finish. by operating directly on the data instead of "squaring" it to obtain
Reprinted from IEEE Transactions on Signal Processing, Vol. 43, No.5, pp. 1232-1242, May 1995.

286

where the overbar denotes complex conjugation without transposition. Centro-Hermitian matrices of size p x q form a p . qdimensional linear space over IR [10]. To show how centroHermitian matrices can be mapped to matrices with real entries, Lee defines left II -real matrices in the following fashion. Definition 2 [10): Matrices Q E C p x q satisfying

loss of accuracy that can be compensated by combining them with Unitary ESPRIT, yielding not only improved estimation accuracy, but also completely real-valued algorithms. In [5], it is shown how Unitary Schur ESPRIT dramatically improves the performance of the new Schur-type method, an adaptive subspace estimation scheme with a computational structure and complexity similar to that of a QR decomposition, except for the fact that plane and hyperbolic rotations are used. In this case, the required rank decision, i.e., an estimate of the number r.f signals, is automatic, and updating as well as downdating are straightforward. The fully real-valued Unitary ESPRIT concept can also be extended to spatially smoothed forwardbackward estimation schemes [6], [13], [14] and is applicable to many other subspace estimation techniques (see [20] for an excellent overview). The results are comparable to the advantages obtained by operating in bearnspace [24] without the necessity of converting the data from element space to r eamspace. This paper is organized as follows. It starts with a review of the definition and basic properties of centro-Hermitian matrices. These properties will be used to derive the realvalued implementation of Unitary ESPRIT. A brief review of the standard ESPRIT algorithm is given in chapter III. It can be seen as a generalization of the matrix pencil method [7]. Chapter IV introduces the Unitary ESPRIT concept for centrosymmetric array structures. In Section IV -8 we show how all tnree required rank-revealing factorizations can be transformed into decompositions of real-valued matrices or' the same si:e yielding a completely real algorithm ..-\ new reliability test. which is a substantial improvement of current high-resolution array signal processing and spectral estimation techniques. is presented in Section IV-C. Further sunplifications of the algorithm are derived in Section IV-D. before a summary \..11' Unitary ESPRIT concludes the chapter (Section IV-E). Finally. computer simulations compare [he perfonnance of unitary ESPRIT with that of the well-known standard ESPRIT algorithm (Section V).

II.

(2) are left II -real. The unitary matrices

u; ]

-jil n

o

J2

(3)

jIn] OT

o - jTl n

(4)

for example. are left Il -real of even and odd order, respectively. More left II -real matrices can be obtained by post-multiplying a left II -real matrix Q by an arbitrary real matrix R. i.e., every matrix QR is left II-real. Now. we are in a position to state Lee' s main result, which establishes an automorphism between centro-Hermitian and real matrices. Theorem 1 [10]: Let 'I'; and U'; denote arbitrary nonsingular left II-real matrices of size p x p and 'l x 't respectively. Then. the bijective mapping

;: M ~ T~lMU(/ maps the set of all p x (j centro- Hermitian matrices onto RP x q . the set of all real matrices of the same size. This theorem can. for instance. be used to calculate the singular value decomposition (SVD) of a centro-Hermitian matrix M E C P X I/ . Corollary 1: Let M be centro-Hermitian, and assume that the SVD ~f ;;Q(M) == Q~ MQIJ E RP~-«/ is given by

o,

y(j(M) == UyE'r~V~!. where the matrices and Qq are unitary as well as left l1-real. Then, an SVD I of M is obtained as

CE~TRO-HER~lITIA~ M.\TRICES

First of all. let us introduce our notation and review the definition and the basic properties of centro-Hermitian matrices that have been derived by Lee [10]. Throughout this paper, column vectors and matrices are denoted by lower case and upper case boldfaced letters. respectively. TTl) is the p x p exchange matrix with ones on its antidiagonal and zeros elsewhere

(5) where the left and right singular vectors of M are left II -real. Proof' The first part follows from the unitary nature of QJ> and Q(]' the second from the fact that the singular vectors of a real matrix are real. • For future reference, we consider an efficient computation of a particular transformation T (.). It transforms an arbitrary complex matrix G E CP x q into a real p x 2q matrix, denoted by T (G). Notice that for every matrix G ~ the matrix

II p is a symmetric permutation matrix. it is involutorial, i.e., rI~ = II'. With this notation. we can define centroHermitian matrices in analogy to centro-symmetric matrices. Definition 1: A complex matrix M E (11 x I[ is called centro-Hermitian jf

~) ince

(1)

[G llpGII q] E

Cpx2
is centro-Hermitian. Thus, the matrix

T(G) (~f ~Q([G IIpGIIq])

== (J~{[G

II pG 1I q]Q 2q

(6)

1 Recall that the SVD of a complex matrix is unique up to a unitary diagonal scaling matrix, if all singular values are distinct.

287

is always real according to Theorem 1. Consider the case where the left II-real matrices Q p and Q2q are chosen according to (3) or (4). Furthermore, let G be partitioned as

G=

[~]

where the block matrices G 1 and G 2 should have the same size. Obviously, the row vector gT must be dropped if ]J is even. Then, straightforward calculations show that the desired real-valued matrix (6) can be expressed as

o Fig. 1. Planar array composed of n sensors (doublets).

(7) Here, Re{·} and Imj-} denote the part, respectively. Once again, if p of (7) should be dropped. Then, an T(G) E Rpx2q from the complex p x 2q real additions.

In = ,)

Subarray 2 (a)

Subarray 1 I

T • T

,Subarray 2 (b)

Consider the standard ESPRIT scenario [151, [17], i.e., an J,I -element sensor array composed of in pairs of pairwise identical, but displaced sensors (doublets). Let ..1 denote the distance between the two subarrays. Incident on both subarrays are d narrow-band noncoherent planar wavetronts

with signal propagation velocity c: and a common center frequency woo The tl impinging signals are combined to a signal vector s( t rl ). Assume, for the moment, that the two subarrays do not share any elements. i.e., they do not overlap. Then, the total number of sensors equals :.\I == 2,11, and the uncorrupted signals received at the two subarrays have the following form:

= [:~~~:n = [:fJ> ].~(tIl) = Ac;s(t,,).

(8)

AG E C A1 x d and A E C n1 x d are the steering matrices of the whole array configuration (global array steering matrix) and the first subarray , respectively. Notice that the kth columns of both array steering matrices depend on the direction of arrival (DOA) Ok of the kth source relative to the displacement between the two subarrays.' Furthermore ~ == diag{(Pk}i=l E (dxcl

is a diagonal matrix of the phase delays between the sensor doublets for the d wavefronts. Its diagonal elements, the phase factors
~

~

A. Standard ESPRIT Scenario

)

. ..

= 3 pairwise identical but displaced

Subarray 1

real and the imaginary is odd, the center row efficient computation of matrix G only requires

III. STANDARD ESPRIT

x(t n

~

=0

< »< d.

= 0 corresponds to the direction perpendicular to ~.

(9)

•

Subarray 1 I

• •

• i

Subarray 2 (C)

Fig. 2. Three different subarray choices for a uniform linear array (ULA) of M = 6 identical sensors. (a) Maximum overlap (m = 5); (b) interleaved (m = 3); (c) Mixed (m = 4).

B. More Structured Array Geometries Recall that every row of At; corresponds to an element of the sensor array. In the case of overlapping subarrays, a particular subarray configuration is described by selection matrices that choose ni elements of x( I ,,) E C .\!. where In <.\/ is the number of elements in each subarray. Let J 1 and J 2 be ni x J/ selection matrices that assign elements of x( tTl) to the subarrays one and two. respectively. Fig. 2. for example, displays three different subarray choices for a uniform linear array (ULA) of Al = 6 identical sensors. In general, the two selection matrices are chosen to be centro-symmetric with respect to one another, i.e. ( 10)

a property that plays a key role in the derivation of the fully real implementation of Unitary ESPRIT, cf. Section IV-B. Therefore, the combined selection matrix

.1 (g;f

[JJ 1] 2

E

R2m

is centro-Hermitian, i.e. "21n.J II~'I

c.

x i\/

= J,

General ESPRIT Principle

By collecting iV ~ d snapshots from each sensor, 1 ~ n :S N, measurement matrices Xl, X 2, X and a signal matrix &of)

288

A basis for the estimated signal subspace is determined from the d dominant left singular vectors according to

are formed, obeying

(11)

Then, a unitary basis for the row space of [C 1 C 2 ] can also be obtained by computing its SVD (total least squares approach). However, it is less expensive to use Prow == which corresponds to the standard least squares solution of the overdetermined set of equations

It is easy to see that every row in X corresponds to an element of the sensor array. Equation (11) implies that Xl, X 2 , and X are rank-deficient, namely rank Xl == rank X 2 == rank X == d. Thus, the d columns of

cf,

C1tJI

GL(d).

(12)

r

==

r.;« [SP

eol

form a basis for the row space of [C1

2]

if

ProwA E GL(d).

(13)

(17)

Therefore, the rank-reducing numbers of the matrix pencil

T, -

Arl

== ProwA (lP - AId)SP col

are the diagonal elements of tP (phase factors) and can be calculated as the generalized eigenvalues of the matrix pair

ir; r 1) .

Due to these observations, the ESPRIT algorithm reduces to choosing the appropriate compression matrices that define the required bases. In the absence of noise (the case discussed so far), any matrices Peal and Prow satisfying (12) and (13) will do the job. With noisy measurements, however, we are faced with the problem of estimating the signal subspace and its dimension.

for some unitary diagonal matrix A c E (d x ", Notice that the I . I ~ II 1 matrix A c:Ac/, -) IS ett -rea. Uniform linear arrays. for example. the most common arrays used in practice, are centro-symmetric. It is well known that the analogy between array signal processing and time series analysis (harmonic retrieval) can be obtained through uniform linear arrays (ULA's) by interpreting them as uniform sampling of a time series [16]. The centro-symmetry of the global sensor array Ac; and ( 10) imply that the steering matrices of both subarrays are also centro-symmetric, i.e.

Without additive noise. the Unitary ESPRIT data matrix

D. SBD-Based Subspace Estimate

The most robust way to estimate the required bases is to compute the singular value decomposition (SVD) of

admits the factorization

·

where X denotes the measurement matrix X corrupted by additive, spatially uncorrelated' noise, ~s contains its d dominant singular values, and the unitary matrices V and V are partitioned accordingly. Then, the best rank d approximation of X in the Frobenius-norm is given by X == V,I 2:s V ~. In other words. our low rank estimate of X is the matrix X satisfying Y~d

-

[XX

=

[;
/Z -

(14)

Ilx- XIIF == rankmin Ilx- YIIF'

r 1 r 2.

Unitary ESPRIT is applicable to centro-symmetric array configurations. A sensor array is called centro-symmetric [22] if its element locations are symmetric with respect to the centroid and the complex characteristics of paired elements are the same. Their global array steering matrix A c . therefore, satisfies

tPSPeol ]

e

==

A. Multiple Invariance Structure

== Prow [C 1 C2 ]

2]

(16) 1

IV. UNITARY ESPRIT

Here, GL( d) C Cd x d denotes the general linear group of all nonsingular matrices of dimension d x d. Observe that the column space or range of IX, range IX C e 2m , is usually called signal subspace. In the arne way, the d rows of

[rl

C2

followed by an eigendecomposition of tjI

form a basis for the column space of X if

sr; E

~

(15)

3If the spatial covariance matrix of the additive noise is known up to a scalar factor, the SVD can be replaced by the generalized or quotient SVD (QSVD), as described in [17].

289

1

2

II m.X 2 ]

lIrnX l

(18)

which is easily seen by using the centro-symmetry of the subarrays and the unitary nature of ~. Thus, Z is also rankdeficient, namely rank Z = d. Equations (11) and (18) show that Unitary ESPRIT essentially doubles the number of available measurements from N to 2JV. Increased estimation accuracy can, therefore, be achieved by replacing the measurement matrix X E C ~\I x IV of the standard ESPRIT formulation ( 11) by Z E C .\f x 2.V . which corresponds to forward-backward averaging of the data.

B. Real Implementation Due to the special algebraic structure of the noise-corrupted data matrix Z and the structure of the subsequent total least squares (TLS) problem, the computational complexity of Unitary ESPRIT can be reduced significantly. This is achieved by transforming the three (complex-valued) rank-revealing factorizations, • the subspace estimation step • the subsequent total least squares problem • the final eigenvalue decomposition (EVD)

Proposition 2-Total Least Squares Problem: The complexvalued SVD of size m x 2d that solves the total least squares (TLS) problem ClIP ~ C 2 , which is associated with Unitary ESPRIT, can be transformed into an SVD of the real matrix T (C 1) E Rm x 2d, where the transformation T (-) is defined in (6). Moreover, the eigenvalues ¢k of the resulting TLS solution \lJTLS E C d x d will be symmetric with respect to the unit circle, i.e., there are indices k,l E

into factorizations of real-valued matrices of the same size. Thus, real-valued computations can be maintained for all steps of the Unitary ESPRIT algorithm. The following three propositions derive the required transformations by taking advantage of the mapping between centro-Hermitian and real matrices, cf. Section II. In Remark 3, we also show how the real-valued total least squares problem can be replaced by a real-valued least squares (LS) problem. Proposition i-Signal Subspace Estimation: The principal subspace of Z E C ~\I X 2 1V (and, therefore, also the principal subspace of JZ) can be obtained through a rank-revealing factorization of the real matrix T(X) E RJ1 x2.N. where the transformation T(·) is defined in (6). Then, the complex matrices C I and C 2 • spanning the estimated signal subspace, obey

Proof: By post-multiplying the noise-corrupted matrix with a unitary permutation matrix we obtain a centroHermitian matrix in the following fashion:

- == Z-[1-

ZCH

.d}

such that

(/>~:

1

== -==-.

(22)

1)1

Proof' The multidimensional TLS problem Clt/! can be solved through an SVD of

~

C2

V~] V~ . Then, the TLS solution is obtained from the d right singular vectors corresponding to the d smallest singular values according to

(23) where we have assumed that V:!:! E GL( d). i.e.. the TLS solution is unique. For the singular case. the reader is referred to [21]. Thus. the TLS problem associated with Unitary ESPRIT can be solved through an SVD of

( 19)

Z

{1,2~···

[C l

C :! J

\ ~ ) rc l 1

1/", Cd.

Notice that this matrix has the same structure as ~ Z :::: (X II.\l X). Using, therefore, the same reasoning as in (20) and (21). the

V

(20)

TLS problem is solved by computing an SVD of the real matrix

According to Corollary 1, a rank-revealing factorization of ZCH can, thus, be obtained through an SVD of the real matrix

(24)

Its right singular vectors will be denoted by

W = which proves the first part of this proposition. Let the tl dominant left singular vectors of :.pQ ( ZCH) be denoted by EsE RJ\;! x d. Then, the d dominant left singular vectors of ZCH as well as Z are given by QA1E.'i' Therefore, the matrix

[Wll W: n

W 12 ] W 22

E

[R2dX"2d.

Then, the right singular vectors of [C 1 from

C·2 ) are determined (25)

and \VTLS is obtained from (23). Since the matrix Q2dW is left II -real, it can be written as provides a basis for the estimated signal subspace. With (10) and the left II-realness of QM we, finally, get the desired result:

II 1nC I == IImJIQ.~IEs

=HmIImJ2II1\lQAfEs == J 2 QAJ E s == C 2 ·

for some matrix VI E C d x 2c1 • cf. (2). With (25) we, therefore, conclude V 22 == V 12. Thus, if (/>, is an eigenvalue of the TLS solution \{J TLS E GL( <1), 1/ fjJi is an eigenvalue of --1

--1

\V T LS == - V 12 V 12 =

•

290

which proves (22).

WTLS

•

C. Reliability Test

Proposition 3-Eigenvalue Decomposition: The eigenvalues of the complex matrix WTLS can be determined from the eigenvalues of a real matrix of size d x d via the linear fractional transformation .r - J

f(:r) == - - . ' .t:

Proposition 2 states that the eigenvalues of WTLS, i.e., the phase factors estimated via Unitary ESPRIT, are symmetric with respect to the unit circle, since they satisfy (22). This observation gives rise to a new reliability test provided by Unitary ESPRIT without demanding additional computations. This reliability test is a substantial improvement of current high-resolution array signal processing and spectral estimation techniques since usually there is no easy way to determine how reliable the resulting estimates are. Unreliable estimation results might have been caused by a false estimate of the number of sources d or by the fact that there is no source signal at all (only noise). Remark 2-Eigenvalues with. Unit Modulus: Notice first that the eigenvalues cPk that lie on the unit circle form a subset with nonzero measure in the class of all eigenvalues ful filling (22). i.e., being symmetric with respect to the unit circle. Owing to this and the fact that Unitary ESPRIT produces consistent DOA estimates. asymptotically all the estimated phase factors (T>k will be on the unit circle. IL however. the number of snapshots .V is too small or if there is only noise present. the eigenvalues of \{J TLS might fail to satisfy

(26)

+J

Moreover, the eigenvectors of both matrices are identical. Proof' a) Assume, for the moment that the left iI-real matrix Q2 d is the one we have defined in (3). Then. (25) yields

After partitioning V and W as before, we therefore conclude .rom (23) tVTLS

== -(W 12

+ jW12)(Wl~

== - (( - W

1:2W

=f(YT LS )

221 )

-

with

- jW 21 l -

j I d )( ( - W

Y T LS

1

12 W

2} ) + j I d ) -

= -W 12W;}.

(27)

Here. f(.1") denotes the linear fractional transformation which is analytic for .1' # -j. Let TTLS

= -W12W~}

t

= TfIT- i

(26),

(30)

(28)

be the eigenvalue decomposition (EVD) of the real matrix

Y rt.s It is a well-known result from function theory that the eigenvalues of tV TLS can be obtained through the "arne linear fractional trans formation. i.e.

with

[J = Jiag{~~·~1.=t

and

.Jv'~. ~

I

-./

r

.md the corresponding eigenvectors of rt.s and tV TL~ are identical. b) An arbitrary left lI-real matrix of dimension ],(] x 2d can. obviously. be written as

which indicates that the subsequent estimates will be unreliable. Hence. no further computations should be carried out. Condition (30) implies that all eigenvalues w'J..: of Y T LS are real. cf. (26 ).~ Thus. if some of the w!}.. occur in complex conjugate pairs. the Unitary ESPRIT reliability test has failed. and the algorithm has to be restarted with an increased window length .\' or more reliable measurements. If. conversely. all eigenvalues ...JJk· are real, i.e .. the reliability test has been "passed.' all estimated phase factors c/)}.. are precisely on the unit circle.

D. Real- Valued Least Squares After replacing the real matrix W by W = R'2tlW. we invoke the same reasoning as above to prove this proposition for an irbitrary left IT-real transformation Q'2d' • Remark I-Covariance Approach: Instead of the described square root (or direct data) approach based on a real-valued SVD. cf. Proposition 1~ we can use a covariance approach based on a real-valued EVD to determine the signal subspace estimate. Then. E, E lR ,\[ x d denotes the d principal eigenvectors of (29)

First forming T(X) according to (7), followed by the computation of (29)~ is more efficient than the approach alternative suggested in [8] and [24]. There, it is proposed to compute the

Notice that the derivation of the Unitary ESPRIT reliability test is based on a total least squares solution of (16). Thus. the computation of T TLS requires an SVD (or another rank revealing factorization) of T( C 1) E jRm. x 2d. By computing the less expensive least squares instead of the total least squares solution of (16), we would, however, lose the benefits of the reliability test, since (22) would no longer be satisfied. Moreover, the complex-valued least squares problem (16) cannot be transformed into a real-valued problem of the same size. The following remark, however, sets up a different. realvalued least squares problem, which can be solved instead. Remark 3-Least Squares Estimate: After partitioning the real matrix of (24) according to

- - H

complex-valued sample covariance matrix R,"\.\ = X X E (AI x 1\-/ first. Then~~;,~ is determined from the EVD of Re{Q~iIRx,yQ.I\I}' which is computationally more expensive than using (29).

291

4 Recall that the eigenvalues of a real matrix can either be real or they occur in complex conjugate pairs.

it is easy to see that T TLS is a TLS solution of the real-valued system of equations

Now, we are in the position to summarize the described real implementation of Unitary ESPRIT, which is given in Table I. Here, the left H-real matrices Q m and QA-I are chosen according to (3) or (4). Notice that a linear estimate of the source signal matrix S (Step 7) can easily be obtained by applying the results of this section to the source signal matrix estimate derived in [4], where (without loss of generality) the additive noise is assumed to be spatially uncorrelated.

(31) To save computations, we can, therefore, solve (31) by computing its least squares solution T LS. Here, the Unitary ESPRIT reliability test is still applicable, since the resulting matrix T LS is always real. If the reliability test has been "passed", the estimated phase factors are on the unit circle. The real-valued LS or TLS problem (31) can directly be obtained from E.., by observing

T(Cd

=Q~[CI

C21[Irl

=Q~[JIQME..

V. COMPUTER SIMULATIONS In this section, we present some simulation results that compare Unitary ESPRIT with the standard ESPRIT algorithm, using the SVD implementation in all cases. Among others, we examine scenarios where the standard ESPRIT algorithm faces some problems, like low signal-to-noise ratios, short window lengths, and correlated source signals.

nJQ2d

J 2Q u E sl

1

j~n ] ~ [~nn -.JI n

v2

J2[K1Es K 2E.• ]

=

where the selection matrices K follows:

1

and K:2 are defined as

A. Signal Reconstruction

Since the matrices in braces are centro-Hermitian, K 1 and K 2 are always real, cf. Theorem I. They are even sparse if the selection matrix J 1 is sparse and the matrices Qtn and Q-'1 are chosen according to (3) or (4). This is illustrated by the following example. For the ULA with maximum overlap sketched in Fig. 2(a), J 1 is given by

1 0 0 000

o

1 o 0 0 1 000 000

J 1 ==

0 0 0 0 0 0 . 100 0 1 0

Thus, straightforward calculations yield 1 0

K

1

== 0

0 0

0 0

K:2== 0

1

0

1 1 0

0 0 0

0 0 -1 1

0 1

0 0 0 0 J2 0 0 0 1 1 0 0 1 -1 0 0 0 0 0 0 0 -1 0

0 0 0 0 1 1 -1 0 0 0

0 1

-J2

First, we examine the effect of Unitary ESPRIT on the resulting signal estimates. To this end, three impinging wavefronts are reconstructed using a single ULA of Al = 9 sensors with maximum overlap, cf. Fig. 2(a). The three uncorrelated equi-powered QPSK signals arrive from H1 = lO°. 112 =20 0 . and f}:J = 30°. respectively. Fig. 3 depicts the resulting output signal-to-noise-and-interference ratio (SNIR) as a function of the SNR and the number of snapshots .V using standard ESPRIT (dashed lines) and Unitary ESPRIT (solid lines). The values of ~V marked on the right side of the figure correspond to the solid lines, i.e., Unitary ESPRIT. The output SNIR achieved by the standard ESPRIT algorithm for a given value of ~V (dashed lines) can be found below the corresponding solid lines. For small values of .V. e.g., .V == 5 snapshots. Unitary ESPRIT achieves a significantly better performance than the standard ESPRIT algorithm. Notice that standard ESPRIT with !V == 10 snapshots attains the same performance as Unitary ESPRIT with ;.V = 5 snapshots for SNR's that are greater than 15 dB, while the performance of standard ESPRIT with N = 20 is comparable to the performance of Unitary ESPRIT with N == 10 for SNR' s that are greater than 5 dB. Thus, Unitary ESPRIT essentially doubles the number of available snapshots J.V compared to the standard ESPRIT algorithm.

B. DOA Estimation

0

0

E. Summary of the Algorithm Before presenting a summary of Unitary ESPRIT, we note an interesting relationship between the eigenvalues of the real matrix T denoted by Wk, and the estimated phase factors c/>k ei 1t k , cf. (9). Solving ei l Lk == f(Wk) for the spatial frequencies J.Lk yields the simple expression

Next, we investigate the effect of Unitary ESPRIT on the estimated phase factors 4>~. ~ 1 ::; It; ::; d. Consider a ULA with itt 6 sensors and three correlated signals impinging from £}1 20° ,()2 = 0° ~ and £}3 == 20°. Their correlation matrix is given by

= =-

=

1 (1 +

J-Lk ·.if -":= -:-In J

jWk) . 1 - JWk

= 2 arctan Wk,

k=1,2~···,d.

(33)

R;J:x

==

[~p p2i :1

2

]

(34)

The phase factors (/>1, 4J2, and q~3' estimated with the standard ESPRIT algorithm and Unitary ESPRIT, are marked by crosses (+) in the C0l11pJex plane as depicted in Figs. 4 and 5 for

292

SUMMARY l . Initializatio n ' Form the matrix X E

eM, ...

TABLE I UNITARY ESPRIT

OF

from the available measurements.

2 Signal Subspa ce Estimation . Determine the real matrix of

r(X ) (square root approach)

r eX) E RMx2N from (7), and compute the SVD

or the eigendecomposition of

r(x )r (X )H (covariance approach).

The d

dominant left singular vectors or eigenvec tors will be called E, E RM x d . Estimate the number of sources d, if d is not known a priori [22J. 3. (Total) l.eust Squares ' Solve the overdetermined system of equations

by means of least squares (or total least squares) techniques. The selection matrices K

I

and K

2

are defined

in (32) s , Eigenvalue Decomposnion, Compute the eigendecomposi tion of the resulting solution 1" =rflr - 1 ER d xd , 5 Relta bili tv Test If all eigenvalues

where

fl =diag {w .}: ~ I '

w, are real, the estimates will be reliable . Otherwise, stan again with more

measurements. 6 OOA Esnm an on : Estimate the directions of arrival (DOA's) from

k = 1,2, .. . , d ,

/-'. = 2 arctan w" according 10 (9).

7 Signa! R eCCmSlrIICI/OII : A linear estima te of the source signal matrix S E - = S

(

Dr -

I

edx N

is given by

H H)-X,

E, QM

where D E C'" denotes an arbitrary diagonal (row) scaling matrix {4].

Standard ESPRIT 90 1

:: -,

1

+

,,----- ---- ---1

180 1---t--+-+-+--~!E--+-+-+----jH~!J,l

+

Unitary ESPRIT Standard ESPRIT

10

15 SNR In dB

20

25

30

Fig. 3. Output SNIR as a function of the SNR and the number of snapshots :Ii using standard ESPRIT (dashed lines) and Unitary ESPRIT tsolid lines ) for iii 10° . Ii ~ :20°. and li l .1 0° '-' / <) sensors. 1000 trial runs ). The values of S marked on the right side of the figure correspond to the sc.lid lines, i.e.. Unitary ESPRIT. The output SNIR achieved by the standard E'iPRIT algorithm for a given value of .\' (dashed lines) can be found below the corresponding solid line s.

Fig. 4. Phase factors 0 1. 02 . and 0:1, estimated with the standard ESPRIT - 20 ° .Ih 0° . Ii:l 20° . and correlation coefficients algont hm for iii Pl2 O··) , PI .l 0..) . and (!2 1 0.2.) (.\[ 6 sensors, SNR 0 dB. S = 20. 80 trial runs).

a correlation coefficient of p = 0.5. The results of 80 trial runs with N = 20 snapshots and an SNR of 0 dB are shown. Notice that all phase factors estimated with Unitary ESPRIT are precisely on the unit circle (Fig. 5), Figs. 6 and 7 depict the estimated phase factors for a correlation

coefficient of p = 0,8. In this example, the Unitary ESPRIT reliability test has failed three times. To picture these failures, the corresponding phase factor estimates are surrounded by circles (0), cf. Fig, 7. Notice that the variance of the DOA estimates that pass the Unitary ESPRIT reliability test is much

=

=

=

=

270

=

293

= =

=

=

=

=

=

Unitary ESPRI T

Unitary ESPRIT 90 1

90

180 1--4--+-+-f-~I!E--+-+-f--r---tI

o

1

o

180 1--4--+---f--f---=;:lIiE--f--+--t--i--I

270

270

Fig. 5. Phase factors OJ, 0) , and OJ , estimated with Un itary ESPRIT for Ii, = - 20°, 8, = 0°, fh = 20°, and correlation coefficients P'2 = 0.5, Pl3 = 0.5, and Pn = 0.25 (M = 6 sensors, SNR = 0 dB , N = 20, 80 trial runs).

Fig. 7. Phase factors 0" 0), and OJ, estimated with Unitary ESPRIT for Ii, = -20°, 8, = 0°, fh = 20°, and correlation coefficients P'2 = 0.8, Pl3 = 0.8, and Pn = 0.64 (M = 6 sensors , SNR = 0 dB, N = 20,8 0 trial runs). Estimates that produced a failure of the reliability test are surrounded by a circle (0).

Standard ESPRIT

+

-e0lt- """

+

10 1 I

9~ -

8~i ,L

+

,

(/)

~

a:

3

/

I

1

, ,

,

Sf 4

-3

,/

270 Fig. 6. Phase factors 0 " 0), and OJ, estimated with the stand ard ESPRIT algorithm for Ii, = -20°, 8, = 0°, fh = 20°, and correlation coefficients P'2 = 0.8, Pl3 = 0.8, and P23 = 0.64 (M = 6 sensors, SNR = 0 dB, N = 20,80 trial runs) .

lower than the variance of the DOA estimates obtained by the standard ESPRIT approach. The advantages of Unitary ESPRIT become even more evident if the root mean squared error (RMSE) of the estimated directions of arrival is plotted as a function of the correlation coefficient p. Fig. 8 show these curves for SNR's of -3, 0, and 5 dB using 3000 trial runs. The standard ESPRIT algorithm (dashed line - - -) is compared with Unitary ESPRIT without reliability test (dotted

0 .1

.

/~

"

,

i

J

- - - -- -

SdB

a a

J

/

es

a dB +

, ,

I

,

6~ 1? i W

f----l---+--+--t--3lIIE-+--+--r------I~~

,

Unll ary ESP RIT WIth reliability test

I

+

,

Unll ary ESPR IT Without reliability test

-

"I

+

180

- - . Standa rd ESPR IT

02

0.3

04

as

corrat anon coatncrern

0 .6

0.7

0.8

a 'J

Fig. 8. Root mean squared error (RMSE) in degrees of the estimated directions of arrival as a function of the correlation coefficient P and the SNR for Ii, = - 20°, 8, = 0°, and fh = 20° (M = 6 sensors, N = 20, 3000 trial runs). The signal correlation matr ix is given by (34).

line · . .). and Unitary ESPRIT with the new reliability test (solid line -). It can be seen that Unitary ESPRIT improves the estimation accuracy considerably. In the case of low SNR' s, the estimation accuracy is improved even further, by exploiting the information provided by the new reliability test. The corresponding failure rates of the Unitary ESPRIT reliabil ity test are plotted in Fig. 9. Due to the forward-backward averaging effect. Unitary ESPRIT can separate two completely coherent wavefronts , which is demonstrated in the next example . Two correlated

294

2 5 , - - - - - - r - - - . , - - - r - - - r - - - - r- - - r - - - , - - - - r - - - - ,

20 -3 dB

15

15

10

05 300

o o

200 100

5 dB

01

02

03

Q4

03

06

correlationcoefficient

dt:c, {/I}

in degrees

08

0 .9

Fig. 11. RMS E (in degrees) of the estimated dir ections of arrival as a function of the magnitude and phase of the complex correlation coefficient p for II, = 0° and 0, = 20° using standard ESPRIT (M = 4 sensors, N = 20, 100 trial runs).

Fig. 9. Failures of the Unitary ESPRIT reliability test as a function of the correlation coefficients p for II, = -20°, ~ = 0°, and 113 = 20° (M = 6 sensors, N = 20,3000 trial run s). Onc e again, the signal correlation matrix is given by (34). These curve s correspond to the solid lines in Fig. 8.

2 ..,

1 5~

I

1Ar 13~

Unitary ESPRIT (LS)

i

Urinary ESPRIT tTLS)

i

e-

~ ~

a:

I I

Standard ESPR IT (TLS)

i

_ t

I

Standard ESPRIT (LS)

.

I

2~

I

I

I

I

I

I

I

I

1 1r

I

I

I

I

1

~

!

1~

i

o

/

J

I

-1j

9rc-_=-.0_'-.-_-___________ ___

0 8r

I

o

01

0 .2

03

0.4 0 .5 06 correlation coeffICient

0.7

0.8

09

I

signals with correlation coefficient o are impinging an aULA of ;\l = 4 sensors from lil = QOand 1i'2 = 20° . Fig . 10 shows the resulting RMS error of the estimated DOA ' s as a function of fl. The performance of Unitary ESPRIT is not effected by the correlation, while the performance of standard ESPRIT deteriorates dramatically as p increases. Notice also that the difference between TLS and LS version of standard ESPRIT is negligible, while the LS and the TLS ver sion of Unitary ESPRIT fall on top of one another. Thus, it is ad visable to use the LS ver sion of Unitary ESPRIT instead of the computationall y more expensive TL S version. Finall y, Figs . 11 and 12 show the RMS error of the estimated DOA 's as a function of the magnitude and phase of a complex -valued correlation coefficient p . confirming the concl usions drawn from Fig . 10.

~OO "

a

.tr,,{p}

100

In

degrees

Fig. 12. RMSE (in degrees) of the estimated directions of arrival as a function of the magnitude and phase of the complex correlation coefficient p for II, = 0° and ~ = 20° using Unitary ESPRIT (M = 4 sensors , N = 20, 100 trial run s).

VI. C ONCLUDING REMARKS

1

Fig. 10. RMSE (in degr ees) of the estimated directions of arriv al as a function of the correl ation coefficient p for II, = 0° and ~ eq 20° (M = 4 se nsors, N = 20, 100 trial run s). Notice that the curve s for the LS and the TLS version of Unita ry ESPRIT fall on top of on e another.

.L

~

An improved version of the ESPRIT algorithm, called Unitary ESPRIT, has been presented in this paper. Unitary ESPRIT represents a simple method to constrain the estimated phase factors to the unit circle, yielding more accurate signal subspace estimates. The computational complexi ty is reduced significantly by exploiting the one -to-one correspondence between centro-Hermitian and real matrices, allowing a tran sformation to real matrices, which can be maintained for all step s of the algorithm. Unitary ESPRIT also provides a new reliability test, which is particularly useful in extremely low SNR's. Due to the inherent forwardbackward averaging effect, Unitary ESPRIT can separate two completely coherent sources and provides improved estimates for correlated signals. Moreover, Unitary ESPRIT offers a great potential to improve the performance of approximate sig na l subspace estimation techniques, which are well suited for an adaptive implementation, since inexpensive updating strategies are known [5]. The fact that Unitary ESPRIT is efficiently formulated in terms of real-valued computations from start to finish, is

295

critically important for the extension to 2-D centro-symmetric arrays with a dual invariance structure. 2-D Unitary ESPRIT [23] provides automatically paired source azimuth and elevation angle estimates along with an efficient way to reconstruct the impinging wavefronts. Furthermore, an efficient Off beamspace implementation of Unitary ESPRIT has also been derived in [23], enabling reduced dimension processing in beamspace, if there is a priori information on the general angular location of the DOA's.

[17]

[18] [19] [20]

[21]

ACKNOWLEDGMENT

The authors would like to thank Prof. M. Viberg, Gothenburg, Sweden, for his helpful comments regarding the contents of this paper.

[22]

[23]

REFERENCES [1] C. H. Bischof and G. M. Shroff, "On updating signal subspaces," IEEE Trans. Signal Processing, vol. 40. pp. 96-105. Jan. 1992. [2] T. F. Chan, "Rank revealing QR factorization," Linear Algebra and its Applications, vol. 88, 89. pp. 67-82. 1987. [3] J. Gotze and A. J. van der Veen, "On-line subspace estimation using a Schur-type method:' IEEE Trans. Signal Processing, Nov. 1993, submitted for publication. [4] M. Haardt and M. E. Ali-Hackl. "Unitary ESPRIT: How to exploit additional information inherent In the rotational invariance structure." in Proc. IEEE Int. Can! Acoust.. Speech. Signal Processing. Adelaide. Australia. Apr. 1994. pp. 229-:!32. vol. IV. [5] M. Haardt and 1. Gotze, "Unitary Schur-type subspace estimation." Technical University of Munich. Inst. of Network Theory Circuit Design. Munich. Germany. Tech. Rep. TUM-LNS-TR-94-6, July 1994. [6] M. Haardt. P. Weismuller, and R. Killmann, "The identification of late fields: A multichannel high-resolution state space approach," in Proc. 14th GRETSI Symp. Signal Image Processing. Juan-les-Pins, France. Sept. 1993, pp. 1251-1254. [7] Y. Hua and T. K. Sarkar, "On SVD for estimating generalized eigenvalues of singular matrix pencil in noise." IEEE Trans. Signal Processing. vol. 39. pp. 892-900. Apr. 1991. [8] K. C. Huamg and C. C. Yeh, "A unitary transformation method for angle-of-arrival estimation:' IEEE Trans. Signal Processing, vol, 39. pp. 975-977, Apr. 1991. [9] _ _ . "Adaptive beamforming with conjugate symmetric weights:' IEEE Trans. Antenna. Propagat.. vol. 39, pp. 92fr-932 , July 1991. [10] A. Lee, "Cenrrohermitian and skew-centrohermitian matrices." Linear Algebra and its Applications, vol. 29. pp. 205-210, 1980. [11] K. J. R. Liu. D. P. O'Leary, G. W. Stewart. and Y. J. J. Wu, "An adaptive ESPRIT based on URY decomposition:' in Proc. IEEE Int. Conf. Acoust.. Speech. Signal Processing, Minneapolis, MN. Apr. 1993. pp. 37~O, vol. IV. [12] B. Ottersten, M. Viberg, and T. Kailath, "Performance analysis of the total least squares ESPRIT algorithm," IEEE Trans. Signal Processing, vol. 39. pp. 1122-1135, May 1991. [13] S. U. Pillai and B. H. Kwon, "Forward/backward spatial smoothing techniques for coherent signal identification," IEEE Trans. Acoust., Speech. Signal Processing, vol. 37, pp. 8-15, Jan. 1989. [14] B. D. Rao and K. V. S. Hari, "Weighted subspace methods and spatial smoothing: Analysis and comparison," IEEE Trans. Signal Processing, vol. 41, pp. 788-803, Feb. 1993. [15] R. Roy and T. Kailath, "ESPRIT-Estimation of signal parameters via rotational invariance techniques," in Signal Processing Part II: Control Theory and Applications (L. Auslander, F. A. Grunbaurn, J. W. Helton, T. Kailath, P. Khargonekar, and S. Mitter, Eds.). Berlin, Vienna, New York: Springer-Verlag, 1990, pp. 369-411. [16] R. Roy, A. Paulraj, and T. Kailath, "ESPRIT-A subspace rotation approach to estimation of parameters of cisoids in noise," IEEE Trans.

[24]

296

Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 1340-1342, Oct. 1986. R. H. Roy, "ESPRIT-Estimation of Signal Parameters via Rotational Invariance Techniques," Ph.D. thesis, Stanford Univ., Stanford, CA, Aug. 1987. G. W. Stewart, "An updating algorithm for subspace tracking," IEEE Trans. Signal Processing, vol. 40, pp. 1535-1541, June 1992. A. J. van der Veen, "A Schur method for low-rank matrix approximation," SIAM 1. Matrix Anal. Api., 1994, accepted for publication. A. J. van der Veen, E. F. Deprettere, and A. L. Swindlehurst, "Subspacebased signal analysis using singular value decomposition," Proc. IEEE, vol. 81, 1277-1308, Sept. 1993. S. Van Huffel and J. Vandewalle, The Total Least Squares Problem: Computational Aspects and Analysis Frontiers in Applied Mathematics, Vol. 9. Philadelphia, PA: Soc. Ind. and Applied Math., 1991. G. Xu, R. H. Roy, and T. Kailath, "Detection of number of sources via exploitation of centro-symmetry property," IEEE Trans. Signal Processing, vol. 42, pp. 102-112, Jan. 1994. M. D. Zoltowski, M. Haardt, and C. P. Mathews, "Closed-form angle estimation with rectangular arrays in element space or beamspace via Unitary ESPRIT," IEEE Trans. Signal Processing, July 1994, submitted for publication. M. D. Zoltowski, G. M. Kautz, and S. D. Silverstein, "Bearnspace RootMUSIC," IEEE Trans. Signal Processing, vol. 41, pp. 344-364, Jan. 1993.

rr-

Joint Angle and Delay Estimation (JADE) for Multipath Signals Arriving at an Antenna Array Michaela C. Vanderveen, Constantinos B. Papadias, Student Member, IEEE, and Arogyaswami Paulraj, Fellow, IEEE

Abstract- We propose a novel subspace approach to estimate the angles-of-arrival and delays of multipath signals from digitally modulated sources arriving at an antenna array. Our method uses a collection of estimates of a space-time vector channel. The Cramer-Rao bound (CRB) and simulations are provided.

I.

I

INTRODUCTION

N WIRELESS communications, mobiles emit signals that arrive at a base station via multiple paths. Estimating each path's angle-of-arrival (AOA) and propagation delay is necessary for several applications, such as mobile localization for emergency services. It is in fact a classical radar problem. This work proposes a novel approach (JADE) to estimating the AOA's and delays of the multipath signals using a collection of space-time channel estimates (analogous to snapshots in subspace methods for AOA estimation) which have constant parameters of interest but different path fade amplitudes. JADE can work in cases when the number of paths exceeds the number of antennas, unlike the traditional MUSIC and ESPRIT algorithms [1], [2]. II. DATA

MODEL

We focus on the case for a single user first and show later how this approach can be extended to multiple users. The received baseband signal at the ith element of an m-element antenna array is given by

L a'i(8Z)f3l(t)r(t - TI) + ni(t) L

Xi(t) ==

(1)

signals into a vector x( t) such that L

X ( t)

==

L a (8

Manuscript received September 12, 1996. The associate editor coordinating the review of this letter and approving it for publication was Dr. Y. Bar-Ness. M. C. Vanderveen is with Scientific Computing Program, Stanford University, Stanford, CA 94305-9025 USA (e-mail: [email protected]). C. B. Papadias and A. Paulraj are with Information Systems Laboratory, Stanford, CA 94305 USA. Publisher Item Identifier S 1089-7798(97)01345-8.

(t) r (t -

Tl)

+ n (t ) .

(2)

We sample this signal at the symbol rate (i.e., instants kT). Let P be the number of the symbol-spaced samples of the channel impulse response, P == 28 + M d , where 28T is the symbol waveform duration and M d is the maximum integer path delay. We obtain

x(k) == Hs(k)

+ n(k)

(3)

where the jth element of the vector s( k) of data is b(k + j - 8) and H is an m x P channel matrix capturing the effects of the array response, delay, symbol waveform, and path fading, and taking the form:

Md

H == [a(8 1 )

...

a(BL ) ]

131 [

(4)

==: A(8) D G(r)T

where g( Ti) == [g((Md +8- 1)T - Ti ) ... g((M d +8- P )T - Ti )] is a P-Iong row vector of samples of g( t - Ti)' Since we assume the path fadings to be constant within a data burst, we have suppressed their dependence on the sampling instant k. We assume we know the number of multipaths L,l the maximum path delay M d , the modulation waveform gC), and the structure of the array response at),

[=1

where L is the number of multipaths, a, (8z) is the response of the ith antenna to the lth path arriving from angle 8l , 13l (t) is the complex envelope of the path fading, Tl is the path delay, n, (t) is the additive noise, and r (.) is the transmitted signal, given by r(t) == Li b(i)g(t - iT), where {b(i)} is the sequence of data bits, g(t) is the modulation waveform, and T is the symbol period. We collect all the sensor responses to 8l into an m-element vector a(el) == [al (ez) ... am (Oz)]T and similarly the received

l ) f3l

l==1

III. JADE Let h == vect(H) be a vector of length mP obtained by taking the transpose of each row of the matrix H and stacking it below the transpose of the previous row. Then

h == (A(O)

0

G(r)) diag(D) ==:

U((J~

r){3

(5)

where 0 denotes the Khatri-Rao (columnwise Kronecker) matrix product (that is, the lth column of U(8,r) is a(el ) @ g(TZ), where 0 denotes the Kronecker product). The m.P x L matrix U (8, r) is called the space-time matrix, and is parametrized by the ADA's and the path delays. The vector u( (), T) == a( B) Q9 g( T) is called space-time response vector to a path of unit amplitude arriving at angle 0 with delay T. As e 1 The number of multi paths can be estimated from the data matrix, but this is beyond the scope of this paper.

Reprinted from IEEE Communications Letters, Vol. 1, No.1, pp. 12-14, January 1997.

297

varies over the range of angles and r varies over the range of delays, u( 0, r) traces a multidimensional space-time manifold . In this paper, the radio channel from the mobile to the antenna array is time-slotted, modeled after the GSM standard. The channel H from the mobile to the antenna array can thus be assumed to be constant over each time slot, but it varies from one time slot to the next. This variation is due to the changing complex fadings (3z. However, the AOA's 01 and delays ri are not changing significantly from each time slot to the next, and thus we take U((}, r) to be constant over a few time slots. This assumption is reasonable in practice because we consider only a relatively small number of time slots and thus during this very short time the mobile, which is far away, appears to be almost stationary with respect to the base station . The first step in our approach consists of estimating the channel impulse response from the user to the antenna array. This can be accomplished by using training bits or blindly . We collect data from M consecutive time slots and use it to obtain estimates of H. If we let q be the time slot index, our estimates Y q of the true channel H, take the form:

q= 1, ·· · ,M

(6)

where V q is the estimation noise matrix . Applying the vecu-) operation yields, with the obvious notation,

q= 1, · ·· ,M or, if we let Y

= [Yl,"

"

YM]mPxM

Y = U((}, r)B

(7)

and similarly for B , V,

+V .

(8)

The second step in our approach consists of estimating the 2£ parameters of interest, namely 01' sand rl' s, and eliminating the £M nuisance parameters (31 'S. In other words, given the noisy estimated matrix Y and the known structure of the space-time matrix U, we seek the desired parameters (} , r according to (8). We assume that the space-time manifold does not have any ambiguities , therefore, leading to unique estimates (this is the case whenever either the array or delay manifold are unambiguous) . In order for an angle-delay subspace to exist , we also need U to be a tall matrix (i.e., L < mP). It can also be easily shown that the space-time channel matrix U corresponding to a collection of L distinct paths is full column rank. IV. ALGORITHMS AND THE CRAMER-RAO BOUND The second step in our approach consists of estimating the multipath parameters (}, -r from the estimated channel Y, using (8). Among the various ways to solve this problem, we focus on two of them.

e

30

AOA(O EGI

Fig. I.

MUSIC-like spectrum.

'Ir '~ , -

We will assume that the estimation noise V is white and Gaussian, a fact which can be readily proved for the case of nonblind channel estimation. The entries of the complex fading matrix B can be modeled as unknown deterministic quantities. Then employing deterministic maximum likelihood techniques

- - --

--,

Fig. 2. Standard deviation of estimates versus CRB.

yields the following minimization problem :

(9)

min lIY - U ((} ,r)BII}·

9 ,T,B

It is well known that this is a separable optimization problem that reduces to (O ,f) = arg max tr PURy, where P u = U(U*U)-lU* , Ry = vv- 1M, and 0* denotes the complex-conjugate transpose. This search can be done using the damped Newton method .

B. JADE-MUSIC The technique described above involves a 2L-dimensional search and may thus be computationally prohibitive. A faster, though suboptimal, approach is based on the MUSIC algorithm and involves only a two-dimensional search. We know that the true space-time channel vector u( 0, r) is orthogonal to the noise subspace En , whose columns are the eigenvectors of Ry corresponding to the mP - L smallest eigenvalues. We thus look for peaks in the two-dimensional MUSIC spectrum (u*u)/u*EnE~u. The peaks should occur close to the true (0, r) coordinates (see Fig. 1). C. Cramer-Rao Bound

The Cramer-Rao bound (CRB) provides a lower bound on the variance of any unbiased estimator. The bound for AOA estimation (without delay spread) was derived in [3] and is readily adapted to the present situation. Assuming the path fadings to be deterministic but unknown, we obtain for the model in (8) that

CRB(B ,r)

A. Maximum Likelihood

DEUt.V [T)

M

= ~ { ~real

(B;F*P&FB q )

}-l

(10)

where (j is the variance of the estimation noise, Bq = h 0 diag(,Bq), F = [A'( B)oG (r), A(O)oG'(r)], and P& = I-Pu (prime denotes differentiation with respect to the individual parameters and all matrices are evaluated at the true parameter values) .

298

V. SIMULATION RESULTS AND EXTENSIONS

Computer simulations were run to demonstrate the performance of the JADE-MUSIC algorithm. We assume a single user, three multipaths, and a two-element antenna. The classical approaches to this problem, such as MUSIC and ESPRIT, will not yield satisfactory results, since the number of antennas is smaller than the number of multipaths. However, the JADE algorithm can handle this case successfully. The AOA's are [-5, 0, 20]0 relative to the array broadside and the corresponding path delays are [1.0,0.7. 2.0]T seconds, where T is normalized to one. The collected data are corrupted by noise with inverse variance 1/ (J" = 5, 10, 15, and 20 dB. The modulation waveform is a raised cosine pulse with excess bandwidth 0.35, assumed to be zero outside the interval [-3,3). We sample at rate T /2 (the purpose of oversampling is to provide improved definition of the delay manifold). Data is collected over 20 time slots, and at each time slot the channel is estimated via least squares using 27 training bits. The experimental variance of the AOA and delay estimates is computed from 100 runs. The results are summarized in Fig. 2. The bias of the estimates was on the average 10/0-270/0 or their standard deviation (STD). Notice the STn is about 5 dB above the CRB.

A typical MUSIC spectrum for noise with inverse variance 20 dB is shown in Fig. 1. When we have more than one user in the same time slot. we can independently estimate the channel matrices H using each user' s unique (usually orthogonal) training signal. We can then proceed as above, with decoupled problems. If no training signals are available, we can still find the channel H for each user using blind methods, which exploit finite alphabet structures and oversampling [4].

REFERENCES [1] R. O. Schmidt, "A signal subspace approach to multiple emitter location and spectral estimation," Ph.D. dissertation, Stanford University, Stanford, CA, Nov. 1981. [2] R. Roy, A. Paulraj, and T. Kailath, "ESPRIT-A subspace rotation approach to estimation of parameters of cisoids in noise," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 1340-1342, Oct. 1986. [3] P. Stoica and A. Nehorai, "MUSIC, maximum likelihood and Cramer-Rae bound," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 720-741, May 1989. [4] A. van der Veen, S. Talwar, and A. Paulraj, . "Blind identification of FIR channels carrying multiple finite alphabet signals," in Proc. IEEE ICASSP, vol. 2, 1995, pp. 1213-1216.

299

Chapter 3

Performance Issues

A

consequence of the unprecedented worldwide expansion of mobile communication systems is the fact that networks are rapidly running out of system capacity. Systems such as UMTS have promised that they will provide higher value services, and as a result this ultimately means higher data rates and more capacity problems. In this scenario, new technologies and in particular adaptive antennas have emerged as strong candidates to fulfill the requirement for increased spectrum efficiency. The benefits that can be achieved with several adaptive antenna methods but also the challenges that each method faces are discussed in the overview paper at the beginning of this chapter. Each of the following papers deals with significant advances in the field and provides insight into the perfor-

mance that can be achieved when adaptive antennas are employed in different wireless communication systems. Issues considered include increased capacity and coverage, better signal quality, support of value-added services such as user location applications, more efficient handover, and so on. Nevertheless, it has to be emphasized that different communication systems will exploit different advantages or mixtures of advantages offered by adaptive antennas, depending on the maturity of the underlying system. Work that deals with important factors that affect the performance of different adaptive antenna systems, such as mutual coupling, fading correlation, multipath conditions in different operational environments, and frequency decorrelation in duplex systems, is also presented.

301

Smart antennas for mobile communication systems: benefits and challenges by G. V. Tsoulos Spatial filtering using adaptive or smart antennas has emerged as a promising technique to improve the performance of cellular mobile systems. This paper presents an overview of smart antennas in terms of key characteristics, options, challenges and benefits, in the context of current, but also with a view towards future, generation personal communication systems. 1 Introduction Over the last fewyears the demand for service provision via the wireless communication bearer has risen beyond all expectations. If the extraordinary fact that worldwide some half a billion subscribers to mobile networks are predicted by the year 2000 is put in the context of third generation system requirements (UMTS, IMT2000)I, then the most demanding technological challenge emerges: the need to increase the spectrum efficiency of wireless networks. While great effort in current (second) generation wireless communication systems has been directed towards the development of modulation, coding and protocols, antenna-related technology has received significantly less attention up to now. To achieve the ambitiou s requirements introduced for future wireless systems, new

' AEF BER

S BSRF

eDMA eIR DCS1800

I

'intelligent' or 'se lf-configured' and highly efficient systems will most certainly be required. In the pursuit of schemes that will solve these problems, attention has recently turned to spatial filtering methods using advanced antenna techniques : adaptive or smart antennas. Filtering in the space domain can separate spectrally and temporally overlapping signals from multiple mobile units, and hence the spatial dimension can be exploited as a hybrid multiple access technique complementing FDMA, TDMA and eDMA

2

Smart antenna approaches

Three main categories of smart antennas may be defined bas ed on how they produce their response: switched beam, direction finding , optimum combining. The

Abbreviations 1&-54/95

= area improvement factor = bit error ratiq = base station = Base station reduction factor

. LOS MMSE MS NLOS

= code division multiple access

carrier-to-interference ratio = digital communication system at 1800 MHz dBd = decibels above dipole D&-eDMA = direct sequence CDMA = digitalsignal processing DSP DTX = discontinuous transmission = effective isotropic radiated power EIRP FCC = Federal Communications Commission FDD =frequency division duplex =frequency division multiple access FDMA = Global System for Mobile GSM IMT2000 = International Mobile Telecommunications for the year 2000 =

PA

REF

SDMA SFIR

SINR SNR

j

TDD .TDMA UMTS

interim standard (of the American National Standards Iristitute and referring to an accepted industry. standard) for TDMA (54) or CDMA (95) -Jine of sight ='minimum mean square error = mobile station . = non line of sight = power amplifier = range extension factor = space division multiple access = spatial filtering for interference ' reduction = signal-to-interferenee-plus-noise ratio = signal-to-noise ratio . :: tim~ division duplex = time division multiple access = Universal Mobile Telecommunications System

=

Repinted with permission from Electronics and Communication Engineering Journal, G. V. Tsoulo s, " Smart Antennas for Mobile Communication Systems" Vol. II , No. 2, pp. 84- 94, April 1999. © 1999 by Institute of Electric al Engineers.

303

Table 1: Advantages and disadvantages of different smart antenna approaches Advantages

Disadvantages

Switched beams

Easilydeployed Tracking at beam switching rate

low gain between beams limited interference suppression False locking with shadowing, interference and wide angular spread

Direction finding

Tracking at angular change rate No reference signal required Easier downlink beamforming

lower overall CIR gain Susceptible to signal inaccuracies, needs calibration Concept is not applicable to small cell NlOS environments

Optimum combining

Optimum SINR gain No need for accurate calibration Performs well even when the number of elements is smaller than the number of signals (MMSE approach)

Difficult downlink beamforming with FDD and fast TDD Needs good reference signal for optimum performance Requires high update rates

reuse pattern employed (users with frequency fk in Fig. la), and through interference reduction in the spatial domain to achieve a lower cell repeat pattern (reuse distance) 2,3,8.13. With SDMAan adaptive antenna system is deployed in such a way that multiple users within the same cell can operate on the same time (tJ and frequency ifJ channel by exploiting the spatial separation of the users (see Fig. Ib)12-17. This concept can be seen as a dynamic (as opposed to fixed) sectorisation approach in which each mobile defines its own sector as it moves. The major advantages and disadvantages of these techniques are summarised in Table 22.3,12-17.

switched beam method employs a grid of beams and usually chooses the beam which gives the best SNR For direction-finding techniques (or spatial reference techniques, as they are also called) all the processing is focused on the acquisition and tracking of one parameter, the directions of the users. With optimum combining the output signal-to-interference-plus-noise ratio (SINR) is the parameter optimised. Table 1 summarises the most important advantages and disadvantages of these techniques, as presented in the literature (see, for example, References 2-13) . When employing a smart antenna with one of the above mentioned methods in the general sense of a spatial filter, one further categorisation can be recognised: spatial filtering for interference reduction (SFIR) (this concept is shown in Fig. la) and space division multiple access (SDMA) (see Fig. 1b). With SFIR the goal is, as with a traditional omnidirectional or sectorised antenna, to support one user in each of the co-channel cells of the

3 Some constraints A smart antenna system relies heavily on the spatial characteristics ofthe operational environment" to improve the output signal. For big cell structures (macrocells), there are three main scattering sources'? (Fig. 2):

\

\

/

b

a

Fig. 1 (a) SFIR and (b) SOMAconcepts

304

Table 2: Advantages and disadvantages of SOMA and SFIR

An interesting point is that in big cell environments the angular change ofthe incoming signal depends generally on the velocity and the distance of the mobile from the base scalterers remote scalterers local to the station (this is true when the multipath due to remote base station scatterers (Fig. 2) either doesn 't exist or can be ignored by the adaptive algorithm). As an example, let us address the point concerning the angular change of incoming signals in a big cell environment in the context of a GSM system application. GSM is a time division multiple access radio standard with a slotted waveform structure having a 577 ns burst period and a 4·6 ms period between consecutive bursts. In this case , even with a worst case scenario of a mobile user travelling at 200 km/h (e.g. fast train) around a circle of radius 1 km, the angular velocity is less than scalt erers local to the 3·1°s- 1, which corresponds for example to a 0·014° angular mobile change between two consecutive GSM bursts, as shown in Fig. 3. Since the angular change of the signal dictates the requir ed update rate of the main beam then, obviously, the Fig. 2 Large cell propagation environment wider the beams are, the smaller the update rates required. Furthermore, since it is possible to reduce substantially the (a) Scatterers local to the mobile: if the mobile is moving, transmission rate either in the up or the down link when these cause Doppler spread (i.e. time selectivity), and there is littleor no speech (on average each user speaks less they also cause small delay and angle spreads (usually than half of the time during a normal conversation), many less than 10°). existing systems employ discontinuous transmission (DTX) (b) Scatterers local to the base station : these contribute in order to exploit this feature". Because discontinuous multipath rays with small delay spread and large transmission effectively reduces the information rate (e.g., angular spread. in GSM,from 1burst every 4·6ms, as mentioned above - no (c) Remote scatterers: these cause independent fading on paths and contribute multipath with large delay spread DTX - to 1 burst every 120 ms - maximum DTX), this (frequency selective fading) and large angular spread. would mean that the angular change rate would increase considerably, as shown in Fig. 3 (from 0·014° per GSM burst to -0 ·4° per GSM burst, i.e. by 27 - - - withou t DTX times approximately). As a result, - - withDTX additional measures to compensate for this increase willbe necessary" 200 kmlh (e.g. periodic transmission of 100 kmlh dummy bursts) , especially for tracking fast-movingusers. In small cells on the other hand , the propagation scenario is quite ____ 200 kmlh -- - different Here, there are many 100k~------ ------- -----local scatterers in close vicinity to 10-3 L-.. -'----'----'-' the mobile and the base station 4 5 1 3 2 which result in much wider angular distance. km spread, low delay spread and moderate (pedestrian) to medium (mobile) Doppler spread. The Fig. 3 Change in direction as a function of distance and speed for large cells, previously used mapping through without OTX (1 burst/4·6 ms) and with maximum OTX (1 burst/120 ms)

--- -- -

305

the spatial response (power versus angle) or 'spatial signature' of a big cell environment to a single 'dominant' direction cannot be done anymore because there is no smart link budget balancing connection between the direction of the signals and the physical location of the user, except from I more efficient the direct paths in LOScases".The power control 'spatial signature' itself has to be ~I-----'I~ benellts used for user location, but in this ~ case there is no connection with the supportof wer redUdi00 value added services physical user location. I Another very important issue for adaptive antennas is the downlink transmission . In time division duplex (IDD) systems, the up and down links can be considered reciprocal, provided that the channel characteristics have not changed considerably between the Fig. 4 Operational benefits achieved with a smart antenna receive and transmit slots, i.e. there is limited user movement between transmission and matrix) is employed. The reason for such processing is reception. Under this condition, the weights calculated by that a subspace is a much more stable entity than a the adaptive antenna for the uplink can also be used for the channel vector and hence, instead of tracking the downlink to achieve spatially selective filtering. Application instantaneous channel, subspace beamforming of adaptive antennas in the downlink for frequency division methods track the subspace structure. duplex (FDD) systems seems to be one of the challenges related to this technology. The fundamental difference is Other methods which exploit particular characteristics of that in FDD systems the downlinkfading characteristics are the air interface method employed can also be used to independent of the uplink characteristics due to the compensate for the possible imbalance caused by the frequency difference (typically-40 MHz), while the angles downlink problem . For example, one such method could ofarrival of the multipath rays remain the same. For current be adaptive resource allocation. This effectively means FDD systems such as GSM, DCS1800, 15-54, 15-95, the allocating more radio channels (fDMA) or bandwidth processing performed in the uplink can not be exploited (CDMA)for the downlink so that the benefits are balanced directly in the downlink without any additional processing. between the two links. Furthermore, techniques which Several approaches have been proposed in the literature convert one form of diversity at the base station to another form at the mobile space/time, space/path, which attempt to solve the downlink problem; some ofthem space/coded modulation, etc.' - can also offer alternative are: solutions for the downlink problem. (a) transmit diversity", where multiple base station Different air interface techniques have different impacts antennas transmit delayed versions of the signal in on the design and the optimum approach for smart order to create frequency selective fading at a single antennas, mainly because of the different interference antenna at the receiver (mobile), which uses an scenarios. With TDMA systems, a frequency reuse pattern equaliser or RAKEreceiver to obtain diversity against is usually employed, which leads to a small number of fading strong interferers for both the up and down links (usually 2-4) . With CDMAsystems a total (1) frequency reuse plan (b) transmission with feedback from the mobile". Here feedback from the mobile to the base station is used to is assumed, which effectivelyleads to many but, due to the spectrum spreading employed, weak interferers in the estimate the impulse response of the radio channel, uplink and usually 6-12 weak interferers in the downlink. hence traditional diversity schemes or beamforming Possibly the most challenging problem related to smart can be employed. antennas is their practical implementation. Full (c) estimation of the transmit spatial correlation matrix exploitation of all the operational benefits that will be from parameterisation ofthe receive spatial correlation described in the next section would mean increased matrix. One such parameterisation can be performed complexity for the system and hence would require a fully for the angles of arrival, which can subsequently be integrated approach in an 'intelligent' system. RF and DSP calculated by direction-finding algorithms". technology will have to evolve further before this can be Cd) subspace processing" (a similar idea to c). Whereas a achieved in a cost-effective manner. Nevertheless, two and b rely on the separation of the instantaneous facts indicate that this is still a promising and valid way channel vectors, with this method separation of the forward: first, the fact that there are partial" or efficient" channel subspaces (second order spatial correlation

CPI::g~ ~

C:

"

306

...

More efficient power control- smarthandover 1

s

0·9

0 ·8 .~ 0 ·7

~ 0·6

o

g :.0 2l

0.5 04

2

0·3

~ 0·2

0·1 J ~H·· o bL~~~~~L.:::.:-.:J_---l_---l_----l

- 20

-15

- 10

-5

0

SNR, dB

5

10

15

Fig. 5 Probability of detection when a matched filter is employed for each antenna element

implementations that can be employed in real systems to reduce complexity and, second, the fact that by the time mobile communica tion systems are ready to fully support smart antennas (most probably with 3rd generation systems) the required technology will be available and matur e enough to fully support adaptive antennas" (assuming Moore's law). Furthermore, although current implementation and installation costs are thought to be high for smart antenna systems in comparison with traditional omnidirectional or sectorised scheme s, if the range-capacity gains and the other benefits that will be discussed in the next section are included in the overall cost calculations, then it becomes obvious that this technology is rath er economical even for today's systems.

4 Operational benefits with smart antennas Fig. 4 shows an overview of the operational benefits that can be achieved from the deployment of an adaptive antenna in a mobile communication network" Each of these benefits is discussed in greater detail below. Fig. 6 BER with an adaptive antenna as a function of the achieved beamwidth and average sidelobe level

The diversity gain offered by an antenna array reduces the fading of the radio signal and hence the power control requi rements are eased. With information about the location and speed of a mobile user, the decision as to which cell to hand the user to is a much easier task. Combination of this kind of information with the handover process can ultimately lead to a 'smart' instead of 'soft' or 'hard' handover. Furthermore, in order to support the soft handoff quality enhancement technique for DS-CDMA within a mixed cell environment, all cell types have to operate on identical carri er frequencies. One possible way of achieving the RF power balancing within each area of a mixed cell scenario needed to provide seamless handover and simultaneously avoiding the near-far effect is to exploit the spatial filtering properties offered by an adaptive antenna" .

Support ofvalue added services Better signal quality - higher data rates

In noise or interference limited environments, the gain that can be achieved with an antenna array can be exchanged for signal quality enhancement, i.e. lower BER. This is demonstrated in Figs. 5 and 6. Fig. 5 shows the probability of detection for the case where a matched filter (a detector which maximises the probability of detection with additive noise) is employed for each array element. Th e probability of detection in this case is 29: (1)

wher e Q(.) is the Q function, PD is the probability of detection, PF is the threshold probability of false alarm, M is the number of elements and SNR the signal-to-noise ratio. From Fig. 5 it can be seen that the detection performance can be significantly enhanced with an

-2 -3 -4

-5

c-,

5

£

-6

1

-7

& -8 Oi .Q

-9

- 10 - 11

-12 40

-c ·. -1 0 10

307

- 20

antenna array. For example, for 0 dB SNR, there is 10% probability of detection with a single element, 20%with two elements, 35% with four, 65% with eight, 85%with twelve, 95%with sixteen and almost 100%with twenty. Fig. 6 shows another example, this time for a DS-CDMA system, where the following approximate formula for the BERis used" ;

omnidirectional antenna: power = -39 dBm RMS delay spread 24 ns

=

-

RMS delay spread, ns power,dBm

90

~Ie

where SF is the spreading factor (64), SIR omni is the signal-to-interference ratio with an omnidirectional antenna, and

BW + SLL (BW) k = 360 1 - 360

(3)

with BWand SLL the ideal or effective* beamwidth (degrees) and average sidelobe level (linear values) , respectively. Fig. 6 shows a plot of eqn. 2 for the case of 100 users per cell (N) in a multitier system (4 tiers) for values of a achieved beamwidth and sidelobe level between 10° and 30°, and -10 dB and -20 dB, respectively. As an example from Fig. 6, if the smart antenna that is -20 120 employed at the base station of the central cell can achieve a radiation - 25 100 c pattern with a beamwidth of 20° (ideal or 80 -0 effective), then an improvement of 1-7 '~a." orders of magnitude for the BER can be 60 >~ accomplished with average sidelobe "C 40 en levels (ideal or effective) between -10 dB :E II: and -20 dB, respectively . Obviously, the --45 20 sidelobe level achieved has such a -50 L...-_-'-_-..1._ _. l . - _ - ' - _ - - - l _ - - - l 0 profound effect on the BER results o 60 120 180 240 300 360 because it reduces the interference angle, degrees received at that region and hence improves the overall SIR, as seen from b eqns. 2 and 3. Furthermore, through spatial filtering of the multipath at th e base Fig. 7 RMS delay spread red uctio n w it h adaptive an ten nas in (a) microcells station (BS) and/or the mobile station an d (b ) picocells (MS), the RMS delay spread of the channel can be reduced, which is a positive thing if the pattern (15 dBd, 26° 3 dB beamwidth) from 0° to 360° in objective is to achieve higher data rates. This advantage steps of 5° and for each step calculating the power and the RMS delay spread. Th e results are for an LOS (line-ofincreases in environments where the angular spread of sight) point. The relationship between high power and the multipaths is wide, i.e. in small cell scenarios. Examples of RMS delay spread reduction with smart small RMS delay spread can be seen in these figures (e.g. antennas in outdoor micro- and indoor picocells are 270°-300 ° for Fig. 7a and 180°-210° for Fig. 7b), shown in Fig. 7. The method used to produce the results although it must be mentioned that it is not always shown in Fig. 7 comprised using the ray tracing tools straightforward, especially for NLOS situations" . In from References 31 and 32 to produce the impulse terms of RMS delay spread , the directional antenna has an advantage, both in LOS and NLOS (non line of sight) response of the radio channel, then steering the antenna situations, when itis steered towards the direction of high "The word effective implies that the value for the parameter under power. Whe n the main beam of the directional pattern is discussion is what can be achieved if the scattering and signal mispointed towards other than the peak power directions, mismatching effects are considered, according to Reference 30. (/)

CIl Q)

308

desired user

-e-

optimise the desired outcom e. The question here is not so much one of simply reducing the cost of the calls (which would increase the network traffic while producing the same revenu e), but instead one of improving in the most efficient manner the capacity of the system. Also, this technique enables tariff plans to be tailored to individual user needs, e.g. by providing low-cost zones when the user is in his preferred zone .

interferers __

80 60

40

20 0>

.,

"6l c

0 -20 -40

-50

--00

-80

o

2

4

point

6

8

Fig. 8 Ground plan of the 3D radiation pattern for a user location scenario

more multipath rays are received from the base station, and this increases the RMS delay spread .

Ability to support user location for emergency calls

(911/999) The FCC has recently passed a ruling which requires that all network operators must provide this service by the year 2001, with 125 m accuracy for 67%of the area cover ed by their network (initially). Adaptive antennas can provide user location information (direction-finding methods are naturally the most suitable for this application). Th e user location concept is demonstrated with the example shown in Fig. 8, for a DS-CDMA system employing an adaptive array with eight element s. Here, the desired user is moving linearly at _20°, two interferers are moving almost linearly at _60° and 30° and two interferer s start from 0° and 60° and finish at 60° and 0°, respectively . It can be seen that the direction of the desired user can be found for the whole duration of the experiment (each point represents 1000 samples) .

Location offraudperpetrators

A significant porti on of the wirele ss oper ators' revenu es is lost every year du e to fraud . Many techn ologie s currentl y applied in ord er to solve thi s probl em are effective in identifying the fact that fraud is occurring but none of them provides the operator with a long-term solution, i.e. the ability to remove the criminal from the activity, instead of deactivating the phone numb er from th e switch. Since adaptive antennas can provide user locati on information, this now becomes feasible.

Location-sensitive billing

Adding a third dimension - location - to the curre nt two dimensions for charging rates (usage against time of day, i.e. peak, off-peak) will provide an operator with the ability to control its network by encouraging (or discouraging) any type of usag e behaviour, and ultimately

On-demand location-specific services

Such services could include roadside assistance, real-time traffic updates, tourist information and electronic yellow pages, e.g. with local entertainment and dining information .

Vehicle andfleet management

Current vehicle management techniques usually suffer from the cost disadvantage of having a single-purpose transmitter network. Location and subsequent navigation of a vehicle is something that can be done with an adaptive antenna. Possible extensions to this idea could also be package monitoring and stolen vehicle recovery.

Optimum smartsystem planning Today the planning of a network is typically theoretical and once implemented is verified through limited testing . One important element missing from this planning process is that designers never really know where wireless devices are located when a specific cell site is handling a probl ematic call. Obviously this situation can be greatly improved with the user location capabilities of adaptive ant enna s. Furthermore, with an adaptive antenna an advanced intelligent network, which will lessen the planning burd en and improve the network efficiency based on a numb er of criteria optimised through the intelligence provided by the base station antenna , becom es possible. Optimisation param eters include intra-and intercell reus e planning, link balancing requ irements, handov er requ irements, traffic den sity and many others. When such intelligence at the base station is enhanced by intelligence at the network level, self-configuration and overall optimisation becomes viable.

Coverage extension Adaptive antennas can increase the network coverage through antenna directivity and interference redu ction. Th e gain G (assuming an antenna efficiency of 100%and no mutual coupling) that can be achieved with an antenna array of M elements is (it can also be proved that this represents the SNR improvement): (4)

309

The additional gain (compared to the standard element gain) can obviously be exploited for range extension. With small angular spread and single slope path loss with exponent n, the range extension factor, REF~ can be calculated as:

REF= r2 =M 1/ n

(5)

rl

or, with the Hata path loss modef'': (6)

where rl and r2 are the default (with a single element) and extended (with multiple elements) ranges, respectively. The area improvement factor, AEF, is: (7)

The inverse of the area extension factor represents the reduction factor in the number of base stations (BSRF) needed to serve the same area as with a single element Obviously, one can get the full range extension from the directivity of the array when the angular spread of the signal is less than the antenna beamwidth (otherwise the desired signal is reduced). Fig. 9 shows that with ten antenna elements (10 dB of SNR improvement) the range can be almost doubled and the area almost quadrupled, or the number of base stations can be reduced to almost 25% of the original number. Also, it can be seen that with smaller path loss exponents (the curve for n=3 in Fig. 9) the improvement factors will be higher, i.e. the improvements will be higher in LOS as compared to NLOS scenarios. Also, the calculated range extension with the Hata path loss model is between the predictions for path loss exponents of3 and 4.

Reduced transmitpower Limitations on the maximum EIRP (effective isotropic radiated power) introduced by standards could mean that the array gain cannot be exploited for coverage or area extension. Also, recent public worries over health issues originating from exposure to electromagnetic radiation (no matter how reasonable or unreasonable this may be) will almost certainly force the governing/standardisation bodies to change the current radiation standards in the future and adopt a lower emission policy, in particular for the mobile handsets. In such cases, one could exploit the base station array gain to reduce the power transmitted by the mobile. This reduction is also beneficial because it relaxes the battery requirements and hence it can increase the talk times or reduce the size/weight of the handsets. Furthermore, if the received power requirement at the mobile remains the same with an M element array at the base station, then the output power from the base station power amplifiers (PAs) can be reduced by M -2, which willreduce the total transmitted power from the array by M - 1• The explanation for this (see, for example, Reference 34) is that the base station nowemploysMPAs and can simultaneously exploitthe directivityofthe array. Ten antenna elements will reduce the totaltransmitted power by 10dB,whilethe output of each PAwillbe reduced by 20 dB. The latter has obvious cost implications, since the high-power amplifiers are expensive hardware components ofa system.

Smart link budget balancing Examination of the up/down link budgets reveals an imbalance which originates from the difference in the power amplifiers employed at the two ends. This imbalance is usually around 10-20 dB and its actual value depends on the PA used at the base station site. As an example consider a case where there is a 13 dBW PA with a 17dBi antenna, giving 1kW EIRP at the base station (this Fig. 9 Range, area and base station reduction factors as a function of the number of elements and SNR improvement

SNR improvement

o

10 13 5,--,------------.---------------. 4

4

6

8

10

12

14

number of elements

310

16

18

20

Fig. 10 Spectrum efficiency limit as a function of the SNR and the number of antenna elements

number of elements

100

N

~

10

.0

5

10

15

20

SNR , dB

example ignores losses from cables, connectors, mismatching, etc. and fading margins at the mobile and the base station)", Th e mobile might be able to achieve 1 W EIRP, which , togeth er with the 17 dBi from the base station antenna gain , willproduce a deficit of more than 13 dB and, hence, a link imbalance. Given that the antenna gains can be exploited at both ends , it becom es obvious that the up and down links can be balanced by using the additional directivity offered by an adaptive antenna at the base station site. For the case of a power amplifier with a dBW output, S dB gain from highsensitivity receivers and other improvements (e.g. employing diver sity at the mobile), and EIRPMS dBW at the mobile, the gain from the adaptive antenna at the base station in order to balance the two links should ideally be G (dB) = a - S - EIRPMs. By exploiting the adaptive capabilities of a smart antenna this concept can be extended and 'smart' link management can lead to dynamically optimised power budgets for different scenarios.

Increased capacity Adaptive antennas can provide capacity increases through several mechani sms. Starting with Shannon's expression for the capacity of a channel with bandwidth Wand with additive Gaussian noise, C = Wlogz (1 + SNR) (bit!s)

(8)

it can be shown" that with an antenna array which divides the power equally into M parts and sends it into M parallel independent channels (M is the numb er of antenna elements), the capacity in one channel is now:

Ca"aY=~ flOgZ(1+ S~R)]= MWlogz (1+S~R)

(bit/ s) (9)

Fig. 10 shows the spectrum efficiency (bit!s/Hz) as a function of the SNR and the numb er of antenna elements.

Apart from the increase of the spectrum efficiency as the numb er of array elements increase, it can be see n that as the numb er of elements becomes very large, the capacity of the channel becomes simply proportional to SNR. Extending this concept by multiplexing M tran smit and N receive channels can offer significant gains. In Refer ence 36 it is shown that the theoretical capacity is now (flat fading , stationary propagation environment):

where 'del' means determinant, I N is the NxN identity matrix and H "T is the conjugate transpose of the normali sed (spatial average power loss normalis ed to unity) channel matrix (the ijth element of which is the tran sfer function of the jth transmitter to the ith receiver) . In Reference 36 it is shown that with 8 tran smit and 8 receive antennas and 1% outage probability with 21 dB average SNR at each receiving element, the maximum capacity is more than 40 times that of a single antenna element at the tran smitter and the receiver. Furthermore, a layered space/ time architecture that can achieve such capacities is reported in Reference 36. With an omnidirectional antenna only a small portion of the transmitted power is actually received by the intended user , while at the same time most of the tran smitted power constitutes interference for other potential users. Hence, the problem is dual: not only is this kind of omnidirectional communication inefficient in terms of power, but also in terms of capacity. One practical way to increase capacity is to decr ease the radiated power associated with directive tran smission in combination with the lower mobile emission levels possible with directive reception. In othe r word s, by exploiting the spatial filtering that an antenna array offers, it is possible with the help of an adaptive meth od to confine the radio energy associated with a given mobile to a small addressed volume, thu s reducing interfer ence experienced from and to co-ehanne1 users. This is straightforward for CDMA systems as user s share 311

Table 3: Advantages and disadvantages of smart antennas and small cells for capacity improvement

discussed in the paper, and are summarised below:

the same bandwidth, and in a TDMA system it can be achieved through reduced reuse patterns or SDMA. Another approach to increase capacity is through dynamic reuse planning. With user location information, channels can be assigned dynamically to areas of high user density and hence better capacity can be achieved. The capacity enhancement achieved can be exploited in many ways, but one particular possible application is in reducing the need for very small cell sizes and consequently reducing the infrastructure costs. When considering the dilemma of whether to use adaptive antennas or smaller cells for capacity, the advantages/disadvantages shown in Table 3 can be mentioned'". Nevertheless, when the requirements of future generation, UMTS-type systems are considered in terms of the performance measures capacity, coverage and services, then it is almost certain that small cell structures will be deployed in order to fulfil these requirements, possibly in combination with some sort of smart antenna technology", 5

• • • • • • •

Conclusions

This paper has provided an overview of the potential benefits and challenges of applying smart-antenna technology to mobile communication systems. The basic advantages and disadvantages of two different approaches to smart antennas «a) SDMA/SFIR; (b) switched beam/direction finding/optimum combining), as they are presented in open literature, were highlighted. Then the importance of specific characteristics of the radio channel under different operational environments (e.g. large-small cells, indoor-outdoor) and interference scenarios (CDMA-TDMA) were discussed. The realisation of comparable up and down link gains along and the practical implementation of a fully integrated smart antenna system seem to be the two most challenging issues facing this technology at the moment. Although more efficient implementations will almost certainly be needed for future systems, partial implementations make the application of smart antennas to current generation systems possible. Furthermore, by the time that future generation systems are ready fully to support smart antennas, the development of RF and DSP technology will have reached such a level that complex implementations will be possible. Exploiting different characteristics of smart antennas can lead to several operational benefits for a communication system. These benefits have been

312

coverage extension increased capacity efficient power control!smart hand over support of value added services (better signal quality, higher data rates, user location) optimum/smart system planning reduced transmit power smart link budget balancing

Communication systems will exploit different advantages or mixtures of advantages offered by smart antennas depending on the maturity of the underlying system. Initially, for example, costs can be reduced by exploiting the range extension capabilities of smart antennas. Then, where there is a demand for increased capacity, costs can be further decreased by avoiding extensive use of small cells and instead exploiting the capability of smart antennas to increase capacity. Finally, more advanced systems (3rd generation) willbe able to benefit from smart antenna systems, but it is almost certain that more sophisticated space/time filtering approaches" will be necessary, especially as these systems become mature. Acknowledgments The author gratefully acknowledges the assistance of Professor J. P. McGeehan in every aspect of his research activities. Thanks are also due to Dr. G. Athanasiadou for her help and the provision of ray-tracing results. References 1 RAPELI, ]. : 'UMTS: targets, system concept, and standardisation in a global framework ', IEEE Pers. Commun., February 1995, 2 , (I) , pp. 20-28 2 TSOULOS, G.: 'Adaptive antenna technology' . lEE 3rd residential course on Digital Techniques in Radio Systems , University of Bristol, September 1997 3 MOGENSEN, P.: 'System aspects of adaptive antennas for GSM'. TSUNAMISeminar, Aalborg University, 1997 4 PAULRAJ, A : 'Smart antenna in wireless communications: technology overview'. 4th Workshop on Smart Antennas in Wireless Mobile Communications, Stanford University, 1997 5 WINTERS, L: 'Signal acquisition and tracking with adaptive arrays in the digital mobile radio system 15-54 with flat fading', IEEE Trans. Veh. Technol., 1993, vr-42, pp. 377-384 6 NAGUIB, A, PAULRAJ, A, and KAILATII, T.: 'Capacity improvement with base station antenna arrays in cellular CDMA', IEEETrans. Veh. Technol. , 1994, vr-43 , pp. 691-698

7 THOMPSON, ]., GRANT, P., and MULGREW, B.: 'Smart

antenna arrays for CDMA systems', IEEE Pers. Commun., October 1996, 3, (5), pp. 16-25. 8 ZE'ITERBERG, P., and OTIERSTEN, B.: The spectrum efficiency of a base station antenna array system for spatially selective transmission', IEEE Trans. Veh. Technol., 1995, VI'44, pp.651-660 9 KENNEDY, j., and SULLIVAN, M.: 'Direction finding and "smart antennas" using software radio architectures', IEEE Commun. Mag., May 1995, 33, (5), pp. 62-68. 10 LI, Y., FEUERSTEIN, M., and REUDINK, D.: 'Performance evaluation of a cellular base station multi beam antenna', IEEE Trans. Veh. Technol., 1997, Vf-46, pp. 1-9 11 HO, M., STUBER, G., and AUSTIN, M.: 'Performance of switched beam smart antennas for cellular radio systems', IEEE Trans. Veh. Technol., 1998, VI'-47, pp.lQ-19 12TSOULOS, G. et al.: Wireless personal communications for the 21st century: European technological advances in adaptive antennas', IEEE Commun. Mag., September 1997, 44, (9), pp. 102-109 13TSUNAMI Project Final Report, R2108/ERA/WP1.3/MR/P/ 096/b2,1996 14 DAM, H.: 'Smart antennas in the GSM system'. MSc thesis, Aalborg University, 1995 15TANGEMANN, M.: 'Near-far effects in adaptive SDMA system'. 6th Int. Symposium on Personal, Indoor and Mobile Radio Communications, 1995,Toronto, Canada, pp. 1293-1297 16 FUHL,]., and MOLISCH, A.: 'Capacity enhancement and BER in a combined SDMA/TDMA system'. Proc. 46th Vehicular Technology Conference, 1996,Atlanta, USA, 3, pp. 1481-1485 17XU, G. et al.: 'Experimental studies of space division multiple access schemes for spectral efficient wireless communications'. Proc. IEEE Int. Communications Coni 1st-5th May 1994, New Orleans, USA, pp. 800-804 ., 18WARD, C., SMITH, M., JEFFRIES, A., ADAMS, D., and HUDSON, J.: 'Characterising the radio propagation channel for smart antenna systems', Electron. Commun. Eng.]., August 1996,8, (4),pp.191-200 19 PAULRAj, A., and PAPADIAS, C.: 'Space-time processing for wireless communications', IEEE Signal Process. Mag., November 1997, pp. 49-83 20 MODLY, M., and PAUTET, M.: The GSM system for mobile communications' (Cell & Syst., France, 1992) 21 TSOULOS, G., and ATHANASIADOU, G.: 'Extrapolation of field trial results to UMTS'. TSUNAMI II Deliverable ' AC020/UOB/D1.9/DS/P/192/a1, 1998 22WINTERS, J.: 'Two signalling schemes for improving the error performance of frequency division duplex (FDD) transmission systems using transmitter antenna diversity'. 43rd IEEE

Vehicular Technology Conf., 18th-20th May 1993, Secaucus, NJ, USA,pp. 85-88 23 GERlACH, D., and PAULRAj, A.: 'Adaptive transmitting antenna arrays with feedback', IEEE Signal Process. Lett., October 1994, 1, (10), pp. 150-153 24 RAYLEIGH, G., DIGAVVI,S., JONES, V., and PAULRAj,A.: 'A blind adaptive transmit antenna algorithm for wireless communication'. Proc. IEEE Int. Communications Conf., 1995, 3, pp. 1495-1499 25 WARD, C., HARGRAVE, P., and McWHIRT'ER, ].: 'A novel algorithm and architecture for adaptive digital beamforming', March 1986, IEEE Trans. Antennas Propag., 34, (3), pp. 338-346 26 GOCKLER, H., and SCHEUERMANN, H.: 'A modular approach to a digital 6o-channel transmultiplexer using directional filters', IEEE Trans. Commun., July 1982, 30, (7), pp.1588-1613 27 GEPPERT, L.: 'Solid state', IEEE Spectrum, Analysis & Forecast Issue on Technology, January 1996, pp. 51-55 28TSOULOS, G., ATHANASIADOU, G., BEACH, M., and SWALES, S.: 'Adaptive antennas for microcellular and mixed cell environments with DS-CDMA', Kluwer Wireless Pers. Commun. f., August 1998, WPC-7, (2/3), pp. 147-169 29JOHNSON, D., and DUDGEON, D.: 'Array signal processing' (Prentice-Hall, USA, 1993) 30TSOULOS, G.: 'Approximate SIR and BER formulas for DSCDMAbased on the produced radiation pattern characteristics with adaptive antennas', Electron. Lett., 1998, pp. 1802-1804. 31 ATHANASIADOU, G., NIX, A., and McGEEHAN, J: 'A ray tracing algorithm for micro cellular wideband propagation modelling'. IEEE Vehicular Technology Conf., 1995, Chicago, USA,pp. 261-265 32ATHANASIADOU,G., NIX,A., and McGEEHAN,].: 'Anew 3D indoor ray tracing model with particular reference to predictions of power and RMS delay spread'. IEEE Personal Indoor and Mobile Radio Conf., September 1995, Toronto, Canada,pp.1161-1165 33 RATA, M.: 'Empirical formula for propagation loss in land mobile radio services', IEEE Trans. Veh. Technol., 1980, vr29, (3), pp.317-325 34 COOPER, M., and GOLDBURG, M.: 'Intelligent antennas: spatial division multiple access'. 1996 Annual Review of Communications, pp. 999-1002 (a copy can be obtained from http://www.arraycomm.com/) . 35 ANDERSEN, r. 'Capacity of parallel channels'. TSUNAMI II Seminar, Aalborg University, 1997 36 FOSCHINI, G.: 'Layered space-time architecture for wireless communication in a fading environment when using multielement antennas', Bell Labs Tech.]., Autumn 1996, PP. 41-59

313

An Adaptive Array in a Spread-Spectrum Communication System RALPH T. COMPTON, JR.

Abstract- This paper describes the integration of an LMS adaptive ar-

ray into a pseudonoise (PN) coded biphase modulated communication

system. The paper explains how these systems may be combined and presents a systems overview of the interaction between the adaptive array and the signaling waveform. An experimental system is described, and typical performance results are presented. The hybrid system requires only very modest spectrumspreading ratios, such as 5: 1, for the adaptive array to null interference. The combined system provides the interference protection of the adaptive array during the code timing acquisition phase as well as after code lockup.

I

INTRODUCTION

N A RECENT PAPER [1] , the author described an experimental adaptive array based on the LMS Algorithm [2]. That paper concentrated on the antenna pattern characteristics of the adaptive array. In the present paper, we continue Our description of this system by discussing the integration of this adaptive array into a spread-spectrum communication system. The LMS adaptive array offers the antenna designer the advantages of automatic beam tracking of a desired signal and automatic nulling of interference. The adaptive array can Manuscript received August 18, 1977; revised November 23, 1977. This work was supported in part by Contract N00014-67-0232··0009 between the Office of Naval Research, Department of the Navy, Arlington, VA, and the Ohio State University Research Foundation, Columbus, OH, and by Contract N00019-78-C-0131 between Naval Air Systems Command. Washington, DC, and the Ohio State University Research Foundation, Columbus, OH. The author is with the Electro -Scienc e Laboratory, Department of t,lectrical Engineering, Ohio State University, Colum hus, OH 43212.

operate with nonuniformly spaced antenna elements on a curved surface, such as an aircraft. There is no need for each element in the array to have the same element pattern or for uniform mutual impedances between elements in the array. Obviously, these benefits are of enormous practical importance. On the other hand, to use an adaptive array in a communication system requires a high degree of compatibility between the signaling waveforms and the adaptive array. One cannot simply put an adaptive array in an arbitrary communication system. There are several reasons for this. First, the adaptive array weights are random processes, and they modulate the desired signal. (The array is a dispersive channel [3].) The desired signal waveforms must be chosen so this modulation does not destroy the effectiveness of the communication systern. Second, there must be some difference between the desired signal and interference waveforms, so these signals can be distinguished in the array. Third, there must be some method for reference-signal generation and for acquiring system timing or frequency when the array is in the system. This paper will describe a particular communication system, a biphase modulated pseudonoise (PN) coded spread-spectrum system, in which it has been possible to integrate the adaptive array in a very successful way. The resulting hybrid system yields the full interference protection of the adaptive array (which is generally much greater than the protection afforded by the coding alone), not only after the PN-code timing is known, but during the code-acquisition phase as well. The paper will discuss the system operation at a conceptual level. That is, the objective is to present an overview of the relation-

Reprinted from Proceedings of the IEEE, Vol. 66, No.3, pp. 289-298, March 1978.

314

SIGNALS FROM OTHER ELEMENTS ~

ARRAY OUTPUT S (t )

REFERENCE SIGNAL R (t'

Fig. 2. Adaptive-array feedback loop.

Fig. 1. The LMS adaptive array.

ARRAY OUTPUT

ship between the adaptive array and the communication system. We do not include detailed theoretical analyses of the system components. It is hoped that the reader with a background in antennas will learn how the LMS adaptive array can be integrated into a practical communication system, and that the reader with a communications background will learn of the close coupling between the signaling waveforms and the adaptive array processing. It seems clear that to exploit the full potential of adaptive arrays for interference protection in communication systems, the design of the signaling waveforms and the adaptive-array processing must be undertaken together. Section II of the paper briefly reviews the feedback configuration of the adaptive array to establish terminology. (Readers wishing more information on the adaptive array are referred to the original papers by Widrow et ale [2], Applebaum [4], and to the earlier paper by the author [I] .) Section III describes the communication signaling waveforms used in the hybrid system and the method used to generate a reference signal for the array. Section IV describes the technique used for acquiring code timing and relates the code-lockup characteristics to the array parameters. Finally, Section V describes typical experiments involving interference rejection with this system.

II.

THE

lMS

ALGORITHM ADAPTIVE ARRAY

The LMS algorithm adaptive array [21 is shown in Figs. I and 2. The signal from each antenna element y;(t) is passed through a quadrature hybrid which splits it into quadrature components Xj(t). Each component xi(t) is multiplied by a weight wi and then summed to produce the array output set). The weights wi are controlled by a feedback system that minimizes the mean-square value of the error signal e(t). E(t) is the difference between the array output s(t) and a locally generated signal called the reference signal R (t).1 The feedback loops in the adaptive array have the form shown in Fig. ~. One such loop is needed for each quadrature channel in the array. This loop is often called a correlation loop, because it forms the product between the error signal e(t) and the channel signal Xi(t) and integrates the result. It was originally shown by Widrow , et ale [2) that this loop minimizes the mean-square value of the error signal. The antenna performance is controlled by what signal is used for the reference signal Rtt), We suppose the array is receiving a desired signal, thermal noise, and directional inter1

R{t) is called the "desired response" hy \Vidrow et al.

I ~ 1.

ERROR

SIGNAL E (t)

Fig. 3. Reference-signal generation.

ference. If the reference signal Rit ) is a perfect replica of the desired signal, the error signal e (r) consists of just the thermal noise and interference. Minimizing €2 (r), then, minimizes the total interference and thermal noise power in the array output. (The weights that minimize e 2 (t) result in pattern nulls toward directional interfering signals.) Furthermore, if the reference signal is a fixed-amplitude replica of the desired signal, minimizing e 2 (r) minimizes the sum of three quantities: I) the thermal noise power, 2) the interference power, and 3) the power in the difference waveform between the desired signal and the reference signal. As long as the signal-to-noise ratio (8/N) is not too low, the result is to match the desired signal to the reference signal and to minimize the thermal noise and interference powers, i.e., the array produces a maximum ratio of desired signal power to interference and noise power at the output. To apply this concept to a practical communication system, one must find some way to obtain a suitable reference signal. In general, of course, there is no way to obtain a perfect replica of the desired signal ahead of time. However, the situation is not nearly as hopeless as it appears at first glance. For some types of communication signals, it is possible to obtain a suitable reference signal by processing the array output signal as shown in Fig. 3. Whether this can be done or not depends on the characteristics of the desired signal and interference wave-forms. There must be some known difference between the two signals that the designer can exploit. To see how the signal-processing loop in Fig. 3 can be designed, we first note that the loop does not in fact have to produce a perfect replica of the desired signal. Rather, its function is to prod uce a reference signal that is 1) highly correlated? with the desired signal at the array output, and 2 In this paper we use the word "correlated" and "uncorrelated" in the sense defined by the feedback loops in the array: Two signals are uncorrelated if their product integrated over a certain time interval is zero. Th e time interval of interest here is the shortest time constant of the array feedback loops.

315

¢data (t)

ARRAY } ---I~--la.c

OUTPUT

."

2T

Td

d

2T

d

2T

TO DATA BIT DETECTION

d

¢code (t) ."

ERROR SIGNAL

« (t)

Fig.

s. The coded reference loop.

ep(t)

Fig. 4. Typical (/Jdata(t), cPcode(t), and ep(t).

2) uncorrelated with interference components at the array output. Such a reference signal will cause the array to behave in the desired way, because the feedback loops in the array are correlation loops. Only the correlation between the reference signal and signals Xi(t) affects the weights Wi. Thus the signal-processing loop in Fig. 3 should allow the desired signal component of the array output to pass through reasonably unchanged. Some distortion or delay of the desired signal is acceptable, however, so long as the referencesignal component that results remains highly correlated. In addition, the desired signal component of Rtt) should be of fixed amplitude, not dependent on the amplitude of the desired signal at the array output. On the other hand, the signalprocessing loop should alter the waveforms of the interference components. That is, it is not necessary for the reference signal to be free of interference. It is only necessary that the interference components in R (t) be uncorrelated with those at the array output. Hence, the signal-processing loop must change the interference waveforms in some manner so the correlation with the original waveforms is destroyed. For such signal processing to be possible usually requires a special type of desired signal. In the next section we describe a particular spread-spectrum communication signal, and show how a reference signal can be generated with this signal.

III.

THE SPREAD-SPECTRUM COMMUNICATION SYSTEM

Consider a spread-spectrum digital communication system Using biphase modulation. The transmitted desired signal is of the form s(t)

=A

cos [wo t + (/>(t)]

where A is an amplitude constant, Wo is the carrier frequency, and (/)(t) is a binary waveform whose value switches between o and n, The waveform c/>(t) is made up of the (modulo 21T) SUm of t\VO bit streams, (/)data(t) and (/>code(t). (/)data(t) is the useful information sent over the communication system; it has a bit rate of fd bit/so code(t) is Ie bit/so The code rate is higher than the data rate by the ratio

(N is an integer), which we call the spreading ratio. Fig. 4 illustrates a typical (/)data(t) and (/)code(t) , and shows the resulting total phase modulation f!>(t). The bit transitions in 4>data (r) coincide with bit transitions in 4>code(t). It is assumed that the details of q,code(t) (i.e., the feedback connections, shift register length, and clocking rate Ie) used at the transmitter are also known at the receiving site, so a similar code can be generated in the receiver. With such a desired signal, a reference signal for the adaptive array can be generated as shown in Fig. 5. In this loop, the array output signal is first mixed against a coded local-oscillator signal '1 (t). '1 (r) is obtained by modulating a CW signal with the same PN code used to modulate the desired signal at the transmitter. This code is known at the receiving site, so it can be generated and used to produce '1 (r). (We assume for the moment that the local code timing is known. The method used to establish code timing in the receiver is discussed in Section IV.) The output from this mixer contains both desired and interference components. Because the code modulation in rl (t) matches that on the desired signal, the desired signal component at the mixer output is compressed to data bandwidth, while the interference is not. The filter bandwidth is chosen wide enough to pass the desired signal, which now has only data modulation, but not wide enough to pass the interference, which has full-code bandwidth. As a result, the filter removes all but the center portion of the interference spectrum. Next, the signal is passed through a limiter, which controls the amplitude of the reference signal. Finally, the limiter output is mixed again with the coded LO signal '1 (t), and the result is used as the reference signal. If we trace the desired signal through this loop, we find that it has the code .modulation removed at the first mixer, it passes through the data-bandwidth filter and through the limiter, which fixes its amplitude, and finally it has the code modulation put back on at the second mixer. The result is that the desired signal passes through this loop unchanged, except for the amplitude adjustment at the limiter and the envelope time delay associated with the filter. 3 An interference signal without the proper PN-code modulation, however, has its waveform drastically altered by this loop. For example, a CW interfering signal which has a single line spectrum at the array output produces a reference signal component with the full bandwidth of the PN-code modulation (and with lower power than at the array output). The correlation between the interference signal at the array output and the reference signal has been essentially destroyed by the loop. 3 The filter bandwidth has to be wider than a, conventional spreadspectrum filter to keep the envelope delay under ~ bit.

316

There are several reasons why the reference signal amplitude must be controlled. First, the amplitude of the reference signal determines the amplitude of the signals present at the array output, which should fall within a certain range for proper operation of the multipliers, etc., in the feedback loops. Second, the reference-signal amplitude should be fixed so the array will yield a maximum ratio of signal power to interference and noise power at the output, as discussed above. Third, the reference-signal amplitude cannot be linearly dependent on the array output amplitude, because then there is a problem with the operation of the array weights. For example, if the reference loop were linear and its gain (to the desir~d signal) were greater than unity, the loop would return a reference signal larger than the array output signal, causing the array weights to increase without limit. Conversely, a loop gain less than unity returns a reference signal smaller than the array output, causing the array weights to drop to zero. Thus stable operation requires a fixed reference-signal amplitude. Finally, when the reference-signal amplitude is fixed, the desired signal voltage at the array output will also be fixed, regardless of the incident power of the desired signal. This behavior is important for the delaylock loop used to track code timing (described in the next section); it makes the threshold setting for code acquisition independent of incoming signal strength. We note finally that the processing loop in Fig. 5 not only generates the reference signal, but also delivers the desired signal with the PN code removed at the output of the databandwidth filter. (The desired, signal is available at this point once the interference has been nulled.) Thus, the referencesignal loop incorporates the spectrum despreading, and the signal out of the data-bandwidth filter is ready for data-bit detection. Moreover, it is important to note that the interference protection afforded by the waveform processing (the PN coding) is available in addition to that due to array nulling. The two types of interference suppression are in cascade. A reference loop of the type shown in Fig. 5 was implemented and used to obtain the experimental data described in Section V. The desired signal was generated by using a slowspeed PN code to simulate data and a higher speed PN code for spectrum spreading. The sum bit stream was biphase modulated onto a 70-MHz carrier. The data and code frequencies (fd and Ie) could be varied (for different data modulation frequencies, different filter bandwidths were used in the reference signal loop) over a wide range up to about 5Mbit/s, so experiments could be performed with different spreading ratios. One of the goals of the experimentation was to determine the array performance that could be obtained with very modest spreading ratios. Most of the data in Section V are for a spreading ratio of 5 :I. The PN codes were obtained from a Hewlett-Packard 1930-A generator. The shift register length used to generate the codes could be varied from 3 to 20 bits. (However, as will be discussed in the sequel, PN-code sequence length appears to have no effect on array performance other than code lockup time.)

IV. THE DELAy-loCK TRACKING Loor The PN code used in the reference loop (Fig. 5) must be synchronized with the incoming desired signal code for the reference loop to operate properly . For large timing errors between the two codes, the desired signal will not pass through the reference loop. The array then regards the desired signal as interference and nulls it out. Experiments (6] - [ 8] have

shown that the array tracks the desired signal properly for timing displacements between the two codes up to about onehalf bit. Beyond one-half bit, the array nulls the desired signal. To keep the codes properly aligned, the delay-lock loop shown in Fig. 6 was used. This loop is a modified version of a conventional delay-lock loop [9] - [ 11]. (The modifications are discussed below.) This loop splits the array output signal into two channels and mixes it in each channel with a 100 MHz LO-signal biphase modulated with a PN code. The PN codes in the two La signals are displaced one bit in time. A 30 MHz zonal filter then selects the difference frequency output, and the output of this filter is shifted to 5 MHz and filtered to data bandwidth. The 5 MHz-filter outputs are squared, filtered to narrow bandwidth at 10 MHz, and envelope detected. The square root of each signal is then taken and the sum and difference of the two signals is derived. The sum voltage is used for timing acquisition, as described below, and the difference-voltage controls the veo frequency that clocks the PN code generator. The PN-code generator (not shown in Fig. 6) generates the two PN codes used in the two 100 MHz LO signals. These two codes are displaced one bit in time. It also generates a third version of the PN code timed halfway in between the two LO codes. This "in-between" code is used to provide the coded LO signal in the reference loop in Fig. 5. When the delay lock loop tracks, the code on one LO runs one-half bit ahead of the incoming signal code, and the code on the other runs one-half bit behind. The code used in the reference loop is then synchronized with the received signal code. The reason for using a delay-lock loop with squaring and square root operations was to enable the array system to lock-on and track a desired signal with data modulation present on the signal, i.e., we did not wish to send a datafree preamble for code acquisition. In a conventional delay lock loop [9] - [ 11], the incoming signal has only PN-code modulation. When this signal is mixed with a coded LO signal whose code timing differs by an amount ~T from the incoming signal code, the mixer output contains a spectral line whose amplitude is proportional to the autocorrelation function of the PN code, for delay ~T, i.e., the amplitude of this spectral line drops linearly with 6.T, up to one bit. The output of this mixer is then narrow-band-filtered to extract this line frequency, and the result is envelope detected. In the system described here, there is also data modulation on the signal. Hence, at the output of the first mixer the desired spectral line is now convolved wth a (sin x{X)2 datamodulation spectrum Therefore, it is necessary to retain an adequate bandwidth after the first mixer to pass this data modulation. However, the signal is then squared, which eliminates the biphase data modulation and collapses the spectrum. By filtering at the second harmonic, one can use a narrow-filter bandwidth to establish an adequate SIN. In this way a good estimate of the code autocorrelation function (squared) results. Finally, the square root is taken to eliminate the effect of squaring on the autocorrelation function. The major effect of using a wide-bandwidth filter after the first mixer is to increase the thermal- and self-noise power [9] - [ 11] over what they would be in a conventional delaylock loop. However, the adaptive array performance is not particularly influenced by minor amounts of code jitter, as long as the total timing error remains well below bit [6][8] . Furthermore, the amount of code timing jitter one

317

t

KEEPS

DIVISOR

FROM DROPPING BELOW AN ACQUISITION THRESHOLD

VOLTAGE

ADJUSTABLE MINIMUM

SWEEP

VOLTAGE

VOLTAGE

BIPHASE

MODULATORS

X

/

OUTPUT TO

PH GENERATOR

PH CODE I PM CODE 2 CENTER FREQUENCY

TRIM VOLTAGE

Fig. 6. Delay-lock tracking loop.

obtains is controlled by the bandwidth of the filter used at the second harmonic. Hence one can trade off this jitter against other system parameters, such as allowable frequency offsets or code-slewing speed (lockup time). Code acquisition with this delay-lock loop behind the adaptive array is rather interesting. Code timing is acquired by running the local PN -code generator faster (or slower) than the incoming signal code. This code slewing is continued until the sum channel voltage indicates that the two codes are aligned. During this slewing process, since the PN code used in the reference loop (Fig. 5) is slaved to the delay-lock loop .codes, the reference-loop code is also slewed. Thus before the -desired signal code and the reference-loop code are aligned, there is no correlation between the incoming desired signal and the reference signal. Hence the array nulls the desired signal during this time. As the two codes begin to align, the reference signal begins to correlate with the desired signal, causing the array weights to change, and the array pulls the desired signal up out of the noise. Thus the desired signal appears at the array output just as the local code timing approaches its correct value. The desired signal is present at the delay-Iockloop input when it needs to be, but not before. The sum channel output rises and trips a threshhold comparator, which removes the slewing voltage from the VCO input and allows the loop to begin tracking. This technique of slewing the PN code to acquire timing is well known, except that because the delay-lock loop interacts with the adaptive array while it slews, there are several novel features, which we discuss below. First, the array provides full interference protection during the lockup phase. That is, the array nulls interference regardless of PN-code timing in the array. (Interference nulling is not related to the local PN-code timing.) Because the array removes interference during slewing, the delay-lock loop does not have to contend with interference at all. Hence, the integration time in the loop, the slewing speed, and the time required for lockup do not need to change when interference is being received. Second, the desired signal that appears at the array output when the code timing is correct has a fixed amplitude, not

dependent on incoming signal strength. This behavior occurs because the array forces the output-desired-signal amplitude to match the reference-signal amplitude, which in turn is controlled by the limiter in the reference loop (Fig. 5). A fixed desired signal amplitude at the array output is helpful because the delay-lock loop does not have to operate over a range of signal levels. For example, a fixed threshhold value, not dependent on received signal power, can be used in the sum channel for acquisition. Also, circuit linearity problems are vastly simplified, e.g., a fixed signal level makes the squaring operation feasible. A third point of mild interest concerns how the array pulls the desired signal out of the noise when the timing approaches the correct values. Since the desired signal starts out in a null, many people have wondered how there will be any desired signal present at the array output to allow a reference signal to be generated when the timing is correct. The answer involves two factors. The first factor concerns the design of the reference loop in Fig. 5. When the array output signal is small, the voltage in the reference loop is too small for the limiter to clip the signal. Below the clipping level, the loop is linear; it produces a reference signal whose amplitude is linearly dependent on the array output signal. In this low-signal region, the loop must have a gain greater than unity. Then the reference signal will have a greater amplitude than the array output signal. This situation makes the array weight setting that nulls the desired signal a point of unstable equilibrium. In other words, if one weight changes away from this value slightly, so a small desired signal appears at the array output, a reference signal of larger amplitude will appear. This behavior reinforces the movement of that weight away from the null. In this way, the array output signal will grow until it is large enough that the limiter clips. When the clipping level is reached, the reference signal does not increase, and the array output signal settles into its steady-state value. The second factor is that the array weights in the adaptive array are random processes. They are derived from the product of two noisy signals, Xi(t) and e(t). Thus there is always a certain amount of weight jitter, so the movement away from the

318

.

I

I~--'

(a)

(a)

(b)

(b)

Fig. 8. Array interference rejection. (a) Spectrum on one array el ment. (b) Array output spectrumafter adaptation(S/Iimprovement_ 35 dB). Code rate = 2.5 Mbit/s; Data rate = 0.5 Mbit/s; Interference coherent (20 0 spatial): SIN =0 dB, S/I = -20 dB; Vertical Scale-

(c)

10 d Bjcm .

Fig. 7 . (a) Sum channel output with desired signal only (array inoperative). (b) Sum channel output with desired signal and coherent CW interference (array inoperative). (c) Sum channel output with desired signal and coherent CW interference (array adapting during acquisition).

v.

desired signal null has no difficulty starting. In practice, we have found no tendency for the array to keep the desired signal nulled once the code timing is nearly correct. The weight jitter always starts the weights away from this point. Finally, we remark that this lockup procedure requires the array speed of response to be fast enough for the desired signal to appear at the array output before the local code timing has passed by. In other words, the array time constants must be commensurate with the slewing speed . In the system used for experimentation, the adaptive array processor had time constants between 1 and 10 ms. (The array time constants depend on incoming signal power [121 .) The code slewing rate (the difference between the incoming code frequency fe and the local code frequency) could be varied . For a code rate f e of 2.5 Mbit/s and a 1023-bit code period (i.e., a lO-bit shift-register length [5)), lockup time was typically around 1 s. Of course , with this slewing technique, the code acquisition time is linearly proportional to the code length. Fig. 7 shows a typical scope trace of the sum channel output from the delay-lock loop as the array is being slewed . Fig.7(a) shows the sum channel output versus time with no interference and with the array not operating (with the weights fixed) . Fig. 7(b) shows the sum channel output when a CW interference signal 20 dB above the desired signal is added (the array is still inoperative). In this case, each channel of the delay-lock loop is saturated , and the desired signal pulse does not come through. Finally, Fig. 7(c) shows the sum channel output when the array is adapting during slewing. The interference is no longer present on the delay-lock loop, and the desired signal pulse reappears . The shape of this pulse has been altered somewhat from the shape in Fig. 7(a) because the array weights are changing during slewing (and hence they modulate the desired signal) . These figures illustrate the interference protection of the array during the code acquisition phase .

ARRAY PERFORMANCE

In this section we discuss the experimental operating char. acteristics of the adaptive array with the coded reference loop and delay-lock timing loop . The adaptive array processor in these tests was a four-element processor operating at 70 MHz, as described in [11 . In the experiments discussed here, the desired and interference signals were each split in 4-way power dividers and fed to the processor inputs after appropriate delays to simulate various arrival angles. Independent thermal noise was also included in each simulated element signal. We begin with a series of spectrum analyzer photographs illustrating typical interference-rejection experiments. Fig. 8(a) shows the power spectral density on one element of the array when PN-coded desired signal, thermal noise , and CW interference are present. The desired signal-to-thermal-noise ratio is 0 dB, and the desired signal-to-interference ratio is - 20 dB (as measured on each element). The PN-code rate is 2 .5 Mbit/s , and the data rate is 0 .5 Mbits/s, so the spreading ratio is 5 : 1. The desired signal is in-phase on each element (corresponding to a signal arriving from broadside), and the interference has a progressive phase shift of 60° between elements (corresponding to an arrival angle of 19.s° off broadside for half-wavelength element spacing) . The interference frequency is coherent with the carrier of the desired signal (70 MHz). Fig. 8(b) shows the array output spectrum after adapting. As may be seen, the interference has been nulled , and the SIN has been improved due to the array gain. Measurements taken of the output-signal powers separately, with the weights frozen , showed that the signal-to-interference ratio had been improved 35 dB in this case. Fig. 9 shows the same experiment when the interference frequency was 71.2 MHz. In this case, the improvement in signal-to-interference ratio measured 36 dB. Fig. 10 shows another case in which the interference was swept CW, swept over a 1.2-MHz band with a l-kflz triangularsweep waveform . The desired signal was modulated with a 1 Mbit/s PN code and 200 kbit/s data (5 : I spreading ratio). Fig. 10(a) shows the power spectral density on one element of

319

(a)

(b)

Fig. 9. Array interf ere nce rejection. (a) Spectrum on one element. (b) Ar ray output after adaptation (Sfl improvement = 36 dB). Code rate = 2.5 Mbit/s; Dat a rate = 0.5 Mbit/s; Interference freq. = 71.2 MHz; Signal Freq . = 70.0 MHz; SIN = 0 dB. SII = - 20 dB; Vertical scale = 10 dB /cm .

Fig. 11. Interference rejection : 31-bit PN-code sequence. (a) Spectrum on one element. (b) Array-output spectrum.

(a)

Fig. 12. Interference rejection: 7-bit PN-code sequence. (a) Spectrum on one element. (b) Array -output spectrum .

to-signal ratio , but the interference power is the quantity being varied. These curves were taken with a data rate of 250 kbit/s, a code rate of 1.25 Mbit/s (5:1 spreading ratio), and the array, and Fig. lO(b) shows the array output-power with a desired signal-to-thermal-noise ratio of 0 dB on each spectral density. In this test, no element noise was present. element . The desired signal arrives from broadside, so it is in Next we comment on the PN-sequen ce length . PN-sequence phase in all four elements. Three curves are shown , correlength appears to have no effect on the interfer ence suppres- sponding to three different angles of arrival of the interference . sion of the array. Figs. 11 and 12 illustrate the effect on the For the top curve, the interference has alSO element-tospectrum of changing code-sequence length. Fig. 11 shows the element phase shift , corresponding to a spatial angle of 4 .8° one-element spectrum and the arr ay output spectrum with for half-wavelength element spacing. The second curve shows a sequence length of 31 bit. The desired signal is modulat ed the case when the interference element-to-element phase shift at a code rate of 2 Mbitls with a spreading ratio of 10. The is 30° (a spatial angle of 9.6° for half-wavelength spacing) , interference is swept CW, swept over a 1.8-MHz band at a and the third curve shows the case when the interference 1200 Hz rate. The interference power is 6 dB above the de- element-to-element phase shift is 75° (a spatial angle of 25° sired signal power. for half-wavelength spacing) . Fig. 12 shows the same experiment repe ated with a 7 bit The trend of all three curves is similar. For low interference code sequence length. There is no measureable difference in power, the output interference increases linearly with input interference -suppr ession as a result of the change in code- interference. In this region, the interference power is small sequence length, although the effect on the spectrum may be compared with the desired signal and the thermal noise ; it seen in Figs. 11 and 12. has little effect on the array feedback . The feedback loops do Next. we show measurements of the interference-suppres- nothing to null the interference, so the output interference sion of the arra y and discuss its depend ence on various factors. power rises linearly with the input interference power . When Fig. 13 shows a typical measured curve of output inter ference the interference power increases enough, however, the interpower as a function of input intereference power , with CW ference begins to dominate the error signal, and the array interference. The abscissa is plotted as input interfer ence- feedback corrects against it . As the interference power in-

Fig. 10. Array reject ion of swept CW interference. (a) Spectrum on one element. (b) Arra y-output spectrum.

320

°r----4.......=.:::::=---

25 .-------------------"-~ INPUT 9/N • 0 dB CODE RA TE' 1 .25 MHz

15

en

." w u

:z w

a: w \L

m -e

DATA RATE' 0.25 MHz

-'

-S

INTERFERE NCE

ANGLE - O· ELE C. I S· ELEC . _ . 30· ELEC .

SIGNAL

x-x - 7 S'

z

....:::>

x

"'" x"

o w

'" Vl - 6

E LEC .

x............

DES I RED SIGN AL - O· ELEC INT ERF. SIGNA L - 30 ' EL EC

w o

....-0

SI N

- 3 dB

w

0-----0

SI N SI N

- 0 dB - -3 dB

>

~ - 8

- IS

x- x

x

-.

~x """ x /

-' w a:

o

-25

/

x

/X

/"

X_x"""

x

-,

- \0

x

I NPUT

<, X _X_~~ I

- 3S t / X

-.; :;\

- 30

20

- 10

0

10

0

( rls )

(d B )

20

Fig. 14 . O ut p u t sig n al lev el (Ps ) versus input l IS (fo r vario us SIN at in put ) .

I

L

20

I NPU T (lh )( INTE RF ERENC" ISIGN AL RAT IO)

Fig. 13 . Inpu t l IS vers us o ut p u t inte rfe ren ce .

creases fu rt her , the fee db ack-loo p gain in t he array increases ." This increasin g loop gain causes the ou t put int erfe re nce to dro p wh en the input inte rfe re nce is increased . Finally , fo r large enough input in ter ferenc e power , (a ppro xi mate ly 25 d B abo ve the desired signal) , th e out put inte rference begins to increas e again .t We were un abl e to ta ke dat a abo ve this point because of eq ui pme nt pow er limit ations , bu t earlier ex periments [81 have shown that the inte rfe renc e power begi ns to rise again beyond this poin t. Also , intermodulati on pro du ct s, which t ypically are not su ppressed by the arr ay fee dback , begin to appear in this rang e . Fig . 13 also illustrates how the inte rference suppressio n depend s on the angular separation between th e de sired signa l and t he interference . When the int er feren ce has very low po wer , it has little effec t o n t he array pat te rn . Th e array simply forms a be am t oward the desir ed signa l. Thus th e diff er en ce bet ween the t hree cu rves at th e lo w-po wer end (lIS < - 15 dB ) j ust reflect s t he array s pa tt ern . Th e clos er the interference is to the desired signal, the mo re in te rfe re nce po wer will be prese nt at the array out pu t . because th e in terference will be higher up o n th e main beam of th e patt ern . As the in te rferenc e po wer is inc reased int o the regio n 0 d B < l IS < 10 dB , the separation be twee n the t hr ee cur ves bec omes lar ger. This indicates that the arr ay is bettcr able to null t he interference if its angular separation from t he des ired signal is larger. The angles sho wn in Fig . 13 span the rang e where t he re is a noticeable difference in per fo rm an ce . When the int erfer en cephase an gle is 7 S° and higher , th e per for man ce is essentially inde pende nt of the angul ar separa tio n . 4 Th e fe edback-lo op gai n in th e ada pt ive array is propo rt io na l to th e power o f the signals in the arr ay (I I ) . S This behavior has be en not ed theoret ical1y b y several author s 113) -

{1 6) .

------....

~

/ { cr-<J -

DESIR ED SI GNA L

x_ X

CODE RA TE _ 1.2 S MHz DATA RAT E - 0 .2S MHz " REF. LOOP F IL TE R - I MHz

-4

0

X-

"'-

g ~

w >-

c,

x-

:::>

S

a:

~

>- - 2

REF. LOOP F ILTER B W - I MHz

- - -- - ---___.

No t o nly the o ut put interference power , but also the output desired-signal power and de sired-signal-to -t hermal-noise ratio dep end o n t he inpu t inte rfere nce power. The desired signal o ut put power is affect ed be cause the interference pow er influe nce s the pattern . Th e feed back -loo p gain in the array depends on t he signal powers. so a change in interference power affects the loo p gain and hen ce th e pattern . When t he interference is muc h st ro nger t han t he desired signal , th e arr ay nulls t he interference regardl ess o f what happens to t he desired signal. If t he interfe rence happ ens t o be too clos e t o t he desired sign al , the interference null is clos e to t he desired signal, and there is a red uc tio n in t he de sired signal pow er at the array o ut put because of this nu ll. Fig. 14 sho ws measured curves o f ou tp u t des ire d signal power as a function of interferen ce power into the arr ay for an interference-phase angle of 30° . This is a case wher e t he interference is close en ough to t he desired signal that the interfer en ce null affects the de sired signal. (A case with wider angular separation is shown be low .) We see in this figure that raising the inte rference pow er lowers the desir ed signal power at the arr ay o ut put. Three cu rves are shown for thre e different inp ut desired signa l-to-t he rmal-noise rat ios. The di ff erent curves we re mea sured by hol ding the de sired signal power co nstant and varying the noise pow er. The top curve shows t he out put signal po wer with a +3 dB desired signa l-to-t he rmalnoise ratio on each elem ent of the arra y . Th e middle curve shows the result whe n the noise pow er has bee n inc reased 3 d B, so the ele me nt SIN is 0 dB . Th e bot to m curve shows th e case whe re the noise power has been increased another 3 dB , so the element SIN is - 3 d B. We see t hat the higher the noise into the array , the mo re dependent t he out put desired signal po wer is on the inte rference po wer. It is clear from these curves that the out put desired-signal po wer de pends on t he t hermal no ise as well as the interferen ce power. For exa mp le, consider the end s of the curves whe re the int erferen ce power is very low . With a +3-dB S IN o n th e eleme nts, t he o ut put de sired signal pow er i5 arbit rarily defined to be 0 dB . Then a 3-dB increase in the element noise results in a 0 .8-dB drop in the output desired signal power, even th ou gh the input desired signal power haS

321

8r-.:=:::;:::==c;::::=::e::::=----------, 6

6

4

..,

CD

x-x-x-x-x_x w 2

(5 '"

~ -'

""~

'" ,...

0

-,

CODE RATE = 1. 25 \IHz DATA RATE =0.2 5 MHz REF . LOOP F ILTE~ = 1 MHz

x

w

'"

,...

6-

\

-,

x

= 3dB

=0 dB

x-x -2 0

= - 3 dB

- 10

0 I NPUT

l

lis )

~x

~2

DESIR ED SIG NAL - 0 ° ELE C I NT ERF . SIGN AL - 30'ELEC

2

x - x - x - x _ x__x

o

~

c,

4

10

...-'z

'-'

-.~~I

'" ....

CODE RATE = 2 .5 MHz DATA RATE = 0 . 2 5 MHz REF. LOOP FILT ER = 1 MHz

o

\

:::> e,

DESIRED SIGNAL _ O' ELEC

:::>

INTERF.

.... o

SIGNAL - 30

- 2

x\

0ELEC

X"'- x

\

= 3 dB = 0 dB

I f

= -3dB

( 03) - 20

Fig. 15. Output SIN versus in p ut lIS (for various SIN at in put).

- 10

I NPUT

not been changed . A further 3-dB increase in the noise causes an additional O.8-dB drop . The reason for this behavior is as follows . The error signal is made up of three components : 1) the desired signal minus the reference signal ; 2) the thermal noise; and 3) the interference. To minimize the overall mean-square error, the array feedback makes a compromise between these three components. In general, the weights that yield minimum-error signal do not match the array-output desired signal to the reference signal exactly . Rather , they compromise between thermal noise, interference, and desired-signal contributions to the error signal. If the thermal noise in the array is increased, the noise component in the error signal becomes larger . In response to this, the array feedback lowers the weights to reduce this noise . In the process, the output desired signal is lowered as well, so the mismatch between the array output desired signal and the reference signal increases. The final weight setting compromises between decreasing the noise and increasing the desired-signal mismatch. The overall result is a reduced desired-signal output from the array. Furthermore, the interference power has a stronger effect on output desired signal power when the thermal noise is large. That is, as the thermal noise is increased , the mismatch between the desired signal and the reference signal becomes a less important contribution to the error signal. This result is seen in Fig. 14; the drop in desired signal power as the interference is increased is greater the larger the thermal noise (the lower the element SIN) . Fig. 15 shows the dependence of output desired signal-tothermal-noise ratio on input-interference power for the same experiment as in Fig. 14. Again, three curves are shown , correspondmg to element SNR's of +3, 0 , and -3 dB. Several interestmg effects may be seen in these curves. First consider the case when the interference power is low . Ideally, a four element array should provide 6 dB gam . Thus, for example, the top curve, where the element desired-signalto-thermal-noise ratio is +3 dB, should show an output SIN of +9 dB at low interference power. The measurement shows

322

0

10

I NTE RFERENCEI S IGNAL l

1/ s l

20

r

l dB)

Fig. 16. Output SIN versus input lIS (for various SIN at input).

6

w

'" 0 4 z

....

-'

~

'-' Vi 2

X - x - x - x __ _ x

X-X_X

~

CODE RATE = 2 .5 MHz DATA RAT E = 0.25 MHz REF. LOOPFILTER=IMHz

o

DES IRED SIGNAL - 0 ° ELE C I NTERF. SIGNAL -75°ELEC

....

.... :::> o

- 20

- 10

0

INPUT SIGNAL /I NTERF ERENCE

lis

--""'x

10

...........

x

20

(dBl

Fig. 17. Output SIN versus in put lIS.

7 .8 dB . For a O-dB element SNR , the output should be +6 dB, and 5.5 dB was measured. For a -3-dB element SIN, the output should be +3 dB, and the result was 2.8 dB. We see that as the noise is decreased (or the element SIN is increased), the array performance departs more from the ideal. The reason for this is the presence of multiplier offset voltages [171, which alter the weights from their ideal values . The less noise , the more effect multiplier offset voltages have. The curves in Fig. 15 are for the case where the interference element-to-element phase shift is 30° . (The desired signal is in phase on all four elements.) Thus, Fig. 15 represents a case where the interference is too close to the desired signal for full performance from the array. The fact that the signals are too close is seen in the strong dependence of the output desired-signal-to-thermal noise ratio on int erference power.

ful suggestions of D. Townsend and D. Himes of NRL and R. Bauman of NASC.

Fig. 16 shows a similar set of curves, except that the spreading ratio has been increased to 10: 1. A comparison of Figs. 15 and 16 shows that there is little difference in the performance as a result of changing the spreading ratio. Next, Fig. 17 shows the same measurements as in Fig. 16 except that the interference phase shift between elements has been increased to 75°. (In Fig. 16 it was 30°.) With a phase shift of 7S°, it may be seen from Fig. 17 that the output desired-signal-to-thermal-noise ratio is less sensitive to the input interference power. Thus, at a 30° phase shift, the interference is too close to the desired signal; at 75°, the signals are far enough apart that the SIN is not degraded by the interference.

REFERENCES ( 1] R. T. Compton, Jr., "An experimental four-element adaptive array," IEEE Trans. Antennas Propagat., vol. AP-24, p. 697 Sept. 1976. ' (2] B. Widrow, P. E. Mantey, L. J. Griffiths, and B. B. Goode,

"Adaptive antenna systems," Proc. IEEE, vol 55, p. 2143, Dec. 1967.

[3] R. S. Kennedy, Fading Dispersive Communication Channels. New

CONCLUSIONS

This paper has described the integration of an LMS adaptive array into a PN-coded spread-spectrum communication system. A method of reference- signal generation and code-timing acquisition were described that allow the array to distinguish the desired signal from interference. The hybrid system yields full interference protection during the code-lockup phase as well as after timing has been acquired, i.e., code lockup time is not affected by the presence of interference. Typical experimental results have been presented illustrating the interference suppression characteristics of the hybrid system. Most of the results presented are for a spreading ratio of 5: 1, a very modest value. With a 5: 1 ratio, interference suppressions of 35 dB are typical. This performance represents much greater protection than can be achieved by spectrum spreading alone with such a spreading ratio. The results presented illustrate the advantages of combining adaptive array techniques with waveform design. However, it is clear that for such systems the adaptive-array parameters and the signaling waveforms must be selected together to be compatible, because of the close interaction between the antenna and communication systems. In general, it is difficult to add an adaptive array to an existing spread spectrum system where the communication-system waveforms have been chosen independently of the array. ACKNOWLEDGM ENT

The author is grateful to Professor R. J. Huff and Professor A. A. Ksienski for many helpful discussions during this program. In addition, the author appreciates the numerous use-

York: Wiley-Interscience, 1969. (4) S. P. Appleboum, "Adaptive arrays," IEEE Trans. Antenna Propagat.; vol. AP-24, pp. 585-598, Sept. 1976. (5] J. J. Stiffler, Theory of Synchronous Communications. Englewood Cliffs, NJ: Prentice-Hall, 1971, p. 178. (6] K. L. Reinhard, "Adaptive array techniques for TDMA network protection," Section II of R. J. Huff, "Coherent multipleXing and array techniques," ElectroScience Lab., Dep. Electrical Eng., Ohio State Univ., Rep. 2738-3, Feb. 1971; prepared under Contract F30602-69-C-0112 for Rome Air Development Center, Griffis, AFB, NY. (7] K. L. Reinhard, "An adaptive array for interference rejection in a coded communication system," ElectroScience Lab., Dep. Electrical Eng., Ohio State Univ. Rep. 2738-6, Apr. 1972; prepared under Contract F30602-69-0112 for Rome Air Development Center, Griffis AFB, NY. [8] R. 1. Huff and K. L. Reinhard, "Coherent multiplexing and array techniques," ElectroScience Lab., Dep. Electrical Eng .• Ohio State Univ., Rep. 2738-9, June 1972; prepared under Contract F30602- 69- 0 1 12 for Rome Air Development Center, Griffiss AFB, NY. (9) J. 1. Spilker, Jr. and D. T. Magill, "The delay-lock discriminatoran optimum tracking device, Proc. IRE, vol. 49, p. 1403, Sept. 1961. [ 10] J. J. Spilker, "Delay-lock tracking of binary signals, IRE Trans. Space Electron Telem., vol. SET-9, p. 1, Mar. 1963. (11 ] V-'. J. Gin, "A comparison of binary delay-lock tracking loop implementations," IEEE Trans. A erosp. Electron. Syst., p. 41 S, July 1966. ( 121 R. L. Riegler and R. T. Compton, Jr., "An adaptive array for interference rejection," Proc. IEEE, vol. 61, p. 748, June 1973. (13 ) C. A. Baird, Jr., G. P. Martin, G. G. Rassweiler , and C. L. Zahm, "Adaptive processing for antenna arrays," Final Rep., Radiation Systems Division, Harris Intertype Corp., Melbourne, FL, June 1972. (141 K. L. Reinhard, "Adaptive antenna arrays for coded cornmunication systems," ElectroScience Laboratory, Dep. Electrical EnJ., Ohio State Univ., Rep. 3364-2, Oct. 1973; prepared for Rome Air Development Center under Contract F30602-72-C-0162. ( 151 B. S. Abrams, S. J. Harris, and A. E. Zeger, "Interference cancel ... lation ," RADC-TR-74-225, Final Rep., General Atronics Corp., Sept. 1974. (16) A. E. Zeger, B. S. Abrams, and C. Luvera, "Interference canc~· lation system for sensors," hoc. NRL Adaptive Antenna Systeml Workshop, Mar. 1974. [ 17] R. T. Compton, Jr., "Multiplier offset voltages in adaptive arrays," IEEE Trans. Aerosp. Electron. Syst., vol. AES-l~.. p. 616, Sept. 1976.

323

tt

On the Performance of a Polarization Sensitive Adaptive Array R. T. COMPTON, JR.

Abstract-The ability of a least mean square (LMS) adaptive array to adapt to the electromagnetic polarization of incoming signals is considered. An array of two pairs of crossed dipoles is studied. A desired signal and an interference signal are assumed to arrive from arbitrary directions with arbitrary elliptical polarizations. The output signal-to-interference-plus-noise ratio (SINK) from the array is computed as a function of the signal angles of arrival and polarizations. It is shown that as long as certain special desired signal polarizations are avoided, the array is difficult to jam with a single interference signal. To produce a poor SINK, an interference signal must both arrive from the same direction and have the same polarization as the desired signal.

INTRODUCTION

A

DAPTIVE arrays [1] -[ 3] are currently of great interest because of their ability to null interference and track desired signals automatically. Numerous papers have discussed the performance of adaptive arrays [4]. In spite of the extensive literature, however, for radio applications of these arrays (as contrasted with sonar applications), one aspect of this subject appears to have received little attention. We refer to the fact that an adaptive array can adapt to the electromagnetic polarization of signals, as well as their arrival angles. If an adaptive array uses elements responding to more than one polarization, the array feedback loops will automatically combine the signals from these elements to optimize reception, or provide a null, for particular signal polarizations. Such an array can automatically track a desired signal with one polarization while nulling interference with a different polarization. Most analytical studies of adaptive arrays have assumed isotropic elements. This assumption, although useful for certain purposes, tacitly eliminates any consideration of the effects of signal polarization on array performance. In essence one assumes all signals arrive at the array with the same polarization. If an array receives and uses more than one polarization its performance can be far superior to one that does not.. For example, an array of isotropic elements always yields poor performance if interference arrives too close to the desired SIgnal. When an array adapts to polarization, however, this difficulty occurs only if both signals have the same polarization as well as angle of arrival. When two signals arrive from the same direction, it is perfectly possible to null one signal and not the other, if their polarizations are different. The purpose of this paper is to examine the performance of a polarization-sensitive adaptive array. As a model, we will consider an array of two pairs of crossed dipoles. We will compute the output signal-to-interference-plus-noise ratio (SINR) Manuscript received February 15, 1980; revised October 16, 1980. This work was supported in part by the Naval Air Systems Command under Contract N00019-79-C-0291, and in part by the Joint Services Electronics Program under Contract NOOa 14-78-C-0049, both with the Ohio State University Research Foundation. The author is with the ElectroScience Laboratory, Department of Electrical Engineering, The Ohio State University, Columbus, OH43212.

from this array when a desired signal and an interference signal arrive with arbitrary polarizations and angles of arrivaj! We will show that in most cases interference has little effect on the array output SINR unless it arrives from the same direction and has the same polarization as the desired signal. However there are two exceptions. If the desired signal polarization is linear, oriented either parallel or perpendicular to the vertical dipoles, the array is susceptible to interference from other angles as well. These desired signal polarizations are ones that should be avoided in a system design. Finally we will find that when both signals arrive from broadside, the array output SINR is simply related to the separation between the signal polarizations on the Poincare sphere. Section II of the paper formulates the necessary equations. Section III contains the calculated results and Section IV the conclusions.

II. FORMULATION OF THE PROBLEM Consider a four-element adaptive array consisting of two pairs of crossed dipoles, as shown in Fig. 1. The signal from each dipole is to be processed separately in the array. The upper and lower dipole pairs have their centers at Z = + L/2 and Z = -L/2, respectively. Let _~ 1 (t) and x3(t) be the cornplex signals received from the upper and lower vertical dipoles, and X2 (r) and X4(t) the signals received from the upper and lower horizontal dipoles, respectively. Each signal Xj(t) is multiplied by a complex weight Wj and summed to produce the array output. We assume the weights Wj are controlled by an LMS processor [2], [5), so the steady-state weight vector, W = (wI, w2, ... , W4)T, is given by (1)

where rI> is the covariance matrix (2)

and S is the reference correlation vector S = E{X*r(t)}.

(3)

In these equations X is the signal vector

(4)

ret)

is the complex reference signal 2 used in the adaptive array feedback [2], (5), T denotes transpose, the asterisk denotes complex conjugate, and E(·) denotes expectation. Assume two continuous wave (CW) signals are incident on the array, one desired and the other interference. Let () and ¢ 1 By arbitrary polarizations, we refer to signals that are completely polarized (i.e., elliptically polarized). \Ve do not consider partially polarized signals [ 12] . 2 r(t) is called the "desired response" in [2J.

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. AP-29, No.5, pp. 718-725, September 1981.

324

Fig. 2.

Polarization eIlipse.

x Fig. J.

Crossed dipole array.

denote standard polar angles , as shown in Fig . I . We assu m e th e: desired signal arrives from angular d irection «(J d , tPd) and th e interference from «(J i , tPi) ' Furthermore each signal is assumed to have an arbitrary electromagnetic polarization. To characterize the polarization o f each signal we make the follow ing de finit ions . G iven a transverse ele ctromagnetic (T E M) wave propagating in to the array, w e co nsid er t he polarizat ion ell ipse produced b y the tra nsve rse elec t ric fie ld as we view the inco m ing wave fro m the coo rd inate o rigin . N o te th at u n it vectors , ~, iJ , - ;, in that o rde r , fo rm a rig h t-ha nd ed coo rdi na t e sy ste m for an incorn ing wa ve . Suppose the elec t ric field ha s transve rse co rn po nen ts

I

(9a)

3 These relationsh ips are derived in [6]. Our definitions and notation correspond exactly to those in [61 if we substitute/:;q, - X, EO Y, 1) -!p.

(6)

r

where r is the a xi a l rat io r=

m ino r ax is

(7)

ma jor axis

In add ition Q is defi ne d po sitive w he n t he elec tric vector ro tates cloc kwise a nd negative w he n it rota te s co u n te rclock wise (w he n the incom ing wa ve is viewed fro m the coo rdi n ate o rigin, as in Fig. 2) . Q is alway s in the range - 71/ 4 ~ Q ~ ni«. F ig. 2 depicts a situation in which Q is positive . For a given state of polarization , specified by Q and (3 , the elec t ric field co m po ne n ts are given by ( as id e from a common ~ h a s e fact o r)

Eq, = A cos "( Eo

(8 a)

= A sin "(e i T)

Where "( and 1/ are related to cos 2"( = cos 2Q cos 2{3

Q

(9 b)

(8 b)

(We will call Eo the horizontal co m po ne n t and Eo the vertical co m po nen t o f th e f'ield.) In general, as time progresses Eq, and E.j wi ll describe a polarizat ion ellipse as shown in F ig. 2 . Gi ve n th is ellipse we d ef ine (3 to be the o rient a ti o n a n gle o f th e m ajo r a .is o f th e ellipse w ith res pec t to Eq" as sh own in F ig. 2. T o e.irnina te amb iq uities we d efi ne (3 to be in t he range 0 ~ (3 < 71 . We also de fine the elli p tic ity a n gle Q to have a magn itude given by

= tan -

tan 1/ = tan 2Q esc 2{3.

Poincar e sphere.

The re lation ship am ong the fo ur angu lar va ria bles Q, (3, "(, a nd 1/ is m ost easily visualized by mak ing use of the Poin car e sphere co nce pt [ 6] . This t echnique represents the sta te of polariza tion by a point o n a sphere , su ch as point M in Fig. 3. For a given M, 2"(, 2(3, and 2Q form the sides o f a right sp he rical triangl e , as sho w n . 2"( is the side o f th e tr iangle between M and a poin t la be lled II in t he figu re ; H is the point represent ing h o rizo n ta l linear polar izat ion . Sid e 2(3 ex tends alo ng t he eq ua to r and side 2Q is vertical , i.e ., perpend icular to side 2(3. The an gle 1/ in (8) and (9) is the angle between sides 2"( and 2{3.3 The sp ecial case whe n Q = 0 in (6) a nd Fi g. :; co rr es po nds to linear polarizat ion ; in th is case th e po int /11 lies o n the eq ua tor. If in ad di tio n, (3 = O. only l:;1j) is nonzer o a nd the w ave is horizontall y polarized . This case de fines the point H in Fig . 3. If instead (3 = 71/ 2 , o nly c'o is non zero and the wave is vertically polarized . Point ,\If th en lies o n the equat or di ametrically behind H. The po les of th e sp he re co rr es po nd to circ u lar pol ari zat ion (Q = ±4 So) , with clo ck w ise circ ula r pol a riz ation (Q = +45°) a t th e uppe r po le . Thus a n arb itra ry pla ne w ave co mi ng in to the a rray m ay be cha racte rized by fo u r angu lar param et e rs an d a n am plitu d e . For exa m pl e t he de sir ed signal w ill be charac te rized by its ar rival a ngles «(Jd, tPd), its polarizat ion ell ipt icit y an gle Qd an d orientat ion angle (3d , and it s amplitude A d ( i.e ., A d is the value of A in (8 ) for the d esired signa l) . We will say the de sired signal is defined b y «(J d , epd, (Xd , (3d , Ad )' Similarly th e in terfe re nce is defined by «(J i, epi, (Xi , (3i, A i) ' We assu me eac h d ip ole in th e array is a short dipol e, i.e. , th e o u t p u t vo lt age f ro m eac h d ipole is pr opo rt ional to the electric field co mpo ne n t alo ng th e d ipole . Ther e fore the vert ica l and hor izontal d ipo le o ut pu ts w ill be p ro portio nal to the z- and x-com po ne n ts, res pec t ively, o f t he electric fie ld . An inc o m in g signal, wi th arb itra ry ele ct ric fiel d co m po nents EIj) and Eo, has

(5 )

Q

Fig. 3.

and (3 by [ 6]

325

x, y, z components:

vector

E = E(j)~ + E(J8

= (Eo cos (J + (Eo cos

( 17)

cos ~ - E(j) sin ~)x (J sin ~

+ Erp cos

The covariance matrix in (2) is then given by

~)y

-(Eo sin 9)2.

(18a)

(10)

where When E rP and Eo are expressed in terms of A, 'Y, and 11 as in (8), the electric field components become

E = A[ (sin 'Y cos 8 cos epejTl

+ (sin 'Y cos 9 sin

l/Je i Tl

+ cos

I

y cos l/J)Y

(12)

(-sin "1 sin (JeiTl)e-jp

rrL

A

I

I

( 18d)

with I the identity matrix. To make the LMS array to track the desired signal, the reference signal ;(t) must be a signal correlated with the desired signal and uncorrelated with the interference [2], [5]. Several techniques have been described for obtaining such a reference signal [ 10] , [ 11 ] . Here we assume

(13)

w is the frequency of the signal, 1/1 is the carrier phase of the signal at the coordinate origin at t = 0, and p is the phase shift of the signals at the dipoles due to spatial delay

=-

I

For the reference correlation vector, (3) then yields

(sin 'Y cos 8 cos epeiTl - cos 'Y sin
p

(18c)

}

(19)

(-sin 'Y sin 8e i Tl )ei P

U=

=A

E{X·*X· I I

.2U·*U· T

(11 )

where U is the vector

(sin 'Y cos 8 cos (j)eiTl - cos 'Y sin l/J)£!P

(18b)

T

and

Adding to this expression the time and space phase factors, we find that an incoming signal characterized by (9, ¢>, Q, 13, A) produces a signal vector in the array (see (4» as follows: Aei(wt+I/J)U,

= E{Xd*X d T} = Ad2Ud*UdT

e. =

- cos "1 sin l/J)X

- (sin 'Y sin 8e i Tl )Z ] .

X =

d

(20)

The steady-state weight vector can now be found by substituting (18) and (20) into (1). The SINR at the array output is then given by

Pd

SINR=---

r,» r;

cos 9.

(14 )

where Pd is the output desired signal power

As stated above, we assume a desired signal specified by

«(J d, l/Jd, Qd, 13d, Ad) and an interference signal specified by «(J i» l/Ji, Qi, 13i, A;) are incident on the array. In addition we assume a thermal noise voltage nj(t) is present on each signal

Xj(t). The nj(t) are assumed to be zero mean, to be statistically independent of each other, and to have power 0 2 :

(22)

Pi is the output interference power (23)

(15)

and Pn is the output thermal noise power

where [, ii is the Kronecker delta. Under these assumptions the total signal vector is given by

X=Xd+Xi+X n

= Adei(wt+I/J d)Ud + Aiei(wt+I/J ou, + X n

(21 )

(24)

(16)

where Ud and Vi are given by (13) with appropriate subscripts d or i added to each angular quantity. l/J d and l/J i are assumed to be random phase angles, each uniformly distributed on (0, 2rr) and statistically independent of the other. X n is the noise

By making use of a matrix inversion lemma, [9] , [ 13] the expression for SINR in (21) can be put in the simple form

326

(25)

10

10

30 - 45 Qj 'O - IT -/

I

0

0-

~45 15 30

a; -10-

ai -IO -e

~

0:

0:

z

'"

t/)

- 30

20, 10

H-20

~ - 20 -

-

- 30 I 30

- 40 0

I

I

60

8;

I

90

120

-40 0

180

8;

( DEGREES )

(a)

10

-45

......

10

-10 - 15

~

a j 'O Of-

I 150

15

...-!:?

~~O

45

(b)

' C

a " -45 - 30

0 '-

180

90 ( DEGREES I

-)5

~"45 o

iii - 101-e

15

30

-

;; -10-

-e

0:

0:

~-20""

~-20-

'"

t/)

-3 0

-40

-

-

-30

I

30

0

to

I 60

I

I

8;

90 ( DEGREES) ( c)

a, ' -45 , (5

120

I

I

I 30

-40 0

180

150

I 60

-15

0 , - 30

0 1-

30

i

a;-e - IOfa: ;;-20 -

45

~45 o 15 30

-10 f-

z

I

) -

I 30

0

Fig. 4.

A/

I 60

I

8,

90 ( DEGREES)

I 120

a

>-<-20 fVl

- 30 l -

I

I

-40

180

150

I 30

0

I 60

(e)

SINR versus 0;. Oil = 90°, rJ>d = 90° . Qd = 15°. I3d = 30°. SNR =0 dB. rJ>; = 90 .INR 30° . (c) 13; =60° . (d) 13; =90° . (e) 13; = 120° . (f)tJ; = 150°.

~d = - 2 = desired signal-t o- no ise ratio (SNR)

0

t (26a)

4 td and ti are the signal-to-noise ratios that will exist in a given array element if the incoming signal arrives broadside to that element ~d is linearly polarized in the direction of that element. For example, if Q
'*

180

0:

(fl

-40

I 150

(d)

a j '_3\,15 / -

-.......::::: -,L.-0"45

I 120

( DEGREES )

10

~

Of-

8;

I 90

*'

A .2

8,

I 90 ( DEGREES I

I 120

I 150

180

(0

=40 dB. (a) 13; =0° . (b) 13; =

=-i=interference-to-noise ratio (INR) , a

(26b)

The derivation o f (25) from (21) may be found in [13, ap pendix] . Calculation of the SINR from (25) is much eas ie r th an from (22)-(24) because (25) does not require calculation of the weight vector. In the next section , we show typical curves of the arr ay performance based on (25) . III. RESULTS Bec ause o f t he large number of parameters required to speci fy both the desired and interference signals, many types o f curves can be plotted . Unfortunately space does not permit an exhaustive set of curves here. However, we will show a number of typical curves, in clu di ng those illustrating the worst performance . Figs. 4 and 5 show curves of output SINR when the desir ed signal arrives from broadside (0 d = rf>d = 90°). The de -

327

10

0 0

<, < - ;

~ - 10,-

-a:

-

10

-3l

a i ' -45 """

Q

=-- 15

----

~O

d5

=

-45

z

....- 20 en

-30-

I

I

30

60

I 90 (DEGREES I

- 30

,

I 120

150

-40 0

180

180

(a) 10 a,

·-t

(b) 5

-30 ]' -1/5

10 a " -3.0

0 Or--

0

,;

-

a; -10 I-

/.

4~

30

z

....-20 Ien

I

-

-40 0

~-IO l-

I,

~-20 l-

en

I

I 60

30

ef>i

I 90 (DEGREES I

I 120

I

150

- 30 l-

I

I

-40 0

180

30

I

60

-1 5

-3~

---=

0

-'}' .15

1

I 0

~

:.---

120

I 150

180

a

I

j-45

- 3 0.~ 1 5

0

\5

30

-4[

15

0

a; -10-

a;-IO-

~-20-

~-20'-

-

-30 l-

45

30

-a:

-a:

-e

-e

en

-40 0

90 (DEGREES)

(d ) 10

a"-'j5 .15

I

I

(e) 10

-r 45

-a:

I

a:

30

15

'

- 45 .0 -15

~

-e

- 30

45

-a:'"

~-20-

-30

-30

a; -10

en

-40 0

1

0

»<>: 0 "15

VI

I 30

I 60

I 90 (DEGREES I

I

120

I

150

-40 0

180

SINR ver su s 8 i- 8 d

ef>i

90 I DEGREES 1

I

120

I 150

180

= 90° , tPd = 90° , Qd = 15°, (Jd = 30° , SNR = 0 dB , 8 i =90" , INR = 40 dB . (a) (Jj = 0° . (b) (Ji = 30°, (e) (Jj =60° . (d) (Jj = 90°. (e) (Ji = 120°. (f) (Jj = 150° .

sired signal has been chosen to have a par ticular elliptical polarization : CY.d = ISO and (3d = 30°. The SNR is 0 dB and the INR is 40 dB. The element pairs are assumed spa ced a half wavelength apart (L = 'A./2) . Fig . 4 shows the output SINR as a funct ion of the interference polar angle 8 ;, with rpj = 90° and for various interference polarizat ions . Spe cifically , F ig. 4(a)

=

I

I

60

( f)

(e)

Fig. 5.

I

30

shows the SINR for (3; 0° , Fig. 4(b) for (3; = 30°, and so forth, up to Fig. 4(f) for (3; = ISOo. Each figure shows the results for CY.; = -4So, _30° , _ 15° , 0°, 15° ,30° , and 45° . Fig. 5 shows similar results as a function of the interference azimuthal angle rp;, with OJ = 90° . Examination of these curves shows that the worst output SINR is obtained when the interfer ence arrives from the same direction as the desired signal (broadside) and has the same polarization as the desired signal. This result is not surprising, of course, because in this case when the arra y nulls the interference it also nulls the desired signal. However the interesting

thing about this case is how little difference in polarization betwee n the signals is required to allow the array to provide substantial protection. For example, it may be seen in Figs. 4(b) or S(b ) (for (3j = (3d = 30 °) that when 0; = 90 ° and rp; = 90 °, if either CY.; = 0 or CY.; = 30° (i.e., if CY.j differs from CY.d = 15° by ±ISo) the SINR out of the array is higher than -9 dB. Thus with th is small difference in polarization, the array can prov ide over 31 dB of protection against the in terference. For the special case where both signals arrive from broad side , the output SINR from the array can be simply related to the polarizations of the two signals . If

328

(27)

(13) yields (28 )

10r------------------------,

o ~ -10

~ -20

H

en

-30

-40----------. . . . - ..... o 20

and also

= 2[ 1 + cos

2'Y d cos 2'Y;

+ sin

2'Y d

• sin 2'Yi cos (17d - 11i))·

(29)

Let .-\1 d and M; be points on the Poincare sphere representing the polarizations of the desired and interference signals, respectively. Then in (29), 2'Yd' 2'Yi and the arc MdM; form the sides of a spherical triangle, as shown in Fig. 6. The angle l1d - 11; is the angle opposite side M d1\1 i- Using a well-known spherical trigonometric identity [8 j , we have cos 2"'1d cos 2'Yi + sin 2'Y d sin 2'Yi cos (17d - 17,.) (30)

so (29) can be written

and (25) becomes

If ~i -

1

60

---.l~-..a....--~_....L_ _

80

100

120

M d M i (DEGREES)

Fig. 7.

SINR =

40

~

4 cos 2 d [

2_

~; -

(Mdtfi)] --

1

+

2 . 2

.

(32)

~ 2, this result may be approximated by

(33 ) These formulas show that the SIN R obtained from the array when both signals are at broadside depends only on the separation between the two points M d and ,M; on the Poincare sphere. A plot of the SIN R versus the spherical distance ItJdM; in angular measure, as obtained from (32), is shown in Fig. 7 for SNR = 0 dB and INR = 40 dB. It is seen, for example, that a separation of MdM; = 26° on the Poincare sphere results in SIN R = -10 dB, an improvement of 30 dB over what it would be without the array. 5 This result holds regardless of the specific Polarizations of the signals, so long as they are separated by 26° on the Poincare sphere. S To reconcile the curve in Fig. 7 with the results in Figs. 4(b) and S(b), one must note that point M in Fig. 3 lies above the equator by an angle 20:. Thus, for example, a separation of M fiMi = 26 0 corre0 ,SPonds to a difference of only I G:d - O:i I = 13 , if {Ji = (3d.

SIN R versus Poincare sphere separation. (J d

140

160

_J

180

= cPd = (J i == cPi == 90° .

In general, when the desired signal arrives from some direction other than broadside, the curves of SIN R versus interference arrival angle are similar to those in Figs. 4 and 5. The worst performance always occurs when the interference arrives from the same direction as the desired signal and has the same polarization. When both signals arrive from the same direction off broadside, however, it is found that the SINR cannot be related to the polarization difference so simply as 6 in (32). In this case the SIN R de pends on the angle of arrival as well as the polarizations. The reason is that the electric field component in the y-direction is not received by the array, because the array contains only x- and z-oriented dipoles. When the arrival angle and polarization of a signal are such that there is a v-com ponent of electric field, the Sf NR is affected by the loss of power in this component to the receiving system. The amount of electric field in the y-direction depends on the angle of arrival as well as the polarization. The curves in Figs. 4 and 5 show typical performance frorn the array for an arbitrarily polarized desired signal. However it must be noted that with this array certain desired signal polarizations allow the system to be jammed over a wide range of interference angles. Namely if the desired signal excites only two of the four dipoles, then when the interference excites only the same two dipoles, the array has no ability to discriminate between signals in the azimuthal coordinate ¢. This situation leaves the array vulnerable to interference from a wide range of angles. Specifically there are two cases where poor performance occurs: when the desired signal is either vertically or horizontally polarized. For example, suppose the desired signal is vertically polarized (CXd = 0°, (3d = 90°) and arrives from an arbitrary direction 0 d . ¢d. Then a vertically polarized in terference signal (CXi == 0°, ~i = 90°) will produce a poor SIN R as long as it arrives from the same polar angle, i.e., if 8; == 8 a. regardless of ¢l' It is clear why this is so from the arrangement of elements in Fig. 1. F or vertically polarized signals the array has no ability to discriminate in the azimuthal coordinate cP. 6 When the signals arrive from the same direction off broadside with different polarizations, one can express the output SINR in a form similar to (32) by means of an artifice, as follows. For each signal, one defines an "equivalent signal" whose amplitude, phase, and polarization are chosen to make the equivalent signal produce the same voltages in a pair of imaginary crossed dipoles oriented perpendicular to the arrival angle as the voltages produced by the actual signal in the actual dipoles. If two such equivalent signals are defined, one for the desired signal and one for the interference, the output SINR will be related to the difference in polarization of the two equivalent signals as in (32). However when this procedure is carried out, it is found that the transformation equations between each signal and its equivalent are complicated enough that little additional insight is gained. For calculating SINR, it appears to be simpler just to use (25).

329

10 ..--- ----=-

-

{3, ; o-

-

-

- - - - - - -- - -- - - - - - ,

10 .---

-

-

- - - - - - -- - --

-

------.,

± 2-

o

- 4 00 30

60

30

120

150

180

( DEGREES I

Fig. 8. SINR versus ¢j. ad = 90°, ¢d = 90·. Cl.d = O' .l3d = 90·, SNR = o dB. a i =90·. Oli =0°. INR =40 dB. A particularly bad case occurs when 8 d = 90° (and
30

Fig. 9. SINR versus rl>i' ad = 90°, ¢d = 90°. Old = 0·. (Jd o dB. a i =90°. (Ji =0·. INR =40 dB.

= 0°, SNR =

Figs. 8 and 9 illustrate the vulnerability of the system to jamming when both the desired signal and the interference have either vertical linear or horizontal linear polarization. This problem occurs because in these cases the signals excite only two of the four elements and also because the dipoles in Fig. 1 are all located at x = O. That is, there is no displacement of the dipoles along the x-axis and hence the array has poor ability to provide spatial discrimination in the
330

t:ld == 1S°, (3d = 30° (see Fig. 2). In general we find that the interference has only minimal effect on output SINR unless it arrives from the same direction and has the same polarization as the desired signal. Furthermore when both desired signal and interference arrive from broadside, the output SIN R depends only on the difference in polarization of the two signals, according to (32). Finally we have found that certain choices for desired signal pol:lrization lead to poor ability of the array to reject interferenc~. Specifically if the desired signal is linearly polarized with its electric field entirely in the 8- or
111 [2

S. P. Applebaum, .. Adapti ve arrays," IEEE Trans, Antennas Propagat., vol. AP-24. pp. 5g5-598. Sept. IY76. B. Widrow , P. E. Mantey. L. J. Gnffith~. and B. B. Goode.

.. Adapuve antenna systems." Proc. IEEE. vol 55, pp. 2 \432159, Dec. 1967. [3J W. F. Gabriel. "Adaptive arrays-c-an Introduction." Proc. IEEE, vol. 64. pp. 239-272. Feb. 1976. 14J Special Issue on Adaptive Antennas. IEEE Trans Antennas and Propag at .. vol AP-24. Sept 1976. 151 R. L. Riegler and R. T. Compton. Jr.. "An adaptive array tor

interference rejection," Proc. IEEE. vol. 61, pp. 748-75g. June 1973.

G. A. Deschamps. "Geometrical representation of the polarizauon of a plane electromagneuc wave." Proc. IRE. vol. 39. pp 540544. May 195 I . [71 D. G. Brennan. "Linear diversuy combining techruques ;" Proc. IRE. vol. 47, pp. 1075-1102. June 1959. l~1 K. L. Nielsen and J. H. Vanlonkhuyzen. Plane and Spherical Trigonometry. New York: Burnes and Noble. pp. 110-119. 1954. l y 1 A. S. Householder. The Theory oj Matrices in Numerical AlltlI\'SIS. New York: Dover. p. 3. 1964. . llUI R. T. Compton. Jr. R. J. Huff. w. G. Swarner. and A. A. Ksrenski, .. Adaptive arrays for communication svstcrns: An overview of research at {he Ohio State Uruvcrvity." iEEE Trans, Antennas Propagut .. vol. AP-24. pp. 59<J-007. I <J76. 1111 R. T. Compton. Jr.. "An adaptive array in a spread spectrum communication ~ystcm." Proc. IEEE. vol. 60. pp. 289-2l)8. 197X. [121 H. c. Ko. "On the reception of quasi-rnonochromauc. partially polur izcd radio wavcs." Proc. IRE. vol. 50. pp. 1950-1957. Sept. I Y72. 1131 A. Ishrdc and R T. Compton. Jr.. "On grating null« 111 adupuvc array\.·· IEEE Trans, Antennas LInd Propugat: vol AP-2~. pp. 467-475. July 1990 [61

331

Effect of Mutual Coupling on the Performance of Adaptive Arrays INDER J. GUPTA

AND

AHARON A. KSIENSKI,

Abstract-The effect of mutual coupling between array elements on the performance of adaptive arrays is examined. The study includes both steady state and transient performance. An expression Cor the steady state output signal-to-interference-plus-noise ratio (SINR) of adaptive arrays, taking into account the mutual coupling between the array elements, is derived. The expression is used to assess the steady state performance of adaptive arrays. The transient response is studied by computing the eigenvalues associated with the signal covariance matrix. The steering vector required to maximize the output SINK of Applebaum-type adaptive arrays in the presence of mutual coupling is also given.

coupling. In Section III, the effect of mutual coupling on the transient performance of adaptive arrays is studied. It is shown that the presence of mutual coupling between the array elements reduces the speed of response of an adaptive array. In Section IV, the steering vector required to maximize the output SINR of Applebaum-type adaptive arrays in the presence of mutual coupling is found. Section V contains conclusions.

II. STEADY STATE PERFORMANCE OF AN ADAPTIVE ARRAY IN THE PRESENCE OF MUTUAL COUPLING

I. INTRODUCTION HAS BEEN shown [1], [2 J that the performance of an antenna array is strongly affected by the electromagnetic characteristics of the antenna array. An important electromagnetic characteristic of an antenna array is the mutual coupling between its elements. In the above work, mutual coupling between the antenna elements was, however, ignored. i.e., the antenna elements were assumed to be isolated from each other. In practice, elements of an antenna array have mutual I: oupling, which in turn affect the gain. bearnwidth , etc . , of the array. Mutual coupling becomes particularly significant as the interelernent spacing is decreased. I n this paper the effect of mu tual coupling on the performance of adaptive arrays is studied. It is shown that the mutual coupling does affect the performance of adaptive arrays, and these effects are significan t even for large in terelernen t spacings, i.e., for spacing of more than half a wavelength. The effect is rather drastic as the interelernent spacing drops below half a wavelength. In (Jet, for a fixed aperture with half-wavelength spaced elements, the introduction of additional elements can degrade the array performance. The failure to recognize the presence of mutual coupling will degrade the performance of Applebaum-type adaptive arrays more than that of least mean square (lMS) arrays because the steering vector has to be modified both in phase and amplitude to include the changes in the desired signal vector due to the presence of mutual coupling. In Section II, an analytic expression for the steady state output signal-to-interference-plus-noise ratio (SINR) of an adaptive array is derived. The expression takes into account the mutual coupling between the array elements and involves the normalized impedance matrix of the array elements. The expression is used to study the effect of mutual coupling on the performance of adaptive arrays, and it is shown that the output SINR of the array depends upon the mutual coupling between its elements. For strong mutual coupling between the array elements, the output SINR drops significantly below its value in the absence of mutual ·;r T

l adaptive

FELLOW, IEEE

The output SINR of an adaptive array is the most commonly accepted measure of its steady state performance, and will accordingly be derived first. The basic diagram of an adaptive array is shown in Fig. 1. The output signal from each element is multiplied by a complex weight, and then these signals are summed to produce the array output Set). The weights are automatically adjusted to optimize the output SINR in accordance with a selected algorithm. To find an expression for the output SINR, one must know the element output voltages. We will, therefore, first develop an expression for the element output voltages when the mutual coupling is taken into account. These voltages will be used as the input signals to the adaptive processor. The required expression can be obtained by considering the iV-element array as an N + 1 terminal linear , bilateral network responding to an outside source as shown in Fig. 2. Referring to Fig. 2. each port of the lV-elenlent array is shown terminated in a known load impedance ZL. The array has as its driving source a generator with open circuit voltage Vg and internal impedance Zg. Using standard notation, one can write the Kirchoff relation for the JV + 1 terminal network as

(1) .

.

J' = ijZN j + ... + ijZ N i

..

+ ... + iNZ N N + isZN s

where Zij represents the mutual impedance between the ports (array elements) i and j. Further, making use of the relationship between terminal current and load impedence, vi

;.=--, /

ZL

i==1,2,···,N.

(2)

If all the elements in the array are in an open circuit condition then

ij=O, j= 1,2,···,N, Manuscript received November 8, 1982; revised March 31, 1983. The authors are with the ElectroScience Laboratory, The Ohio State University, 1320 Kinnear Road, Columbus, OH 43212.

and from (1) vi

= uOi = Zjsis·

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. AP- 31, No.5, pp. 785-791, September 1983.

332

(3)

Let m + 1 continuous wave (CW) signals (one desired and m jammers) of the same frequency be incident on the array. Then the open circuit voltages at the antenna terminals are given by

stu

m

Vo =Xd

+~

kc:1

x,

(7)

where

Xd

x,

CONTROL SIGNAL

Fig. 1.

ZL

ZL

ip

ZL

=

[2(8 d, fJ>d,Pd)e iPd2

(10)

ZL

Antenna array as aN + 1 terminal network.

where (ed, ¢Jd) defines the desired signal direction. Pa is the polarization of the desired signal, fj(8, fJ>, p) is the pattern response of the jth element to a signal incident from direction (e, ¢J) with polarization P and Pdf is the desired signal phase at the jth element, measured with respect to the coordinate origin.

Substituting (2) and (3) into (1) one gets

Z11 1+_

(9)

f 1 ((J d , fJ>d' Pd) e d 1

BILATERAL N+I PORT NETWORK

Ud

Fig. 2.

«v.:

coordinate origin, l/J ik is the carrier phase of the kth jammer at the coordinate origin and Ud, Ui k are, respectively, the desired signal vector and the kth jammer vector defined as follows;

OUTSIDE SOURCE

__----...10..-------'---------. } A~~~~~A LINEAR

Aikei(wot+1/J

A 7k is the average power in the kth jammer, Wo is the carrier frequency, l/J d is the carrier phase of the desired signal at the }

_

=

(8)

In (8) and (9), A~ is the average power in the desired signal,

Basic adaptive array.

Vi •

= Adei(wot+l/Jd)Ud

Z 12

ZlN

ZL

ZL

vi

II (Oik, ¢Jik' Pik)e i Pik J Uik

J I

VON

f2COik' ¢Jik' Pik)eiPik2

(11)

NC8ik, ¢Jik' Pik )e i PikN

where the notation is analogous to that for the desired signal vector. Using (6) and (7), the input signal to the adaptive processor will be

(4)

Or, more compactly

V=ZOl(Xd+i;xik)'

(5)

In (5), Zo is the normalized impedance matrix and Vo represents the open circuit voltages at the antenna terminals. Since Zo is nonsingular, one can find the element au tput voltages from the open circuit voltages. The element output voltages will be given

v = Zo 1 Vo.

=

k=l

If thermal noise is also added to each element of the array then the total input signal to the processor will be X= V+X n

(6)

::=

It should be noted that the matrix Zo is a normalized impedance matrix, normalized to the load impedance. It acts like a transformation matrix, transforming the open circuit element voltages to the terminal voltages. What is normally assumed in analyzing adaptive antenna systems is that the element spacing is large enough .so that the mutual coupling between the elements is small and consequently the matrix Zo becomes diagonal. If one further assumes that the self-impedances (Z;;, i = 1,2, ... ,N) are equal, the input signal vector will be just the open circuit voltage vector multiplied by a trivial scaling factor involving the self and load impedance terms. Thus the array performance will be the same as calculated using the open circuit voltages as the input signals to an adaptive processor.

(12)

z; 1 [Xd+ i; xikJ + X n

(13)

k=l

where X n is the noise vector defined as Xn

= (n 1( t), n2 (t),

'.., nN (t)) T .

(14)

In (14), T denotes transpose. In the case of an adaptive array, the signal xi(t) from the jth element is multiplied by a complex weight wi(t). The signals are then summed to produce the array au tpu 1. Using the LMS algorithm [3], the steady state weight vector W of the array is given by W=-IS

(15)

where is the covariance matrix

333

= E {X* X T }

(16)

and S is the reference correlation vector

S

gets

= E{X*R(t)}.

(17)

ZT

~-l = a~ (R; 1 - iR; 1 ~U'SR; 1 X(Zo J )*)- I

In (16) and (17), R(t) is the complex reference signal in the adaptive array [3], [4], the asterisk denotes complex conjugate, and E{ •} denotes expectation. From (13) and (16)

where (26)

~=EI [ZoJ (xd+~ Xik) +XinT · [ZOI

(Xd+

= (ZOI)*£

I[(

tl Xik) +XnJTl

x, + ~ X ik) + ZoX n

The array will acquire and track the desired signal if the reference signal is correlated with the desired signal and is uncorrelated with interference signals. Assuming that the reference signal R(t) is given by

r

R(t)

S = A,.Ad(ZO 1 )*U~.

[a Z*Z T + ~ A ~ u: UT + A U*U 0

0

~

k=l

Ik

Ik

d2

Ik

d

T] d

W = KZ&R;; 1 ~

(19)

-vhere 0 is the thermal noise power. From (19)

+

~ ~

T

~
k=l

AA K=~ (1- r UTR-l~) 2 d n d a

Pd

SINR= --m

(21)

k==l

~ Pi k +Pn

where Pd is the output desired signal power

Pik is the output interference power due to the kth jammer

Pi k

= ~ E{I(Z Ol X i k)TWI 2 } A2

2 = ~I 2 Ur(ZIk 0 1)TW1 ,

0

+ ~dUaUJXZo 1 )T.

(22)

Note that R n is the normalized (with respect to the thermal noise power) covariance matrix of the undesired signals(jammers and the thermal noise). To find the steady state weights (15), ~1>-1 must be computed. The following matrix inversion lemma [5] is used to compute <1>-1: (A-aU*UT)-l =A- 1-{jA- 1U*U T A -

1

(23)

where A is a nonsingular N X N matrix, U is a N X 1 column vector and Q, t3 are scalars related by

a-I +/3-1 =UTA-1U*.

(24)

Using the matrix inversion lemma to invert in (22) one

(33)

and Pn is the output thermal noise power 2

Pn=-IWI 2 . 2

then = (Zo 1 )*a (R n

(31)

k==l

where ~d is the ratio of the desired signal power to the thermal noise power and ~ik is the ratio of the kth jammer power to the thermal noise power. Let

2

(30)

(32)

(20)

s, = z3zl; -+ ~ ~ikU!kUhc

(29)

is a constant. The weights given in (29) will lead to the maximum output SINR in the presence of multiple jammers (see Appendix) . Knowing the steady state weight vector, one can compute the output SINR of the array which is given by

t v ,,*UT]eZ-1)T + C;d d d 0

m

(28)

where

(Z-l)T 0

2

"" = (Zo-1)*a 2 [Z*OZOT 't'

(27)

Using (25) and (28) the steady state weights (15) of the array are given by

Assuming that the thermal noise voltages from the array elements are Gaussian with zero mean and are uncorrela ted with each other, and the carrier phases of the narrow-band signals .ire uniformly distributed on (0, 2n') and are statistically independent of each other and of the thermal noise voltages, the covariance matrix ~ is given by 2

= A,.ej(wot+ wd)

and using (13), (17) yields

(18)

= (Z-l)* o

(25)

(34)

Using (3l)-{34), (31) yields

SINR = ~d UJR;; J U;.

(35)

Equation (35) is used to compute the steady state output SINR of an adaptive array consisting of N half-wavelength, center-fed dipoles. All the dipoles are assumed to have similar radiation characteristics and are spaced at a distance d apart (Fig. 3). Note that H-plane arraying of the dipoles is done. The desired signal and all jammers are assumed to be theta polarized (Fig. 3). For the results presented in this 'paper, [k(f) , 4>, p) = 1.28142 (10). Fig. 4 shows the output SINR of an adaptive array of six dipoles as a function of the desired signal direction. The dipoles

334

o

N'"

..,;

.,.

o

o

N

Fig. 3.

o

An array of N half-wavelength, center-fed dipoles. Fig. 5.

o

2 3 IN WAVELENGTHS

4

5

Mutual impedance between two half-wavelength , center-fed thin dipoles versus the spacing between the dipoles.

-

CD

-e

a: CD z

CD

-e

H

U)

a:'" ....z

NO MUTAL COUPLING - - WITH MUTUAL COUPLING

U)

NO MUTUAL COUPLING WITH MUTUAL COUPLING

04-.,....,.,....,....,.,....,...........,......,....,....,.....,-.....,........--,.......,....,...,

-1.0

-0 .5

0

0.5

1.0

SIN (epd)

Fig. 4. Output SINR of an array of six half-wavelength, center-fed dipoles versus the desired signal direction (<1>d)' (d = 5 dB, 9d = 90°, d = 0.51\ , ZL = Zii'

are spaced at a distance of half a wavelength and each dipole is terminated in a load impedance equal to the complex conjugate of the self impedance of a half-wavelength, center-fed dipole . 1 The input signal-to-noise ratio (~d) is 5 dB , and the output SINR is computed in the absence of all jammers. The continuous curve in the figure shows the output SINR when the mutual coupling between the array elements is taken into account while the broken curve represents the output SINR when the mutual coupling between the arra y elements is ignored . Note that the presence of mutual coupling changes the array performance and the output SINR of the array depends on the angle of arrival of the desired signal. The dependence of the array output SINR on the angle of arrival of the desired signal can be explained as follows . Mutual coupling changes the desired signal component of the element output voltages (6) . The array illumination due to the desired signal is no longer uniform and depends on the angle of arrival of the desired signal , while the noise being internal is not affected by the mutual coupling between the array elements. The output SINR of the array, therefore, changes with the angle of arrival of the desired signal. In the above example , the interelement spacing was large (A/2) and we found that the mutual coupling does nevertheless affect the performance of the array. For small interelement spacings , the mutual coupling between the array elements will be large (Fig. 5) and, therefore, the array performance will be affected mo re. This is evident in the plots of Fig . 6, where the output SINR of the array is plotted as a funct ion of the interelement spacing . The desired signal is incident from the broadside 1 The self and mutual impedance between the dipoles were computed by evaluating the expressions given by Schelkunoff and Friis [6] .

0+rr-T'T-r-r""""''''''''''''''''''''''''T"T.,....,....,.....,...rr-rT"l a 2 3 4 5 o IN WAVELENGTHS Fig. 6. Output SINR of an array of six-half-wavelength, center-fed dipoles versus the interelement spacing. td = 5 dB, ZL = Zii' (9d, <1>d) = (90° ,0°).

direction (90° , 0°) and jammers are assumed to be absent. The broken curve in the plot shows the output SINR in the absen ce of mutual coupling while the continuous curve shows the output SINR when the mutual coupling is taken into account. Note tha t the mutual coupling between the array elements affects the array performance even for large interelement spacing (d > A/2) . The effect is more pronounced for small interelement spacing '(d < A/2) , where the output SINR drops below the expected value (in the absence of mutual coupling) by a significant amount. The smaller the interelement spacing , the larger the drop in the SINR. Note that the SINR curve is similar to the gain curve of a broadside array of six half-wavelength , center-fed dipoles [7] . The drop in the output SINR for small interelement spacings can also be rela ted to the reduction of the total incident energy . As the interelement spacing is decreased, the total aperture of the antenna decreases and so does the total incident energy due to the desired signal. Since the receiver internal noise remains unchanged . the signal-to-noise ratio drops . For a similar reason, the introduction of additional elements into a fixed aperture with half-wavelength spaced antenna elements can degrade the adaptive array performance. The total aperture is fixed and so is the total incident ene rgy . The introduction of additional elements adds to the thermal noise without increasing the available signal power and that degrades the array output SINR. Fig . 7 shows the output SINR of an adaptive array as a function of the num ber of antenna elements. The array is a linear array of half-wavelength , center-fed dipoles . The total aperture is fixed at 21.. and the desired signal is incident from the broadside direction. Again jammers are absent. The output SINR is computed with and with out mutual coupling. In these plots only the indicated points are

335

by SINR = lD

-e

-lD

a:: Z

H

en

...

o-o-e ti-6-ii

o

NO MUTUAL COUPLING WITH MUTUAL COUPLING

2 4 8 NUMBER OF ELEMENTS

10

12

Fig. 7. Output SINR of an may of half-wavelength, center-fed dipoles of fixed aperture versus the number of elements. td = 5 dB, ZL =Z1i. (8 d, ¢d) =(90·, 0·), total aperture = 2 A.

meaningful (the total number of antenna elements is always an integer) . Note that in the absence of mutual coupling the output SINR increases with the introduction of additional elements. It is consistent with the previous work of Compton [8], but in the presence of mutual coupling, the array output SINR can decrease with the introduction of additional elements. One can see that the output SINR reaches a maximum for a four-element array and the array performance degrades with the introduction of additional elements. The reason for this is that one needs a minimum number of antenna elements to receive all the energy incident on a given aperture . But beyond this point the aperture becomes overcrowded and leads to a worse performance . In the examples given so far, the array performance was computed in the absence of jammers. The presence of jammers degrades the array performance. As pointed out in our earlier work [1] , [2]. the degradation in the array performance can be computed using the unperturbed pattern of the array. The unperturbed pattern of an adaptive array was defined to be proportional to the receive pattern of the array responding to a single desired signal in the absence of interfering signals. In the absence of jammers. the normalized noise covariance matrix (21) becomes R tl = ZOZT.

(36)

Substituting (36) in (29), the steady state vector of the array will be

W = K(Zo IUd )*.

(37)

Thus . the weight vector for the unperturbed pattern will be (Z~ 1 Ud )* and the value of the unperturbed pattern in the direction (0. cp) for polarization P will be (38) where U is the signal vector of the array in direction (0, cp) for polarization p. Substituting (36) into (35), the output SINR c-f the array in the absence of all jammers will be SINR

= ~d(ZO I Ud)T(Zo J Ud)* (39)

and is proportional to the value of the unperturbed pattern (38) in the desired signal direction. Further, following the same procedure as given in [1]. it can be shown that the output SINR of the array in the presence of one jammer will be given

dr

~ ~(O

cP d »

2

d,Pd

) _IE(Oil' CPi! ,PiJ) 1

(ZoJuiJl(ZoJuiJ)*

]

(40)

where E(0 iJ , CPiJ, Pi!) is the value of the unperturbed pattern in the jammer direction . The same can be done for multiple jammers [2J . Thus , the degradation in the array performance can be computed using the unperturbed pattern . In this section, the effect of mutual coupling between the array elements on the steady state performance of an LMstype adaptive array was presented. It was shown that though LMS adaptive arrays produce the maximum obtainable output SINR (see Appendix), their performance is affected by mutual coupling . One should, therefore, take mutual coupling into account to compute the true output SINR of the array. In the next section , the transient response of an adaptive array in the presence of mutual coupling will be studied. III. TRANSIENT RESPONSE OF AN ADAPTIVE ARRAY IN THE PRESENCE OF MUTUAL COUPLING The speed of response of an adaptive array is controlled by the eigenvalues of its signal covariance matrix [4] . As pointed out in the last section , the presence of mutual coupling between the array elements affects the input signals to the adaptive processor and thus the covariance matrix. The eigenvalues of the covariance matrix will, therefore, be different than those in the absence of mutual coupling. In this section , the transient response of an adaptive array in terms of the eigenvalues of its covariance matrix will be studied. From (20), the covariance matrix 4> can be written as

4>

= 0 2 [I +

i: ~ik(ZO

ke J

J Uik)*(Zo J Uik)T

+ ~d(ZO J Ud)*(Zo I Uik)TJ .

(41)

In the presence of m + 1 signals (one desired signal and m jammers), 4> has, at least, N - m - 1 eigenvectors (N is the total number of antenna elements) having unity eigenvalues (assuming 2 0 = 1) and the rest of the eigenvectors have eigenvalues larger than one . The presence of mutual coupling between the array elements will affect these eigenvalues. We will, therefore, compute the nonunity eigenvalues to study the effect of mutual coupling on the transient response of adaptive arrays . Fig. 8 shows the nonunity eigenvalue of a six half-wavelength , center-fed dipole adaptive array in the presence of one signal only (no jammer). The desired signal is incident from the broad side direction (90°; 0°) and is 10 dB stronger than the thermal noise. The eigenvalue is plotted as a function of the interelement spacing . Note that the mutual coupling between the array elements affects the eigenvalues even for large interelement spacing, but the effect is more severe when the spacing is less than half a wavelength. For small spacings, the eigenvalue drops significantly below the value obtained in the absence of mutual coupling (broken line) . The drop in the eigenvalue indicates a reduction in the speed of response of the adaptive array. In other words , the array will take more time to adapt to the changes in the desired signal parameters, The main feature of an adaptive array is nulling the undesired signals Gammers). The

336

the array elements affects the eigenvalues even for large interele. ment spacings, but the effect is more severe for small interelement spacing (d ~ "A/2). For such spacing, the eigenvalues drop substantially below the values that they would have had in the absence of mutual coupling (broken curve). The smaller the eigenvalues, the longer will be the transient, and the array will take more time to null the jammers, which may be undesirable. The strong fluctuations in the eigenvalues for large interelement spacings are due to the fact that one or more signals (jammer as well as the desired signal) are incident from grating lobe directions. In this section, the effect of mutual coupling on the transient response of an adaptive array was studied. It was shown that the presence of mutual coupling between the array elements affects the transient behavior of the adaptive array and slows its response to both desired as well as jamming signals for closely spaced elements. In the next section, it is shown that mutual coupling affects the performance of Applebaum-type adaptive arrays and that the steering vector for these arrays must be modified to account for mutual coupling in order to maximize the output SINR.

--WITH MUTUAL COUPLING - - N O MUTUAL COUPLING

-~

o

I

234 DIN WAVEL ENGTHS

5

Fig. 8. Nonunity eigenvalue (A) of an array of six half-wavelength, centerfed dipoles in the presence of a desired signal versus the interelement 0 spacing. td = 10 dB, (8d, tPd) = (90 , 0°) ZL = Zii' no jammer.

--WITH MUTUAL COUPLING - - - NO MUTUAL COUPLING

- -

~--.----.--....""..--~--

IV. APPLEBAUM ARRAYS In the case of an Applebaum adaptive array. one uses a steering vector or initial weights instead of a reference signal as the control signal (Fig. 1). The steady state weight vector for this type of adaptive array [9] is given by

o

2 3 4 DIN WAVELENGTHS

5

W= [

Fig. 9.

Nonunity eigenvalues (A) of an array of six half-wavelength, center-fed dipoles in the presence of a desired signal and a jammer versus the interelement spacing. td = 10 dB, (8d, epd) = (90°, 0°), ~i1 = 20 dB, (8il,
1.25

o

IN

MUTUAL COUPLING

2.5

3.75

WAVELENGTHS

(42)

where Us is the steering vector, G is the loop gain, and J is a N X N identity matrix. Note that (42) contains the signal covariance matrix <1>. As pointed out earlier, the presence of ITIUFlal coupling between array elements will change (}> and thus will affect the steady sta te performance of Applebaum-type adaptive arrays. The eigenvalues of the signal covariance matrix can trol the speed of response of Applebaum -type adap tive arrays too. The presence of mutual coupling between array elements will, therefore, affect the transient response of Applebaum-type adaptive arrays in the same fashion as it affects the LMS array discussed in the last section. We will now find the optimum steering vector to maximize the output SINR of Applebaum-type adaptive arrays in the presence of mutual coupling. Assuming that the loop gain G is large, the steady state weights (42) become

- - - NO MUTUAL COUPLING

o

1 J-J Us G 1+
5

Fig. 10. Nonunity eigenvalues (A) of an array of six half-wavelength, center-fed dipoles in the presence of a desired signal and two jammers versus the interelement spacing. td = 10 dB, (Od, tPd) = (90 0 , 00 ) , til = 20 0 0 dB, ti2 = 30 dB, (8 iI, cPil) = (90 , 30°), (8 i2,
Z;'..

transient response of an adaptive array in the presence of jammers, therefore, will be considered next. Figs. 9 and 10 show the nonunity eigenvalues in the presence of one and two jammers, respectively. The angles of arrival of 0 0 0 0 the two jammers are (90 , 30 ) and (90 , _45 ) , respectively, and the jammers are 10 dB and 20 dB stronger than the desired signal. Again the eigenvalues are plotted as a function of the interelement spacing. Note that the mutual coupling between

W = -IUS'

(43)

Using (25) in (43) one gets

W --

Z6 (R-n 1 -

(44) rR-1U*UTR-1)Z*U n d d nOS' 02 If the steering vector is chosen to steer the beam in the desired signal direction, i.e., Us = Ud then the steady state weight vector will be given by

ZT

W == ~ (R- 1 - rRJ U*UTR- 1 )Z~U* 02 n n d d nOd·

(45)

Comparing (29) and (44) one can see that the two are not the same, and thus this choice of the steering vector would not give the optimum SINR. If instead of the "open circuit" desired

337

that obtained when mutual coupling is ignored and the speed of response of the array is reduced. APPENDIX

-

From (21)

CD 'a

s; =Z3!I+ Let U';k

f

k=l

~ik(ZOlUik)*(ZOlUikl}ZoT.

(49)

= ZC;l Ui k .

Then (49) becomes

o

2 3 4 IN WAVELENGTH 5

5

Fig. 11. Output SINR of any array of six half-wavelength, center-fed dipoles versus the interelement spacing using Applebaum algorithm. td = 5 dB, Z L = Z7;, (8 d» tPd) = (900 , 09 ) .

signal vector (U d)' the complex conjugate of the desired signal component of the element output voltage is used to generate the steering vector, Le.,

Us = (ZOJUd)*,

(46)

then from (44)

W=KtZ'{;R;;JUd,

(47)

where 1 T _ J * K 1 = 2" (l-iUdR n Ud ) ·

a

(48)

s; = Z~ {I + ~ ~ikU;;U;r} Z'[ = Z6R~Z'[

(50)

where

R~ is

m

= I + ~ ~ikU;k* U;f. k==l

(51)

Using (50) in (29) the steady state weight vector of the array

w= K(R~)-l(Zo 1 Ud )* = K(R~)-ll!d*

(52)

where

lid = Zo) ti;

(53)

Comparing (52) with the optimum weight vector [9, eq. (4.1)] one can see that the two are similar. Thus the weight vector given by (29) will lead to the maximum output SINR in the presence of multiple jammers.

Comparing (29) and (47), one can see that the two weight vectors differ only by a scale factor. The choice of the steering vector as given in (46) will, therefore, lead to the optimum perREFERENCES formance of the array. [I} I. J. Gupta and A. A. Ksienski, "Dependence of adaptive array perFig. 11 shows the output SINR of an adaptive array of six formance on conventional array design." IEEE Trans. Antennas Propagat.; vol. AP-30. pp. 549-553. July 1982. half-wavelength, center-fed dipoles as a function of the interele[2] - , "Prediction of adaptive array performance," IEEE Trans. Aeror.ent spacing for the two choices of steering vectors. The desired space Electron. Syst .. vol. AES-19, no. 3, pp. 380-388. May 1983. signal is incident from the broadside direction and the load [3J 8. Widrow, P. E. Mantey. L. J. Griffiths. and 8. B. Goode. "Adaptive antenna systems, Proc. IEEE. vol. 55, no. 12, pp. 2143-2159, impedance is equal to the complex conjugate of the self impedDec. 1967. ance of a half-wavelength, center-fed dipole. Again jammers are [4J R. L. Riegler and R. T. Compton, Jr .••. An adaptive array for interassumed to be absent. Note that when the steering vector is ference rejection." Proc . IEEE. vol. 61. no. 6, pp. 748-758. June 1973. chosen according to (46). the adaptive array gives a better per[5] A. S. Householder, The Theory of Matrices in Numerical Analysis. formance for small interelemen t distances where the mutual New York: Dover, 1964, p. 123. coupling is the strongest. Comparing Figs. 6 and 11, one can [6] S. A. Schelkunoff and H. T. Friis, Antenna Theory and Practice. New York: Wiley, July 1966, p. 409. see that the output SINR of the properly excited Applebaum C. T. Tai, "The gain of uniform arrays of isotropic sources and a.ray is the same as that of the LMS array. In the presence of ' [7] dipoles," Ohio State Univ. Antenna Lab. Tech. Rep. 1522-2, Mar. jammers, the array output SINR will be the same as that of the 1963. [8] R. T. Compton, Jr., "A method of choosing element patterns "in an LMS array and can be predicted using the unperturbed pattern. adaptive array," IEEE Trans. Antennas Propagat., vol. AP-30, pp. Thus, Us = (ZOI Ud )* , leads to the optimum performance 489-493, May 1982. of an Applebaum-type adaptive array. [9] S. P. Applebaum, Adaptive arrays," IEEE Trans. Antennas Propato

lO

gat., vol. AP-24, pp. 585-599, Sept. 1976.

V. CONCLUSION In this work, the effect of mutual coupling between array elements on the performance of adaptive arrays was studied. I t was shown that mutual coupling affects the performance of adaptive arrays even for large interelement spacings. The effect is particularly serious for small interelement spacing where the steady state output SINR of the array is significantly lower than

338

Optimum Combining in Digital Mobile Radio with Cochannellnterference JACK H. WINTERS,

Abstract-lbis paper studies optimum signalcombining for spacediversity reception in cellular mobile radio systems. With optimum combining, the signals received by the antennas are weighted and combined to maximize the output signal-to-interference-plus-noise ratio. Thus, with cochannel interference,space diversity is used not only to combatRayleigh fading of the desired signal (as with maximal ratio combining) but also to reduce the power of interfering signals at the receiver. We use analytical and computer simulation techniques to determine the performance of optimum combining when the received desired and interferingsignals are subject to Rayleigh fading. Results show that optimum combining is significantly better than maximal ratio combining even when the numberof interferers is greater than the number of antennas. Results for typical cellular mobile radio systems show that optimum combining increases the output signalto-interferenceratio at the receiver by severaldecibels. Thus, systemscan require fewer base station antennas and/or achieve increased channel capacitythrough greater frequency reuse. We also describe techniques for implementing optimum combining with least mean square (LMS) adaptive arrays. I.

S

INTRODUCTION

PACE diversity provides an attractive means for improving the performance. of mobile radio systems. With space diversity, the signals from the receiving antennas can be combined to combat multipath fading of the desired signal and reduce the relative power of interfering signals. Previous studies of mobile radio systems (e.g., [1]) have considered space diversity only for combating multipath fading of the desired signal. Interference at each receiving antenna is assumed to be independent in these studies. Under this condition, maximal ratio combining! [1, p. 316] produces the highest output signal-to-interference-plusnoise ratio (SINR) at the receiver. However, in most systems (in particular, cellular mobile radio systems [1]) the same interfering signals are present at each of the receiving antennas. Thus, the received signals can be combined to suppress these interfering signals in addition to combating desired signal fading and thereby achieve higher output SINR than maximal ratio combining. The output SINR can be maximized by using adaptive

MEMBER, IEEE

array techniques at the receiver (e.g., [2]-[4]). We will not analyze the performance of the various adaptive array techniques in this paper, but only study the performance of the optimum combiner that maximizes the output SINR. Although the adaptive array (i.e., optimum combiner) has been studied extensively, it has not been previously analyzed with the fading conditions of digital mobile radio. This paper studies the performance of the optimum combiner in digital mobile radio systems. We assume flat Rayleigh fading across the signal channel and independent fading between antennas. The average bit error rate (BER) of the optimum combiner is studied for coherent detection of phase shift keyed (PSK) signals. Analytical and computer simulation results show that optimum combining is significantly better than maximal ratio combining even when there are more interferers than receive antennas. Results for typical cellular mobile radio systems show that the optimum combiner can increase the output SINR several decibels more than maximal ratio combining. In Section II we describe the optimum combiner. Section III studies the BER of the optimum combiner when the desired and interfering signals are subject to Rayleigh fading. We discuss analytical results for one interferer and Monte Carlo simulation results for multiple interferers. In Section IV we consider the optimum combiner performance in cellular mobile radio systems. Section V discusses the possible methods for implementing the optimum combiner in mobile radio with a least mean square (LMS) [3] adaptive array. A summary and conclusions are presented in Section VI. II.

OPTIMUM COMBINER

A. Description and Weight Equation

Fig. 1 shows a block diagram of an M element space diversity combiner. The signal received by the ith element y;(t) is split with a quadrature hybrid into an in-phase signal xI;(t) and a quadrature signal xQ,(t). These signals . Manuscript received February 25, 1983; revised November 23, 1983. are then multiplied by a controllable weight WI (t) or This paper was presented at the International Conference on Communica- wQ;(t). The weighted signals are then summed to form the tions, Boston, MA, June 1983. The author is with the Radio Research Laboratory, AT&TBellLabora- array output so(t). tories, Holmdel, NJ 07733. The space diversity combiner can be described mathe1 I!1 maximal ratio combining, the received signals are weighted promatically using complex notation [5]. Let the weight vector portionatelyto their signal-voltage-to-noise-power ratios and combinedin phase. w be given by Reprinted from IEEE Transactions on Vehicular Technology, Vol. VT-33, No.3, pp 144-155, August 1984.

339

signals are uncorrelated, we can show that R nn ARRAY OUTPUT

and the received signal vector x be given by

(2) The received signal consists of the desired signal, thermal noise, and interference and, therefore, can be expressed as L

L »,

(3)

j=:l

where x d ' x n ' and x j are the received desired signal, noise, and jth interfering signal vectors, respectively, and L is the number of interferers. Furthermore, let Sd(t) and Sj(t) be the desired and jth interfering signals as they are transmitted, respectively, with

(4) and L.

(5)

L UjSj(t)

(6)

for 1 ~ j

~

Then x can be expressed as L

(8)

(9)

where a is.a constant.? and the superscript -1 denotes the inverse of the matrix.

(1)

= UdSd(t)+x n +

j=l

w = aR-1u* nn d

Fig. 1. Block diagram of an M element space diversity combiner.

X

L

"i: E[u~u~] J J

where 0 2 is the noise power and I is the identity matrix. In (8) the expected value is taken over a period much less than the reciprocal of the fading rate (e.g., several bit intervals). Note that we have assumed that the fading rate is much less than the bit rate. Finally, the equation for the weights that maximize the output SINR is (from [6])

WEIGHT GENERATION

X=Xd+X n+

= 0 2 1+

j=l

where Ud and u j are the desired and jth interfering signal propagation vectors, respectively. The received interference-plus-noise correlation matrix is given by

where the superscripts * and T denote conjugate and transpose, respectively. Assuming the noise and interfering

340

B. Discussion The mobile radio environment is quite different from the signal environment in which adaptive arrays (i.e., optimum combiners) are usually employed. In a typical adaptive array application in a nonfading environment, at the receiver there are only a few interfering signals, and their power is much greater than that of the desired signal. The adaptive array places nulls in the antenna pattern in the direction of these interferers, greatly suppressing these signals in the array output. In general, an M element array can null up to M - 2 interfering signals and still optimize desired signal reception. The output SINR is, therefore, substantially increased by the array. In mobile radio systems, on the other hand, at the receiver there can be several interfering signals whose power is close to that of the desired signal, and numerous interfering signals whose power is much less than that of the desired signal. Therefore, the number of interfering signals may be much greater than M, and the array may not be able to greatly suppress every interfering signal. Thus, the array output SINR may not be markedly increased by the array. To be useful in mobile radio systems, however, the adaptive array does not have to greatly suppress interfering signals or vastly increase the output SINR. Interfering signals need only to be reduced in power by a few decibels so that their power is below the sum of the power of other interferers. Furthermore, a substantial output SINR improvement is not required because a several decibel increase in output SINR can make possible large increases in the channel capacity of the system. Thus, although the signal environment in mobile radio systems is quite different from that of the typical adaptive array system, adaptive array techniques can still offer significant advantages. One other major difference is the ability of the array to resolve closely spaced transmitters. In a nonfading environment the adaptive array cannot suppress an interfering 2 Note that a does not affect the array output SINR (i.e., the array performance) and, therefore, we will not consider its value.

signal if the angular separation between the interfering and desired transmiters is too small. In this case, the desired signal phase difference between receive antennas is nearly the same as that for the interfering signal. Therefore, the array cannot both null one signal and enhance reception of the other. The use of additional antennas results in a small decrease in the required angular separation, but the problem remains. In mobile radio, because of multipath, the signal phase at one antenna is independent of the signal phase at another antenna when the antenna separation is greater than half of a wavelength (several inches at 800 MHz) [1, p. 311].3 Therefore, the adaptive array antenna pattern is meaningless. Similarly, at the receive antennas, the signal phases from two different transmit antennas are independent when the transmit antennas are more than half of a wavelength apart. Therefore, for all practical purposes, the received signal phases are independent of vehicle location. Thus, the resolution of interfering and desired signals does not depend on how closely the vehicles are located. Instead, for all locations, there is a small probability that the array cannot resolve the two signals. This occurs when the phase differences between the antennas are nearly the same for both the desired and interfering signals. With moving mobiles," however, the period of time with unresolved signals is very brief, and the performance of the adaptive array can be averaged over the fading. Furthermore, since the signal phase differences between the antennas are independent, the probability of unresolved signals with M antennas is approximately equal to the probability of unresolved signals with two antennas raised to the M - 1 power. Thus, each additional antenna greatly decreases the probability of unresolved signals, and this probability becomes negligible with only a few antennas. In summary, in mobile radio the adaptive array cannot resolve desired and interfering signals a small percentage of the time, rather than over a given angular separation. Therefore, we need not be concerned with vehicle location for the resolving of the signals. In the following analysis we study the optimum combiner performance averaged over the Rayleigh fading.

III.

biner, first determining the output SINR distribution and then the BER for coherent detection of PSK. All results are compared to those for maximal ratio combining. In cellular mobile radio, the interference plus noise at the receiver consists primarily of cochannel interference. In a typical system, there are numerous cochannel interfering signals, each of which affects the performance of the optimum combiner. An exact analysis of the performance is, therefore, quite complicated, especially since, with fading, each of these signals has a random amplitude. Therefore, in the analysis in this paper, we consider only the strongest interferers individually. The remaining interfering signals are combined and considered as lumped interference that is uncorrelated between antennas. Since with Rayleigh fading the in-phase and quadrature components of each of the received interfering signals have a Gaussian distribution, the components of the sum of these signals also have a Gaussian distribution, and the sum can be considered as thermal noise. Thus, under this assumption the combiner cannot suppress the lumped interference, and we are therefore analyzing a worse situation since the actual combiner performance will be better. Therefore, although this analysis does not show the maximum improvement (over maximal ratio combining) with optimum combining, it does show most of it. Note that if we consider all the interference as lumped interference, the results are identical to those for maximal ratio combining. The analysis involves several parameters which are defined as follows:

r = mean received desired signal power per antenna mean received noise plus interference power per antenna

r = d

(10)

mean received desired signal power per antenna (11) mean received noise power per antenna

I'. = mean received jth interferer signal power per antenna } mean received noise power per antenna

OPTIMUM COMBINER PERFORMANCE WITH

(12) _ local mean desired signal power at the array output YR mean noise plus interference power at the array output

FADING

This section studies the performance of the optimum combiner when the received signals are subject to fading. For the analysis we assume flat fading across the channel, with the fading independent between antennas. We assume that the received signal has an envelope with a Rayleigh distribution and a phase with a uniform distribution. We study the steady state performance of the optimum com3In our analysis we assume the antennas are spaced far enoughapart so that the received signal phases are independent. ..If all mobiles are stationary, channel reassignment can be used to

eliminate the problem.

341

and

(13)

_ local mean desired signal power at the array output Ylocal mean noise plus interference power at the array output

(14) In the above definitions, mean is the average over the Rayleigh fading, and local mean is the average over a period less than the reciprocal of the fading rate (e.g., several bit durations). It is useful to note that

f=

fd

l+I:IJ

---_

5x10·1n-----~---~---

(15)

L

L~'1

10- 1

j=l

- r 1-0 --- r 1 ~ 1 ........ r 1 s 2

(OdS) (3d8)

A. Analytical Results with One Interferer

We first consider 'YR with one interferer. YR can be determined from [6, weight equation (9)]

(16) where from (8)

10-'----.....a..-~-..I6.:-...:~

. 20

(17)

We note that with our fading model the components of Ud and u1 are complex Gaussian random variables that vary at the fading rate. Thus, to determine 'YR' the expected value in (17) must be averaged over the Rayleigh fading. It can also be seen that 'YR will vary at the fading rate. The probability density function of 'YR can be calculated to be given by [7] e-YRlfd(

P(YR)=

t)

YR

..5

Fig. 2. The cumulative distribution function of 'YI( versus YRIf for optimum combining when the desired and interfenn~ signals are subject to fading. Results are shown for one interferer WIth several values of M and I'J: The distribution function for YR with fixed average received SINK is shown to decrease as the power of the interferer becomes a larger proportion of the total noise plus interference power. The decrease 1S even larger as M increases.

(20)

P(YR)=l-e- YR / t) M-2 dt,

(18)

rd

M

E

k-l

(~

d

r-

1

(k-l)!

(21)

which agrees with [1, p. 319]. With a high power interferer f 1 = 00, and the cumulative distribution function is given by

) =lYR/fde-XxM-l(l+ Mf1 ) 0 (M -2)! ·fe-XMfll(l-t)M-2dtdx.

0

YR/r(dB)

fAM-2)!

Thus, the cumulative distribution function is given by

P(

..L.L_....L----:-.---JJ

" - 10

or

M-\1 + Mf1)

• ~l e-«YRlfd)Mfl)I(I-

- 15

(22) (19)

In the above equations it is seen that YR can be normalized by rd. Therefore, from (15) we can also normalize YR by r and compare the performance of optimum combining to that of maximal ratio combining for fixed average received SINR. Fig. 2 shows the cumulative distribution function of YR versus YR/f with optimum combining for several values of M and fl. The f 1 = 0 distribution curve is also the distribution curve for maximal ratio combining. Fig. 2 shows that for fixed average received SINR the distribution function decreases as the interference power becomes a larger proportion of the total noise-plus-interference power. The decrease becomes even greater as M increases. Thus, optimum combining improves the receiver's performance the most when the interferer's power is large compared to the thermal noise power and there are several antennas. As in [7] let us now consider the effect of a very high power interferer on optimum combining. Without interference (or with maximal ratio combining) f 1 = 0, and the cumulative distribution function is given by

342

or

Thus, optimum combining with an infinitely strong interferer gives the same results as maximal ratio combining without the interferer and with one less antenna. In other words, optimum combining with a strong interferer and an additional antenna will "always do better than maximal ratio combining without the interferer. For digital mobile radio this has the following impact. Let us consider a system where the required system performance can be achieved with maximal ratio combining at the base station receiver and adaptive retransmission- (see Section V-B) for base-to-mobile transmission. Then, another mobile can be added per channel per cell by using optimum combining and adding one antenna at the base station, Thus, optimum combining provides a relatively simple means for growth in a system. The BER for coherent detection of PSK is given by

(24)

L"1 - 1 ' 1 "0 - - - 1' 1 a1 ( OdBI

From (18) and (24) the BER for optimum combining with one interferer can be calculated to be given by [7]

········1'1"2 (3dBI

>'"
10- 2

II:

0

II: II:

'" >-

1<,3

lD

=---_ _

10- 4 - 10

M -2

- L

k: =1

i=1

-I)!! )])

(2i i!(2+2fd ) i

0

r

5

10

__=l

15

20

(dBI

Fig. 3. The average BER versus the average received SINR for optimum combining with one interferer. Results are shown for several values of M and fl ' The improvement with optimum combining (in decibels) is shown to be nearly independent of the average received SINR for BER's less than 10- 2 •

(-Mf 1)k

.[1- Vr-rz (1 + t ~

L....._ _L....."---'L...-lL.....--.::-\....II-.._---.:~ _

-5

(25) 1 2 . - - - - - - r- - - - . - - -- - - - r - -- - - "

where

10

(2i -I)!! = 1·3·5· . .. . (2i -1).

(26)

iii

~

>-

8

z

Similarly, for maximal ratio combining the BER can be determined from (21) and (24) as [81

'" ~

6

~

II:

~ 4 2

-5

o

5

10

1' 1 (dBI

Fig. 3 shows the BER versus the average received SINR (f) for optimum combining with one interferer. Results are shown for several values of M and fl ' The results for f 1 = 0 are the same as those for maximal ratio combining. Let us define the optimum combining improvement as the decrease (in decibels) in the average received SINR required for a given BER as compared to maximal ratio combining. Fig. 3 shows that the improvement is nearly independent of I' for BER's less than 10- 2 . Thus, for most systems of interest, the improvement is independent of f d and depends only on f 1 and M. That is, the improvement depends on the interferer's power relative to the combined power of the other interferers and not the power of the desired signal. In Fig. 4 the optimum combining improvement is plotted versus f 1 for one interferer and several values of M. The results are shown for a BER of 10- 3 , but as discussed above, similar results can be obtained for other BER values (less than 10- 2 ) . Results show that as f 1 and M increase, the improvement also increases. Fig. 4 also shows the maximum improvement that can be achieved if the interferer is completely nulled in the array output. The difference between the maximum and the actual improvement for a given M (M > 2) is shown to be constant for large fl' From (23) it can be seen that this

Fig. 4. The improvement of optimum combining over maximal ratio co~p ining versus f l with one interferer for several values of M and a 10 BER. Results show that as f l and M Increase, the improvement becomes significant.

difference is the increase in the required received SINR with the loss of one antenna with maximal ratio combining. For example, the required received SINR with maximal ratio combining is 2.3 dB for M = 5 and 4.0 dB for M = 4. Thus, for M = 5 the optimum combining improvement is 1.7 dB less than maximum for large fl'

B. Simulation Results with Multiple Interferers With two or more interferers, it is extremely difficult to determine analytically the optimum combiner performance. Therefore, in this section we use Monte Carlo simulation to determine the performance of optimum combining. For the simulation we consider y rather than YR as in the previous section. For a given bit duration, the array output SINR is given by

(28) where Pd is the power of the desired signal and Pi + n is the

343

power of the interference plus noise. The desired signal power is given by

(29)

nents are also independent complex Gaussian random variables. Thus, the probability density function of y can be determined analytically to be given by [1, p. 367]

,

where the superscript t denotes complex conjugate transpose. The power of the interference plus noise is given by

p(y)=

M(~)M-l

r(l + ~)

(37)

M+l

(30) and the cumulative distribution function is given by where u, is the vector of the thermal noise vectors at the receiver. The components of u, are independent complex Gaussian random variables with a variance corresponding to the noise power. The array output SINR is then given by

(31)

y=

The weight vector for the optimum combiner is given in (9). As can be seen from (8), R n n is Hermitian, and therefore, from (9)

(32)

y=

The cumulative distribution of y is plotted versus y/ r in Fig. 5. Simulation results with 100 000 samples are shown for optimum combining with two interferers that have 3 dB higher power than the noise, and analytical results are shown for maximal ratio combining. With 100 000 samples there are small deviations in the simulation results only for very small values of the distribution function. Fig. 5 shows that optimum combining significantly decreases the value of the distribution function as compared to maximal ratio combining. This decrease becomes even greater as M increases. The BER can be determined from the cumulative distribution function by the equation

111 cos-

Thus, the SINR can be expressed as

tu~R;~u~12

(38)

BER=(33)

",

0

1

(!y ) p( y ) dy

(39)

or

BER=.!.11cos-l(!y)(dP(Y))dY. With Rayleigh fading, the components of U d, un' and uj are complex Gaussian random variables with zero mean and variance r d , 0 2, and f j , respectively. Therefore, through Monte Carlo simulation, the probability distribution of y can be determined. We now discuss the distribution of y with maximal ratio combining so that a comparison to optimum combining can be made. For maximal ratio combining, the weights are given by W=U~

Thus, the BER for optimum combining was determined from the simulation results using the above equation. Since the cumulative distribution function can be determined for y normalized by I', from one simulation run we can determine the BER over a wide range of L's, Similarly, the BER for maximal ratio combining is seen from (37) and (39) to be given by BER =

(34)

or

(40)

dy

'IT 0

-1 M ",

Ijf

0

cos " (vTX)

XM - 1 M

(1+ x-)

+

1

dx

(41)

which is numerically equivalent to (27).

For one interferer, the BER results obtained using the

Therefore, from (31) the SINR is given by y=

tu~u~12

(35) above equations (with a simulation using 100000 samples) agree with the analytical results shown in Fig. 3. (36)

Since the components of u, + EJ=o:luj are the sum of independent complex Gaussian random variables, the compo-

For two interferers, the BER results are shown in Fig. 6. The simulation used 100 000 samples per data point. Fig. 6 shows that there is a marked improvement with optimum combining as the number of antennas increases. For example, for a BER of 10- 3 and M equal to 5, optimum combining requires 4.2 dB less SINR than maximal ratio combining. Thus, in this case, optimum combining with five antennas (which requires -1.9 dB for a 10- 3 BER) is

344

L"2

......

--r,"rZ"o - - - r , " I' z "2 (3dfl)

10-'

,

/

,

I

-15

-10

-5 YII' (dBI

,I

I

I

I

I

I

o

,

/

,-

,//"

5

10

Fig. S. The cumulative distribution function of y versus Y/f for optimum combining with two interferers that have 3 dB higherllower than the noise. Analytical results for maximal ratio combining (T, = f 1 = 0) are also shown. Optimum combining is seen to significantly decrease the distribution function. 5 . 1 0 · ' , . . . - - - , - - - - , - - - - r - - - - - , . - -----r-----,

L-2

--I',-rz"o

--- r," I'z

o

-5

r

5

10

"213dBI

15

20

(dBI

Fig. 6. The average BER versus the average received SINR for optimum combining with two interferers . Results are shown for optimum combining with two equal ~ower interferers (T, = f z = 2) and for maximal ratio combining (f1 = f. = 0). There is a marked Improvement with optimum combining as the number of antennas increases. 12 g:----- - - - r - - - - - , - , - - - - , - - - - - - : : I I

L-2

BER - 10- 3

10

I'z -2 13dBl

...z

~ 6

\oJ

o> ::: 4

!

2

o "=-10

IV.

--.",__- -

r - --'-5

--'--

r,

o

---J'-

5

one interferer when there is also a second interferer with a 3 dB signal-to-noise ratio. The simulation used 100000 samples per data point. The results are shown for a 10- 3 BER, but as seen in Fig. 6, these results are similar to the results for other BER's less than 10- 2 • Fig. 7 also shows the maximum improvement possible if both interfering signals are completely nulled in the receiver output (i.e., the difference between the maximal ratio combiner performance with and without interference). The improvement is within about 2 dB of the maximum with six or more antennas. Fig. 8 shows the improvement versus the number of antennas with one to six equal power (fj = 3 dB) interferers. Again, 100000 samples per data point were used. The improvement is shown to be between 1-6 dB as M varies from 2 to 8. Thus, optimum combining has some improvement over maximal ratio combining even with a few antennas, and the improvement greatly increases with the number of antennas. Although the results of Fig. 8 are for equal power interferers with a particular value of Ij, they demonstrate the following characteristics of optimum combining that apply to other interference cases as well. First, when the number of antennas is much greater than the number of interferers, the improvement is limited. That is, in this case there is little improvement (relative to maximal ratio combining) with additional antennas. This can also be seen from Figs. 4 and 7. Second, except for the above case, the increase in the improvement (in decibels) with each additional antenna is approximately constant (about 0.6 dB for f J = 3 dB). Finally, the most interesting characteristic is that there is a large improvement even when the number of interferers is greater than the number of antennas. This implies that in analyzing systems we must consider many interferers individually even if there are only a few antennas. For example, consider the case of five antennas with six interferers , each with f J equal to 3 dB. From Fig. 8, the improvement is 2.7 dB. However, if only five interferers are considered individually, and the power of the sixth one is combined with the thermal noise, f J is - 1.8 dB and the improvement is only 1.6 dB. Thus, we must consider individually as many interferers as possible to determine accurately the actual optimum combining improvement.

...=tl

10

(dB)

Fig. 7. The improvement of optimum combining over maximal ratio combining versus the signal-to-noise ratio of one interferer when there is also a second interferer with a 3 dB signal-to-noise ratio. The improvement is within about 2 dB of the maximum improvement with six or more antennas.

better than maximal ratio combining with nine antennas (which, from (27), requires -1.7 dB for a 10- 3 BER). Fig. 7 shows the optimum combiner improvement over maximal ratio combining versus the signal-to-noise ratio of

PERFORMANCE IN TYPICAL SYSTEMS

This section studies the performance of optimum combining in typical cellular mobile radio systems. Using the techniques of Section III, we study optimum combining when the signals are subject to Rayleigh fading.' Optimum combining is studied only at the base station receiver because multiple antennas and the associated signal processing for optimum combining are less costly to implement at the base station than on numerous mobiles. (Adap tive retransmission with time division [1], [9] can be used to

345

s In an actual mobile radio system, the signals are also subject to shadow fading [9] which greatly complicates analysis. We therefore only consider Rayleigh fading so that system comparisons can easily be made.

10

r--,----,----,----,----,....----.

TABLE I

COMPARISON OF OPTIMUM AND MAXIMAL RATIO COMBINING IN TYPICAL MOBILE RADIO SYSTEMS- THE NUMBER OF ANTENNAS REQUIRED AND THE SINR MARGIN FOR A 10 - 3 BER

BER • 10- 3

...

B

r

J (j ·1 .LI·2(3dBl

iii

!Z....

ea.. 6

Bale Station

::IE ....

Geometry

> o

'"::IE Q.

Combininl

Reuse

Decoy Exponential

Number o(

SlNR

1

3

14 15

0.5

1.1

6 7

0.0 1.7

4

12 13 3 4 2 3 II 69

0.1 0.7 0.3 2.9 1.0 5.5 0.0 1.0

5 6 3 4 1 3 17 II

1.9 4.5 2.6 5.1 3.0

55 69 7 I 5 6

0.0 1.0 0.3 1.2 1.6 2.9

II 12 5 6 4 5

0.5

Frequcnq

)-corner

3

7

3 4

o :-2----;-----:--~:----L.--...l--~ 3 4 6

Milimal Ratio

B

1 Centrally

3 4

Loealed 3

J

4

improve reception at the mobile with multiple base station antennas only (see Section V-B).) As before, all results for optimum combining are compared to maximal ratio combining. Analysis of optimum combining with numerous interferers requires a substantial amount of computer time. It is therefore nearly impossible to determine the average performance of the adaptive array in the typical cellular system with random mobile locations. Therefore, in this section we consider a worst case scenario only, i.e., the mobile transmitting the desired signal is at the point in the cell .fart~est from the .b ase station, and the interfering mobiles In the surrounding cells are as close as possible to the base station of the desired mobile. Furthermore, in the ~alysis we consider only the six strongest interferers individually. The power of the other interferers is combined and considered as thermal noise. .The systems studied involve two different cell geometries WIth hexagonal cells. In one geometry the base stations are located at the cell center, and in the other geometry the base stations are at the three alternate comers of the cell and are equipped with sectoral horns. In the latter geometry, each of the base station's three antennas has a 120° be~width and serves the three adjoining cells. We also consider both frequency reuse in every cell and the use of three channe~ sets. Furthermore, because in the typical system the SIgnal strength falls with the inverse of the distance raised to between the third and fourth power we ' also consider these two extremes," The performance of optimum combining and maximal ratio combining in typical mobile radio systems is shown in Table I. For each of the systems described above Table I lists the number of antennas required to achieve a 10- 3 BER and the average output SINR margin. We also show the margin with an additional antenna. The results show that with three-comer base station g~~etry and frequency reuse in every cell, optimum combining more than halves the required number of antennas. Furthermore, the increase in margin with an additional .6Th e calculation of the power of the signals in these cellular systems WIll not be descnbed here. The method is similar to that described ill [10J.

Antennas

Mlrsin. a (dB)

Optimum

Number or Intcnnu

Combinin•

SINR Matpn a (dB)

1.6

0.4 0.9

1.1

0.1 2.4 2.7 5.3

a Margins are a~urate to within a few tenths of a decibel and were determmed from Simulation results using 100 000 samples .

antenna is much greater. With the same geometry and three channel sets, even though only a few antennas are required with maximal ratio combining, optimum combining increases the margin by 2-3 dB. With centrally located base stations and frequency reuse in every cell, optimum combining substantially reduces the number of antennas. As few as 11 antennas are required with optimum combining as compared to more than 50 with maximal ratio combining. Finally, with three channel sets, optimum combining requires one less antenna and has higher margins. Thus, the improvement with optimum combining is the largest in systems where a large number of antennas is required because of low received SINR. However, even with high SINR and few antennas, the improvement is 2 dB or more. Therefore, the results for typical cellular systems agree with those of Section III (i.e., Fig. 8). In an actual system we would expect the optimum combining improvement to be even greater than that shown in Table I because of the following three reasons. First, all the channels in all the cells may not always be occupied. Thus , the total interference power will be less, and the power of the strongest interferers (when transmitting) relative to the power of the sum of the other interferers r will be higher. As shown in Section III, as r increases, the optimum combining improvement increases, Second, with random mobile locations rather than the worst case, the total interference power will be lower. Thus, rJ for the • strongest mterferers (those closest to the desired mobile's base station) will be higher, and therefore, so will the improvement. Third, for the results in Table I only the six strongest interferers were considered individually, and thus the results are somewhat pessimistic. Finally, we note that in actual systems the fading can be non-Rayleigh with direct paths existing between an interfering mobile and a base station (i.e., the fading might not be independent at each antenna). Under these conditions, the performance of maximal ratio combining can be significantly degraded while optimum combining can still achieve the maximum output SINR.

346

v.

IMPLEMENTATION

In this section we discuss the implementation of optimum combining in mobile radio. We consider the use of an LMS [3] adaptive array at the base station receiver and adaptive retransmission with time division for base-tomobile transmission. For the LMS adaptive array, we discuss the dynamic range, reference signal generation, and modulation technique.

'Oll)

ARRAY

OUTPUT

A. The LMS Adaptive Array 1) Description: Of the various adaptive array techniques [2]-[4] that can be used in mobile radio, the LMS technique appears to be the most practical one for mobile radio because it is not too complex to implement and it does not require that the desired signal phase difference between antennas be known a priori at the receiver. Fig. 9 shows a block diagram of an M element LMS adaptive array. It is similar to the optimum combiner of Fig. 1 except for the addition of a reference signal r( t) and an error signal eel). As shown in Fig. 9, the array output is subtracted from a reference signal (described below) r( t) to form the error signal e(t). The element weights are generated from the error signal and the x/,et) and xQ/(t) signals by using the LMS algorithm which minimizes the power of the error signal. The reference signal is used by the array to distinguish between the desired and interfering signals at the receiver. It must be correlated with the desired signal and uncorrelated with any interference. Under these conditions the minimization of the power of the error signal suppresses interfering signals and enhances the desired signal in the array output. Generation of the reference signal in digital mobile radio systems is described in Section V-A3). We now consider the weight equation for the LMS adaptive array in a mobile radio system. In the typical system the bit rate is 32 kbitsjs, and the carrier frequency is about 840 MHz. With the signal bandwidth 1.5 times the

+ r(t) REFERENCE SIGNAL

Fig. 9. Block diagram of an M element LMS adaptive array.

desired and interfering signals. However, the weights must also change much more slowly than the data rate so that the data modulation is not altered. It has been shown [11] that for PSK signals the maximum rate of change in the weights without significant data distortion is about 0.2 times the data rate. For the typical mobile radio system, the maximum fading rate is about 70 Hz (for a carrier frequency of 840 MHz and a vehicle speed of 55 mijh), and the code rate is 32 kbitsjs. Thus, the permissible range in signal power at the array input is given by . R _ 0.2x32xl0 3 D ynannc ange 70 ~

bit rate, the relative bandwidth of the mobile radio channel is only 0.006 percent, and we can consider the signal as narrow band. For narrow-band signals, the weight equation for the LMS array is given by [6, eq. (9)], i.e., the LMS adaptive array maximizes the output SINR. However, these are the steady state weights, and in mobile radio the signal environment is continuously changing. Therefore, we must consider the transient performance of the array. That is, because the weights are constantly changing, the performance will be degraded somewhat from that of the optimum combiner. (Analysis of the transient performance is not considered in this paper.) Also, we must consider the dynamic range of the LMS adaptive array. 2) Dynamic Range: One limitation of the LMS adaptive array technique is the dynamic range over which it can operate. In an LMS adaptive array, the speed of response to the weights is proportional to the strength of the signals at the array input. For the array to operate properly, the weights must change fast enough to track the fading of the

20 dB.

(42)

The received signals in a mobile radio system vary by more than 20 dB, however, and therefore automatic transmitter power control (which could add significantly to the cost of the mobile radio) is required to control the power of the strongest signals at the receiver. With this power reasonably fixed, the dynamic range determines the power ratio of the strongest to the weakest received signal that the array can track. A 20 dB dynamic range is certainly not large, but it is more than adequate for mobile radio for the reasons described below. In the mobile radio systems studied in this paper (see Section IV), the average received SINR at each antenna is relatively small. This is because an adaptive array is not needed when the received SINR is large. For example, for maximal ratio combining with two antennas, an average received SINR at each antenna of 11 dB [1] is required for coherent detection of PSK with a 10- 3 BER. For optimum combining the required SINR is less with two antennas and, of course, even lower with more antennas. Thus, the received SINR is much less than 20 dB for all cases of interest. (It is typically between - 5 and 5 dB.) A small received SINR affects array operation as follows. First, if the power of an interfering signal is more than 20 dB below the desired signal's power at an antenna, the array need not track the interfering signal at that antenna because it has a negligible effect on the output SINR. Second, if the power of an interfering signal is more than 20 dB higher than the desired signal's power at an

347

antenna, the array need not track the desired signal at that antenna because the resulting weight for the antenna will be almost zero. Thus, because the received SINR is small in the systems where the LMS adaptive array is practical, a 20 dB dynamic range is adequate. Note that if the received SINR is large' (e.g., greater than 20 dB, as in a lightly loaded system), the LMS adaptive array will have the same performance as maximal ratio combining. 3) Reference Signal Generation and Modulation Technique: The LMS adaptive array must be able to distinguish between the desired signal and any interfering signals. This is accomplished through the use of a reference signal as discussed in Section V-AI). The reference signal must be correlated with the desired signal and uncorrelated with any interference. A reference signal generation technique that allows for signal discrimination is described in [12] and involves the use of pseudonoise codes with spread-spectrum techniques. To generate the spread-spectrum signal the pseudonoise code symbols, generated from a maximal length feedback shift register, are mixed with lower speed voice (data) bits, and the resulting bits are used to generate a PSK signal. The code modulation frequency is an integer multiple of the voice bit rate, and this multiple is defined as the spreading ratio k. The reference signal is generated from the biphase spread-spectrum signal using the loop shown in Fig. 10. The array output is first mixed with a locally generated signal modulated by the pseudonoise code. When the codes of the locally generated signal and the desired signal in the array output are synchronized, the desired signal's spectrum is collapsed to the data bandwidth. The mixer output is then passed through a filter with this bandwidth. The biphase desired signal is therefore unchanged by the filter. The filter output is then hard limited so that the reference signal will have constant amplitude. The hard-limiter output is mixed with the locally generated signal to produce a biphase reference signal. The reference signal is therefore an amplitude scaled replica of the desired signal. Any interference signal without the proper code has its waveform drastically altered by the reference loop. When the coded locally generated signal is mixed with the interference, the interference spectrum is spread by the code bandwidth. The bandpass filter further changes the interference component out of the mixer. As a result, the interference at the array output is uncorrelated with the reference signal. Thus, with spread spectrum, a reference signal is continuously generated that is correlated with the desired signal and uncorrelated with any interference. Furthermore, since pseudonoise codes are used, every mobile can be distinguished by a unique code. Unfortunately, spread spectrum increases the biphase signal bandwidth by a factor of k and therefore increases both the total cochannel interference power and the number of interferers in cellular mobile radio. For example, with frequency reuse in every cell, the cochannel interference power and the number of interferers from surrounding cells are increased by factors of k and 2k -1,

ARRAY OUTPUT ADAPTIVE ARRAY

REFERENCE SIGNAL

Fig. 10. Reference signal generation loop with the adaptive array. When the desired signal is a biphase spread-spectrum signal, the reference signal is correlated with it but not with any interference.

respectively. This increase in interference power is canceled by the processing gain of spread spectrum, but the increased number of interferers degrades the performance of the LMS adaptive array. Furthermore, 2(k -1) cochannel interferers are now present within the desired mobile's cell. Thus, even with a small spreading ratio (e.g., 5 or less) the performance of the LMS adaptive array with the biphase spread-spectrum signal can be worse than that of maximal ratio combining, making the LMS system impractical. The bandwidth increase with spread spectrum and its associated problems can be overcome in the following way. The biphase spread-spectrum signal is combined with an orthogonal biphase signal modulated by the voice bits only (see [13]). The data modulation rate of the orthogonal biphase signal is the same as the code modulation rate of the biphase spread-spectrum signal. The resulting fourphase signal therefore has a bandwidth determined by the data rate only, i.e., the bandwidth is not increased by the spreading ratio. Furthermore, a reference signal for the four-phase signal can be generated from its biphase spread-spectrum signal component using the loop described earlier. As shown in [14], the performance of the LMS adaptive array with the four-phase signal is close to that with the biphase signal. Therefore, with this system, we can generate a reference signal without any increase in interference power or the number of interferers and achieve an improvement with an LMS adaptive array close to that for optimum combining which is shown in Sections III and IV. We now describe the modulation technique in detail by describing three possible ways to modulate the four-phase signal. The simplest technique is for the voice bits to modulate only the orthogonal biphase signal. The biphase spread-spectrum signal then contains the code plus data bits for transferring information from the mobile to the base station. With this first technique, the signal bandwidth corresponds to the voice bit rate r (e.g., 32 kbitsz's). However, the energy-per-bit-to-noise (interference) density ratio Eb / No is half that of a biphase signal. Thus, the improvement with an LMS adaptive array is 3 dB less than that shown in Sections III and IV. A data channel is also available, however, with an r/ k data rate. Furthermore, since the Eb / No for the data bits is k times that for the voice bits (because of the spread spectrum), the BER for the data bits is very low. If a data channel is not required, then voice bits can

348

TABLEII

replace the data bits. With this second technique, the voice bits are split into two channels, one modulating the biphase spread-spectrum signal and the other modulating the orthogonal biphase signal. The bit rate for the latter channel is k times that for the biphase spread-spectrum signal. The signal bandwidth is reduced by k I( k + 1) as compared to the first technique. However, the Ebl No of the voice bits on the biphase spread-spectrum signal is k times that on the orthogonal biphase signal. Through appropriate coding techniques, this difference can be used to improve the overall BER. We can equalize the BER for both channels by decreasing the power of the biphase spread-spectrum signal by II k. With this third technique the Ebl No for the voice bits is just k /tk + 1) times that for a biphase signal. For example, with k equal to 5, the improvement with an LMS adaptive array is 0.8 dB less than that shown in Sections III and IV. Table II summarizes the above results for the three modulation techniques. A block diagram of the four-phase signal generation circuitry for the three modulation techniques is shown in Fig. 11. The code symbols of duration ~ are mixed with either voice or data bits of duration kA. The resulting symbols modulate a local oscillator to generate a biphase spread-spectrum signal. As shown in the lower portion of Fig. 11, voice bits, also of duration ~, modulate the local oscillator signal shifted by 90° to generate the orthogonal biphase signal. This signal is then combined with the biphase spread-spectrum signal to obtain the four-phase signal. By adjusting the biphase spread-spectrum signal level with fJ and modulating this signal with either voice or data bits, we can generate any of the three four-phase signals listed in Table II.

FOUR-PHASE SIGNAL PARAMETERS FOR THREE MODULATION

TECHNIQUES IN AN Technique No.

Relative Biphase Silnal Powen

LMS ADAPTIVE ARRAy SYSTEM

Spread-Spectru m Biphue Sian_l Bit a Information E,,/N o b Biu Rile

Ortholonal Bipbuc Silnal Information Bit Bill Rate

1

1:1

Data

r fk.

2

1:1

Voice

rj(1e +1)

k /2

Voice

3

ljle :1

Voice

rj(1e +1)

lej(1e +1)

Voice

a The code modulation rate is b Relative to biphase signals.

le/2

Voice

E"INo b

r

0.5

Ie

(kF)r

O.S

(k+1 )r

k/(Ie+l)

k times the bit rate.

CODE

------.,,+ ) - - - - - - - - e . ( Y l - - - - l (~)

DATA

OR

of.

VOICE BITS

(A.6)

+

FOUR - PHASE

I,.......---.. SIGNAL

VOICE BITS

(td

Fig. 11. Block diagram of the four-phase signal generation circuitry for the LMS adaptive array. A biphase spread-spectrum signal, modulated by code symbols plus data or voice bits, is combined with an orthogonal biphase signal, modulated by voice bits, to generate the four-phase signal.

B. Base -to - Mobile Transmission

As we have shown, the LMS technique can significantly improve signal reception at the base station. This improvement is, of course, also desired at the mobile. However, since there are many more mobiles than base stations, it is economically desirable to add the complexity of the LMS technique (particularly multiple antennas) only to the base stations. Adaptive retransmission with time division (1], [9] can be used to improve reception at the mobile with multiple base station antennas only. With adaptive retransmission, the base station transmits at the same frequency as it receives, using the complex conjugate of the receiving weights. With time division, a single channel is time shared by both directions of transmission. Thus, with the LMS technique, during mobile-to-base transmission the antenna element weights are adjusted to maximize the signal-to-noise ratio at the receiver output. During base-to-mobile transmission, the complex conjugate of the receiving weights are used so that the signals from the base station antennas combine to enhance reception of the signal at the desired mobile and to suppress this signal at other mobiles. Therefore, by keeping the time intervals for transmitting and receiving

much shorter than the fading rate (e.g., transmitting in 10 bit blocks), we can achieve the advantages of the LMS technique at both the mobile and the base station. With adaptive retransmission using the LMS technique, each base station transmits in a way that maximizes the power of the signal received by the desired mobile relative to the total power of the signal received by all other mobiles. Thus, at the mobiles, interfering base station signals are suppressed and the improvement in the performance with the LMS technique as compared to maximal ratio combining should be similar to that at the base stations. The actual improvement for a given mobile, however, depends on the interference environment of every base station. Because of the complexity of the analysis, we will not study this improvement in detail. It should be noted, though, that for base-to-mobile transmission, spread spectrum on the signal is not required because a reference signal is not generated at the mobile. Therefore, without the degradation with the modulation scheme in the mobileto-base transmission (see Section V-A-3), the BER at the mobile may be lower than that at the base station.

VI.

SUMMARY AND CONCLUSIONS

In this paper we have studied optimum combining for digital mobile radio systems. The combining technique is optimum in that it maximizes the output SINR at the receiver even with cochannel interference. We determined the BER performance of optimum combining in a Rayleigh fading environment and compared the performance to that of maximal ratio combining. Results showed that with cochannel interference there is some improvement over maximal ratio combining with only a few receiving antennas, but there is significant improvement with several

349

antennas. With optimum combining, the typical cellular system was seen to have greater margins and require fewer antennas than with maximal ratio combining. Finally, we described how optimum combining can be implemented in mobile radio with LMS adaptive arrays. Thus, we have shown that optimum combining is a practical means for increasing the channel capacity and performance of digital mobile radio systems. REFERENCES

[10] Y. S. Yeh and D. O. Reudink, "Efficient spectrum utilization for mobile radio systemsusingspace diversity," IEEE Trans. Commun., vol. COM-3D, R. 447, Mar. 1982. [11] T. W. Miller, ' The transient response of adaptive arrays in TDMA systems," Electrosci. Lab., DeP. Elec. Eng., Ohio State Univ., Columbus, OR, Rep. 4116-1, p. 287, June 1976. [12] R. T. Compton, Jr., "An adaptive array in a spread-spectrum communicationsystem," Proc. IEEE, vol. 66, p. 289, Mar. 1978. [13] J. H. Winters, "Increased data rates for communication systems with adaptive antennas," in Proc. IEEE Inter. Coni. Commun., June 1982. [14] - , "A four-phase modulation system for use with an adaptive array," Ph.D. dissertation, Ohio State Univ., Columbus, OH, July 1981.

[1] W. C. Jakes Jr. et al., Microwave Mobile Communications. New York: Wiley, 1974. [2] R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980. . [3] B. Widrow, P. E. Mantey, L. 1. Griffiths, and B. B. Goode, "Adaptive antenna systems," Proc. IEEE, vol. 55, p. 2143, Dec. 1967. [4] S. R. Applebaum, "Adaptive arrays," IEEE Trans. Antennas Propagat., vol. AP-24, p. 585, Sept 1976. [5] B. Widrow,1. McCool,and M. Ball,"The complex LMS algorithm," Proc. IEEE, vol. 63, p. 719, Apr. 1975. [6] C. A. Baird, Jr. and C. L. Zahm, "Performance criteria for narrowband array processing," 1971 Coni. Decision Contr., Miami Beach, FL, Dec. 15-17, 1971, p. 564. [7] V. M. Bogachev and I. G. Kiselev, Optimum combiningof signals in space-diversity reception," Telecommun. Radio Eng., vol. 34/35, p. 83, Oct. 1980. [8] P. Bello and B. D. Nelin, "Predetection diversity combining with selectively fading channels," IRE Trans. Commun. Syst., vol, CS-10, p. 32, Mar. 1962. [9] P. S. Henry and B. S. Glance, "A new approach to high-capacity digital mobile radio," Bell Syst. Tech. J., vol. 60, no. 8, p. 1891, Oct. 1981. U

350

On Optimum Combining at the Mobile RODNEY G. VAUGHAN,

Abstract-Optimum combining for diversity antennas at the mobile is discussed. Effectively, the aim is to add the wanted signal vectors in a maximum ratio sense, while interferers are weighted so that their resultant is in a permanent deep fade. Even if there are not enough degrees of freedom available to accomplish this fully, an optimum solution can still be found. Many interpretations from conventional array technology do not apply to the mobile communications case and the mechanism of optimum combining for array branch signals rather than discrete spatial signals is reviewed. Physical interpretation of the formulation is emphasized tbroughout. Problems with the adaptive algorithm and its implementation are also identified. Sample matrix inversion is shown to be a likely algorithm to apply in vehicular mobile communications receivers. A worst case example gives an idea of the required computation rates.

MEMBER, IEEE

z r-'" t' ""'---X-" ~ \

\

\

\

/;"'------~ /

I

\

I

,/

\,'

I_-ARRAY PATIERN

I

Y

I I r

ARRAY

S

\ \

I

'

I

Fig. 1. Scenario for conventional array. In simple scenario. array has enough degrees of freedom to steer ., nulls ., of array pattern toward all interferers while maintaining finite gain toward wanted signal.

I. INTRODUCTION

S

IGNAL COMBINATION in mobile communications is generally discussed (cf. [8], [9]) in terms of pre- and postdetection classes of selection, switched, equal-gain, and maximum ratio. To date, adaptive techniques have not been extensively discussed in the area of mobile communications. In the antenna literature, the interpretation of the adaptive techniques revolves around antenna patterns. In particular. the adaptation of nulls toward interferers has been the explanatory mechanism, which accounts for terms such as "null steering." The wanted signals and interferers are considered as discrete. usually resolvable, points in real space, as indicated in Fig. 1. This is not the case in a typical mobile communication scenario, where wanted sources and interferers easily outnumbered the degrees of freedom of the diversity array and, furthermore, are distributed and generally unresolvable, as suggested by Fig. 2. From the point of view of adapting antenna patterns, the situation appears impossible. Lee [9, p. 451] brietly discusses adaptive techniques for mobile communications but bases the discussion on antenna patterns and interferer directions. Bogachev and Kiselev [3] seem to be the first to have discussed optimum combining with respect to (space) diversity antennas. They derive curves for the probability of the signal-to-noise ratio (SNR) in the presence of Rayleigh fading. Winters [17] presents a good discussion of optimum combining at the base station and also provides simulation results. For the returns offered by optimum combining in the mobile communications case, the reader is referred to Winters' article. The best improvements (over other combining methods) occur when the interference Manuscript received April 1, 1987; revised August 20, 1988. The author is with the Physics and Engineering Laboratory, Department of Scientific and Industrial Research, P.O. Box 31313. Lower Hutt, New Zealand. IEEE Log Number 8927107.

0)

WANTED

88

UNWANTED

Fig. 2. Multipath scenario for array at mobile. Wanted signal and interferers arrive from many directions. Array pattern. after convergence to optimum combining solution, does not offer meaningful interpretation mechanism as in simple scenario case.

levels are high relative to the wanted signal, and when the number of diversity antenna elements is large. As far as the author is aware, there is no discussion of optimum combining applied at the mobile, where, as at the base station, it is necessary to consider branch signals rather than discrete spatial sources. The application is in high-density cellular systems where the cochannel interference is high. A theoretical worst case occurs in a corner of a hexagonal cell where the average interference power is less than 5 dB below average signal power [19]. Local shadow effects such as caused by large buildings can make this figure worse still, and Winters [17] suggests that a signal-to-interference-plus-noise ratio (SINR) of + 5 dB to - 5 dB is typical. The number of diversity antenna elements should be large, typically five or more, to obtain significant improvement (i.e., several dB in SINR) over maximum ratio combining. Glance and Greenstein [5] discuss up to six base station elements and Yeh and Reudink [19] go as far as discussing a system with 20-element mobile antennas with 24element base station antennas, each using predetection maxi-

Reprinted from IEEE Transactions on Vehicular Technology, Vol. 33, No.4, pp. 181-188, November 1988.

351

mum ratio combining. A disadvantage of optimum combining at the mobile rather than at the base station (using adaptive retransmission) is that the cost and complexity of implementation at every mobile is much greater over-all than that of implementing only at each base station. An advantage of many-branch systems at the mobile, however, is that the antennas can be realized in a compact manner. At the base station, even a six-element array becomes very expansive. In the following, the mobile diversity antenna is treated as an array antenna. The terminology is also array antennaoriented. For example, the multipath environment of mobile communications is referred to as a multiple source, or distributed source scenario since the distribution of sources effectively approaches a continuum in urban environments. The scenario is referred to as stationary, which is here taken to mean that its statistics are unchanging with time and the position of the mobile. In some sections, the scenario is referred to as static, indicating that the source positions are constant relative to the mobile. A real-world mobile scenario is rarely stationary or static, but the latter simplification does not seem too drastic for the short periods over which each optimum array solution is sought. Section Il contains the basic complex formulation for the array signals. The differences between the more familiar conventional array case and the mobile communications case are discussed with specific attention given to the physical interpretation of the formulation. Section III follows an approach similar to that in Section II but covers the signal combining aspects. The optimum antenna array (specific antenna elements are not within the scope of the article) is addressed in terms of conventional array parameters. Section IV contains a brief look at adaptive algorithms for the mobile communications case, and Section V looks at aspects of implementation. A worst case example gives some idea of the computational requirements and the lirnitations of adaptive algorithms. II.

source is defined as a single point source; any incident signal

which is derived (e.g., diffracted, reflected, or delayed) from the point source, is considered as a separate (point) source. In the mobile communications case, N is limited to less than, say, six. M is effectively unlimited since the sources are considered distributed. Many of the M sources bear the same signal because of the multipath. If there are P different signals (one wanted signal and P - 1 interferers) , then each of the P signals can be considered randomly allocated to many of the M sources. With the time factor ei wt understood, the complex envelope of the signal conveyed by the nth branch from the mth source is (1)

where mm(t) is the signal modulation of the mth source and

(j)nm is the carrier phase of the mth source, in the nth branch.

The total signal in the nth branch is thus .W

xn(t) = }:; xnm(t).

(2)

m=l

is real and represents the signal amplitude (or rather its mean over the modulation) in the branch from a particular source. The static scenario is characterized by the Q and cP being independent of time. Some physical insight is offered for the mobile communication case if a and cP are thought of as functions of distance moved by the mobile. After the antenna has (hopefully) adapted to a set of Q and (j), the mobile has moved, and adaptation to a new set of Q and q, is required.

Q nm

Array Branch Signals Define the total (sum over all sources) branch signals in the usual way

BASIC FORMULATION

The distributed source scenario around the mobile in an urban environment must be considered unchanging for each adaptation cycle of the combining algorithm. With this in mind, this section considers constant (in location) sources only. In particular, signal correlation measurement intervals or a sequence of them-to establish the covariance matrix with sufficient accuracy during convergence-are taken over a static scenario. In practice, this will not be the case (unless the mobile and its surroundings are still): the adaptation algorithm will be chasing a changing solution, and its performance will be correspondingly degraded. If the adaptation is fast compared with the rate of change of the sought solution, the degradation will not be too much. Some practical points are discussed in Section V. Where possible, the notation here follows that of Hudson [7]. However, the multipath scenario of mobile communications calls for a different formulation, and some designations common to this article and Hudson's are not interchangeable. The array antenna has N branches with M incident signals from M sources in the field of view of the array. Here, a

352

(3)

and a vector of the RF (or IF) phase terms weighted by the amplitude of the source signal in the branch

T m are the vector contributions of the mth source to the array. When only the phases are included (Qnm = 1, all n) Tm = Sm is the source vector to the mth source, a term from conventional array technology. Now let all the sources be included in the total weighted source vector

(5)

Note that a column of Q corresponds to a given source signal in all branches and a row corresponds to all the source signals in a given branch. Finally, define a vector containing the source modulations

AT = [ml (t) m2(t) · .. m.w-(t)]

(6)

products occur in R. Denote a source modulation correlation

so that X(t)

= QA (t)

(7)

is the vector formulation of (1). In the mobile communications case, let there be P different (uncorrelated) signals, with each signal conveyed by subsets of the M sources. The subsets contain MJ, M 2 , •• " M; sources. Now

Pmq

= m :(t)mq(t)

(15)

in which the normalization is understood. Obviously, PmQ = P * and Pmm = 1. Now the inner term of the covariance matrix (14) is

of

Pl2

pi2

A*AT=

••• PIM

1

(8)

(16)

i= I

and (2) can be written as xn(t) =

L L anmmM;(l)ej(j>nm P

,Wi

(9)

i=l m= I

L anMjmAfi(t)eJcPnMi p

~

1=

(10)

I

where the summation is over i for all Mi. In (9), the inner summation is over each source bearing the ith signal. and the outer summation is over the difference source subsets. i.e., over the different signals. The subscript M; serves as a reminder that the (subscripted) signal quantity is derived from several sources rather than an individual source in the scenario. Equation (10) shows that each branch can be considered to support P signal-bearing vectors, each of which correspond to the vector sum of the signals received from the appropriate sources. This is an important maxim as it allows branch signals to be combined such that certain signals are maximized and/or others minimized. The latter effect is the extension of maximum ratio combining to optimum combining. Equation (10) is used in the following in a vector formulation of the covariance matrix for the mobile communications case.

In the case where all the sources bear different (uncorrelated) signals, A *A T becomes the identity matrix. In the mobile communications case, the off-diagonal elements will be randomly placed (other than complying with the Hermitian construction) ones or zeros. Each source adds one to the rank of A *A T which makes it of mathematical interest only except in simple scenarios, i.e., those with a manageable number of sources. In the mobile communications case, not even the rank of A * A T is known. Inclusion of the weighted source vectors is facilitated by denoting

=

7r nmqpcPqpnm'

(Q)nm is the nmth element of Q. cPqpnm is the phase difference between the pth source in the qth branch and the mth source in the nth branch. 1r nmqp has the dimensions for power and denote 1r nmnm = 1r nm . From (10), (14), and (18),

L p

(R)nq

=

1r nM;qlvf;cPqMinMi

and (20)

The output signal is usually written

= WT X(t)

i= I

(11 )

where W is the column vector of weights. The output power is the Hermitian form (12)

in which the covariance matrix is defined as R=X*(t)XT(t)

(19)

i= I

The Covariance Matrix

y(t)

(18)

(13) (14)

The overbar means time averaging in the presence of the static scenario. It is evident that R must contain all cross products of both the weighted space vectors and modulations. Some physical insight can be obtained by examining where these cross

353

The off-diagonal elements are seen to be a summation of vectors, each vector corresponding to one of the P signals from one of the subsets M, of the M incident sources. Little can be remarked about simplifying the physical interpretation of R, its form being too involved. However, some progress is possible by performing a similarity transformation of the desired-signal-only covariance matrix (see the later discussion of prewhitening, below). In practice, the covariance matrix will also contain terms due to noise. The noise is uncorrelated between branches (any unwanted signal that is correlated between branches is defined as interference) so that only the principle diagonal in the covariance matrix becomes augmented. This is convenient because it ensures that R is always positive definite and the existence of R - I is assured. Conversely, if the noise level is very low relative to the source signal levels, numerical problems may sometimes arise in seeking R- I from measured

The optimality criteria can be expressed as the Wiener solution

values of R. Techniques for ill-conditioned R include taking the generalized inverse instead of the conventional inverse. III.

MWopt=kTZ,o

THE OPTIMUM ANTENNA

In public service mobile communications. the average (over several fades) power of interferers would rarely be above that of the wanted signal. In each antenna branch or a single port antenna, however, the instantaneous level of the interferer could be a few tens of decibels above the wanted signal owing to the Rayleighlike fading. The optimum antenna (here the antenna is regarded as the elements, weights, and summing network) is defined to maximize the SINR. If covariance matrices R, and M are defined to embody the wanted signals only and the interference-plus-noise, respectively, then the adaptive algorithm seeks an optimum weight vector Wopt which is well-known to satisfy

where k is a scaling constant ( =golk ' ) which has no effect on the SNIR. If M is invertible (which can be arranged, if necessary, by adding noise), then the scaled weights of the optimum antenna are given by (28)

Equations (27) and (28) are the generalized version (applicable to the mobile communications case) of the results for conventional arrays in which TMO is replaced by roSo. The difference is that the conventional array employs purely phase weighting. Here, both phase and amplitude weighting are necessary. In practice, o must be known a priori, or else it is measured, as is the covariance matrix. In mobile communications, a priori knowledge of 110 is not possible and instead, knowledge of a characteristic of the wanted signal is used to estimate To measu~e M, the wanted signal must first be removed from each branch. Measurement of R, which includes the wanted signal, is clearly more practical but its unentangled inclusion in the optimum solution of (28) requires special conditions. Specifically, if

Ttt

(21)

where k' is a constant which is, in fact. the maximum SINR. Denote a weighted space vector for the wanted source To; then

TZt .

(22)

In the mobile communications case, the wanted signal and interferers are distributed across many source vectors and the small size of the array will normally prevent the isolation of only one of these, the To above. However, R, can always be expressed as a dyadic when the sources are considered branch sources rather than spatial sources. Each branch source is the summation of the contributions from the actual sources which bear the same signal. Equation (19) depicts the situation. The weighted source vector of (4) becomes reformulated as branch signals, (23)

in which M;

TnM ; = ~ t;

(27)

(24)

(29)

then (21), (26), and (29) can be combined to yield the optimum scaled weights solution

Wopt=R-ITZto

(30)

which is known to be equivalent to (28) (e.g., Hudson [7, p. 74]) but features the more readily available total covariance matrix. The validity of (29) is the key resource: the desired signal and the interferers must be uncorrelated (Pmp = 0, m =1-= p) or else orthogonal in the sense

TTMm T*Mp =0 ,

m=l

m e p,

(31)

In the conventional array case, the analogy of (31) is the spatial orthogonality relationship

and then the form (25)

ST·~*=O fTtI p ,

is a valid representation even in the case of a spatially distributed set of (e.g., wanted) signal sources. Substituting (25) into (21) gives

k' MWopt = TttoT~oWopl =goTt,o

(26)

where go is a constant representing the amplitude gain of the array to the wanted signal or to the wanted source in the case of an isolated source. In the latter case, the right side of (26) is go1roSri . So is called the steering vector, and 1ro is the total power (in all branches) of the desired source, quantities familiar from the conventional array case.

m e p,

(32)

which is sought in null steering by retrobeams (eigenbeams for subtraction from the quiescent array pattern). In the mobile communications case, eigenbeams for preprocessing can be formulated mathematically (only at the expense of juggling signal powers since the TM; are not orthonormal), but the patterns are not in real space.

Prewhitening Preprocessing can offer increased convergence rates for adaptive combiners in conventional arrays. The power of the prewhitening is in the equalization and decorrelation of all interference (-plus-noise) in each branch. The combining algorithm can then just maximize the output power. White [15]

354

antenna. For example, in a conventional array, the modulus part of complex weighting allows the suppression of array space near-field sources while maintaining a response to farfield sources in the same direction. In the mobile communications case, the N complex weights provide 2N nOF, but each interferer must be dealt with in both amplitude and phase. Cancellation of an interferer thus requires two real OOF. In the context of mobile communications, it would be appropriate to consider the complex weighted N-branch array to have N complex OOF, each available one of which can be used to suppress at least one interferer.

discusses adaptive beam-forming in simple scenarios but does not consider transient performance. Accelerated convergence techniques (see, e.g., Monzingo and Miller [12, p. 188-192]) can help considerably for antennas with a relatively small number of elements, as would normally be the case in mobile communications. Preprocessing, in particular Gram-Schmitt orthogonalization (e.g., Monzingo and Miller [12, pp. 364383]), seems to offer considerable gains, but again, the scenario is assumed to be unchanging during adaptation. Since the rate of convergence is important in mobile communications, an investigation of the applicability of preprocessing is worthwhile. Since M is always positive definite, it can be uniquely factored using the Hermitial matrices CC = M, and since M is of full rank, C- 1 exists, so C-IMC- 1 =/.

Array Patterns The array pattern for a conventional array is defined as the amplitude gain for a given set of weights

(33)

g({}, ¢) = WTs

The operation is seen to decorrelate the interference signals between all branches (e.g., Hudson [7, p. 62]). The transformed covariance matrix of the desired signal is

B=C- IR sC- l •

(34)

Maximizing the SINR is now just a question of finding the maximum eigenvalue of B. The transformation from R, to B is between array space and eigenspace. However, the eigensources cannot be related to array space sources as is possible with conventional arrays. In fact, the eigensources are difficult to relate to the branch sources (see (4)) as well. The problem again lies with the weighted source vectors TM; not being orthonormal. In the conventional array case with a simple static scenario featuring M < N spatial sources, the array with a beam-former C- 1 provides a (transformed) covariance matrix which is diagonal and contains the powers of each of the sources as its element values. In practice, much effort is required to implement the beamformer of changing scenarios. (For constant scenarios, the beam-former can be hard-wired; indeed, most applications are for this case.) Generating C-I is not trivial because C is the square root of M. M must first be measured, its eigenvalues found numerically, their square roots taken (positive radicals are understood in the unique factorization of M), and then assembled into a diagonal matrix. Finally, the inverse taken to yield C- 1. The continuous measurement and processing required to implement the prewhitening is formidable, and any acceleration of convergence as a result of preprocessing will be well mitigated by the time taken for the preprocessing. Moreover, it is shown in Section V that the gradient search techniques are not very suitable for single-receiver systems in the vehicular mobile communications case.

(35)

where S provides the directive information. If the scaled weight vector corresponds to maximizing the array amplitude gain towards a source 51, then (in the absence of interferers) W = S and S I is the steering vector. If a second source 52 is introduced, the amplitude gain toward 52 using weights W is given by

i,

g({}, cP)=S~5~.

(36)

The array pattern then is a mapping of an inner product of 51 and S2' S1 could be referred to as a testing source, in keeping with the position of a probe antenna in pattern measurements. The nulls of g «(J, q,), if they exist, occur when the source 52 does not affect the array output, i.e., when the sources are spatially orthogonal,

Degrees of Freedom (DOF) Clearly, an N-branch conventional array with N phase-only weights has N DOF; or if the weights are complex, there are 2N DOF. In general, not all the DOF are available for suppression of an interferer, and the number available depends on both the scenario and the configuration of the array

355

(37)

Array space patterns are difficult, if not impossible to interpret for the mobile communications case. A signal amplitude gain quantity can be defined analogously to (35) (cf. (26)) by g = T'f.,; W. TM; carries the "directive" information but the direction is not in real array space. A test weighted source vector could be defined as above (cf. S2 above), but again there seems little point because the inner product analogous to (35), T'f,T2 , does not readily offer a mapping which is physically meaningful. The traditional use of array patterns to illustrate antenna adaptation does not apply to mobile communications. IV.

ADAPTIVE ALGORITHMS

The adaptive algorithms well-known from conventional arrays are all applicable for complex rather than phase-only weights. Some caution must be exercised with some conceptual and explanatory interpretations as noted in the previous section. The suitability and implementation feasibility of the various algorithms in the mobile communications case is the remaining issue. Ricardi [14, p. 6.2] suggests three classical

methods, under which all adaptive algorithms can be classified. These are power inversion, least mean squares (LMS), and direct sample matrix inversion (SMI). SMI is conceptually the simplest technique. The covariance matrix and weighted steering vector are estimated (in this case) and the weights are calculated directly from (38) where the hat denotes estimate. This offers the quickest technique for finding the weights and is independent of the makeup of R, as long as it is well-conditioned with respect to the word size of the implementation. The covariance matrix estimate is formed with K samples 1 K R(k)=- }: X*(k')XT(k')

K

(39)

k'=l

where X(k) is a sample of X(t) at the kth sample time. The estimate for the weighted steering vector is similar (cf. Monzingo and Miller [12, p. 300]):

where d(k) are the samples of d(t), which is a reference signal well-eorrelated with the desired signal. In some ways, SMI is not truly adaptive because it is an open loop system. However, it is considered adaptive in the sense that it will produce a new solution for a new scenario. As in any open-loop system, the accuracy of the implementation must correspond to the desired accuracy of the solution. SMI is discussed further at the end of the subsection. The power inversion and LMS algorithms employ a closedloop system that works to converge towards the solution. Power inversion is simply a minimization of output power, often with a constrained gain to the wanted signal (or spatial source, in simple scenarios). The weighted steering vector needs to be known a priori so the power inversion algorithm is not applicable to the mobile communications case. When the weighted steering vector is estimated with the aid of a reference characteristic of the wanted signal, the algorithm effectively becomes the LMS technique. Despite their differences in implementation, theLMS and power inversion techniques are very similar. Both algorithms are steepest descent gradient search techniques. The convergence rate of the gradient search techniques is very complicated to predict accurately, except in simplified situations. The convergence rate is known to be sensitive to the spread of eigenvalues in the covariance matrix. As far as the author is aware, no simulations have been reported for complex vectors are weights. Simulations for conventional arrays in simple scenarios usually involve very low signal-to-interference ratios, a situation which would be unlikely in the mobile communications case. Winters' [17] base station simulations include Rayleigh fading but assume complete convergence (no combiner losses) although no transient analysis is included. From conventional array simulations (see, e.g., Monzingo and

Miller [12]) convergence to within 3 dB of the optimum solution rarely occurs in less than several hundreds or even thousands of iterations. The eigenvalue spread of the covariance matrix in these simulations is generally more than would be found in mobile communications, so results will be pessimistic. On the other hand, the interference-to-signal ratio is much higher than would be found in the mobile communications case, which would make the results optimistic. Results from conventional array simulation can only give a rough guide. Some conceptually derived time constants allow some conclusions regarding implementation. An upper limit on the rate of change of the weights is given by the data rate. Winters ([17], p. 152], after Miller [11]) notes that the maximum rate of change in the weights is about 0.2 times the data rate before significant data distortion occurs for phase-shift keyed (PSK) signaling. 1 For a data rate rd = 16 kbits/s, the maximum rate of weight update for PSK signals is then about every 0.06 ms, and the weights cannot change completely within 0.3 ms. For good adaptive performance, it is best to try to attain this rate. In terms of a Rayleigh fading envelope, this upper limit is quite severe. A worst case fading rate, given by a vehicle traveling at, say, 140 kmlh with a 9OO-MHz carrier frequency and omnidirectional antenna, is about 120 fades/so If the ,'branch scenario" (the scenario describing the "positions" of the P signals in each diversity branch) can be assumed uncorrelated at intervals of half of the fading period, i. e. , every 2.1 ms, then there can be a maximum of less than 2.1/ 0.06 :: 35 iterations between independent scenarios. This worst case situation does not look favorable for the gradient search techniques. However, once the weights are close to optimum, the required number of iterations are less, but still well over 30. A dual receiver system appears necessary for the gradient search techniques. One receiver would be for establishing the weights and another, using only periodically updated weights, dedicated to data throughput. The 8MI technique offers a possible single-receiver solution. The processing required can be couched in the traditional units of number of multiplications. The covariance matrix is formed with K independent samples, and clearly, the larger K is, the better the estimate and the more accurate the solution since a static scenario is assumed. For an ensemble average solution to be within 3 dB of the optimum solution for 50 percent of the estimates, Reed et al. [13] suggest K > 2N 3. The usual role of thumb is to take K > 2N. Boronson [2] notes that K > 3N and K > 4N ensures that the solution will be within 3 dB of the optimum for 98 and 99.68 percent of the estimates, respectively. Formation of R(k) and Mo(k) require KN(N + 1)/2 and KN complex multiplications, respectively. Inversion of R(k) and calculation of W(k) require (N3 /2 + N2) and N2 complex multiplications respectively [13]. Thus for a three-element array, there are at least 86 complex multiplications. A real 16-bit multiplication can be performed within about 150 ns with current off-theshelf hardware. For the three-element array, the multiplica-

356

t

I In fact, the weights must respond slower than 0.2 times the data rate. In practice, the weights can be updated at the data rate, but cannot change completely in less than five data bit intervals (Winters [18]).

4. 63

6 .17

18

24

7 .71 30

TIME 1" \

)

9 .16

10. 80

12.3 4

13 .89

1\ . 43

36

41

48

\4

60

DIS TANCE TRAVELLED AT 140 \:mj h - 00

i ~ i:i

20

1&

SO !It IA"PLE NU"UIl:

40

after the processing is complete and the weights are calculated [18] .

~ cm )

V.

60

-' 0

- 10

- 10

tI

iI 1

/

Mea s ur e

"-

We lgh ts set Ca lc ul e t e

and set we t qnt. s

We Ights us ed

I i n tm s in te rva l e r i v ed f rom mea sureeent s he re

~

1 UPDATE CfCL[

Fig. 3. Example of worst case fading (carrier frequency = 900 MHz. vehicle speed 140 krn/h) with maximum weight response rate (for l o-kbit/s data rate and single receiver system) of about every 0.3 ms. At 140 km/h , weights respond about every 1.2 cm . Limitation in manageable dynamic range of signal is illustrated in fade 10 left. where installed weights correspond to "out of date" measurements. Limitation in realizing this with sufficient accurac y. maximum update rate is in estimat ing

Two

tions alone would take about 52 JJ.S, easily within the upper limit of 0 .3 ms for the weight updating . A six-element array requires 504 complex multiplications, taking about 0.3 ms, indicating the need for a second multiplier to maintain the maximum weight update rate. It is probable that an exponential deweighting for the covariance matrix update would be more suitable for tracking the solution . Hudson ([7, p. 125] after Lunde [10], and Monzingo and Miller [12, p. 314]) give formulas for directly updating the inverse of the deweighted covariance matrix and the steering vector, respectively . A fundamental limitation lies with the required measurement times . The K independent samples must be taken within a short enough time that the scenario appears static. On the other hand, the period taken to retrieve the K samples necessarily extends long enough for the correlations to be correctly detected . This places some lower limits on the crosscorrelation bandwidths of the reference signals. If pilot tones are used, for example, a correlation time interval equal to the maximum weight response rate of 0.3 ms calls for pilots to have a theoretical minimum spacing of about 3 kHz . (In practice, there would be a much larger pilot spacing requ ired.) This is rather inconvenient; for eight cochannels, for example, an entire 25 kHz bank is used only for pilots! A point of academic interest is that pilots should be spaced by more than the upper Doppler limit of about 120 Hz ; otherwise , an interferer may often be singled out as the wanted signal and vice versa . If it can be assumed that the weighted steering vector can be estimated accurately in less than about 0.3 ms, then the maximum weight response rate can be realized using SMI. At the worst case fading rate (corresponding to a vehicle traveling at 140 km/h), the response occurs about every 1.2 ern. The measured signals are taken over the preceding interval , or maybe even two intervals before weight update, if slower processing is used . The situation is depicted on a Rayleigh envelope in Fig. 3 . A way around dealing with "out of date" measurements is to delay the received signal and operate on it

IMPLEMENTATION

Winters ' [17] base station implementation proposal uses a spread-spectrum system with an LMS algorithm. It is based on Compton 's [4J description . While the system is claimed to be practical, the transient analysis of the algorithm is not addressed. A possible system could use Winters' spreadspectrum proposal with the SMI algorithm. The spreading factor should be kept modest, not only to keep the processor bandwidth down, but also because the cochannel interference increases with channel bandwidth in a cellular system, and the returns from the optimum combining system diminish . However, the spreading factor must be sufficient to get a pseudonoise sequence completed and preferably repeated within the time that the branch scenario changes, i.e ., about every 0 .3 ms in the worst case . With implementation at the mobile rather than the base station, shorter pseudonoise sequences are possible since there are far fewer base situations than mobiles . For example , for a sequence length of 16 bits with an information rate of 16 kbit/s , a spreading factor of at least three is required to attain a rate of one sequence per 0.3 ms . The implementation of optimum combining is obviously much more complicated and expensive compared to conventional combining . Switched or selection combining are the easiest techniques, especially when there are just two branches. Postdetection equal-gain combining is also easy to implement, since there is no need for cophasing the signals . Postdetection maximum ratio combining is more complicated because an amplitude weighting must be included, and calculating the weights requires measurement of SNR in each branch. Predetection maximum ratio combining requires both amplitude weighting and cophasing of the signals . This demands much extra hardware for each branch. Maximum ratio combining is more often used as a theoretical performance benchmark because much progress is possible in calculating the diversity returns. A loss budget must be introduced for imperfect combining . For example, the system proposal of Yeh and Reudink [19] assumes maximum ratio combining and allows I-dB .. modem loss " for the theoretical output SINR . Optimum combining, in the implementation considered here, requires not only the measurement and weighting hardware for predetection maximum ratio combining , but also powerful digital computation hardware for calculating the weights . Signal digitization would be one of the expensive aspects of an implementation. A possibility which has not received much attention is bandpass sampling, in which the IF (or RF) signal is digitized directly to baseband. The cost of the extra hardware necessary for optimum combining is very high indeed when compared to a switched , selection , or equal-gain system . Nevertheless, integrated packages for this type of signal processing are becoming increasingly available and digital hardware costs are still decreasing . If high spectrum efficiency becomes sufficiently important in cellular mobile systems, then optimum combining of diversity antennas at the mobile, with its significant

357

improvement in the channel capacity compared to conventional combining types, forces itself into consideration for the system architecture. V.

CONCLUSION

Optimum combining for diversity antennas at the vehicular mobile has been discussed in terms of traditional array parameters. The emphasis has been on physical interpretation of the mathematical formulation. A worst case example of a fast-moving vehicle operating with cellular communication frequencies indicates that the LMS, or other truly adaptive algorithms are not particularly suitable owing to the potentially short times available for adaptation. Also, it has been shown that preprocessing is not likely to be useful for accelerating convergence. However, for slower mobile speeds such as pedestrians and/or lower carrier frequencies, gradient techniques become feasible. Using a sample matrix inversion algorithm, the signal-processing requirements are not so severe and can be implemented using current hardware. In a system with conventional frequency divided 25 kHz bands for each channel, the measurement time for estimating the weighted source vector imposes serious limits on the situations for which the optimum combining is useful. Conversely, the rate of change of the scenario (effectively the speed of the vehicle or its immediate surroundings) limits the returns because of the required measurement time for the weighted source vectors. To derive the benefits of optimum combining, it is necessary to use wide-band pilots such as the pseudonoise codes in a spread-spectrum communications system.

digital mobile radio with diversity combining," IEEE Trans. Commun., vol. COM-31, no. 9, pp. 1085-1094, 1983. [6] P. W. Howells, "Intermediate frequency sidelobe canceller," U.S. Patent 3 202 990, 1965. [7] J. E. Hudson, Adaptive Array Principles. London: Peregrinus, 1981. [8] »: C. Jake~, Ed Microwave Mobile Communications. New York: Wiley, 1974. [9] W. C. Y. Lee. Mobile Communications Engineering. New York: McGraw-Hill, 1982. [10] E. B. Lunde, "The forgotten algorithm in adaptive beamforming," in Aspects of Signal Processing, G. Tacconi, Ed. Reidel, 1977. [11] T. W. Miller, "The transient response of adaptive arrays in TDMA systems," ElectroSci. Lab., Dep. Elec. Eng., Ohio State Univ., Columbus, Rep. 4116-1, 1976. (12] R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980. [13] I. S. Reed, J. D. Mallet, and L. E. Brennan, HRapid convergence rate in adaptive arrays," IEEE Trans. Aerospace Electron. Syst., vol. AES-IO, no. 6, pp. 853-863, 1974. [14] L. F. Ricardi, "Adaptive and multiple-beam antenna systems," in

Proc. Summer School on Satellite Communication Antenna Technology. The Netherlands: North Holland/Elsevier, ch. 6. [15] W. D. White, "Cascade preprocessors for adaptive antennas," IEEE Trans. Antennas Propagat., vol. AP-24, no. 5, pp. 670-684, 1976. [16] B. Widrow, P. E. Mantey, L. J. Griffiths. and B. B. Goode, "Adaptive antenna systems," Proc. IEEE, vol. 55, 1967. [17] J. H. Winters. "Optimum combining in digital mobile radio with cochannel interference," IEEE Trans. Veh. Technol., vol. VT-33, no. [I 8) [19]

REFERENCES

S. Applebaum, "Adaptive arrays." IEEE Trans. Antennas Propagat.. vol. AP-24. pp. 585-598. 1976. [2] D. M. Boronson. "Sample size considerations for adaptive arrays:' IEEE Trans. Aerospace Electron. Syst., vol. AES-16. no. 4. pp.

[I]

446-451, 1980. V. M. Bogachev and I. G. Kiselev, "Optimum combining of signals in space-diversity reception." Telecommun. Radio Eng., vol. 34/35, no. 10, p. 83, 1980. (4] R. T. Compton, HAn adaptive array in a spread spectrum communication system," Proc. IEEE. vol. 66, pp. 289-298, 1978. [5] B. Glance and L. J. Greenstein, ·'Frequency selective fading effects in

[3)

358

3, pp. 144-155, 1984. - - , private communication', Mar. 1988. Y -S. Yeh and D.O. Reudink, "Efficient spectrum utilization for mobile radio systems using space diversity, " IEEE Trans. Commun., vol. COM-30, no. 3, pp. 447-455, 1982.

The Performance of an LMS Adaptive Array with Frequency Hopped Signals

LEVENT ACAR R.T. COMPTON. JR., Fellow, IEEE The Ohio State University Electroxcience Laboratory

The performance of an Ll\-tS adaptive array with a frequency hopped, spread spectrum desired signal and a CW interference signal is examined. It is shown that frequency hopping has several

effects on an adaptive array. It causes the array to modulate both the amplitude and the phase of the received signal. Also, it causes the array output SINR (signal-to-interference-plus-noise ratio) to vary with time and thus increases the bit error probability for the received signal. Typical curves of the desired signal modulation and the time-varying SINR at the array output are presented. It is shown how the array performance depends on hopping frequency, frequency jump size. interference frequency, signal arrival angles. and si2nal powers.

Manuscript received January 19. 1984: revised December 27, 1984. This work was supported in part by the Department of the Navy, Naval Air Systems Command. Washington, D.C. under Contract NOOO 19-82C-0190 with the Ohio State University Research Foundation.

Adaptive arrays based on the least mean square (LMS) algorithm [1] are very effective for protecting communication systems from interference. These antennas can automatically track desired signals while also nulling interference [2]. Methods have been developed for using adaptive arrays with ordinary amplitude modulated (AM) signals [3], binary frequency shift keyed (FSK) signals [4, 5,61, binary phase shift keyed (PSK) spread spectrum signals [4, 7], and quadriphase PSK spread spectrum signals [8]. These techniques have all been demonstrated experimentally. In this paper we study the performance of an adaptive array with another type of spread spectrum signal, a frequency hopped signal [9]. Frequency hopping is a widely used method of spectrum spreading. Its purpose is to make a communication system less vulnerable to interference. For some applications, it may be desirable to combine adaptive arrays with frequency hopped signals to obtain the interference protection of both. However, very little information is available on the performance of adaptive arrays with frequency hopped signals. As we shall show in this paper, frequency hopping has several adverse effects on an LMS array. First, it causes the array to modulate both the amplitude and the phase of the received signal. Second. it makes the output SINR (signal-to-interference-plus-noise ratio) vary with time and hence increases the bit error probability for the demodulated signal. If an LMS array is to be used with frequency hopped signals. these effects must be taken into account in the design of both the array and the signal modulation. In this paper we consider an ordinary LMS adaptive array with continuous feedback loops. We do not consider various modifications of the LMS array (such as weight storage and recall algorithms) that might be used to improve its performance with frequency hopped signals. Our purpose here is to determine when the basic LMS array has problems and to characterize the array behavior with frequency hopped signals. We use a simple model to study this problem. We consider an array with three elements. and we assume the frequency hopped signal has only a few frequencies. Such a model is adequate to illustrate the effects of frequency hopping on the array, and it allows us to explore the interaction between the hopping characteristics and the array transients. Section II of the paper defines the problem and formulates a method for calculating array behavior with frequency hopped signals. Section III describes numerical results based on the technique in Section II. Section IV contains the conclusions.

Authors' current addresses: L. Acar, The Ohio State University,

II. FORMULATION

Electro-Science Laboratory. Department of Electrical Engineering. Columbus. OH 43212: R.T. Compton, Jr., Department of Electrical Engineering, Ohio State University, 2015 Neil Avenue. Columbus, OH 43210.

Consider an LMS adaptive array [1 J with three elements, as shown in Fig. 1. Let the clements be

Reprinted from IEEE Transactions on Aerospace and Electronic Systems, Vol. AES-21, No.3, pp 360-371, May 1985.

359

hopped with a periodic hopping pattern. We suppose the hopping sequence repeats after p hops. We model the desired signal as a CW signal with constant frequency Wh on each hop interval Th - l :5 t < Ti; where h is an integer denoting the hop interval (1 :::; h s p) and This the time at the end of interval h. The duration of hop interval h, TIr - Th - 1 , is called the dwell time. We assume the dwell time is the same for each h. We refer to the separation between two Wit that are adjacent in frequency as the frequency spacing, and we assume the Wh are equally spaced across the band. (Two Wh that occur sequentially in a given hopping pattern are not necessarily adjacent.) We define the hopping frequency iii to be the number of hops per second (the reciprocal of the dwell time), j~ = (T 1 - To) - 1, and the pattern frequency f p to be the number of complete hopping periods (or .. patterns' ') per second ~ i.e. ,fp = (TfJ TO)-l = fH/P, We define the center frequency We of the desired signal to be the arithmetic mean of the hopping frequencies Wh. (The antenna elements in Fig. 1 are assumed to be a half wavelength apart at frequency we') We denote the difference between a specific signal frequency Wh and the center frequency We by .lw".

ARRAY

OUTPUT

i(U

REFERENCE SIGNAL r(t)

Fig. 1

Three-element adaptive array.

isotropic, noninteracting, and a half wavelength apart at the center frequency of the signals. The analytic signal y)(t) from the jth element is mixed with a local oscillator (LO) and then passed through a narrowband filter (NBF). The purpose of the LO and NBF is to dehop the desired signal, as discussed below. The filter output xJ(t) is the input to the jth channel of an LMS processor [1]. This processor multiplies each signal x)(t) by a complex weight w) and then sums the result to form the array output get). The weights wJ in an LMS processor are obtained by correlation feedback loops that minimize the average power in the error signal e(t) [3]. e(t) is the difference between the reference signal ret) and the array output set). The reference signal determines which signals are to be retained in the array output and which are to be nulled. Received signals correlated with ret) will be retained and signals uncorrelated with ret) will be nulled. In practical communication systems, ret) is usually derived from the array output by nonlinear signal processing operations [4-8]. In this paper, we do not address the problem of refernece signal generation. We simply assume ret) to be a signal correlated with the desired signal. Let yet) be a vector containing the element signals.

(5) Finally. we define the relative bandwidth B, of the desired signal to be its total bandwidth divided bv its . center frequency.

max{wh} - min{wh}

To dehop the desired signal. we assume that the LO in Fig. 1 is hopped synchronously with the received desired signal. 1 The LO signal is

(7) where WL is the center frequency of the LO. We assume that WL < We and that the NSF has a center frequency We WL' The bandwidth of the NSF ~ which we denote bv B,-. is assumed smaller than the separation between . adjacent hopping frequencies. All the NBFs in Fig. 1 are assumed to be identical. With this model, the vector YJ(t) is

(1)

and let X(t) be a vector containing the signals at the LMS processor input (2) where T denotes the tranpose. We assume below that the array receives a desired signal and an interference signal, and that there is also thermal noise in the element outputs. Thus, yet) and X(t) may be written yet) = Yd(t) + Yi(t) + Yn(t)

(6)

Yd(t)

=

AJ

eXP{j f(Wc + ~Wh)t + 4JJ)} ] exp{j[ (we + ~Wh) it - Tel) + wdl} [ exp{j[(w c + UWh)(t - 2Td) + l1J J 1}

,

(3)

and (4)

'We do not address the issue of timing synchronization here. Also. if the desired Signal amves from a direction other than broadside. its hopping will have a different timing on each element. because of the propagation time delay between elements. Thus. strictly speaking. the LO hopping cannot be synchronized exactly with the desired signal hopping on every element. However. we assume the interelernent propagation time to be very small compared with the dwell time. In this case, differences in desired signal timing on different elements may be neglected.

where Yd(t), Y,(t), and Yn(t) are vectors containing the desired, interference, and thermal noise signals from the elements, and Xd(t), Xi(t), and X'l(t) are the corresponding vectors at the processor input. Now let us define the signals and determine these vectors. First, we assume the desired signal is frequency 360

where Ad is the desired signal amplitude, 4Jd is the desired signal carrier phase angle, and Td is the propagation time delay between two adjacent elements,

Finally, we assume the element signals Yj(t) contain white, Gaussian noise. After mixing and filtering, the iJ(f) then contain narrowband Gaussian noise signals. The noise vector Xn(t) is

(9)

(19) where the ii/t) are zero mean, Gaussian random processes, each with variance tr', We assume the flj(t) are statistically independent of each other and of l/Jd and l/Ji' Once the signal vector X(t) is defined, the array weights may be found as follows. The weights satisfy the system of equations [1],

ad is the desired signal arrival angle with respect to

broadside. (8 is defined in Fig. 1.) $d is assumed to be a random variable uniformly distributed on (0, 21T). After dehopping, the desired signal vector Xd(t) is }(d(t) == Adexp{j[(Wc-WL)t+WJJ} Ud(h),

dW(t) dt

where

- - + k¢(t) ( 11 )

== kS(t)

(20)

where Wet) is the weight vector

and d(h) is the desired signal interelement phase shift during interval h,

(21)

( 12)

k is the LMS loop gain, ¢(t) is the covariance matrix,

Next, consider the interference. Suppose the interference is a CW signal at frequency Wi incident on the array from angle Sf' The interference signal vector f,(t) is

(22)

and Set) is the reference correlation vector,

Set) == E [X* (t)r(t)]. In these equations, E ( .) denotes expectation, and * complex conjugate. Since Xd(t), Xi(t), and Xn(t) are uncorrelated with each other, ¢(t) reduces to

( 13)

where Ai is the amplitude, W, is the carrier phase angle, and T, is the propagation time delay between two adjacent elements.

¢(t) == E[X;r (t)XJ (t)] + E[Xt (t)Xr (t)]

+ E[X:(t)XJ(t)]

( 14)

== A~Ud(h) U~(h)

The carrier phase angle 4Ji is assumed uniformly distributed on (0, 211") and statistically independent of Wd' After mixing and filtering, the interference signal vector X,(t) is .X,(t) == A;(h) exp{j[(w i

-

WL -

We -

To find the reference correlation vector Set), we must define the reference signal ret). As discussed above, r(t) must be a signal correlated with Xd(t) and uncorrelated with Xi(t) and Xn(t). We shall assume the reference signal has the same form as the desired signal on channel 1 of the processor, but with amplitude An

.lwJr)t + Wi]} V,.

otherwise

uWhl <

8(12

f(t) == A, exp{j[(wc

-

WL)t + l/Jd]},

(16)

and VI == [1, exp( - j;) , exp( - j2,) T]

+ A: 2(h)Ut U] + (J"2/, (24)

where

Iw, -

(23)

Then, from (23), Set) is ( 17)

Set) == A,A d VI (h).

with 4>; the interelement phase shift,

(26)

Also, we note that (t) and Set) depend only on h, because they are constant during each hop interval. Hence we denote their values during interval h by (h) and S(h). Thus, for one period of the hopping pattern, Wet) satisfies the sequence of equations,

( 18)

Note that the frequency hopping has converted the CW interference signal at the antenna element into a pulsed signal at the processor input. 361

d:?) + k
both periodic functions of time. Thus, W(t) satisfies a differential equation with periodic coefficients and a periodic driving term. The solution to such an equation will also be a periodic function of time after any initial transients have died out. 3 In this paper we concentrate on the periodic steady-state behavior of W(t). We do not consider initial transients. Once W(t) has settled into its periodic steady state, the initial weight vectors W(T o), W(T I ) , ••• , W(Tp - 1 ) may be found by invoking the periodicity of W(t) to note that W(Tp ) must be the same as W(To). Thus, using (28) to compute W(t) at the end of each interval, we have the following relations among the initial vectors W(T h ) ,

To:S t < T1

d:?) + k
d:?) + k
Tp_l:S

t < T;

(27)

Suppose W(Th - L) is the value of W(t) at the end of interval h - 1. Then, since W(t) is continuous at each hop, W(Th - l ) is also the initial value of W(t) for interval h. Hence, the solution to this sequence of equations is2

W(t) = exp[ - k(l)(t - To)] [W(To) - -I(l)S(l)]

WI

+ -I(l)S(l),

= exp[ -k(I)(T I - <1>-1(1)5(1)]

W2

W(t) = exp[ - k(2)(t - T1 ) ] [W(T1) - -1(2)S(2)]

1

Wo

+ -l(p)S(p),

Tp - 1 ~ t < T1

WO] Wi

[W:_

1

_

-

To)]

<1>-1(2)5(2)

1-

(p - I )5( p - I)]

exp[-k-I(p)S(p)]

+

Tp -

2) ]

[W(Tp _ 2 )

+ - I (p [Wp -

1) ]

- 1)5( p - 1)

1

ct>-l(p)S(p)

where. to simplify notation. we have used W, = Wel,). Note that in the last equation we have replaced Wp by Wo· If the initial vectors Wi in (29) are regarded as unknowns, (29) has as many equations as unknowns. Rearranging (29) gives the following system of equations for the W,:

-exp[ -k(p)(Tp - Tp-1) ]

I

o

-exp[ -k(2)(T2

o

o

[[I-

I

o

!

+

(29)

where W(To), W(T1) , ••• , W(T -1) are the initial weight vectors for each interval. If these initial vectors were known, we could calculate W(t) at any other time from these equations. To determine the W(Th ) , we proceed as follows. Because the hopping pattern is periodic, (t) and S(t) are

exp[ -k(1)(T1

=

-

(28)

<1>-1(1)5(1)

= exp[-k(p-l)(Tp _

- -

W(t) = exp[ -k(p)(t - Tp - 1) ] [W(Tp - 1)

+

exp] -k(2)(T2 - T 1 ) ] [WI - <1>-1(2)5(2)]

+ -1(2)S(2), Wp-

- -l(p)S(p)]

=

To)J [W o

-

T1)]

-exp[ -k(p - 1)(Tp- 1 - Tp- l ) ]

exp[ -kcll(p)(Tp - Tp_1)]]-1(p)S(p) [! - exp[-kcll(1)(T1- To)]]-I(l)S(l)

[I - exp[ -k(p - 1)(Tp - t

~ Tp-2)]]-I(p -

I

]

(30) 1)S(p - 1) .

3This statement can be proven in the same manner as in [II. eqs. (24)-(29)]. A general proof of this property of differential equations may be found in D'Angelo [12J.

2Matrix exponentials, such as exp[ -k(l)(t - To)], are defined in

[10].

362

This system may be solved numerically" for the initial vectors Wi = W(T;), and W(t) may then be found for other times from (28). Once the weights have been found , the array performance may be calculated . First, the time-varying weights cause the array to modulate the desired signal. The desired signal at the array output is

and

or , from (10), Sd(t)

=

I ~

«

r..

(32)

To characterize the desired signal modulation , we define the envelope modulation ad(t) and the phase modulation TJd(t) by

I

aAt) = Ad WT(t) Ud(h)

I

Th -

I

-s t <

r,

In section III we refer to ~i as the " input INR." The reader will understand that ~, is actually the INR on each processor channel only for those hopping intervals when the interference apears in the filter output. In Section III we use these equations to calculate the array performance with frequency hopped signals.

III. RESULTS

AdWT(t)Ud(h) exp{j[(wc-wdt+l!JdJ},

Th -

is (41)

(31)

Sd(t) = WT (t)Xd(t)

~,

(33)

and (34)

The output signal powers also vary with time. The output desired signal power is

Using the equations above, we have computed the signal modulation and the SINR for a variety of cases . We pre sent these results as follows. First, in Section IlIA, we show typical curves of the desired signal envelope and phase modulation and the output SINR as function s of time . To characterize these time -varying quantities in a simple way, we also define an envelope variation, a phase variation, and a bit error probability. Then , in Sections IIIB-F, we describe how each signal parameter affects the envelope and phase variations and the bit error probability. Section IlIB discusses hopping frequency, Section IlIe frequency jump size, Section IIID interference frequency, Section IIlE arrival angles, and Section IlIF signal powers.

(35)

A. Typical Curves

(36)

First we consider envelope modulation. Fig. 2 is a typical curve , computed for 6d = 15°, 6, = 30°, ~d = 6 dB, ~, = 20 dB , B, = 0.1, and p = 2. The figure

the output interference power is

and the output thermal noise power is

0.....-------------------,

co -

o

3 ELEMENT LMS AAAA T

(37)

The output SINR is SINR =

Pd P, + P"

IW(tl1 2 + ~ :(h)IWT(tlU,12 Th -

where

~

I ~ (

< T,

(38)

c w

a:

is the input SNR per element,

~ =A/ la

V>

2

~: (h) is the input INR in each processor channel during

interval h,

~: (h ) = {t

IWi -

We -

otherwise

~wlrl

<

Bf l2

Fig . 2.

..j.,.,..................."T"'.-rrl~"T"'......,:""""' .................I """"..................I""""...,.............r"""orj

'l'_o.J C>

(40)

0 .0

0. 1

0 .2

0 .1

o.~

O.S

TIME

0 .6

0 .7

0 .8

0 .9

= 15°, 6, = 30°, ~d 6 dB, ~ , = 20 dB, P = 2, B, = OJ, w, = WI '

Desired amplitude versus time. 6d =

' To solve (30), one must evaluate matrix exponentials such as exp(k(I)(T, - To)l . These may be computed by means of Sylvester 's theorem (13\. according to the procedure discussed in [II. eqs . (47) (55)1 .

w

c

(39)

shows the output desired signal envelope versus time . The envelope is plotted in dB, relative to the envelope that would exist with no frequency hopping and no interference. The time axis is normalized so a complete hopping period begins at t = 0 and ends at t = 1. The desired signal is on frequency WI = 0.95wc for 0 < t < 1/2 and on frequency Wo = 1.05w c for 1/2 < t < 1. The interference is on frequency WI (i.e ., Wi = 0.95wc ) . As

363

1. 0

1. 1

may be seen, there is significant envelope modulation on the output desired signal. Moreover, step discontinuities occur in the envelope when the frequency jumps. The reason for this behavior is as follows. A jump in desired signal frequency, at a given arrival angle, is electrically equivalent to a jump in desired signal arrival angle with no change in frequency, because either situation causes a jump in the interelement phase shift. In Fig. 2, the desired signal arrives from ~ == 15° with frequency 0.95w c for 0 < t < 1/2 and 1.05wc for 1/2 < t < 1. This situation is electrically the same as if the desired signal were always on a frequency 1.05wc and arrived from a = 13.54° for 0 < t < 1/2 and from a = 15° for 1/2 < t < 1. Fig. 3 shows the pattern of the array computed at frequency 1.05wc for several times during

(42)

m is the fractional modulation, since it is the total excursion of the envelope normalized to its peak. We refer to m as the envelope variation. Next we consider phase modulation . Fig. 4 shows a typical curve of output desired signal phase versus time ~~..,.-------------,-----.-,

; ELEHENT lHS ARRAY

o

-:::: ~ + ..................,.,~T"""TTT

..........'T".....,........"T'"...................,.,~T"""......,.............l

0'-0.10.00. 1 0. 2 0 . 3 0 .•

fig .~ .

- 70 - 60

-se

-~o

-30 - 20 -10 0

DB

0 . 5 0• •

TlHE

0.7

0.'

0.9

1.0

Desired signal phase angle versus ume . tl" = 15°. fl, ~, = ~O dB . p = ~. B, = 0 .5. w , = W"

L = 6 dB.

=

1.1

~5 °.

for the case ad == 15°.1:1; == 45°. ~J == 6 dB. ~, == 20 dB. B, == 0 .5. P == 2. and W , = WI ' As may be seen. there is substantial phase modulation on the output desired signal. Fig. -+ is typical of what usually happens: the desired signal phase jumps up or down at the beginning of each hop interval and then decays back to zero. This phase modulation is due to the frequency hopping on the desired signal and not to the presence of interference . Fig. 5 shows the output desired signal phase versus time for the same situation as in Fig. 4 but without

10

Fig. 3. Array patterns during hopping period. ed = 15°. e, = 30°. ~ d = 6 dB, ~ , = 20 dB , p = 2, B, = 0.1. W , = WI' Patterns computed at W 2.

the hopping period. The interference is at a = 30° on frequency 0.95w" which is equivalent to interference arriving from a = 26.9° at frequency 1.05w c ' As may be seen in Fig. 3, at the end of interval 1 (at t = 1/2), the array has formed a null on the interference at a = 26.9°. In the second hop interval, the interference disappears (it is filtered out) and the desired signal angle jumps from 13.54° to 15°. Because this jump is toward the existing null, an instantaneous drop in the desired signal amplitude occurs at the beginning of interval 2. During interval 2, the desired signal amplitude increases as the array adapts. Then at the end of interval 2, the desired signal angle jumps back from 15° to 13.54° (away from the null), so the desired signal amplitude jumps up instantaneously. After this jump, the desired signal amplitude drops as the weights form the null at 26.9° again, since the interference has reappeared at 26.9° during interval 1. In order to characterize such a time-varying waveform in a simple way for use below, we define an envelope variation as follows. Let am ax be the largest and am," be the smallest (absolute) value of the output desired signal envelope during the hopping period. Then let

~ ~-y--- ------------------., o 3 ELEMENT

LMS ARR!H

!oJ ...J

~

z'" a:. r-,

'-----------

- ..

"'~

....ex: C

~: ~~..,...,....."T'"TT"O'"T""""....,~,...,... , iT. ''''''''''''1"1 ,.....,.i~'.-T'"'.,,"'''''i~'11""'.'...,.... , .f""T"...,.,........j

C"O.IO .O

Fig. 5 .

0.1

0 .2

0.3

0 ••

T">'

0 .5

TI"E

0 .6

0 .7

0.'

0 .9

Desired signal phase angle versus lime. ed = 15°, p = 2. B, = 0.5 . no interference.

tJ

1.0

1.1

= 6 dB.

interference. Note that almost the same phase modulaton occurs in both cases. The phase modulation occurs because the desired signal interelement phase shift jumps when the frequency hops (unless the signal is incident from broadside) . Hence, immediately after each hop, the 364

array weights no longer have the correct phasing to maximize desired signal response . As the weights respond after each hop, the phase shift of the array seen by the desired signal changes with time. To characterize phase modulation , we define TIm., and TJmin to be the maximum and minimum phase angles of the output desired signal over the hopping period ( - 1T -s TJmtn ::s TIm., -s 2.. ) and !3 to be

!3

= TIm., - TJm,n .

(43)

21T

We refer to !3 as the phase variation . Finally, we consider the output SINR from the array. Fig. 6 shows a typical curve of SINR versus time over one hopping period. This curve is computed for the same ~

........- - - - - - - - - - - - - - - - - ,

3 ELEMENT

LM5 RRRfH o

-

(44) where E, is the signal energy per bit and No is the onesided thermal noise spectral density. For our purposes, we may replace Eb/ No by the signal-to-noise ratio,

Eb = PdTb=~= SNR No No (NjTb )

where P, is the desired signal power and T, is the bit duration; i.e., since T,-I is the effective noise bandwidth, N)Tb is the noise power. Hence P, may be written P, =

P, =

.

2 +"'~~'TT""....-r........y.,..,..""""'"T'r'""""""""""""""""T'r'"TT""...........-n-.'T""""""; '·P.IO.O 0.1 0.2 0 .3 o.~ 0 .5 0 .6 0.7 0 .8 0.9 ).0 1.1 Tll'IE Fig. 6.

SINR versus time . ll" = 15' . ll, = 30°. ( , dB . p = 2. B. = 0.1. W, = W "

= 6 dB. S, =

20

parameters as in Fig. 2. Note that the SINR drops approximately 16 dB at the beginning of the hopping period. Although the SINR recovers quickly from this drop, such a drop can nevertheless greatly increase the bit error probability for the received signal, since the number of bit detection errors is larger by orders of magnitude during this short interval than it is when the SINR is higher. To characterize such a time-varying SINR curve, we define an average bit error probability. We arbitrarily assume that the desired signal, in addition to being frequency hopped, has binary differential phase shift keyed (DPSK) modulation [14] .5 The bit error probability for a DPSK signal in white noise is [15] 'Adding biphase modulation to the desired signal does not change the array weight behavior as long as the bandwidth of the phase modulation is small enough to pass through the dehopping filters. With small bandwidth. the covariance matrix <1>(1) is the same as for a CW signal because the phase modulation terms cancel out. (Also. aside from the dehopping filters. it has been shown [16 J that desired signal bandwidth has almost no effect on array performance anyway . even if the bandwidth is large .) Moreover. as long as the reference signal carries the same DPSK modulation as the desired signal. which we assume. the reference correlation vector S(n is also unchanged . Since both <1>(1) and 5(1) are the same , the weight behavior will be the same with this signal as with a CW signal. (Some examples of how digital phase modulation can be transferred to the reference signal may be found in [4-8] .)

21 exp(-

SNR).

(46)

In addition, for this analysis we assume that interference power has the same effect on detector performance as thermal noise power, so

I---

~

(45)

21 exp(-

SINR).

(47)

Finally, we assume that the SINR transients and the desired signalphase modulation produced by the array are slowcompared with the bit length Ti. In this case the SINR and the signalphase may be consideredconstant over a time interval of 2Tb • (Two adjacent bits are used to detect a DPSK signal.) We define the effective bit error probability Pe as the average of P, over one hopping period, - = ~ 1 IT: '12 exp[ - SINR(t) ] dt. P,. [ ,.

10

(48)

0

We use (48) below for comparing different SINR curves.' Now let us consider the effect of the various signal parameters on the signal modulation and the bit error probability. B. The Effect of Hopping Frequency

The envelope and phase variations m and f3 are large at low hopping frequency and drop as the frequency increases. Bit error probability, on the other hand, is low at low frequency and increases with frequency. Both m and Pe may have local peaks at intermediate frequencies. Fig. 7 shows typical curves of m versus (pattern) frequency. This set of curves was computed for ad = 15°, a, = 45°, ~d = 6 dB, ~i = 40 dB, p = 2, and Wi = WI ' The figure shows several curves for different hopping bandwidths B,. As may be seen, for all but the smallest bandwidths, m has a complicated behavior at intermediate frequencies. This behavior occurs because of "We recognize that (48) glosses over many subtleties that will affect detector performance in an actual system. Our intent here is simply to reduce each SINR curve to a single number to compare different SINR curves. In the presence of a specific system definition , (48) will do as well as anything.

365

:------------------, \ - ' , 3 ELEI1ENT

~--......

~

-

,

C>

II:

o

:--~ __ :__ ~I1S rRA!-

..:-

...

I

-- ' >-

; Sr = I :

. .J'"

co a: co

0"

cr:' ncr:

~=r

- - - - - -- - - - -- - -- - - - .. - ._. :. .~ --;'-:-:- --.....-.. _ ~ _ .i. _ ....:... _

---

ec w

... '", -I

g. 7.

0

1

2

3

~

PATTERN FREOUENCY . HZ XIO"

5

-I

6

Envelope variation versus pattern frequency . e" = 15°. ll, 45°. ~., = 6 dB .~. = 40 dB. p = 2 . '», = w, .

Fig. '}.

=

the way the desired signal envelope changes as the hopping frequency varies. In particular, the time at which am m occurs is in one hopping interval for low values of [p and in the other hopping interval for high values of [p. Typically the smallest envelope in one interval increases with [p, while the smallest envelope in the other interval decreases with [p. At the value of [p where the two minima become equal, the location of am rn in time changes from one hop interval to the other. At this change, the slope of am m versus [p reverses, so the slope of m changes in Fig. 10. Fig. 8 shows typical curves of phase variation f3 versus [p . These curves were computed for the same 0-r-

~

:.

'

~---~-~--~----~--

~~ ':c-_~:i},--:;;~l:l:1:~-c_=C '"'2

-I

0

I

2

3

PATTERN FREOUENCT. HZ

~

X\O"

5

3

~

5

6

Bit error probability versus pattern frequency . 9" = 15°. ll, = 45°. ~" = 6 dB. ~. = -to dB . P = 2. w, = w, .

intermediate J;, and then drops at higher f;, . However.

The larger the frequency jumps encountered by the array. the larger the variations m and f3 and the greater the SINR reduction . In a frequency hopped system, the size of the frequency jumps depends not only on the frequency spacing (the total bandwidth divided by the number of frequencies) , but also on the hopping pattern. For the same spacing. different hopping patterns will produce different frequency jumps. Moreover, bit error probability is affected not only by the size of the frequency jumps. but also by how oftell the jumps occur. since it is an integrated quantity. In general. to minimize 15" one should choose a hopping pattern that minimizes the number of large jumps and also reduces the frequency with which large jumps occur. The effect of frequency jump size may be seen in Figs. 7, 8 and 9. These figures each show several curves for different bandwidths Br • Since there are only two frequencies i p = 2), the total bandwidth is the same as the frequency spacing and the frequency jump size. As may be seen, as bandwidth increases, the variation m and 13 and the bit error probability Pe all increase . This behavior is easily understood . As the desired signal frequency jumps become larger, the jump in interelement phase shift at each hop becomes larger. A larger jump means that the array weights are farther from their optimal values at the new frequency . Thus, a larger weight transient is required after the jump. More

, . .

~

2

C. The Effect of Frequency

LMS AnnA T

\

I

PATTERN FREOUENCY . HZ X\O"

behavior is similar to what happens when an adaptive array receives pulsed interference and a desired signal with no hopping [Ill . (Frequency hopping converts the CW inteference into pulsed interference . The two problems differ, however. because frequency hopping also causes jumps in the desired signal interelement phase shifts .)

--,

~~

0

: 0.001 ,

P, is always higher at Iarge j, than at low!" . This

3 ELEMENT

\

t

: 0 .2 : 0 .1

6'

Fig. 8. Phase variation versus pattern frequency. e,/ = 15°, e, = 45°, ~d = 6 dB, ~; = 40 dB, p = 2, W; = W I '

parameters as in Fig. 7. In general, phase variation is highest at low hopping frequency and drops to a constant as the hopping frequency increases. At large Jp, the array weights are too slow to track the hopping. The nonzero asymptotic phase variation is caused by the jumps in interelement phase shift when the frequency hops. Finally, Fig. 9 shows typical curves of bit error probability Pe versus [p, again for the same param:!ers as in Fig. 7. As may be seen, for higher bandwidths P, simply increases with [p. At lower bandwidths, P, peaks

366

or-------------------,

equivalent angle will be closer to 6, if the interference is on W I than if the interference is on wz, since W3 - W I is greater than W 3 - Wz. Hence, with interference on w" the desired signal falls farther into the interference null, the SINR is reduced more and a higher P, results than with interference on Wz. Note that the edge of the band that is worse depends on the signal arrival angles. In the example above, we have 0 < 6d < 6" and the worst performance is obtained with the interference on W I ' If instead we have 0 < 6j < 6d , then interference on W 3 , the other band edge, will give the worse performance.

,

E. The Effect of Arr ival Angles

envelope and phase modulation is produced and the SINR is lower after the jump. D . The Effect of Interfer en ce Frequen cy

Interference near the edge of the hopping bandwidth is more harmful to the array than interference near the center of the band. Figs. 10 and 11 illustrate this point. These figures 15°. 6j 45°, ~d = 6 dB, show P, versus f p for 6;, z

o x_

.... ''" CD

The envelope variation and the bit error probability increase as the interference arrival angle approaches the desired signal arrival angle. Interference arrival angle has almost no effect on phase modulation except when the interference signal is extremely close to the desired signal. Figs. 12 and 13 show the envelope variation m as a function of pattern frequency for 6d = 15°, ~ d = 6 dB,

a:

CD

o~

~'

Q..

~

0' ~

.

.... ~

... '", CD

-I

Fig. 10.

R

o

O

Il

3

~

PA TTERN FREQUENC Y, HZ XIO·

5

Bit error proba bility versus patte rn freq uency. ij./ = 15' . = ~5 ° . ~.J = 6 dB. ~ , = ~O dB. p = 3. w , = w ,

°

0,

6

3 ELEHENT LHS ARRAY

E

3 ELEHENT LHS ARRA Y

x...... l

.... '"' . -CD

_

~

.

-

, 6 r=l.4

ILl"

;

...J

._

.

~

:

._.-

,

... '", ,- --~.-

...

._

._

._

._

:

.~

i

o

._

ILl

.

ILl

3

~

-,

., i l

3

~

PATTERN FREQUENCY, HZ X10"

Fig. 12. E nvelope variation versus pattern frequ enc y. 6" = 15°, ~ ,/ 6 dB. ~ , = 40 dB , p = 2, B, = 0.1, W , = W "

CD

Oil

,

O

-I

0 .00 1 /

PATTERN FRE QUENCY , HZ XIO"

.,

>'" 20

,

.... r ·0 .2

-1

... _._i

D.o

~: _:> ; ~-'~-l~-,]~t:=Fig. II .

---..

ij ,

=

5

0-.=-."-=,...,,..::-::-.,..-.,.....,..,...0-:-:-------------,

Bit erro r prob ab ility vers us patte rn freq uency . 6" = 15°. 6, = ~5 ° . ~.J = 6 dB. ~ , = ~O dB. p = 3. w , = w , .

E

= 40 dB, and p = 3. Fig. 10 is for W , = W I and Fig. 11 is for W = W :! (where WI < Wz < w 3)' The performance in Fig. 10, when the interference is at the edge of the hopping band. is much worse than that in Fig. 11 , when the interference is at the center of the band. The reason for this difference may be understood in terms of the equivalence between desired signal frequency and arrival angle discussed earlier. Suppose 0 < OJ < 6, as in Fig. 10. The array will produce a null in the pattern at 6j on the interference frequency, either W I or w, . Since o < 6J < a., the equivalent desired signal arrival -angle. as seen on the interference frequency, will be closest to 6, when the desired signal is on frequency w, . The ~,

j

"

ILl ' 0.0

o...J

ILl",

>. Zo

ILl

-I

O

i l

3

~

PATTERN FREQUENCY. HZ XIO"

Fig. 13. Env elop e variat ion ver sus pattern frequen cy. 6" 6 dB , ~, = 40 dB , p = 2, B, = 0.1, W , = W "

367

5

= 15°, ~ " =

~d = 40 dB, B, = 0.1, P = 2, W, = WI. and for different interference angles. Fig. 12 shows 6; = 0°, 30°, 45°, 60°, and 90°, and Fig. 13 shows 6; = 5°, 10°, 13°, 17°, 20°, and 25°. It may be seen that m increases as 16, - 6d l decreases. For 6, very near 6d , the variation m is quite large. The phase variation 13 is small unless 6, is very near 6". Fig. 14 shows a typical case, for 6" = 15°, ~d = 6 dB, ~; = 40 dB, B, = 0.5, p = 2, and W; = WI' Note

probability is primarily to shift the value of the hopping frequency for a given 15e : Bit error probability is very sensitive to the input SNR. Figs. 16 and 17 show the envelope variation and the bit error probability versus jp for 6J = 15°, 6, = 45°, ~ = 6 dB, P = 2, w, = WI ' B, = 0.1, and for several 0-r-

E

0_-----------.,----,-------, 3 ELEHENT

--,

3 ELEHEN T lHS ARRAY ,

..

zci

o

I-

L.HS ARRA Y

~

!

I !

..

:

irci

>

- -

;

. :-:--:- . .

jj ...

.

~

:/

t::-= ::.::---A

~25°

-J

' - - 130 I

2

3

PATTERN FREQUENCY , HZ

~

X10 N

Fig: . 16.

..o

30~ I

i

!

L....

cr

CD 0'"

a: '

"-

a: ~=r a:

....

;

1

,

/

I

. .# . . . , . . . . . " .",

2

3

.

~

PATTERN FREQUE NCY, HZ XIO N

S

2.8,

=

0.1. w ,

=

ad =

15°.

W"

e,

=

20

'

6~~ ·

II>

o

I

2

3

PATTERN FREQUENCY, HZ

~

X10 N

Bit error probabil ity versus pattern freq uency

=

45 °. ~"

=

6 dB. f'

=

2.8,

=

a"

0 .1 . w, = W"

S

=

6

15°.

a,

values of INR. Fig. 17 illustrates how the hopping frequency at which P. peaks varies with the INR. Fig. 18 shows 13 versus j~ for the same parameters as in Figs. 16 and 17 except that B, = 0.5 . Fig. 19 shows the bit error probability versus f p for aJ 15°,6, = 45° "~i = 40 dB, P = 2, Wi = WI' B, = 0.1, and for several values of input SNR. As may be seen, Pe is extremely sensitive to the SNR, as it would be even in a simple DPSK communication system without an adaptive array.

!

60 0

".

=

~ I N R = 10dS'

r ig . 17.

•

- ---- ----------

.

6 dB . P

~

-I

; : 45 i ___'__s> 0

t, =

6

S

... ,

I

,

~

X10 N

Envelope variation versus pattern frequency .

-' '" ' CD

• 0° ............

I

3

,

~----;--~- ~-

'Fig. 15.

I

3 ELEHENT LHS ARRAY

3 ELEHENT LHS ARRAY

=

2

o

0,-.--------------------. ; 8i l

J

0

)(-

also that 13 is much larger for 6, just above 6d than for 6, just below 6d • The reason is as discussed above: in one case the desired signal hops into the null left by the interference whereas in the other it hops away from the null. Fig. 15 shows curves of the bit error probability for the same parameter values as in Fig. 12. It is seen that P, is largest when 6, is near 6d •

o

i

i !

L...

20

.

t

,

PATTERN FREOUENCY , HZ

45 °.

6

5

Fig. 14. Phase variation versus pattern frequency. e" = 15°, I;d = 6 dB , 1;; = 40 dB, P = 2, B, = 0.5, w, = WI'

-I

;

!

. 50

- - -~--:-:~-----------j o

j.

> ,'""- ~/ . z 0 ~.~ w . IO

_ . -~'--....., .... "",,-=-. ~- .' -=--:-";':"';"=' '' -=-=-,-

~ IOO

·1

I

, INR i=4 0 dB l·~ ~ ·

~~ ..........--

.iI

.. .

j

~

6

Bit error probability versus pattern frequency. ed = 15°,1;" = 6 dB, 1;, = 40 dB, p = 2, B, = 0.1, w, = W I '

F. The Effect of Signal Powers

IV. CONCLUSIONS

The input INR has almost no effect on the phase modulation and very little effect on the envelope modulation. The effect of the INR on the bit error

A frequency hopped desired signal has several effects on an LMS array. It causes the array to modulate both the envelope and the phase of the output desired signal.

368

----,

o~

Also , it causes the array output SINR to vary with time below its optimal value and increases the bit error probability for the received signal. The signal parameters affect the desired signal modulation and the bit error probability as follows:

3 ELEMENT LMS ARRAY

z o

C)

1- ...

~.; et:

(1) Envelope and phase modulation are large for low hopping frequencies and drop as the frequency increases. Bit error probability is low at low hopping frequencies and increases with frequency . Both the envelope variation and the bit error probability may have local peaks at intermediate hopping frequencies. (2) Envelope and phase modulation increase with the size of the frequency jumps in the hopping pattern. Bit error probability is increased as the frequency jump size increases. (3) An interference frequency at the edge of the hopping bandwidth is more harmful to the array performance than an interference frequency at the center of the band . (4) Envelope modulation and bit error probability incre ase as the interference arrival angle approaches the desir ed signal arrival angle. Phase modulation is not affected by interference arrival angle unless the interference is extremely close to the desired signal. (5) Input INR has almost no effect on phase modulation and very little effect on envelope modulation. Input INR affects bit error probability by shifting the value of the hopping frequency required for a given bit error probability. Input SNR has a very large effect on bit error probability, as it would in any DPSK system, even without an adaptive array.

>r

UJO

::I: '"

ll.';

.

I----~/_ · _

Fig, IX ,

30

. 40

.

Phase va riation versus pattern frequency. B" .J5° . f."

=

fldB .!, = 2 .8, = 0 5 .w. =

o ;---

><... I

'

.

o I 2 3 ~ PATTERN FREOUENCY. HZ XIO"

-I

~

INR =: 10dB 20 '

/ '

:

-.::-

-----'_ _

""-SNR: - IO dB

~

-

15 , H,

" " " .. " ",r

, ~t..I..J'_=CJ>U._ i

LMS ARRA y -:--

~O

6

W,

'-

~

...

5

_

:

~ '"

eo '

a:

al

0"

a:' Q..

a: o~ a: a:

....

.r "

... ,

- - ,

\~

10

\

~

\

\

-I

Fig , IlJ

o I 2 3 ~ PATTERN FREOUENCY . HZ XlO"

5

6

Btl arm probability versus pattern rrcqucncv ~ " = 15' . H = .J5 ' . ~, = .JOdB . !' = 2. 8 , = 0 l . uJ ~ w !

369

[8]

REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

Widrow, B., Mantey, P.E., Griffiths, L.J., and Goode, B.B. (1967) Adaptive antenna systems. Proceedings of the IEEE, 55 (Dec. 1967),2143. Compton, R.T., Jr. (1976) An experimental four-element adaptive array. IEEE Transactions on Antennas and Propagation, AP-24 (Sept. 1976), 697. Riegler, R.L., and Compton, R.T., Jr. (1973) An adaptive array for interference rejection. Proceedings of the IEEE, 61 (June 1973), 748. Compton, R.T., Jr., Huff, R.J., Swarner, W.G., and Ksienski, A.A. (1976) . Adaptive arrays for communication systems: An overview of research at the Ohio State University. IEEE Transactions on Antennas and Propagation, AP-24 (Sept. 1976),599. Hudson, E.C. (1980) Use of an adaptive array in a frequency-shift keyed communication system. Report 712684-1, Ohio State University ElectroScience Laboratory, Columbus, Aug. 1980. Ganz, M.W. (1982) On the performance of an adaptive array in a frequency shift keyed communication system. . M.Sc. thesis, Department of Electrical Engineering, OhIO State University, Columbus, 1982. Compton, R.T., Jr. (1978) An adaptive array in a spread spectrum communication system. Proceedings of the IEEE, 66 (Mar. 1978),289.

[9] [10]

[11]

[12]

[13]

[14]

[15]

[16]

370

Winters, J.H. (1982) Spread spectrum in a four-phase communication system employing adaptive antennas. IEEE Transactions on Communications. COM-30 (May 1982), 929. Dixon, R.C. (1976) Spread Spectrum Systems. New York: Wiley, 1976. Bellman, R. (1970) Introduction to Matrix Analysis. New York: McGraw-Hill, 1970. Compton, R. T., Jr. (1982) The effect of a pulsed interference signal on an adaptive array. IEEE Transactions on Aerospace and Electronic Systems, AES-18 (May 1982), 297. D'Angelo, H. (1970) Linear Time-Varying Systems: Analysis and Synthesis. Boston: Allyn and Bacon, 1970. Hildebrand, F.B. (1952) Methods of Applied Mathematics. Englewood Cliffs, N.J.: Prentice-Hall, 1952. Ziemer, R.E., and Tranter, W.H. (1976) Principles of Communications. Boston: Houghton Mifflin, 1976. Lindsey, W.C., and Simon, M.K. (1973) Telecommunication Systems Engineering. Englewood Cliffs, NJ.: Prentice-Hall, 1973. Rodgers, W. E., and Compton, R. T., Jr. (1979) Adaptive array bandwidth with tapped delay-time processing. IEEE Transactions on Aerospace and Electronic Systems, AES-15 (Jan. 1979), 21.

l. INTRODUCTION

An LMS Adaptive Array for Multipath Fading Reduction

YASUTAKA OGAWA, Member. IEEE

MANABU OHMIYA KlYOHIKO ITOH, Member, IEEE Hokkaido University

Multipath fading often poses a serious hindrance in radio communication. The application of a least-mean-square (LMS) adaptive array to the problem of multi path fading reduction is discussed. However, it is known that multipath components are in general correlated with one another. \\ore examine the effect of the correlation on the performance of the L~IS adaptive array. When the correlation coefficient does not equal or approximate I, the L~IS

adaptive array suppresses the multipath signals significantly

by nulling. On the other hand. when the correlation coefficient nearly equals 1. the L~IS adaptive array prevents the output sianal power from decreasing. Therefore. the L~lS adaptive array may reduce the multipath fading effectively for any correlation coefficient value. A reference signal in the

L~IS

adaptive array is

also discussed. It is shown that synchronization in the reference signal generation must be extremely accurate. Moreover, a processor configuration is proposed which may generate the reference signal with the required accuracy.

Radio communications suffer from multipath fading. It has been reported that only a few multipath components are often dominant in strength and play an important role in the multipath fading phenomena [I]. Thus, an adaptive array [2] has a potential to reduce the multipath fading. A least-mean-square (LMS) adaptive array automatically tracks a desired signal and nulls interference signals. The LMS adaptive array. however. requires a reference signal in order to control each weight. Let us assume that the desired signal contains a deterministic component which is fully known at the receiver. Then, the deterministic component may be used for the adaptive array reference signal. An example of the deterministic component is a pilot signal which is added to the transmitted communication signal [3, 6]. According to the Ii terature [3], the LMS adapti ve array may eliminate the undesired multipath signals by nulling, in a case where a modulated pilot signal has a sufficient bandwidth to discriminate between multipath propagation modes. This means that the LMS adaptive array may suppress the undesired multipath components when the incident components are not correlated with one another. In mobile communications. however, we do not know the time delay differences between rnultipath components. Then. the required bandwidth of the pilot signal is in general unknown. Even though we know the time delay differences, an unrealistically wide bandwidth might be required. Thus, in order to reduce the multipath fading by the adaptive array, we must consider the effect of the correlation between multipath components on the performance of the adaptive array. The literature [4] proposed a preprocessing scheme for the adaptive array which may suppress the coherent signals. A disadvantage of this scheme is that it needs more antenna elements than the conventional adaptive array. We show that an LMS adaptive array may reduce the multipath fading effectively for any correlation coefficient value between multipath signals. First. we examine the behavior of the LMS adaptive array in the presence of the correlated multipath signals. Second, we find required synchronization accuracy in reference signal generation. Third, we propose a processor configuration which generates the reference signal. II. FORMULATION OF THE PROBLEM

ManUSCript received September 11. 1985: revised March 29. 1986. ,A~lhors' address: Department of Electronic Engineering. Faculty of Engineering, Hokkaido University. Kita 13. ~ishi S. Kita-ku. Sapporo 060, Japan.

We consider the N-element linear LMS adaptive array shown in Fig. 1. We assume that two multipath components set) and m(t) are incident on the array from angles 8I and em relative to broadside, respectively. The antenna elements are assumed to be isotropic and a halfwavelength apart. \Ve represent both signals on the kth element by Sk(t) and mk(t) (k == 1--- N). Thermal noise nk(t) is assumed to be present on each element signal.

Reprinted from IEEE Transactions on Aerospace and Electronic Systems, Vol. AES-23, No.1, pp. 17-23, January 1981.

371

Information Signal

P'i Lo t Signal

\nl'ular F"requency

Fig. 2. Fig.!.

LMS adaptive array.

Then, the complex-valued element signal is given by ik(t) = Sk(t)

+

mk(t)

+

nk(t).

(1)

We assume that the thermal noise components on different elements are independent and that they are also independent of the signals set) and m(t). We define an iV-dimensional signal vector as

(2) where T denotes transpose. Furthermore. we express the complex weight for the kth element as Wk and the LVdimensional weight vector as W, i.e., W ::::

We assume that the parameter G in Fig. 1 is determined adequately so that the deviation of the weight vector W from the ensemble average is negligibly small. Thus, we consider the weight vector W to have the ensemble average. We represent the power of Sk(t), mk(t), and iik(t) by Si' M i, and N;, respectively. Namely, we have < Sk(t) 12 ),

=

N,

I

(4)

for k = 1 --- N

where (.) denotes the ensemble average. It is assumed that till (r) is delayed from S I (t) by Then, rill (1) is expressed as mt(t) = \iM,ISiSl(t-T)

is almost the same as that of the information signal. For simplicity, we consider only the pilot portion. We express the reference signal as (6)

When Tr = 0, the reference signal coincides with s(t). Similarly, when Tr = 1', it coincides with met). Now, we represent the complex envelopes of 51 (t) and ml (t) by S1 (r) and M1(t), respectively. Then. we have

(7)

m1(t )

e-J'lI'

(5)

where '1" is a phase delay which occurs by a reason other than the propagation delay difference. In order to generate a reference signal ret) in the LMS adaptive array, a pilot signal is always transmitted together with an information signal [3, 6]. The power spectrum of the signal is illustrated in Fig. 2. The pilot signal is modulated by a signal which is fully known at the receiver. Both bands are located very closely to each other. Namely, let We and w; be the center angular frequencies of the pilot and information portions, respectively. Then, IW e - w;l/w c « 1 holds. The reference signal which is generated by the reference signal processor is a replica of the pilot portion of the signal. We assume that the bandwidth of the pilot signal

j W c1

(8)

•

(9) We define the normalized autocorrelation function of as

S1(t)

( 10)

where * denotes the complex conjugate. Moreover. we define the covariance matrix Rxx and correlation vector Vxr as Rxx = (X*(t) XT(t) ) Vxr

'T.

= Ml(t) e

From (5), (7), (8), we may obtain

(3)

[WI l-V2 . . • WN]T.

s, ::::

Spectrum of signal.

=

(X *(t)

r (t)

( 11 ) ( 12)

).

Here, we assume that the bandwidth of the pilot signal is narrow enough that the interelement delay does not change the envelope. From these results, the i p ,q )th element of Rxx is given by

(~i; (t)~iq(t) )

= S; eJ(p-q)Q>s + + \is; M i

M; eJ(P-q)
P*(T)

+ N, bp q

ej{(p-I)
where

= 1T sin

as

( 14)

(15) and t1p q is the Kronecker's S.

372

70...--- --

Similarly, the kth element of Vxr is given by ~

CD

( i:(t),(t) ) :==

5; p r

e

"vIf'T , M ; P I u

-( ~ - T)

r

e

j {' k - ) )
(s n r)m) (t) )/ Y5,M, = p * (i ) e - /IW,T+ ,v' l

(\6 )

( 17)

This means that p * (T) e- j (W,.T + 'V · ) is the complexvalued correlation coefficient between .~ ) (t) and iii)( r) . As stated previously. we consider that the weight vector has the ensemble average . Thus. the steady-state weight vector is given by the Wiener solution. i.e .. Rxx -

--,

., 3 0

.

~

.:- 2 0 ~

:==

-

~c::J 4 0F-""7~~~::::

j{Ck - I }
o

Here. it should be noted that the following equation holds.

W

-

~ 50

- * (T)

-I-

-

S0

I

Vxr .

(18)

This is true whether the incident signals are correlated with or not. We represent the array output of .i'(tl. 171(£ ). and the thermal noise by so(t ) . 1710(£)' and lio(rl. respectively. Moreover , we represent the power of them by So. Mo. and No. Namely. we have .\-10

, - I' \1/ -1

= ( l //l o(t ) -

00

0 .S

1 .0

c

Output DCR versus C. N = 2. 8, = 0°, 8.. = 30°, Tr = 0 , S,IN, = M,IN, = 20 dB.

Fig. 3.

provided that C does not equal or approximate I. The output OUR is above 20 dB when C ~ 0 .9. Fig. 4 illustrates the effect of C on the array pattern. It is apparent that when C ~ 0 .9. the null is pointed almost exactly toward InU). However. when C > 0,9. the null is shifted or lost. Now we discuss the case where C nearly equals I. When C = I. i is significantly less than the reciprocal of the frequency bandwidth of the signal. In this case, the fad ing is not frequenc y-selective but freque ncy -tlat. Furthermore . we may regard d ;)(r) defined by (2 1) as the approx imate output desired signal. d ;l (t) :== .~I) ( t)

([ 9)

When So 2:: Mo. we consider that .W ) is the desired signal and ,nU) is the undesired one. On the other hand. when .Vl o > So, we consider that ,nUl is the desired signal and sU) is the undesired one. Here. we represent the desired-to-unde sired-signal-ratio by OCR. Note that the output OUR is given by So;Ml) when Sf) 2:: Mo and that it is given by M ol 50 when Mf) > So. Moreover. we add that all of the numerical results which are shown later are computed values.

10

Let D ;\

.i,

In ,)(t) .

D:, denote the power of ii ;\(t).

i.e .. (22)

: a~ ([) I ~ ') , ~ ,

:== (

The output O':\R (D ' I)/ .Y,)l represents the approximate output desired-signal-to-noise-ratio when C = I. Fig. 5 shows the output O':-JR versus C tor several values of '1' . From these curves. it is seen that the LMS adaptive array prevents the output signal power from decrea sing. Namely. the frequency-nat fading is reduced , When C = I, the weights are determined in such a way

III. MULTIPATH FADING REDUCTION

Now we discuss the steady-state performance of the LMS adaptive array. In this section we assume that Tr:== 0 holds. Namely . we assume that the reference signal coincides with s( t) . In order to simplify the notation. we introduce the real-valued symbols C (0 ~ C ~ 1) and 'I' which satisfy (20).

--" I I J~

Fig.

.i

(20) From ( 17), it is seen that C and '1' are the magnitude and phase delay of the correlation coefficient of sd t) and rn\ (t ) , respectively. Fig. 3 shows the output OUR versus C for several values of '1'. Since Sf) 2:: Mo holds for these parameters. i( f) is the desired signal and 1iI(t) is the undesired one . It is seen that the output OUR depends on the correlation Coefficient (C e - )'1') . The undesired signal is, however. Suppressed significantly by the LMS adaptive array

Array pattern. N = 2, 8, = 0°, 8.. = 30°, i' = O°, Tr =O. S,IN, = M,/N, = 20 dB . 5 0 , --

-

-

- -- - ,

~ 4B

. 95

Fig. 5.

373

. 99

C

. 999

. 9 9 99

Output D'NR versus C. N = 2, e, = 0°, 8", = 30°, Tr = 0, S,lN, = M,/N, = 20 dB.

that the weighted signals Wk {sk(t) + mk(t)} (k = I - N) are added in-phase at the array output. This means that the LMS adaptive array realizes space diversity when C = I. Here we consider the physical reason why the weighted signals are added in-phase at the array output. As is shown later, the waveform distortion of the output 1 Wk{Sk(t) + mk(t)} is signal component do(t) = negligibly small . In other words, when C = 1, the output signal component has almost the same waveform as that of the reference signal f(t) = SI(t). The LMS adaptive array shown in Fig . I realizes the weights which min imize the mean-square error ( je(t)12 ). We see that the error is given by

Lr=

e(t) = r(t) -

y(t) tv

2.:

k =1

wk,ik(t) ·

The last term L~= 1 IVk'lk(r ) is the thermal noise . If the weighted signals are added in-phase at the array output , the output signal component do(r) coinsides almost perfectly with the reference signal keeping the weight . norm YWw small value (t denotes complex conjugate transpose) . This means that the thermal noise power :V, W+W in the error e(t) has a lower value . Thus. the meansquare error is minimized by add ing the weighted signals in-phase at the array output when C = I . According to the literature [4], the signal cancellation phenomenon occurs when the desired signal is correlated with one or more interfering signals . When the incident signals are correlated with one another. the desired signal is canceled in adaptive arrays other than the LMS adaptive array shown in Fig . I. Even though the incident signals are not correlated with one another, the desired signal may be canceled (7) . Under some circumstances . the weights do not converge to the Wiener solution . The "non-Wiener" effects cause signal cancellation [7] . However. as stated previously, the parameter G in Fig. is determined in such a way that the deviation of the weight vector from the ensemble average is negligibly small. Namely, the steady-state weight vector is given by the Wiener solution (18). Thus, the signal cancellation phenomenon due to the non-Wiener effects does not occur in the LMS adaptive array discussed In this paper. The LMS adaptive array does not cancel the desired signal even when the incident signals are correlated with one another. This is because the signal cancellation increases the mean-square error in the LMS adapti ve array. It is shown analytically in the Appendix that the weighted signals Wk{sk(t) + ,nk(t)} (k = I -N) are added in-phase at the array output when T = 0 (C = I) and that the signal cancellation phenomenon does not occur in the LMS adaptive array . Moreover, we investigate the distortion contained in do(t) . We define the distortion power Eo as Eo

= m~n(

Ido(t) - a 5 1(t) 12 )/2.

(23)

Fig. 6 shows the output E'NR (EoIN o) as a function of C for several values of 'l' . We see that the output E'NR is less than 0 dB. Namely, Eo is less than the output thermal noise power. Therefore, we may say that although the multipath signal is not suppressed when C = I. the waveform distortion is negligible . IV. REFERENCE SIGNAL GENERATION

Thus far , Tr has been assumed to be a. Namely . we have assumed that the reference signal coincides exactly with s(r) . In this section we discuss the problem of the reference signal generation . In the remainder of this paper, we assume that the pilot signal is biphase modulated by a pseudonoise (PN) sequence with a long period . Then , we express the normalized autocorrelation function p(r) as pU) =

{I

for It I :oS T elsewhere

-ltI IT,

a,

(24)

where T denotes a clock pulse duration. A local PN sequence generator at the receiver modulates the carrier which is recovered from the array output. However, the reference signal in general does not coin cide with the incident signal in time . i.e . . Tr rf 0 and Tr rf r . Thus we must synchronize it to the time of arrival of s(t) or ,n(tl. A. Synchronization Accuracy

From the above assumptions . we examine the effect of Tr on the steady-state performance of the LMS adaptive array. Fig . 7 shows the output DUR versu s t nt: When , iT =: a.5 , sU) and ,nU) are correlated with each other

_ 21!r --

-

-

-

---,

OJ

~

10

. 95

Fig. 6.

.99

C

Output E'NR versus C. N

. 999

= 2.

. 9999

0, = 0°. 0,,,

S,IN , = M ,IN , = 20 dB . 51! r - -OJ -0

Q:

= 30

0

•

T,. = O.

- - - ---,

40 30

::J

020

., , 10 .,a.

o"

0

-I 0 '--~--'--~-'-~-' -I 0

Tr /T

Fig. 7.

Output OUR versus Tr lT. N= 2. a,= 0°.0", = 30°. '!' =0°. S,INi = M, IN, = 20 dB .

374

according to (17) and (24). When TIT = 5, they are independent of each other. When Tr IT = 0 , the reference signal coincides with s( t) . Similarly. when TIT = 0.5 and tnr = 0.5. it coincides with m (t ) . It is seen that satisfactory output OUR is obtained around Tr iT = 0 or TrlT = 0.5 for TIT = 0.5 . On the other hand , the output DUR has a steady and satisfactory value for - 1 < Tr iT < I when TIT = 5. Fig. 8 shows each output power normalized by N;l2 versus Tr lT for TIT = 0.5 . The period for which the output undesired signal power is suppressed less than No is about 0 .2 T around Tr tT = 0 or Tr IT = 0.5 From these results, we say that the synchronization of the reference signal must be extremely accurate in the case where the input multipath components are correlated with one another. B. Reference Signal Pro cessor

Now we propose the configuration of the reference signal processor . The reference signal generation consists of two parts just like the synchronization process in a spread spectrum receiver 15). One is acquisition and the other is tracking. The acquisition is implemented by a sliding correlator [5] which performs the search process and calculates the correlation between the received pilot signal and reference signal. The sliding correlator makes the reference signal coincide with the pilot signal within T. During the initial acquisiuon. the reference -ignal is not correlated with the input signal and all ~) f the weights are driven to O. if the LMS adaptive processor operates . Thus, until the initial acquisition IS achieved . we do not make it operate. Namely. the weights are frozen in fixed = \\\.= O. values, for example. W I = I. \t.: = \\', = . Now we discuss the tracking process which is the second part of the synchronization. The tracking circuit operates in such a way that the reference signal coincides with the transmitted pilot signal as precisely as possible. When the multipath components are correlated with each other, the correlation function between the array output signal and generated reference signal is not a symmetric triangular function. Thus. a delay-lock loop 15] may not be employed . Also, it is difficult to achieve an accurate synchronization bv use of the conventional tau-dither clock-tracking loop [5]. Thus. we must configure the new tracking circuit. 3 0 ....-- - --

'" -10 'I .

-

-

-

Fig . 9 shows the normalized MSE (mean-square error) in the LMS adaptive array versus Tr IT. Here , the MSE is defined as (25)

It is seen that when the reference signal coincides with s( t) or nl (t) (Tr IT = 0 or Tr IT = 0 .5), the MSE has an extremely low minimal value. Then, the MSE may be used for the recognition of the synchronization. After the initial acquisition is achieved by the sliding correlator, we make the LMS adaptive processor and tracking circuit operate. The configuration of the tracking circuit is shown in Fig. 10 . Note that each signal in Fig. 10 has a real value. The yeO (voltage-controlled oscillator) is controlled in such a way that the MSE has a minimal value . This circuit is analogous to the tau-dither clock-tracking loop used in a spread spectrum receiver. Fig . II shows an example of waveforms in the tracking circuit. We set the duration time (T') of the rectangular wave a U) lonzer than the converzence time of the weights. The ~mplitude of a(t) is~ V 'N,I'2 . Then, a (t ) = ::: Y N/ 2 holds. At a leading edge of a U ), the clock phase of the reference signal is shifted back by a fraction (.~ T) of the clock pulse duration. It is shifted forth by the same amount at a trailing edge of a(t). At the output

Fig 9 .

Array

L

5-20

a..

-30

-1

h it )

ri g. 10.

~~-

Fig. 8 .

L_ _---'

1 U--L -~

~:

~:

l 0

T racking c ircuit.

8~ i '--~T- !'~i

--,

. :",- ,

':~O ;'\ l

Adaptive Process or

Local Os ci ll a to r

. '.

.

LMS

Ou t p u t

';; 0 :'. , _~ --:' Il

\ lSE 'N, vervus r- t .\' = 2. fl , = 1)0 . fl" = 30'. 'I' 7.T = 0 .5 S. .v, = ,W,. N, = 20 JB

I"

,

!

nr t ,

i

Il ---_ _

Tr / T

8, = If. em= 30". ' V = 0°. TI T = 0.5. S,' S , = M ,: N, = 20 dB .

Power ratio versu s Tr I T. N

= 2.

ri me

Fig. II .

375

Waveform s in tracking circuit.

= ()o .

from the LPF (low-pass filter) placed behind the squaring device, we may obtain the waveform b(t) which is almost proportional to the MSE « le(t)l2 ». We assume that b(t) = V2 (le(t)1 2 )/vMholds. Then, we have h(t) = ± ( I e(t) 12 ). The LPF placed in front of the vco extracts the DC component v(t) from h(t). Since the yeO is controlled by v(t) in such a way that the MSE has a minimal value, the reference signal coincides with s(t) or m(t) in time. By using the tracking circuit, we may generate a highly accurate reference signal. We assume that the transfer function of the LPF in front of the yeO is K 1/ (1 + sT I ). Since the phase of the yeO-output signal is proportional to the integral of the control signal v(t), we may have

1) The behavior of the LMS adaptive array depends on the correlation coefficient of the incident signals. However, if the reference signal is generated properly, the LMS adaptive array may reduce the multipath fading effectively for any correlation coefficient value. 2) The synchronization in the reference signal generation must be very accurate in the case where the multipath components are correlated with one another. 3) We proposed a processor configuration which generates the reference signal. We showed satisfactory performance of the tracking circuit.

-- =

7=0 (C=1)

d(Tr) dt T

(26)

Kv(t).

Fig. 12 and 13 show the variations of Trt'T and the output OUR, respectively ~ in a case where the tracking circuit operates. It is seen that TrlT reaches the range from -0.005 (- :lTI2T) to 0.005 (~TI2T). This means that the reference signal is synchronized with the time of arrival of s(t). It is also seen that the output OUR takes on satisfactory values when TrlT approaches o. The output DUR ranges from 32.3 dB to 38.6 dB even after convergence. This is because Tr is vibrated.

v.

CONCLUSIONS

We have examined the fading reduction performance of the LMS adaptive array. Moreover ~ we have discussed the problem of the reference signal generation and we have the following results.

APPENDIX.

THEORETICAL ANALYSIS FOR

We assume that 7 = 0 holds. From (10) and (20)~ we obtain C = 1. Namely, m(t) is perfectly coherent with s(t). In this case. the following equations hold. (AI)

(A2)

Then, the combined signal on each antenna element is given by

+

Sk(t)

mk(t) =

b,

5\ (r) e'":',

(k

= I---N)

(£~3)

where b

k

=

e -}(k-1)tP s + Vlvl,1Sj e -J{(k- l)
(A4)

We define an N-dimentional vector as (AS)

05,....-------~

From these results, we have

Rxx = s, B* BT +

l-

::

00

Fig. 12. Tr/T versus time. N = 2, f\

(A6)

= S, B*.

(A7)

Even when the signals &(t) and met) are correlated with each other ~ the steady-state ensemble average of the weight vector is given by the Wiener solution Rxx - 1 Vxr. Then, from (A6) and (A7), we may obtain

am = 30°. 'V = 0°, TIT = 0.5, S,IN, = M;lN, = 20 dB, ~T/T = 0.01. T' = IO/(G'N,), T 1 = lOO/(G·N,), K( = V2iFi;, K = - SV2Gv'N, X 10- 5 .

= 0°.

W

50~------,

"0

I

where I denotes an N x LV identity matrix. Also, assuming that Tr = 0 holds, we obtain Vxr

£D

iV;

40

=

(5 iB* BT + lV,-J)-I 5 iB* Si - - - - B*. SiBT B* + s,

~ 30

o

~ 20

o 10

(A8)'

Furthermore, from (A3) and (A8), each weighted signal is given by 6 Time

x

~~~~

as

Fig. 13. Output OUR versus time. N = 2, = 0°, em = 30°. \{1 = 0°. TIT = 0.5, S;lN, = M,llv, = 20 dB, ~TIT= 0.01, T ' = lO/(G·Nr ) T) = lOO/(G·N,), K) = y'2IN j , K = - 5V2Gv'N, X 10- 5 .

376

Wk {Sk(t)

+

1nk(t)} (k

= 1-- N).

(A9)

We see that the phase of each weighted signal has the same value. Therefore, it may be said that the weighted signals are add~d in-phase at the array output and that the signal cancellatIon phenomenon does not occur.

Adaptive beamfonning for coherent signals and interference.

[5)

REfERENCES (1]

[4]

Ikegami. F., and Yoshida. S. (1980) Analysis of multi path propagation structure in urban mobile radio environments.

[6]

IEEE Transactions 011 Antennas and Propagation, AP-28. 4 (July 1980),531-537.

(2]

Monzingo, R.A., and Miller, T.W. (1980)

[3]

Introduction to Adaptive Arrays. New York: Wiley, 1980. Hansen. P.M .. and Loughlin, J.P. (1981)

Adaptive array for elimination of multipath interference HF.

IEEE Transactions on Antennas and Propagation, AP-29, 6 (Nov. 1981).836-841. Shan, T.J., and Kailath, T. (1985) IEEE Transactions on Acoustics. Speech, and Signal Processing, ASSP-33 , 3 (June 1985), 527-536. Dixon, R.C. (1984) Spread Spectrum Systems (2nd ed.)

New York: Wiley, 1984. Ogawa, Y.. Ohmiya, M., and Itoh, K. (1985) An LMS adaptive array using a pilot signal.

IEEE Transactions on Aerospace and Electronic Systems, AES-2/, 6 (Nov. 1985). 777-782.

(7]

Widrow, B.. Duvall, K.. Gooch. R.P., and Newman, (1982)

w.e.

Signal cancellation phenomena in adaptive antennas: causes and cures.

IEEE TranSaC!lVI1S on Antennas and Propagation, AP-30. 3 (May 1982).469-478.

at

377

Optimum Combining for Indoor Radio Systems with Multiple Users JACK H. WINTERS,

Abstract-This paper studies the use of optimum combining to increase the capacity of narrow-band in-building radio communication systems with multiple users .. We consider systems consisting of a base station with numerous remotes in a Rayleigh fading environment and study tbe problem of more users requiring channels than the number of channels available. A system is described that, with multiple antennas at the base station but only one antenna at each remote, uses optimum combining to suppress interfering signals. We sbow that this system, with M antennas at the base station, can achieve an M-fold increase in the number of users or tolerate M - 1 interferers from other systems. Thus, with optimum combining, radio communications can be used in high-density, multipleuser environments, such as within buildings, even when only limited bandwidth is available.

I.

MEMBER .. IEEE

age) fading. Narrow-band channels are assumed, i.e., the channel bandwidth is assumed to be much less than the coherence bandwidth [9] . These results show that such a system with one antenna at each remote and M antennas at the base station can achieve either an M-fold increase in capacity (over systems without optimum combining) or tolerate M - 1 interferers from other systems. Section II describes the multiple-user system proposed in this paper and calculates the interference tolerance of the system without optimum combining. In Section III, we describe optimum combining and calculate the increase in capacity and interference tolerance with optimum combining in the system. A summary and conclusions are presented in Section IV.

INTRODUCTION

II. A BASIC SYSTEM FOR MULTIPLE USERS Fig. 1 shows the system to be analyzed in this paper for inbuilding radio communication in a multiple-user environment. Multiple remotes communicate with a base station via radio, with the radio channel characterized by multipath (Rayleigh) and shadow fading. Each user uses a single frequency channel, i.e., frequency-division multiple access (FDMA) is used in multiple channel systems. (As discussed in Section Ill, the system can also have multiple users per frequency channel by using a form of space-division multiple access, i.e., through the use of optimum combining since the remotes are physically separated.) As described in detail in Section III, the base station has multiple antennas (antenna diversity), while each remote has only one antenna. As discussed below, dynamic channel assignment and transmit power control are also used. Let us first consider dynamic channel assignment [9] to increase the average number of users in a multiple-user system and transmit power control [9] to reduce adjacent channel interference, and determine the interference tolerance of such a system without optimum combining (i.e., without antenna diversity). For a multiple-user system with multiple channels, dynamic channel assignment [9] is required for efficient channel usage. With this method, before transmission begins, the channels are scanned to find a quiet channel (one with little or no interference) for channel assignment. Furthermore, during transmission, the assigned channel is continuously monitored for interference, and the channel assignment is changed to a quiet channel when the interference becomes too strong. The latter process must occur because the signal environment is constantly changing as the user moves, the environment changes (e.g., doors are closed or opened), or as other users move or begin transmission. Thus, with dynamic channel assignment, interference does not affect the outage performance of the System as long as there are quiet channels available. Another technique to reduce interference among users is power control. Within the coverage region, the signal attenuaPaper approved by the Editor for Radio Communication of the IEEE tion between the transmitter and receiver can vary widely, by Communications Society. Manuscript received December 15, 1986; revised as much as 80 dB or more. Thus, a system with a base station May 18, 1987. This paper was presented at the International Conference on and multiple remotes, all transmitting at the same power level, Communications, Seattle, WA, June 1987. can have received signals differing in power by as much as 80 The author is with AT&T Bell Laboratories, Holmdel, NJ 07733. dB at the base station, which creates an adjacent channel IEEE Log Number 8717084.

W

IRELESS in-buildi.ng. commu~~ation allows. t~e user to be mobile and ehnunates wmng and rewmng when adding or moving phones, terminals, etc., and reconfiguring networks. In-building radio propagation [1]-[6] is hard to predict and continuously changing, however, which makes interference management with multiple users difficult. Furthermore, since bandwidth must be shared by all users within the coverage areas (which could overlap), the capacity of a multiple-user system can be much less than that required in many office buildings. One technique for interference reduction is optimum combining [7]. With optimum combining, the signals received by several antennas are weighted and combined to maximize output signal to interference plus noise ratio (SINR). Thus, interfering signals are suppressed and the desired signal is enhanced. Optimum combining has been shown to substantially reduce interference in mobile radio [7] where multipath fading is present and in systems without fading [8]. For inbuilding radio communication, there is multipath fading as in mobile radio, but the fading rate is much slower. This makes it possible to use optimum combining in combination with other techniques to further reduce interference. In addition, optimum combining can be implemented as an adaptive technique [7], so that detailed a priori knowledge of a building's radio environment is not required and changes in the environment are automatically tracked. In this paper, we describe a digital in-building radio communication system that allows a large number of users in a small area. We consider a system consisting of a base station with numerous remotes and show how optimum combining, in combination with other techniques, can be used to increase the maximum number of users and eliminate interference from other systems. Computer simulation results are shown for a digital system with phase-shift-keyed (PSK) modulation and coherent detection with Rayleigh and shadow (due to block-

Reprinted from IEEE Transactions on Communications, Vol. COM-35, No. 11, pp. 1222-1230, November 1987.

378

I

REMOTE

~

· l1

•

\ / •

REMOTE

0 L-1 EQ.4

2

tg

iii

~

.....

(-1 SIN)

(1)

where S IN is the signal-to-noise ratio . Thus, a 6 .8 dB S IN is required for a 10- 3 BER. Next , consider the effect of a PSK interfering signal with a phase difference (J from the desired signal. The worst case interference occurs when the bit timing for the interferi ng and desired signals are equal. In this case, the received demodulated signal is modified by the factor I + -1 I I S cos (J where II S is the interference to desired signal power ratio, I and therefore, the BER is given by I

BER=- erfc (-1 z(O» 2

(2)

where z(O) = SI N(l

+.JUS cos

0)2.

(3)

The phase difference (J changes with the modulating bits and varies slowly with time for small frequency offsets between the two signals . We therefore assume that (J has a uniform probability distribution. Thus, the BER averaged over (J is given by

BER=:" [" 1f

~ erfc (-1 z(O»

Jo 2

dO

}EQ.(A-11

-10

GAUSSIAN EQ. (A-31 BOUND EQ.(A-4)

BER-10- 3 -15

interference problem. The problem can be reduced by adaprively controlling each remote 's transmit power so that the received power is equal for all signals at the base station. Furthermore, to reduce adjacent channel interference at the remotes, the base station can transmit all signals with equal power. We now consider the effect of interference on a digital communication system using PSK modulation and coherent detection. In general, for voice communications, good voice quality can be ma intained at a bit error rate (BER) less than 10- 2 . In this paper, we conservatively consider a 10- 3 BER. For data communications, we assume coding could be used to reduce the error rate to a more acceptable value . The BER for cc .ierent detection of a PSK signal in white Gaussian noise is given by [10, p . 381]

I

4

5

H

The multiple-user radio system .

BER = - erfc 2

3

-5

--s--

Fig. l.

I

(4)

where z«(J) is given by (3) . Thus, from (4), we can determine the maximum IIS that can be tolerated for a given BER. Fig . 2 shows IIS versus S IN for a 10 - 3 BER . (Multiple interferer results are discussed in the Appendix .) That is, the figure shows the maximum IIS that can be tolerated for a given SIN and a 10 - 3 BER. For the single interferer case , the maximum IIS increases from - 20 to - 5 dB with a 5 dB I Note that we are assuming perfect phase synchronization at the receiver . This is discussed further in Section III-D .

-20 '-----'--'"-- - - - - - ' - - - - - - - - - ' 5 10 20 SIN (dBI

Fig. 2.

The interference to desired signal power ratio versus signal-to-noise ratio for a 10 - l BER.

increase in S IN (from 7 to 12 dB) . Thus, by increasing transmitter power, we can significantly increase the interference tolerance . However, this works only up to a limit since the single antenna system cannot tolerate an interferer stronger than the desired signal no matter how high the S IN. Because the signal propagation in buildings varies substantially with position , it is a very real possibility that interfering signals from nearby systems could be stronger than the desired signal. Thus, even if the capacity of a single antenna system were adequate for an office, interference from nearby systems could easily block channels . thereby reducing capacity or abruptly terminating transmissions. Thus, from both a capacity and interference standpoint, a single antenna system is inadequate for offices . III.

MULTIPLE ANTENNA SYSTEMS

A. Optimum Combining 1) Overview: Interference at the receiver can be reduced

with optimum combining. With this technique, the signals received by several antennas are weighted and combined to maximize output signal to interference plus noise ratio . Thus , diversity (e .g ., space [9 , p. 310], direction [9, p. 311 , II], polarization [9, p. 311 , 12], or field [9, p. 148] [see Section III-OJ) is used to suppress interfering signals and enhance desired signal reception. Optimum combining has been shown to substantially reduce interference in systems both with [7] and without [8] signal fading . Our proposed indoor radio system falls somewhere between the se two cases because, although there is fading , we compensate for it by adjusting the transmit power (see Section II) .

Without fading, optimum combining can null M - 1 interferers with M antennas if the angular separation of the desired and interfering signals is large enough. With fading, as in mobile radio, the angular separation no longer matters because of the multipath. In fact, the receiver can suppress interfering signals and enhance desired signal reception as long as the received desired signal powers and phases differ somewhat from the received interfering signal powers and phases at more than one antenna. Thus , in a system using several antennas for space, direction, polarization, and/or field diversity, the probability of being unable to suppress an interfering signal is very small. Furthermore, since with dynamic channel assignment the channel can be changed if the interference cannot be suppressed , systems with optimum combining can overcome most interference problems. As discussed in [7], optimum combining need only be used

379

at the base station receiver. Adaptive retransmission with time division [9], [13] can be used to improve reception at the remote without requiring multiple remote antennas. With adaptive retransmission, the base station transmits at the same frequency as it receives, using the complex conjugate of the receiving weights. With time division, a single channel is time shared by both directions of transmission. Thus, with optimum combining, during transmission from the remote to the base station, the antenna element weights are adjusted to maximize the signal to interference plus noise ratio at the receiver output. During transmission from the base to the remote, the complex conjugate of the receiving weights are used so that the signals from the base station antennas combine to enhance reception of the signal at the desired remote and to suppress this signal at other remotes. Thus, we can achieve the advantages of optimum combining at both the remote and the base station with multiple antennas at the base station only. 2 As discussed above, a system with optimum combining can suppress interfering signals with a high probability even if their power is equal to or greater than that of the desired signal. Therefore, with optimum combining, several signals can use the same channel simultaneously, thus increasing capacity. Also, signals from other systems can be suppressed even if they are stronger than the desired signal. These topics are discussed in detail in Sections III -B and III -C. 2) Description and Weight Equation: Fig. 3 shows a block diagram of an M antenna element diversity combiner. The signal received by the ith element y;(t) is split with a quadrature hybrid into an in-phase signal X/j(f) and a quadrature signal XQ;(I). These signals are then multiplied by a controllable weight W/j(f) or WQi(f). The weighted signals are then summed to form the array output 50(1). Let the received interference-plus-noise correlation matrix be given by L

R nn =0 21+ ~ u-«: ~ J J

ARRAY OUTPUT

WEIGHT GENERATION

Fig. 3.

signal propagation vector is given by (7)

(5)

J=l

where 0'2 is the noise power, I is the identity matrix, L is the number of interferers, Uj is the jth interfering signal propagation vector, and the superscripts * and T denote conjugate and transpose, respectively. In (5), the correlation is over a period much less than the reciprocal of the fading rate, i.e., Uj and Ud [in (5)-(10)] are assumed to be reasonably constant over the period in which the bit error rate is calculated. Note that we have assumed the fading rate is much less than the bit rate. The equation for the weights that maximize the output SINR is then (from [14]) (see [7])

- R-nn1 u d* w-a

Block diagram of an M antenna element diversity combiner.

(6)

where w is the complex weight vector, a is a constant.! the superscript - 1 denotes the inverse of the matrix, and Ud is the desired signal propagation vector. 3) Preliminary Assumptions and Analysis: In this study, we will assume independent Rayleigh fading (due to multipath) at each antenna with the same shadow or obstruction fading at each antenna for a given signal. Of course, the fading produced by multipath may not be Rayleigh in all locations in all buildings. However, it must be stressed that optimum combining always maximizes the signal to interference plus noise ratio, even if the fading is not Rayleigh. With independent Rayleigh fading at each antenna and transmit power control as discussed in Section II, the desired 2 Note that for adaptive retransmission to be completely effective, all systems within range must use optimum combining and adaptive retransmission with synchronized time division (see Section III-D). 3 Note that Q does not affect the performance of the optimum combiner, and therefore we will not consider its value.

where the Ud; are independent complex Gaussian random variables and P rd (= u~u;) is the total received desired signal power. Note that because of transmit power control, the components of Ud are not independent. Although the phases of the components are independent, the amplitudes (and, therefore, the powers) are dependent. The interfering signal propagation vectors (the Uj's) for the interfering users in a multiple users per channel system have the same characteristics as Ud in (7). For interference from other systems, the characteristics of the Uj'S can vary widely, however. In Section III-C, we study the system performance with fixed total received power for each interferer, i.e., the Uj'S have the same characteristics as Ud in (7), but with a total received power (Prd ) that can be different from the desired signal. 4) SINR and HER: We are interested in achieving the lowest possible BER for the digital system. The optimum combiner, however, maximizes the SINR. With Gaussian interference and noise, maximizing the SINR does indeed minimize the BER. However, in our system, the interference is one or more PSK signals. Therefore, maximizing SINR does not necessarily minimize the BER, although it substantially reduces the BER. Thus, since no simple formula currently exists for determining the weights that minimize the BER [from (3), (4), (A-I), and (A-2) note that the BER is a complicated function of SIN and liS], 4 optimum combining is used. As discussed above, interference has a different effect from noise on the BER. In fact, the effect of interference depends on the noise and vice versa, as shown in Fig. 2. Thus, in our analysis, we first determined the weights that maximize SINR and then determined the liS and SIN at the optimum combiner output. The BER can then be determined from (4) for L = 1 and (A-1) for multiple interferers. For the diversity combiner of Fig. 3, it can be shown that the interference to desired signal power ratio liS and the desired signal-to-noise ratio SIN at the array output are given by L ~ IW t u j*\2 (8) I1S=i=\wt u; 12 4 Note that optimum combining does minimize the upper bound on the BER given in (A-4) and the BER approximation for the interference considered to be the same as Gaussian noise (A-3).

380

and

SIN==

Iw tu;1 2 t

2

awW

'

number of cases (corresponding to randomly positioned remotes) were generated, and the probability was calculated by determining the proportion of cases in which all signals had a HER less than 10- 3 . Thus, for each case, the following procedure was employed. First, signal propagation vectors were generated for each signal by

(9)

respectively, where w is given by (6) and the superscript t denotes complex conjugate transpose. Note that without interference (L = 0), from (5) and (6), a

W=2 a

and therefore, .noting that P rd

u;,

=

U

1) generating independent complex Gaussian random numbers, and 2) calculating Ud from (7).

(10)

Second, with these signals vectors, it was determined whether the desired signal at the output of every optimum combiner had a BER less than 10- 3 by, for each signal,

~u;, from (9),

P rd SIN=-2 . a

1) designating the signal as the desired signal and all others as interfering signals, 2) calculating the optimum weights (6), 3) calculating SIN and liS [(8) and (9)], and 4) determining if SIN and lIS were below the appropriate curve of Fig. 2.

(11 )

interference, optimum combining causes the SIN to be slightly less than that of (11), while the liS is substantially less than that received at each antenna. Assuming an acceptable channel unless the BER e~ceeds 10- 3 , we are interested in the probability that the HER ~s less than 10- 3 (and not interested in the average HER). That IS, we are interested in the probability that a given channel can be used. This is, of course, the probability that SIN and 1/ S are below the curves of Fig. 2. In Sections III-B and III-C, we calculate this probability and from it determine capacity and interference tolerance.

Wit~

B. Multiple Users Per Channel As discussed previously, because optimum combining can suppress signals even when their power is equal to or greater than that of the desired signals, multiple users per channel are possible. Thus, a much higher capacity than that for sin~le antenna systems can be achieved. In this section, this capacity is determined. The proposed system with multiple users per frequency channel has one base station with M (M > 1) antennas and multiple remotes with one antenna each. The base station ~as, for every remote's transmitted signal, an optimum combiner that uses the signals received by each of the M antennas. Thus, the designation of the desired and interfering signals depends only on which optimum combiner is being considered. All the signals are, of course, desired at the receiver. The capacity of multiple users per channel systems ~as calculated by first using Monte Carlo simulation to determine the probability that (for a given received signal-to-noise ratio and number of antennas) a given number of users can use the same frequency channel simultaneously. From this probability, we then calculated the probability that, with a given number of simultaneous users, another user can be added to the channel. Finally, these results were used to determine the capacity of systems with a 0.01 blocking probability (i.e., 99 percent availability was considered in our study). The analysis uses the following notation. Let K be the number of simultaneous users per channel (all with BER < 10- 3) . Also, let r d and r j be the average received signal-tonoise ratio per antenna for the desired and jth interfering signals, respectively. Thus, r d = Prdl Ma 2 , and for the multiple users per channel system, I', = I' d for j = 1, L a~d !-= K - 1. Our results are given as a function of rd' This IS because T d determines the required transmit power of the remotes or, alternatively, with fixed maximum transmit power, the maximum range. Note that a 6.8 dB SIN is required for a 10- 3 BER, and assuming a cubic law of signal strength falloff with distance, a 9 dB increase in required r d with fixed transmit power implies a 50 percent range reduction. The probability P K that K users can simultaneously use the sanle channel was determined by computer simulation. A large

Figs. 4-7 show the probability that K users can use the same channel simultaneously versus the average received desired signal-to-noise ratio per antenna with two-nine antennas. Ten thousand cases per data point were used. To conserve computer time, only up to six simultaneous users were considered. The figures show that one user per channel is always possible if T d is greater than 7-10 log 10 M dB, and that for K > 1, the probability of accommodating K simultaneous users increases with rd. M users per channel with high probability are possible if I' d is increased by up to 20 dB, with higher values of K possible only at a much lower probability. Note that as the number of antennas increases. smaller increases in I' d are required for multiple users at a high probability. For example, with nine antennas, an increase in I' d of only 10 dB is required for a six-fold increase in capacity. For fixed transmit power in a typical building, this represents about a 50 percent reduction in maximum range. We now consider the probability P K/ K - 1 of being able to add the Kth user (with BER < 10- 3 for all K users). That is, P K / K -1 is the probability that one more user can use the same channel given that K - 1 users are using the channel. This probability can be derived from the previous results by noting that the BER for each of the existing K - 1 users can only be increased (not decreased) by adding an additional interferer. Thus the cases where BER < 10- 3 with K users are a subset of the cases where BER < 10- 3 with K - 1 users, and the probability of adding the Kth user is PKIPK - 1• Fig. 8 shows the probability that a Kth user can be added to a channel versus the received desired signal-to-noise ratio per antenna for six receive antennas. This probability is similar to the probability for K simultaneous users (Fig. 6) because the probability of adding the Kth user successfully is usually much less than that for the K - 1 user. Similar results were obtained for two, four, and nine receive antennas. The blocking probability for a single channel with capacity K is defined here as the probability that a K th user cannot be added to the system, 5 i.e., for a one-channel system (N = 1),

B= 1-PK / K -

1e

(12)

Thus, the call blocking probability for a single channel can be calculated directly from the above results. Fig. 9 shows the capacity (maximum number of simultaneous users) versus I' d for a single-channel system with a 0.01 blocking probability. The figure shows that the increase in r d required for each additional user becomes smaller as the

381

5 This is actually the worst case blocking probability for the capacity K system since the blocking probability is substantially less when there are fewer than K - 1 users.

1.0

K-1

1.0

O.B

('{;"

,.,

I : I :

I:

O.B foM=2

0 .6

0 .6

x

I:

-------10

0

1.0

, I

--"----

,

I

I

l« - 3

0 .6

M-4 0 .6

Q.

,,

I I

I

,

0 .2

I

I

I

I

10

r,

20

0 -5

30

K-1

O.B

0 .6 K-5 I

0 .2

I

· I

I

I

I

I

I

I

I

I

"

I

I

I

I

I

I

I

I

I

K-6

r

20

30

N=1

M-9

. B- .01

;

M-6

~ M=6

r ···········

4

....>-

u

..

.........

~

u

K '" 6

r------ M=4 I I

...j ,..

2

I

I

J

M-

I

I

I I

.,'/ 0

"

6,--...,----r---r-----,------,

I

0 .4

I

, , I

I

I

, ,,

I

I

I

I

I

Fig. 8. Probability that a Kth user can be added to a channel that already has K - I users with the BER less than 10-J for all K users versus received desired signal-to-noise ratio per antenna for M = 6 receive antennas .

,"

I• I I

x

o

(dBI

Fig. 5. Probability that K users can simultaneously use the same channel with a BER less than 10 - J versus received desired signal-to-noise ratio per antenna for M = 4 receive antennas .

Q.

" I

.'

," ,, ,

• I

04

I

0

30

. I

x, x

I

I

I

20

r. (dBl

I

:......K -= 4

, ,

0 .2

10

0

O.B

I

0 .4

I

!)/

,I'

,

Q.

I:

1.0

..,

:

:

I:

Fig. 7. Probability that K users can simultaneousl y use the same channel with a BER less than IO-J versus received desired signal-to-noise ratio per antenna for M = 9 receive antennas .

,-

.'

,,

I:

0 -5

30

(dBI

,"

I

x

0 .0 -5

1

J

I I

K-4: 'K-6

K=4

20

r,

·,, ·

O.B

1.0

,:

K::~I :

0.2

K-3

Fig. 4. Probability that K users can simultaneously use the same channel with a BER less than 10 - J versus received desired signal-to-noise ratio per antenna for M = 2 receive antennas.

0 .0 -5

J

I: ,: ~K-5

0 .2

0 .0 -5

, ,

,,

J

'. r:

0 .4

M=9

I

I: I:

Q.

0 .4

I

I

I

, ,

I :

': K=2f r : ':J:

x

Q.

,

,:

10

r,

20

o

30

20

30

(dBI

Fig. 6. Probability that K users can simultaneously use the same channel with a BER less than IO- J versus received desired signal-to-noise ratio per antenna for M = 6 receive antennas.

382

Fig. 9. The capacity (maximum number of simultaneous users) versus r d for a single-ehannel system with a 0.01 blocking probability for several values ofM.

Dumber of antennas increases. For example, five users with six antennas require r d = 17 dB, while with nine antennas, only 5 dB is required. Also, the results show that close to M users are possible, but only with a substantial increase in r d as compared to the single-user system. However, multiple users with a small I' d penalty are possible if the capacity is much less than M. \Ve now study the capacity of multiple channel systems (N > 1) where N is the number of channels. Because of dynamic channel assignment, the capacity for a given blocking probability is greater than just N times the capacity of a singlechannel system. In fact, with dynamic channel assignment, there may be many users in one channel and only a few in another. However, to simplify the analysis, we will assume that all channels have K users before any have K + 1 users. This is a worst case model since the capacity is greater if the number of users in each channel is more unevenly distributed. OUf results are, therefore, somewhat pessimistic. Consider an N-channel system with N - (I - 1) channels with K users per channel and I - I channels with K + I users per channel (0 < I ~ N). Then the total number of users is NK + (I - 1), and the blocking probability for the next user is given by

B = (1 - P K + 1/ K ) N -

(I- I)

(1 - P K + 2/ K + I ) 1- I •

(

40

32

>- 24

~

u

u

16

B

0

-5

0

10

r d (dB)

20

30

Fig. 10. The capacity (maximum number of simultaneous users) versus I'd for an eight-channel system with a 0.01 blocking probability for several values of M.

13)

That is, (13) is the call blocking probability for a system with capacity NK + t. Thus, from the previous results in this section and (13), the capacity (maximum number of users) for a given blocking probability can be determined. As an example, consider an eight-channel system. Fig. 10 shows the capacity versus r d with a 0.01 blocking probability for several values of M. This figure shows that an M-fold increase in capacity can be achieved with M antennas if r d is increased by as much as 20 dB (for M = 2). However, the required increase in I'd decreases with more antennas. Furthermore, for less than an M-fold capacity increase, the r d penalty is significantly less. For example, with nine antennas, a fivefold increase in capacity is possible with only a 3 dB increase in rd. Note that as the number of channels increases, for the same blocking probability, the required I' d decreases. The results can be generalized as follows. In systems with Rayleigh fading, an M-fold capacity increase is obtained because M - 1 signals are nulled by each optimum combiner. Thus, the number of signals that can be nulled is the same as that in a nonfading environment (M - 1). We might therefore expect that our results would be valid even if the fading were not Rayleigh and/or there were more than nine antennas. Hr:wever, such results need to be verified in a practical system. C. Interference In this section, we determine the number and power of interfering signals that can be tolerated by the optimum combiner. We first describe how the results were generated and discuss the effect of interference on the optimum combiner. Next, results are shown for the maximum level of interference for a 0.01 blocking probability with L equal power interferers and M antennas. Finally, we determine the maximum number of interferers at any power that can be tolerated. The probability that L interferers of equal average received POwer (rj ) block a channel for the desired signal was detennined by computer simulation. A large number of cases (corresponding to randomly positioned remotes) were generated, and the probability was calculated by determining the proportion of cases in which the single desired signal had a BER greater than 10- 3. The method used was the same as that described in Section 111-B, except that there is only one desired Signal and the power of the interferers is not necessarily equal

to the desired signal power. Results for the maximum interference power for a given blocking probability were obtained by increasing the interference power (in 1 dB steps) until the blocking probability exceeded the given value. The weights are affected by the power of the interference as shown in (5) and (6). If T, < 1 (i.e. ~ the power of the interference is less than that of the noise), the interference has little effect on the weights, and the interference-to-noise ratio at the optimum combiner output is close to that at the input. However, when I', > 1, the weights are adjusted to suppress the interference in the output to a level far below the noise. In this case, increasing the received interference power decreases the interference-to-noise ratio at the optimum combiner output. The optimum combiner can greatly suppress (far below the noise level) interferers and not greatly suppress the desired signal if the received desired signal phases differ somewhat from the received interference signal phases at more than one antenna. With multiple antennas and multipath, it is very unlikely that the phases will be the same. Therefore, the probability of the optimum combiner being unable to null the interference is negligible. However, interference nulling does reduce the output desired signal-to-noise ratio. Thus, call blocking occurs when SIN is reduced to less than 7 dB (i.e., BER > 10- 3) with high received interference power. The optimum combiner can therefore tolerate interference at any power" with high probability if I' d is large enough. These points are illustrated in Fig. 11 for M = 4. This figure shows the maximum rjlr d versus I' d for a blocking probability of 0.01 with eight channels. Thus, the probability of call blocking in one channel is 0.56 [(0.56)8 = 0.01]. Results show that the system can tolerate M - 1 (= 3) interferers at any power if r d is 7 dB greater than that required without interference. With M or more interferers, the optimum combiner can only tolerate interference that has power approximately equal to that of the desired signal even with very high rd' Similar results were obtained for M = 2 and 4 with N = 1 and 8. From the above results, the r d required for the system to tolerate L interferers at any power can be determined. Fig. 12 shows the maximum number of interferers at any power versus r d for a blocking probability of 0.01 with one channel. The figure shows that close to M - I interferers can be tolerated with large increases in rd. 6 In a hardware implementation, the maximum interference power that can be tolerated is usually limited to 40-80 dB.

383

6

20 L'1

M-4 N'8 8'.01

10

'"

4

."

....~ .....

C

-,----- -

20

:;

..J

L·4

.-

~

__ -L'5

:::>

x

4

~

~

x

:;

M·4

~

:::> 4

N• 8 B • .01

M'6

2

-10

M·2 _20L-

o

Fig. I I.

---'

---L

10

--.J

20

30

10

The maximum rJ /r d versus r d for a blocking probability of 0 .01 with eight channels and four antennas .

r d (dBI

20

30

Fig. 13. The maximum number of interferers at any power versus r d for a blocking probability of 0.01 with eight channels and M = six. four, and two antennas .

6,----- - . - - - - - - - - , - - - - - - - , M'6

..J

N'1

8' .0 1

4

,------M·4

2

M'2

o L-_L-_.1.-..L.o

r,

[dBI

...I-_---'_ _...J 30

Fig. 12. The maximum number of interferers at any power versus r d for a blocking probability of 0 .01 with one channel and M = six. four. and two antennas .

Fig . 13 shows the maximum number of interferers at any power versus r d for a blocking probability of 0.01 with eight channels. M - 1 interferers can be tolerated with M = 2, 4, and 6 and increases in r d of only 3, 7, and 8 dB, respectively . Thus, the results in this section show that M - 1 interferers at any power can be tolerated with a several dB increase in r d if M s 6. Since these results are similar to those for a nonfading environment (where up to M - 1 interferers can be nulled), we might again expect that our results would be valid, even if the fading were not Rayleigh and/or there were more than six antennas .

D. Implementation For the system with optimum combining to be practical, the antenna array at the base station must not require a large area. The separation for (nearly) independent fading at each antenna is one-quarter wavelength (A/4, e.g. , 8 em at 900 MHz and 1.5 mat 50 MHz) . Thus, with space diversity [9, p. 310] , an array of M antennas requires a A/4(M - 1) by A/4(M - 1) area. Furthermore, direction [9, p . 311, uj, polarization [9, p. 311, 12J, or field diversity [9, p. 148J can also be used. With these diversity schemes, antennas can be added without increasing the physical size of the antennas array . For example, with polarization diversity in addition to space diversity, the number of antennas can be tripled (three orthogonally polarized antennas for each space diversity antenna) without any change in the area of the array. Thus,

with a mixture of diversity techniques, a large number of antennas can be placed in a relatively small area. Optimum combining can be implemented for in-building systems in the same way as in mobile radio [7] . The optimum combiner can be implemented with an LMS [15], [16] adaptive array . Signals can then be distinguished at the base station by different pseudonoise codes, with these codes added to the biphase PSK signal with an orthogonal biphase PSK signal (see [17]) . The pseudonoise codes that are used to distinguish signals are also useful for carrier recovery . The received signal can be mixed with the code to generate a narrow-band signal for carrier recovery . Because of the processing gain with the code, the narrow-band signal will have a high signal to interference plus noise ratio. even when /IS at the receiver output is high . Therefore, the receiver can track the signal phase with little phase jitter even when /IS at the receiver output is close to 1. A major difference between in-building systems and mobile radio is the fading rate . In mobile radio, the fading rate is about 70 Hz . Thus, the weights must adapt in a few mill iseconds. In buildings, however, the fading rate is much less . For example. a 1.5 mls velocity (i.e., walking with the remote) produces a 4 .5 Hz fading rate at 900 MHz and a 0.25 Hz fading rate at 50 MHz. Thus, the weights can be adapted much more slowly, making implementation of the LMS algorithm on a chip much easier. Furthermore, because the fading rate is less, the dynamic range of the LMS adaptive array is greater. That is, the receiver can operate with higher interference to desired signal power ratios . Using the analysis of [7], we can show that the maximum interference to desired signal power ratio is on the order of 30 dB for a 4.5 Hz fading rate as compared to 20 dB for mobile radio. If greater dynamic range is required, other (more complicated) techniques [8] may be used because rapid adaptation is not required. As noted in Section Ilf-Al ), for adaptive retransmission to be completely effective (i.e., same BER at the remote as at the base station) , two requirements are placed on the systems. First, all transmissions must be synchronized. That is, all remotes must transmit at the same time , and all base stations must transmit at the same time . With one base station and multiple remotes, synchronization is not a problem. However, with multiple base stations, there should be synchronization between systems within the same building. A second requirement is that all base stations use optimum combining with adaptive retransmission . If another system did not use this technique, it could interfere with the base-to-remote transmis-

384

than integration. Unfortunately, the series has convergence problems (on a digital computer) for most of the cases of interest in this paper. Thus, (A -1) was used to calculate the BER, but only for L -s 5. Fig. 2 shows the results. Note that for large SIN with L = 5, there appears to be some error in the curve. (For L = 5, the error could not be determined because of the extensive computer time required.) However, this error does not affect our results for the reasons discussed below. We also considered two other BER equations. First, for large L, the interference can be considered to be the same as Gaussian noise [18], and therefore, the BER is given by

sio ns 7 of other systems on a channel. However, the system without optimum combining could suffer interference on both transmission paths. Therefore, in high-density multiple-user environments, systems could not operate without optimum combining, and would be required to use optimum combining with adaptive retransmission. In this paper., we have studied only the steady-state performance of the optimum combiner. In an actual system, the base station receiver must track both the desired and interfering signals. Although the dynamics of in-building radio communications are slow, the movement of the remotes will affect the performance of the LMS adaptive array (or any other implementation of the optimum combiner). Thus, the transient performance of the system should also be studied. Finally, in this paper, we have studied the performance of the base station receiver only. A brief analysis (not presented in this paper) shows that the BER at the remote should be similar to that at the base station (for adaptive retransmission with time division). Computer simulation is needed, however, to verify that when the BER is less than 10- 3 at the base station, it is also less than 10- 3 at the remote. IV.

1 BER=-2 erfc (

)

(A-3)

Results using this approximation are given in Fig. 2. Second, an upper bound on the BER with interference for any number of interferers is given by [20]

BER~exp [_

SUMMARY AND CONCLUSIONS

In this paper, we have studied multiple-user in-building radio communication systems. We described a multiple-user system and showed that optimum combining can be used to increase the capacity and interference tolerance of the system. Computer simulation results showed that with optimum combining, a system with one antenna at each remote and M antennas at the base station can achieve either an M-fold increase in capacity or tolerate M - 1 interferers. Finally, we discussed implementation of the system and showed that the system was practical for the office environment. ApPENDIX

Extending the results of Section II, we can see that with L interferers, the BER is

where

+···+-JIL/Scosf)L)2

1

(S/N)-l+//S

(A-2)

and Iii S is the interference to desired signal power ratio of the ith interferer. Note that the total interference to signal power ratio liS is ~f=l L/S, There are two problems with (A-I) and (A-2), however. First, the BER depends not only on the total interference to signal power ratio, but on the individual interference powers as well. However, it was concluded (although not proved) in [18] and [19] that for fixed total interference power, the highest BER is achieved with equal power interferers, i.e., Ii/S = (I/L)IIS for i = 1, L. Therefore, we considered equal power interferers as a worst case and generated an approximate lower bound for maximum liS versus SIN for a 10- 3 BER. A second problem is that for numerical evaluation of (A -1), computer time grows exponentially with L, and therefore, calculations are only practical for small values of L. Another formula for the BER is given in [18], which uses a series rather

1

(S/N)-l+//S

]

(A-4)

Results using this upper bound are also shown in Fig. 2. Note that this bound is not very tight for small liS; from this bound, the SIN is 8.4 dB at a 10- 3 BER (without interference, liS = 0), while the actual SIN required [from (1)] is 1.6 dB less. Fig. 2 shows that the maximum II S varies significantly with the BER equation used. (Equations (A-I) and (A-2) with equal power interferers were used for the results presented in Figs. 4-13.) However, our results for the optimum combining system (with M antennas and L interferers) for L < M in Figs. 4-13 and our conclusions do not depend on the BER equation used. This is because, for L < M, the number of degrees of freedom in the adaptive array using optimum combining is greater than or equal to the number of interferers, and therefore, the array can usually greatly suppress the interferers without affecting the desired signal. Therefore, the liS at the array output is small, and, if the SIN is large enough, the BER is less than 10- 3 • Thus, the array usually operates in the small liS region where the required SI N is about the same for all the BER equations (except for the upper bound (A-4) where the required SIN is 1.6 dB higher). We verified that our results for L -s M in Figs. 4-13 were not significantly changed by the liS curve used, except that the SI N was 1.6 dB higher for the liS curve from (A-4). For L ~ M, the number of degrees of freedom in the array is less than the number of interferers, and therefore, the array cannot greatly suppress all the interferers in most cases. Thus, the variation in maximum 1/S at high SIN has a dramatic effect on the results. As noted above, the results in this paper are based on (A-I) with equal power interferers, and thus, our results should be conservative for L ~ M. However, our conclusions (an M-fold increase in capacity and suppression of M - 1 interferers) are based on the L < M case, and therefore, do not depend on which BER equation is used.

7 It would not interfere with remote-to-base transmissions of systems with optimum combining, however, as optimum combining suppresses any mterference.

385

REFERENCES

[1]

K. Tsujimura and M. Kuwabara, "Cordless telephone system and its propagation characteristics," IEEE Trans. Vehic. Techno/., vol. VT26, pp. 367-371, Nov. 1977. [2] K. Yamada, S. Naka, A. Nishiyama, and T. Miyo, "2 GHz-band cordless telephone system," in Proc. 29th IEEE Vehic. Techno/. Conf., Arlington Heights, IL, Mar. 1979, pp. 159-163. [3] S. E. Alexander, "Radio propagation within buildings at 900 MHz," Electron. Lett., pp. 913-914, Oct. 14, 1982. [4] P. S. Wells, "The attenuation of UHF radio signals by houses," IEEE Trans. Vehic. Technol., vol. VT-26, pp. 358-362, Nov. 1977. [5] A. A. M. Saleh and R. A. Valenzuela, "A statistical model for indoor multipath propagation," in Proc. Int. Conf. Commun., 1986, pp. [6]

27.2.1-27.2.5. D. M. J. Devasirvatham, "The delay spread measurements of

[7]

[8] [9] [10] [11] [12] [13] [14] [15] [16] [17]

wideband radio signals within a building," Electron. Lett., vol. 20, pp. 950-951, Nov. 8, 1984. J. H. Winters, "Optimum combining in digital mobile radio with cochannel interference," IEEE J. Select. Areas Commun., vol. SAC-2, pp. 528-539, July 1984 (also IEEE Trans. Vehic. TechnoI. , vol. VT33, pp. 144-155, Aug. 1984). R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980. W. C. Jakes, Jr., Microwave Mobile Communications. New York: Wiley, 1974. H. Taub and D. L. Schilling, Principles of Communication Systems. New York: McGraw-Hill, 1971. K. H. Awadalla, "Direction diversity in mobile communications," IEEE Trans. Vehic. Technol., vol. VT-30, pp. 121-123, Aug. 1981. R. T. Compton, Jr., "On the performance of a polarization sensitive adaptive array," IEEE Trans. Antennas Propagat., vol. AP-29, pp. 718-725, Sept. 1981. P. S. Henry and B. S. Glance, "A new approach to high-capacity digital mobile radio," Bell Syst. Tech. J., vol. 60, pp. 1891-1904, Oct. 1981. C. A. Baird, Jr. and C. L. Zahm, "Performance criteria for narrowband array processing." in Proc. Con! Decision Contr., Miami Beach, FL, Dec. 1971, pp. 564-565. B. Widrow, P. E. Nantey, L. J. Griffiths, and B. B. Goode. "Adaptive antenna systems," Proc. IEEE, vol. 55, pp. 2143-2159, Dec. 1967. B. Widrow, J. McCool, and M. BaH, "The complex LMS algorithm." Proc. IEEE, vol. 63. pp. 719-720, Apr. 1975. J. H. Winters, "Increased data rates for communication systems with

adaptive antennas, " in Proc. Int. Coni. Commun., Philadelphia, PA, June 1982, pp. 4F.3.1-4F.3.5. [18] A. S. Rosenbaum, "Binary PSK error probabilities with multiple cochannel interferences," IEEE Trans. Commun. Techno/., vol. COM-18, pp. 241-253, June 1970. [19] V. K. Prabhu, "Error rate consideration for coherent phase-shift-keyed systems with co-channel interference," Bell Syst. Tech. J., vol. 48, pp. 743-767, Mar. 1969. [20] G. J. Foschini and J. SaIz, "Digital communications over fading radio channels," Bell Syst. Tech. J., vol. 62, pp. 429-456, Feb. 1983.

386

The Performance Enhancement of Multibeam Adaptive Base-Station Antennas for Cellular Land Mobile Radio Systems SIMON C. SWALES, MARK A. BEACH, DAVID J. EDWARDS, JOSEPH P. McGEEHAN, MEMBER, IEEE

Abstract- The problem of meeting the proliferating demands for mobile telephony within the confinements of the limited radio spectrum allocated to these services is addressed. A multiple beam adaptive basestation antenna is proposed as a major system component in an attempt to solve this problem. This novel approach is demonstrated here by employing an antenna array capable of resolving the angular distribution of the mobile users as seen at the base-station site, and then using this information to direct beams toward either lone mobiles, or groupings of mobiles, for both transmit and receive modes of operation. The energy associated with each mobile is thus confined within the addressed volume, greatly reducing the amount of co-channel interference experienced from and by neighboring co-channel cells. In order to ascertain the benefits of such an antenna, a theoretical approach is adopted which models the conventional and proposed antenna systems in a typical mobile radio environment. For a given performance criterion, this indicates that a significant increase in the spectral efficiency, or capacity, of the network is obtainable with the proposed adaptive base-station antenna.

T

I.

INTRODUCTION

HE FREQUENCY SPECTRUM is. and always will be. a finite and scarce resource, thus there is a fundamental limit on the number of radio channels that can be made available to mobile telephony. Hence, it is essential that cellular land mobile radio (LMR) networks utilize the radio spectrum allocated to this facility efficiently, so that a service can be offered to as large a subscriber community as possible. Indeed, a major consideration of the second generation cellular discussions in both the US and Europe has focused on this point. However, present and proposed future generation cellular communication networks which employ either omnidirectional, or broad sector-beam, base-station antennas, will be beset with the problem of severe spectral congestion as the subscriber community continues to expand. A measure often used to assess the efficiency of spectrum utilization is the number of voice channels per megahertz of available bandwidth per square kilometer [11. This defines the amount of traffic that can be carried and is directly related to the ultimate capacity of the network. Hence, as traffic demands increase, the spectral efficiency of the network must also increase if the quality and availability of service is not to be degraded. At present this is overcome in areas with a Manuscript received April 18. 1989. This work was supported by UK

SERe.

The authors are with the Centre for Communications Research, Faculty of Engineering. University of Bristol, Bristol, 858 lTR. UK. IEEE Log Number 9034227.

AND

high traffic density by employing a technique known as cell splitting. However, the continuing growth in traffic demands has meant that cell sizes have had to be reduced to a practical minimum in many city centers in order to maintain the quality of service. As well as increasing the infrastructure costs, the number of subscribers able to access these systems simultaneously is still well below the long-term service forecasts due to the reduced trunking efficiency of the network. This places great emphasis on maximizing the spectral efficiency, or ultimate capacity, of future generation systems. and thereby fulfilling the earlier promises of performance. There have already been significant developments in terms of spectral efficient modulation schemes, e.g., the proposed US narrow-band digital linear system [21, [3) and the second generation Pan-European cellular network [4}. Also. in the area of antenna technology. the use of fixed coverage directional antennas has been considered [51. In particular. the use of fixed phased array antennas. with carefully controlled amplitude tapers and sidclobe levels for the enhanced UK T:\CS network (ETACS) [6]. are currently under evaluation. However. the application of adaptive antenna arrays in civil land mobile radio systems has hitherto received little attention. in spite of the significant advances made in this field for both military and satellite communications. In this paper a multiple beam adaptive base-station antenna is proposed to complement other solutions, such as spectrum efficient modulation, currently being developed to meet the proliferating demands for enhanced capacity in cellular networks. The feasibility of such a scheme is demonstrated. and a comparison made with existing conventional antennas in a realistic mobile radio environment. Geometrical and statistical propagation models are used and a unique insight is given into the benefits of utilizing adaptive base-station antennas in a cellular radio system. Finally, the concept of such a scheme is discussed and the integration of adaptive antenna array technology into a mobile communications environment considered. II.

ADAPTIVE ANTENNA ARRAYS

An adaptive antenna array may be defined as one that modifies its radiation pattern, frequency response, or other parameters, by means of internal feedback control while the antenna system is operating. The basic operation is usually described in terms of a receiving system steering a null, that is, a reduction in sensitivity in a certain angular position. toward a

Reprinted from IEEE Transactions on Vehicular Technology, Vol. 39, No.1, pp. 56-67, February 1990.

387

source of interference. The first practical implementation of electronically steering a null in the direction of an unwanted signal, a jammer, was the Howells-Applebaum sidelobe canceller for radar. This work started in the late 1950's, and a fully developed system for suppressing five jammers was reported in open literature in 1976 by Applebaum [7]. At about the same time Widrow [8] independently developed an approach for controlling an adaptive array using a recursive least squares minimization technique , now known as the LMS algorithm. Following the pioneering work of Howells, Applebaum, and Widrow, there has been a considerable amount of research activity in the field of adaptive antenna arrays , particularly for reducing the jamming vulnerability of military communication systems. However, to date, there has been little attention to the application of such techniques in the area of civil land mobile radio. Adaptive antenna arrays cannot simply be integrated into any arbitrary communication system, since a control process has to be implemented which exploits some property of either the wanted. or interfering, signals. In general. adaptive antennas adjust their directional beam patterns so as to maximize the signal-to-noise ratio at the output of the receiver. Applications have included the development of receiving systems for acquiring desired signals in the presence of strong jamming. a technique known as power inversion [91 . Systems have also been developed for the reception of frequency hopping signals [10), [II). TDMA satellite channels [12) and spread spectrum signals [131. Of particular interest for cellular schemes is the development of adaptive antenna arrays and signal processing techniques for the reception of multiple wanted signals [14] .

A. Fundamentals of Operation The adaptive array consists of a number of antenna elements. not necessarily identical. coupled together via some form of amplitude control and phase shifting network to form a single output. The amplitude and phase control can be regarded as a set of complex weights, as illustrated in Fig. I. If the effects of receiver noise and mutual coupling are ignored . the operation of an N element uniformly spaced linear array can be explained as follows. Consider a wavefront generated by a narrow-band source of wavelength A arriving at an N element array from a direction fh off the array boresight. Now taking the first element in the array as the phase reference and letting d equal the array spacing. the relative phase shift of the received signal at the nth element can be expressed as ,T. 'l'nk

=

27rd(n - I)

A

.

Sin

(J

k·

(1)

Assuming constant envelope modulation of the source at Ok. the signal at the output of each of the antenna elements can be expressed as

(2) and the total array output in direction fh as

YkCt) =

L wnej("'I +'~n.) N

n ~1

(3)

Far Field Signa l Source Array Outpu t

Antenna Arra y

Fig. I.

An adaptive antenna array..

where wn represents the value of the complex weight applied to the output of the nth element. Thus by suitable choice of weights, the array will accept a wanted signal from direction I and steer nulls toward interference sources located at fh, for k # I. Likewise. the weighting network can be optimized to steer beams (a radiation pattern maxima of finite width) in a specific direction. or directions. It can be shown [15] that an N element array has N - 1 degrees of freedom giving up to N - I independent pattern nulls. If the weights are controlled by a feedback loop which is designed to maximize the signalto-noise ratio at the array output. the system can be regarded as an adaptive spacial filter. The antenna elements can be arranged in various geometries . with uniform line. circular and planar arrays being very common. The circular array geometry is of particular interest here since beams can be steered through 360 0 • thus giving complete coverage from a central base-station. The elements are typically sited A/2 apart. where A is the wavelength of the received signal. Spacing of greater than A/2 improves the spatial resolution of the array. however. the formation of grating lobes (secondary maxima) can also result. These are generally regarded as undesirable.

o

B. Adaptive Antenna Arrays for Cellular Base-Stations Multiple beam adaptive antenna arrays have been considered by Davies et al. [16] for enhancing the number of simultaneous users accessing future generation cellular networks. It is suggested that each mobile is tracked in azimuth by a narrow beam for both mobile-to-base and base-to-mobile transmissions. as shown in Fig. 2. The directive nature of the beams ensures that in a given system the mean interference power experienced by anyone user, due to other active mobiles, would be much less than that experienced using conventional wide coverage base-station antennas. It has already been stressed that high capacity cellular networks are designed to be interference limited, so the adaptive antenna would considerably increase the potential user capacity . This increase in system capacity of the new base-station antenna architecture was evaluated [17] by considering the spatial filtering properties of an antenna array. The results show that this type of base-station antenna could increase the spectral efficiency of the network by a factor of 30 or more. These results were obtained for a hypothetical fast frequency hopping

388

cellular systems. However, some well-established trends are becoming apparent in the quest toward higher spectrally efficient modulation schemes [I] for the systems of the year 2000 and beyond . It is thus vital during the initial stages of research to develop antenna architectures which are, in essence, modulation scheme independent, so that a figure of merit can be obtained for the rnultibeam base-station antenna .

Base-station

III . REDUCTION OF CO-CHANNEL INTERFERENCE USING ADAPTIVE ANTENNAS

Fig. 2.

Tracking of mobiles with multiple beams .

code division multiple access cellular network (18) . assuming uniform user distribution and complete frequency reuse for the omnidirectional antenna case. i.e .. adjacent cells are cochannel cells. Complete frequency reuse is then assumed for each of the beams formed by the adaptive array . i.e.. adjacent beams are co-channel beams . Further. it was shown that a similar enhancement of efficiency can be obtained for either an idealized multibeam antenna . or a realizable 128 element circular array (19] . It was recognized in the analysis. but not fully assessed. that this approach would greatly increase the level of co-channel interference. It was. therefore. suggested that this problem could be overcome using dynamic channel allocation to eliminate the so called common zones. This again introduces additional hand-offs , reducing the trunking efficiency and available capacity of the network. as the mobile circumnavigates the cell. The only study previous to the work discussed above considering the use of an adaptive antenna array in land mobile radio was by Marcus and Das (201 in 1983. The analysis assumed that the base-station. or repeater. sites could be placed closer together if an antenna array formed 20 dB nulls toward co-channel sites. This effectively reduces the amount of co-channel interference at the output of the base-station as explained in Section II . It was suggested that in this system the beam steering information could be derived from the squelch tone injection which is presently used in the US FM land mobile radio. In contrast with the null steering technique considered by Marcus and Das, here the ability of the adaptive array to steer radiation pattern maxima toward the mobiles is considered . In the limit it can be envisaged that individual beams will be formed towards each mobile as illustrated in Fig . 2. It has already been mentioned that adaptive antenna technology cannot be simply integrated into an arbitrary communication system. and at present no one particular modulation scheme , or access technique, has been selected for the third generation of

In this paper the integration of an idealized adaptive array into an existing cellular network is considered . In order to ascertain the benefits of this class of antenna system compared with that of conventional omnidirectional base-station antenna systems, the following network topology has been assumed . 1) A cellular network consisting of hexagonal cells. with channel reuse every C cells (C is the cluster size) . 2) The base-station transmitters arc centrally located within each hexagonal cell . 3) There is a uniform distribution of users per cell. 4) There is a blocking probability of B in all cells. 5) The omnidirectional base-station antenna has an ideal beam pattern. giving a uniform circular coverage . 6) The adaptive base-station antenna can generate any number. m, of ideal beams. with a bearnwidth of 2;r !m. and a gain equal to the omni-antcnna . 7) Each adaptive beam will only carry the channels that are assigned to the mobiles within its coverage area . 8) Any mobile (or group of mobiles) can be tracked by the adaptive base-station antenna. 9) The necessary base-station hardware is available to enable bcarnforrning and tracking. 10) The same modulation scheme can be used with each antenna system. The blocking probability of B in assumption -i) is the fraction of attempted calls that cannot be allocated a channel. If there are "a" Erlangs of traffic intensity offered. the actual traffic carried is equal to at 1 - B) Erlangs. The Erlang is a measure of traffic intensity. and measures the quantity of traffic on a channel or group of channels per unit time . This gives an outgoing channel usage efficiency (or loading factor) (211 of (-i) 1) = a ( l - B) /N where N is the total number of channels allocated per cell. Assumptions 6). 7). and 8) imply the deployment of a somewhat hypothetical adaptive antenna system. This approach can be justified since a uniform user population has been assumed for both categories of antenna system. It is recognized that the dynamic. nonuniform. user distribution will have a significant effect on the results presented here. This will be considered in a subsequent more rigorous study. Also, in the analysis which follows only the base-to-mobile link has been studied. however, it can be shown that the analysis is also valid for the mobile-to-base link. Two different categories of co-channel interference models are used as the basis for the study presented here. The first is the geometrical model adopted by Lee l51. followed by a more rigorous statistical analysis (211-(23) .

389

o

= Base station

Wanted cell

Interfering cell Region of interference

Region of no interterenc/ dw

0-------------- -------------------------------Wanted

Fig. 3.

IV.

Base-station

D

...

_ Interfering Base-station

Worst-case position

Two co-channel cells.

GEOMETRICAL PROPAGATION MODEL

This approach considers the relative geometry of the transmitter and receiver locations, and takes into account the propagation path loss associated with the mobile radio channel. A. One Co-Channel Cell

Consider one co-channel cell which forms part of a cellular network as shown in Fig. 3. By definition both the cells have the same channel allocation . and a reuse distance of D separating the base-station transmitters. The co-channel reuse ratio is defined as

Q

==

D iR.

(5)

This ratio has also been termed the co-channel interference reduction factor [5j since the larger it is (i.e ... the further apart the cells) the less the co-channel interference for J given modulation scheme. The level of acceptable co-channel interference governs the value of this parameter and the overall spectral efficiency of the network. The area mean signal level experienced at the mobile is assumed to be inversely proportional to the distance from the base-station raised to a power -y. With the advent of smaller cells . the propagation path loss is close to the free-space value [24 J.. however. it is envisaged that the proposed base-station will initially operate in larger cells. Therefore . as a starting point for the comparison to follow, the commonly used approximation that the received signal power is inversely proportional to the fourth power of range will be used [251. Hence . the area mean signal level (in volts) received from the wanted base-station at a mobile a distance d w from the transmitter is

(6) Similarly, the area mean signal level from the interfering basestation transmitter at a distance d, is (7)

assuming in each case identical radiated transmitter powers and signal propagation constants, as denoted by the constant k.

Co-channel interference will occur when the ratio of the received wanted signal envelope, s.,; to the interfering signal envelope, s., is less than some protection ratio . Pr, i.e.:

(8)

Fig. 4.

Contour defining interference regions.

The protection ratio is defined by the modulation scheme employed [1]. Considering only the propagation path loss, the received signal envelopes are equal to the area mean signal levels, hence: mw

d~

d~ S Pro w

(9)

So.. for a given protection ratio . a locus given by

d, [d ; == V!jJ;

( 10)

can be drawn. This defines a region where no interference will occur.. and where it will always occur, as illustrated in Fig. 4. For the worst-case position, which is in a direct line between the transmitters as shown, the co-channel reuse ratio is

Q ==D!R == 1 +di/dw == 1 + JJJ;.

(11)

For a given protection ratio and modulation scheme, this defines the minimum spacing between co-channel cells in order to avoid interference, and the maximum spectral efficiency obtainable. In this discussion it is assumed that the same modulation scheme is employed for both antenna systems under evaluation. This implies that the protection ratio and reuse distances are identical in both cases. Therefore, there would appear to be no apparent benefit from employing adaptive antenna technology at the base-station site. However, the occurrence of co-channel interference is a statistical phenomena. Hence, when comparing omni- and adaptive antennas, it is necessary to introduce the concept of the probability of co-channe/ interference occurring, i.e., Pis.; :::; p,Sj). This is often called the outage probability, which is the probability of failing to obtain satisfactory reception at the mobile in the presence of interference. If the cells are considered to be identical, i.e., have equal blocking probabilities, then on average, there will be N1J active channels in each cell (71 is as defined in (4». So, in the case of the omnidirectional antenna, given that the wanted mobile is already allocated a channel, the probability of that channel being active in an interfering cell is the required outage

390

0 ··· ··········· 0

J/ ~O

t~

i "Z i -o

,: c0 : u ,

11>

\U1

.0

o o L a.

.0

Wan ted ce ll

g

QC9

Q)

0'

o .....

'\~

:J

o

Fig . 6.

lnt er ter inq cel ls

~

Hexago nal cellular layout showin g tiers of inte rferers .

a ( I - B ) active channels (or user s). Th is is only rea lly valid if a ( I - B ) > m and that the use rs are uniformly dist ributed

5

10

15

20

25

30

Nu m b er of beams Fig . 5.

35

40

45

within the cell. If this were not the case, and the numbe r of beams formed was less than m, the outage probability would be reduced even further since the wanted mobile will not be covered by a co-channel beam all the time . This situati on will not be pursued further since thi s analysis can be regarded as worst -case situation.

50

m

Outage as a function of the number of beams .

probability. He nce. when the wanted mobile is in the region of co-c hannel interference the outage probability is given by P (s w

< Pr Si) = -

numbe r of active channels . total number at channels

NT] N

=

B . Six Co -Channel Cells

1) .

( 12)

Now consider the case of the adaptive antenna as previously desc ribed. with m beams per base-station providing coverage of the whole cell, and wit h N T/ 1m channels per beam. give n a unifo rm distribution of user s. T he same regions of co-channel interfe rence can be defi ned . however , whe n the wanted mobile is within the region where co-cha nnel interference may occ ur, the outage probability is reduced . The wanted mob ile is always covered by at least one beam from the co-channel cell. hence. the outage probability is equal to the probability that one of the channels in the aligned beam is the corresponding act ive co-channel ' and is given by

Pts; < P rs ; ) -

number channels per beam total number of channels

= ----,----.,.-'---NT/ 1m N

=

m

The previous ap proach can now be simply extended to JS sess the effect of six co-channel interfer ers. i.e .. the first tier of co-channel cells in a con ventional cellular scheme JS shown in Fig . 6 . It is co nsidered that further tiers of interfere rs will not significantly af fect the results except when reu se distances become small. Equat ion ( 9 ) can now be rewritten for this mor e reali stic representation o f the cell ular network s". s/

I The " active co-cha nnel " is the chan nel that has also been allocated to the wanted mobile .

In /

=

d,:c II

Ld,-c

'5:

o,

( 14)

1= 1

where the total mean signal le ve l from the interfering cell s. m, , is the sum of the mean level from each active cell. Thu s in a fully loaded system. the number of active users is six ( i.e . , n = 6). If all the d , are assumed to be equal and the wanted mobile is at the edge of a cell bou nda ry. as for the case described in Lee [5). then the co-c hannel reuse factor ca n be expressed as

(13)

where the omnicase is given by m = I . These results are pre sented graphically in Fig . 5, and show the strong influence of the number of beams, m , on the outage probability. The influence of the loading factor, 1] , is as expected, i.e ., the less the loading , the fewer the numbe r of active channels , and hence, a red uced chance of co-c hannel interfe rence . This assumes that there are still m bea ms formed even thoug h the re are only

I n ".

Q = [6(5 w I5dl" "'I .

( IS)

Subjective tests showed that over a mobile radio channel

swl5/ 2:: 18 dB (i .e. , PR = 18 dB) gave good speec h trans-

mission for a 25-kHz FM channel operation . A value of Q can now be calculated to defi ne the minimum cluster size, C. Using sim ple geometry it would be possible to evaluate the actual swls/ in the worst-case locations. From this, a conto ur may be d raw n defi ning region s with and without interference . For bo th classes of antenna sys tems the outage probab ility is still ze ro within the contour (i .e .. when s.. .. ls, > p,) , but outside,

391

in the region of interference: P(Sw 5: Pr S / ) = ( : )

6

(16)

where s I is the total co-channel interference and the omnicase is given by m == 1. Since it is assumed that all m beams per cell are formed, there are six beams aligned onto the wanted mobile at any time. The outage probability within the region of interference is then found by considering the probability that the active co-channel is in each of these beams. C. Analysis of Results The use of adaptive multiple beam-forming base-stations would, based upon the analysis presented so far, appear to give an improvement in performance with regard to the reduction of the probability of co-channel interference. The improvement depends on the degree of adaptivity used, i.e .. the number of ideal beams formed. However . the above approach is over simplistic and gives a rather optimistic view of the situation. Firstly, the beams are assumed to be ideal. giving an equal gain over the whole beamwidth. In practice this would not be the case. Also. a hypothetical situation could be envisaged where. if In is large enough to satisfy a given outage critcrion.? it would appear that the ultimate reuse distance (D R == 2) is possible for any modulation scheme. Hence. adjacent cells arc co-channel cells . the radius of which is decided by the rcqui red coverage area of the base-station site. In spite of this though. the analysis has been useful in introducing some of the important factors that affect the performance of a mobile radio network which exploits frequency reuse as a means of increasing spectral efficiency.

v . STATISTICAL

PROPAGATION ~loDEL

In the previous analysis only the path loss associated with the mobile radio environment was considered when calculating the level of co-channel interference. This was useful in demonstrating the principle benetits to be offered by adaptive antennas . although it is an over simplified approach and totally unrealistic of many land mobile radio environments. It was shown that a single contour defining regions of operation where co-channel interference would occur can be drawn., however. it is known that the signal levels fluctuate rapidly generating small isolated pockets of interference in an operational system. In some adverse environments these areas may be quite close to the base-station antenna. There is seldom a line of sight path between the base-station and the mobile . and hence, radio communication is obtained by means of diffraction and reflection of the transmitted energy. This produces a complicated signal pattern causing the field strength to vary greatly throughout the cell, and the received signal at the moving mobile to fluctuate very rapidly. This is generally attributed to the superposition of two different classifications of signal fading phenomenon: fast fading (or just fading) due to the multipath nature of the received signal. and slow fading (shadowing), the slower variations of 2

the received signal due to variations in the local terrain. In areas experiencing this type of signal variation, the area mean signal level is essentially constant. In order to model these propagation effects, the are included in a statistical fashion, the fading and shadowing described above being represented by Rayleigh and log-normal type distributions, respectively.

A. One Co-Channel Cell Various studies [26]-[28] have been undertaken to analyze co-channel interference originating from a single co-channel interfering cell in an attempt to characterize the mobile radio environment. In particular the rigorous analysis presented by French [22] has been adopted here. The fast fading is the rapid fluctuation of the signal level 5 about the local mean s (s == (s)), and is usually described by a Rayleigh type probability density function (pdf) . i.e.:

7rS

[ _ ;rs~ ] .

Pts Is) == ---:-;- exp 25-

( 17)

4s~

Shadowing of the radio signal due to the terrain, i.e., by buildings and hills, causes the local mean level s to fluctuate about the area mean. It has been generally accepted that this variation is log -normally distributed about the area mean md, where md == (Sci) . the mean of s in decibels. (Note, a subscript I".d' indicates that a signal is in decibels.) The area mean level is approximately proportional to the inverse of the distance from the base-station raised to the power 1, as described in Section IV-A. Hence. the log-normal shadowing pdf is given by ( 18) The standard deviation. (J . describes the degree of shadowing. This parameter typically varies from 6 to 12 dB in urban areas ~ the larger value being associated with very built up inner city areas. The combined pdf can now be expressed as

If there arc ten beams (m = 10) a loe;{J outage criterion could be satisfied

in a fully loaded system (13).

392

P(S)

= i:P(S/S).P(Sd)dSdo

By substituting s == IcY d / 2o (from Sd == 20 the combined pdf becomes P(s)

==

J

7r 18(J2

J

x

.

-x

S -dj'O

lOS ..

exp

. exp

(19)

10gIO s)

into (17),

2] [7rS -d no 4 x lOS

- (Sd -

.,

2(J-

[

1-

m d)2]

dSd.

(20)

1) Outage Probability With Fading and Shadowing: The outage probability with fading and shadowing is derived in French [22] and the resulting integral is

Pis; 5: PrSi) where, Zd = mdw internal variable.

-

1

=..Ji mdi -

j'oo

-00

PR,

exp ( _u 2 ) 1 + 1O(zr2aul/ 1O du (J

=

O"w

=

(Ji,

and

(21) u

is an

In many situations it is possible to greatly reduce the fading, e.g. , antenna diversity at the mobile, and a similar result to that above can be derived [22) for shadowing only. Note that the result in (21) is for the case of the omnidirectional basestation antenna . 2) Outage Probability with an Adaptive Antenna: The outage probability for an adaptive antenna can be simply ex pressed as

rts; :::; p-s] , m)

=

a = 6 dB n' = 0 .7

-,

10

-0

o o ....

et»; :::; P,Si)

-0

Q..

probability of an active co-channel) (

Q)

Q'l

...,o

in the aligned beam

= Pis; :::; p ,s,) ' ( ; )

:J

o (22)

i.e ., the probability that the ratio of the wanted signal to the interfering signal is less than some protect ion ratio (21) and the probability that the aligned beam actually contains the ac tive channel (13) . Again, the outage probability is redu ced by a factor m. This is illustrated graphically in Fig . 7. The loading factor is fixed at 70lJD (,., = 0.7) and the fading and shadowing case is considered for a = 6 dB . This represents a typical urban environment. The se results have been obtained by solving (21) and (22) numerically with In = l. 2. ..J.. 8. 16, and 32. Note that In = I gives the ornnicase . The outage probability varie s as expected with :::<1 and . for the omnicase is consistent with French . Note . however . that for a given Zd the outage is reduced by a factor of 111 when an adaptive antenna is considered . 3) Calculation of the Reuse Distance: When the fading and shadowing characteristics of the mobile radio channel are considered, it can be shown that no definite boundary ex ists between regions of interference , and regions of no interference. Co-channel interference can even occur dose to the wanted transmitter if. for example. the wanted signal s fades and the interfering signal peaks as illustrated in Fig . 8. Now. since co-channel inte rference is a statistical phenomena. it can be described by contours of outage probability. Using the definition for Zd and (9) yields

10

Fig . 7 .

30

20

50

~o

60

Outage probabiliry tor on e co- cha nne l cell.

fa)

' n t er fer en c e

a .n ter f eri nq

Bo se -s tat ion

:~ = JZii; = J IO(~d 'P" J ,20 .

(23)

So for a given Zd, protection ratio p R , and outage probabil ity. contours can be drawn as shown in Fig . 9 . If an adaptive antenna is used , the value of each contour is reduced by a factor of m, hence , for a given outage cr iterion, the service area is increased. This can be represented graphically by substituting (23) into (11) and expressing the co-channel reuse ratio as (24)

From this formula the outage probability against the reuse ratio for a given protection ratio can be obtained in a manner similar to that of French . 4) Calculation of the Cluster Size: In a cellular network with a hexagonal layout, the cluster size is related to the reuse

393

(b)

Fig . 8.

Contour defirung regions of inter fe rence . (a) With no shadowing or fadin g . (b ) W ith shadowing and fading .

distance by

C = Q:'/3.

(25 i

Note that only certain value s of C are possible in a hexagonal cellular network [25), i.e. , C = (3, 4, 7 , 9, 12. 13. 16. 19. 21, · · .). Using (24) this can be expressed as

C = 1/3[1

+V

IOIZd +h )/ :'O):' .

(26 )

Again the outage probability can now be evaluated for variou s cluster sizes for a given protection ratio .

Out age Proba bility contours~

· · O.~l:.~.

-.

.....

.

Adaptive service area (m = 10)

.

a Interf ering Base-station

....... ..

." Fig .

l} .

Om ni- service ar e a

Outage probability contours and service areas for a 10'; chance of interference .

essary to consider interference originating from multiple cochannel cells. Several different studies [21], [29]-[31] have pursued this goal. but of particular interest is the work by Muammar and Gupta [23] . This has been adopted here since the analysis follows directly from the previous discussion . However, a few alterations have been necessary in order that a more meaningful comparison could be presented. Fig. 6 shows the cellular layout of a mobile radio network for an arbitrary cell cluster size of C. It is recognized that there are many tiers of co-channel interferers present, but only the first. i.e. , cells at a distance D from the wanted base-station. it considered here. This assumption was shown to be valid in similar studies [29], [31]. The wanted mobile in the central cell receives a signal envelope Sw from the wanted base-station . It also receives unwanted signals from the cochannel cells s., i = 1, 2, . .. .n . where n is the number of active interfering co-channel cells (the maximum number being six in this case) . The total co-channel interference is thus given by n

5) Calculation of Spectral Efficiency: To gain a more

meaningful interpretation when comparing different system architectures in a cellular network . the spectral efficiencies [II of the various schemes are usually considered . This gives an unbiased measure of spectrum utilization. and is usually expressed as the number of channcls/Ml-lz of bandwidth /krrr' . I. e.. I efficiency, E = B<~ = B,C.4 B,lCAl

i ~1

When the wanted signal does not exceed this value by the protection ratio. co-channel interference will occur. In order to calculate the total probability of co-channel interference (or simply the outage probability). it is necessary to consider the probability of there.being co-channel interference and n interfering co-channel cells . Using conditional probability theory this can be expressed as P; (co-channel interference) Ii tn active co-chunnelsj)

where

B,

Be

C

A

=P(S,. SPrs, !n) ·P(n).

total available bandwidth . channel spacing in megahertz. number of cells per duster. cell area (krrr') .

C nmm C a<1apl '

(30)

Pirn is the PDF of nand Pts ; S p.s, I n ) is the conditional outage probability (the probability of co-channel interference given that there are n active interfering cells) . Hence. the total

To enable a simple comparison to be made between ornniand adaptive antenna systems. it is necessary to assume that an identical modulation scheme will be employed in both cases . Thus E 'X l /C. and the relative spectral efficiency can be expressed as

E a<1apl E mnOi

(29)

S/=LSi.

outage probability is given by

(28)

Equation (22) was solved numerically and then the cluster size C. given by (26). was calculated for a fading and shadowing (6 dB variation) environment with a loading factor of 0.7. An outage criterion of I II.. is used and. although this value is quite low, it serves to give some idea of the advantages that can be obtained by using this new class of base-station antenna. Two values of protection ratio. 8 and 20 dB, are considered in order to cover a variety of modulation schemes [I]. Then. using (28). the relationship between the relative spectral efficiencies was calculated for m = I. 2. 4, 8, 16. and 32, and is shown in Fig. II .

P(Sw S Prs,)

In order to present a more realistic comparison between omnidirectional and adaptive base-station antennas, it is nee-

394

PrS' ! n ) . Pen)

(31)

n

since all possible values of n must be taken into account. Here only the first tier is considered. so the maximum number for n is six . The pdf of the signal envelope s is as given by (20) and from here the conditional probability of co-channel interference for multiple interferers. when considering both fading and shadowing. can be derived. This result is simply quoted here without proof as details [23] can be found elsewhere.

where 20 loglo K(X, u) =

B . Six Co-Channel Cells

= L ri»; S

Zd

+C· In(4/ 1rn:2) 1

~

2

+oX - aNYu - 4C (a"Nx - aNY)

(32)

and where a~x and a~y are defined by Muammar and Gupta [23]. The variable Zd is as defined previously, and u and X are internal variables. This integral can be solved using various numerical techniques and the results are presented later. 1) PDF of n, Ptn): P(n) is the probability that the number of active interfering co-channel cells is n and so if the channels are assumed independent and identically distributed, this has the form of a binomial pdf:

a = 6 dB ~

= (:) p"(l

-1

..Q

o

-2 10

L

o

where p is the probability of finding one interfering cochannel active. Using the loading factor 11, as defined before. the probability p that a single co-channel cell has an active co-channel, given that the wanted mobile has been assigned that channel already, is number of active channels

p

== total number of channels

a(

1 - B) == 11· N

-+J

::J

~

-3

10

) -== )

L..-

p(SH,'

S Pr S ' n i

(34)

(35)

- YJ}

{) n -~

Hence, the origination probability is given by

60

(dB)

Outage probability \\ uh vix co-channel (db.

Y1 \

.

-

number of active channels in beam total number of channels 11 m

so

40

n

(probability that the interfering co-channel is in the) beam pointing at the wanted mobile

N

z,

30

The overall outage probability can now be expressed as

This can now be calculated for a given outgoing channel usage efficiency over the range of ;'r/. Alternatively, as befor;, the outage probability can be considered against the cochannel reuse ratio Q, or the cluster size C. 2) Integration of Adaptive Antennas: With an ornnidirectional antenna the probability of an active interfering cochannel cell was given by n . the outgoing channel usage efficiency or cell loading factor. Since it is assumed that at any one time all In beams per cell are formed. there will always be six beams aligned onto the wanted mobile. Hence. for the adapti ve antenna:

a(l-B)jm

20

10

(36)

p

.7

-4

n

===

~

v

~

10

Hence. giving the total outage probability as

Pi s ; < Pr S ' -

~

~

~

1S'

~

~

o

Fig. 10.

11"(1 -11)6-11,

~

~

The origination probability [21 J. or the probability that n co-channel interfering cells are using the same channel as the wanted mobile. can then be expressed as Pen) = ( : )

~

~

0... OJ CJl

~

~ ~

o

(33)

_p)6-"

0.7

10

..Q

Pen)

=

(37)

,;;)

(')-fl

(39)

where the omnidirectional case is given for /11 -=- I. As before. when only one co-channel cell was considered. a comparison can now be made between the two base-station technologies. Fig. 10 shows the variation of the total outage probability against ~d. The case with both fading and shadowing is considered here. hence. the results are obtained by num;rically solving (32) and applying (39) for m == 1.2.4.8. 16. and 32. From (26) and (28) the cluster size and the relative spectral efficiency can now be calculated for a given outage criterion (1 (,{), and is shown in Fig. 11. An outgoing channe: usage efficiency of 0.7 and a log-normal shadowing standarc deviation of 6"dB has been assumed. Protection ratios of 8 and 20 dB have also been considered. 3) Analysis of Results: It can be seen from Fig. 11 that for a given outage criterion and modulation scheme. the introduction of an adaptive array capable of forming eight tracking beams, into an existing network. would produce at least a threefold increase in efficiency. This can be equated to be three times the number of channels per megahertz per square kilometer, or simply as three times as many users in each cell. This result has been obtained by considering the co-channel interference originating from single and multiple co-channel cells. The propagation model employed here considers both the fading and shadowing characteristics of the mobile channel. Although the application of power control of the individual tracking beams has not been considered to date, this may

395

12

l'.

"" -

20 dB, 6 co-ch annel cell • •

o

P R

8 dB, 6 co-channel cell••

•

P

a

III -

R

::::I

20 dB. one co-channel cell. 8 dB, one co-channel cell.

>.

u

C CIl

u

~l

6

0404-

W

o Bose-station

o

L

+'

U

CIl

a.

Mob iles

4

V'l CIl

·3o

2

CIl

a:::

o

o

5

10

15

20

25

30

35

Number of Beams m Fig . II .

Relat ive spectral efficiency as a tuncu on of the number of beams formed .

prove to be essential for future generation networks . It has been shown [211. [301 that base-station power control can substantially reduce co-channel interference for omnidirectional antenna systems. It can thus be envisaged that power con trol of the multiple beam adaptive base-station antenna would greatly reduce energy "overspill " into neighboring cells . The combination of the two adaptive techniques will be considered in a later study. VI.

F.ig. 12.

Optimal beam forming .

Rx/T x Antenna Array

THE "SMART" BASE-STATION AI'TE:-;NA

Adaptive antennas operate by exploiting some property of the signal environment present at the array aperture [71. [81. and it is due to this ability that they are often aptly referred to as "smart" arrays (32). In the previous theoretical analysis it was assumed that the base-station antenna could track any mobile . or group of mobiles. within its coverage area . Therefore. on reception. the array must be capable of resolving the angular distribution of the users as they appear at the basestation site. Armed with this knowledge, the base-station is then in a position to form an optimal set of beams, confining the energy directed at a given mobile within a finite volume . This concept can be further illustrated by considering the sequence of events illustrated in Fig . 12. The scenario depicted is realistic of many operational systems where there are lone mobiles , or groups of mobiles, dispersed throughout the cell. Using the spatial distribution of the users acquired by the arrayon reception , the antenna system can dynamically assign single narrow beams to illuminate the lone mobiles, and broad beams to the numerous groupings along major highways. It can be seen that by constraining the energy transmitted toward the mobiles, there are directions in which little or no signal is radiated. It is this phenomenon which gives rise to

396

N Signa l

ports

SOURCE ESTIMATION PROCESSOR

!

BEAMFORMER

Mobile Location Data F+g. 13.

The " smart " antenna .

the reduction in the probability of co-channel interference occurring in neighboring cells, and thereby increasing the spectral efficiency (or capacity) of the network as illustrated in the previous section. The realization of such an adaptive base-station antenna requires an architecture capable of locating and tracking the mobiles, and a beam-forming network thus capable of producing the appropriate multiple independent beams . The former requirement can be broadly classified as that of a direction finding, or a spatial estimation problem. These two tasks are illustrated in Fig. 13 as a source estimation or direction find-

in spectral efficiency. In addition to the other advantages that can be gained, as outlined in the previous section, the infrastructure costs incurred by this base-station antenna must also be considered. The rnajority of these costs are associated with the acquisition of the base-station site, construction of various buildings and antenna masts. When compared with existing schemes, such as cell splitting, the overall cost is less, since fewer base-station sites arc required for an equivalent user capacity. Also, unlike the techniques of cell splitting and cell sectorization, the multiple beam adaptive array would not irnpair the trunking efficiency of the network.

ing (DF) processor and a beamformer. In recent publications [33], [34] this concept was further extended to consider the implementation of such a base-station antenna. Also the results from some initial computer simulations were presented and these demonstrated the ability of the antenna array to resolve multiple mobile users in the signal fading conditions typical of the LMR scenario. Many of the popular superresolution DF algorithms used in radar were evaluated, and some beam-forming techniques which could generate the optimal beam set using the knowledge of the mobile distribution were discussed. Finally, a proposal was put forward for a fully adaptive base-station antenna test rig to demonstrate the principles of operation and show how such an antenna could be incorporated into the existing cellular network. The results of this work will form the basis for a future paper. Intelligent antenna systems have also been considered for numerous other applications. Sandler and Kokar [35] have described the use of an adaptive antenna in conjunction with an artificial intelligence system as an antijamming antenna for radar. Here the antenna has a wealth of stored data which is used to ·'teach the system about the various scenarios it will encounter. In this way it can adapt readily to every new situation as it is presented and "learn from its mistakes. It is intended to exploit the synergy that exists between this application and that of the proposed cellular mobile radio basestation. Initially, the knowledge of the mobile locations within each cell could be utilized to provide an elegant hand-off mechanism as the mobiles cross cell boundaries. Also. combined with a knowledge of the local terrain and shadowing characteristics, it should be possible to extend this technique and provide a cellular network with dynamic cell boundaries. This would thus allow the optimal usage of the available system capacity.

ACKNOWLEDGMENT

We are extremely grateful to our colleagues at the Bristol University Centre for Communications Research for their stimulating intellectual support. REFERENCES

H

H

VII.

DlSCLSSION

The full potential of adaptive antenna technology in the future generation of ubiquitous portable communication networks is yet to be realized. The goal is to be able to provide universal pocket sized communications by the year 2000. This implies that the system must make very efficient use of the radio spectrum if it is to be made available to a large consumer base: thus making the portable equipment relatively cheap. Also, it is highly desirable that the portable communicator has a long duty cycle between battery recharging implying power efficient modulation. The role of adaptive antennas has already been discussed in terms of the former requirement. however, it must be emphasized that spectrum efficient modulation is still a vital parameter in the design of these systems. The potential enhancement of power efficiency obtainable using spatial filtering has not been fully assessed to date, and the merits of this technique in a rural service area are of particular interest. The study presented here has demonstrated the feasibility of an adaptive base-station antenna for cellular communications networks. A comparison made between the conventional and proposed schemes has shown that a marked improvement in spectral efficiency and capacity can be obtained, e.g., an idealized eight beam antenna could provide a threefold increase

[1] H. Hammuda, 1. P. McGeehan and A. Bateman, "Spectral efficiency of cellular land mobile radio systems," in Proc. 38th IEEE Veh. Techno!' Con]., Philadelphia, PA, pp. 616-622, June 15-17, 1988. [2] J. A. Tarallo and G. I. Zysman, "Modulation techniques for digital cellular systems," in Proc. IEEE Vehicular Technology Con]., pp. 245-248, June 15-17, 1988. [3] J. Uddenfeldt, K. Raith and B. Hedberg, "Digital technologies in cellular radio," in Proc. IEEE Vehicular Technology Conf., pp. 516-519, June 15-17, 1988. [4] F. Lindell, 1. Swerup and J. Uddenfeldt, "Digital cellular radio for the future," The Ericsson Rev., no. 3, 1987. [5] W. C. Y. Lee, "Elements of cellular mobile radio systems," IEEE Trans. Veh. Technol., vol. VT-35, pp. 48-56, May 1986. [6] P. C. Carlier, "Antennas for cellular phones," Commun. Int., pp. 43-46, Dec. 1987. [7] S. P. Applebaum, "Adaptive arrays," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 585-598, Sept. 1976. [8) B. Widrow, P. E. Mantey, L. J. Griffiths, and B. B. Goode, "Adaptive antenna systems," Proc. IEEE, vol. 55, pp. 2143-2159, Dec. 1967. [9] R. T. Compton, "The power inversion adaptive array: Concept and performance," IEEE Trans. Aerospace Electron. Syst., vol. AES15, pp. 803-814, 1979. [10) L. Acar and R. T. Compton, "The performance of LMS adaptive array with frequency hopped signals," IEEE Trans. Aerospace Electron. Syst., vol. AES-21, pp. 360-370, May 1985. [11] K. Bakhru and D. J. Torrieri, "The maximum algorithm for adaptive arrays and frequency-hopping communication," IEEE Trans. Antennas Propagat., vol. AP-32, pp. 919-928, Sept. 1984. [12] R. T. Compton, R. 1. Huff, W. G. Swarner, and A. A. Ksienski, "Adaptive arrays for communication systems: An overview of research at The Ohio State University," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 599-607, 1976. [13] R. T. Compton, "An adaptive antenna in a spread spectrum communication system," Proc. IEEE, vol. 66, pp. 289-298, Mar. 1978. [14] M. A. Beach, A. J. Copping, D. J. Edwards, and K. W. Yates, "An adaptive antenna for multiple signal sources," in Proc. lEE Fifth Int. Con! on Antennas and Propagation, University of York, 1987. [15J J. E. Hudson, Adaptive Array Principles, lEE Electromagnetic Wave Series No. 11. Stevenage, U.K.: Peter Peregrinus, 1981. [16] Telecom Australia, "Base-station antennas for future cellular radio systems," Rev. Activities 1985/1986, pp. 41-43,1985-1986. [17] W. S. Davies, R. 1. Lang, and E. Vinnal, "The challenge of advanced base station antennas for future cellular mobile radio systems," presented at IEEE Int. Workshop on Digital Mobile Radio, Melbourne, Australia, Mar. 10, 1987. [18] G. R. Cooper and R. W. Nettleton, "A spread spectrum technique for high capacity mobile communication," IEEE Trans. Veh. Techn., vol. VT-27, pp. 264-275, 1978. [19) D. H. Archer, "Lens-fed multiple beam arrays," Microwave J., pp. 171-195, Sept. 1984. [20) M. J. Marcus and S. Das, "The potential use of adaptive antennas to increase land mobile frequency reuse," presented at lEE 2nd Int. Con! on Radio Spectrum Conservation Techniques, CP224 , Birmingham, UK, Sept. 6-8, 1983.

397

[21] K. Daikoku and H. Ohdate, "Optimal channel reuse in cellular land mobile radio systems," IEEE Trans. Veh. Technol., vol. VT-32, pp. 217-224, Aug. 1983. [22] R. C. French, "The effects of fading and shadowing on channel reuse in mobile radio," IEEE Trans. Veh. Techno/., vol. VT-28, pp. 171-181, Aug. 1979. [23] R. Muammar and S. Gupta, "Co-channel interference in high capacity mobile radio systems," IEEE Trans. Commun., vol. COM-3D, pp. 1973-1978, Aug. 1982. [24] J. H. Whitteker, "Measurements of path loss at 910 MHz for proposed microcell urban mobile systems," IEEE Trans. Veh. Techno/., vol. VT-37, pp. 125-129, Aug. 1989. [25] W. C. Jakes, Microwave Mobile Communications. New York: Wiley, 1974. [26] W. Gosling. "Protection ratio and economy of spectrum use in land mobile radio," Proc. Inst. Elect. Eng., vol. 127, pt. F, pp. 174-178, June 1980. [27] M. Rata, K. Kinoshita, and K. Hirade, "Radio link design of cellular land mobile communication systems," IEEE Trans. Veh. Techno!., vol. VT-31, pp. 25-31, Feb. 1982. [28] A. G. Williamson, "Coverage, co-channel interference and outage probability calculations for mobile radio systems," in Proc. IREE 20th Int. Electron. Convention and Exhibition, Melbourne, pp. 224-227, Sept./Oct. 1985.

[29] D. C. Cox, "Co-channel interference considerations in frequency reuse small-coverage-area radio systems," IEEE Trans. Commun., vol. COM-3D, pp. 135-142, Jan. 1982. [30] V. Palestini and V. Zingarelli, "Outage probability computation and cellular coverage for mobile radio," in Proc. 37th IEEE Veh. Technolo Conf, Tampa Bay, FL, pp. 468-476, 1987. [31] Y. S. Yeh and S. C. Schwartz, "Outage probability in mobile telephony due to multiple log-normal interferers," IEEE Trans. Commun., vol. COM-32, pp. 380-388, Apr. 1984. [32] W. F. Gabriel, "Adaptive arrays-An introduction," Proc. IEEE, vol. 64, pp. 239-272, Feb. 1976. [33J S. C. Swales, M. A. Beach, and D. J. Edwards, "Multi-beam adaptive base-station antennas for cellular land mobile radio systems," in Proc. 39th IEEE Veh. Techno/. Conf, San Francisco, CA, pp. 341-348, Apr. 29-May 3, 1989. [34] S. C. Swales, M. A. Beach, D. J. Edwards, and 1. P. McGeehan, "A multi-beam adaptive base-station antenna for cellular land mobile radio systems," in Proc. 1989 IEEE Workshop on Mobile and Cordless Telephone Communications, University of London, UK, pp. 55-61, Sept. 25-26,1989. [35] S. S. Sandler and M. Kokar, "Intelligent antennas," in Proc. URSI Int. Symp. on Electromagnetic Theory, Budapest, Hungary, pp. 159161, 1986.

398

Combination of an Adaptive Array Antenna and a Canceller of Interference for Direct-Sequence Spread-Spectrum Multiple-Access System RYUJI KOHNO,

MITSUTOSHI HATORI,

MEf\.1BF.R, lEEE~

MEMBER~ IEEE. A~D

HIDEKI IMAL

Abstract-In the realization of code division multiple access based on a spread-spectrum communication system, i.e., spread spectrum multiple access (SS~IA). reduction of cochannel interference is an i01portant problem. This paper proposes an adaptive array antenna system including a canceller of cochannel interference. which can improve performance by a cornbinatiun of temporal and spatial filtering. While ~ he adaptive array suppresses interference sources w ith arrival angles different from that of the desired user. the adaptive digital filter-canceller rejects those whose arr-ival angles are the same a~ that of the desired user. The proposed system can achieve stable acquisition and low error rate of demodulated data even in a heuvy interference channel where a conventional array antenna s~ stem cannot achieve sat isfactury acquisition.

T

1.

INTRODUCT1()~

HE demand for spread-spectrum (SS) communication techniques in commercial applications has been increasing recently ..~ SS communication technique has advantages such as robustness against narrow -band interference as well as noise and realizability in the form of code division multiple access. i.e .. spread spectrum multiple JCCCSS (SSMA) [1]. However. if enough inherent pro.essing gain of a SS system cannot be obtained (e.g .. due ;'0 restricted transmission bandwidth in a channel). cochannel interference (that is. an interfering SS signal from an undesired user due to cross-correlation aITIOng pscudonoise (PN) sequences assigned to different users in SSMA) cannot be suppressed completely Then it will be difficult to achieve initial acquisition and phase tracking and to increase the number of sirnultaneously accessing users. This is known as a near-far problem. In order to reduce cochanncl interference. we have proposed an adaptive canceller of cochanne! interference as well as the design of a set of PN sequences with good correlation characteristics 12}. 13]. However. since the adaptive canceller demodulates and resprcads undesired SS signals from every user in order to generate a replica Manuscript received February 17, I9R4: rev iscd November I), I 4X4, R, Kohno JnJ H. lmai arc with the DIV\"IOn of Electrical and Computer Eng inccnng , Yokohama Nauonal Univcrsuy. l)() Tok rwad.u. Hudo gaya-ku. Yokohama 240, Japan. M, Haiori is with the Department of Electrical Engineering. Uruvcrsuy of Tokyo, Hongo , Bunk yo-ku , Tokyo II J. Japan S, Pasupathy is with the Department of Electrical Engmccnng , Uruvcrsity of Toronto. Toronto. Ont. M5S I A~. Canada IEEE Log Number 9()~~ 792,

SENIOR MEMBER. IEEE,

SUBBARA Y AN PASUPATHY,

SENIOR ~1EMBER, IEEE

of them by using an adaptive digital filter and then subtracts the replica from received signals, the amount of hardware will be considerably large corresponding to an increase in simultaneously accessing users. On the other hand. an adaptive array antenna is useful in suppressing interfering signals because it can adaptively control directivity of the antenna even if the desired SS signal's arrival angle is unknown [4]. [5]. However. if there is a high-level interfering signal from an undesired user with the same arrival angle as that of a desired user, an array antenna cannot suppress it. In this paper. in order to minimize these problems inherent in an adaptive canceller and an array antenna. an adaptive array including a canceller of cochannel interference for SSMi\ is proposed and the performance is investigated. The proposed system can suppress interfering SS signals. i .e .. cochannel interference. with arrival angles different from that of a desired user by using a null steering array antenna and eliminate by means of a canceller the residual interference and cochannel interference having an arrival angle the same as that of the desired SS signal. The canceller consists of a SS demodulator and modulator for interfering 5S signals and an adaptive digital filter which can generate replica of interfering SS signals even in a time-varying channel. The weights of array elements are adaptively updated by using more reliable reference signals which are obtained by the canceller. In particular. even if there is considerable high-level interference from an arrival angle same as that of a desired SS signal. the proposed canceller can eliminate it more effectively. Therefore. the proposed system can achieve stable demodulation and improve the error rate of decoded data even in a heavy interference channel where a conventional array antenna s~stetn cannot achieve acquisition. This is an approach based on a combination of spatial and temporal filtering for rejection of interference in a SSMA system (61. In Section II. a conventional adaptive array antenna for a direct-sequence (DS) SSMA system is briefly explained. In Section III. the structure and controlling algorithm of the proposed system are described in detail. Finally, computer simulation results that evaluate the proposed system in comparison to a conventional system are presented in Section IV.

Reprinted from IEEE Journal on Selected Areas in Communications, Vol. 8, No.4, pp 675-682, May 1990.

399

Desired

[ 1T denotes transpose and

Interference

8

= d, (k) c, (k)

r; (k) ~--

LN

+ Y(kl

OUTPUT

,

y(k) =

Fig. 1. Adaptiv e array antenna with N clements.

II. AN

ADAPTIVE ARRAY ANTENNA FOR A

oW

L: h;r,(k) exp ;=0

{-j(21rL,,/'A) sin

el} +.

= 11-==\ 2:

= [hro, n.;

Re {xn{k)* W,,(k)} [Re

+ IIII

{x

11 (

k)} Re { W"( k) }

{x" (k ) } 1m { W" ( k ) }] .

(2)

R"" .... l.j and lrn tf ....- lJ arc a real and an irnauinarv ... ., parts of a complex value ::.. respectively. * denotes the complex conjugate. Let the Oth user's 55 signal r., ( k) be the desired one . i.e ... the reference signal. The error signal e( k) is defined as the difference between y( k) and an in-phase or rea1 pa rt 0 f ro ( k I. i. e .. e ( k) = y ( k) - Re {ro ( l:) ~ . which comprises of the interference and noise components. If mean square error E{ e t k: ):! 1 is employed as a criterion for optimizing complex weights W" ( k) (11 = I. 2. . . . . N ).. such optimum weights W" which minimize E[e(k)2] can be obtained by solving the well-known Wiener-Hoff equation. given by 1 :lIl

{

-

N,,(k)

(3 )

( 1)

where let a real and an imaginary pans of W" be W'R" Rc { W" } and Will = Irn { W" }. respectively. then r W = [WR I' W/I' ~vR 2. W· f 2 • • • • ~VR,\' ~t'f.\'] •

where h, and T/ (k) which are vectors of a channel impulse response for the i th user and a S5 signal from the i th user r. ( k), respectively, are given by h,

N

L: n=1 .v

DS-SSMA

Fig. 1 shows the structure of a conventional adaptive arrav"'" antenna with tv elements. Assume for simplicity that each element is omnidirectional and that all mutual impedances are zero. Though a quadrature hybrid is used for each element in order to split the received signal into quadrature components in a practical system. it is omitted in Fig. 1. Throuzhout the paper.. analytic signal notation with ~omplex weights is used [5]. .r, (k) is the sample of received signal in the 11th element at instant t = kT where T is a sampling interval as well as the duration of a chip of PN sequence: k is an integer. XII (k) consists of the desired SS signal. interfering 55 signals from other users . and a noise component. The COITIplex signal .r, ( k) is expressed by x,,(k) =

{j (wekT + o,)} .

M + 1 is the number of simultaneously accessing users, and d, (k) and c, (k) are the binary ( + I or - I ) datum and PN sequence signals of the i th user at instant k'T, respectively. L II is the distance between the first and the nth elements where L I = O. 0; and 0; are the arrival angle and the phase shift of the SS signal from the i th user, respectively. We is the carrier frequency. A is the free-space wavelength . and Nil ( k) is the noise component in the nth element. Each XII (k) is multiplied by a complex weight WIl (k) in the 11th element and then summed to produce the array output y (k) • as

-4--=--~

e(kl

Z lkl

exp

..... him] T

'I

R\\(O) and R,,.(O) are the 2N x 2N covariance matrix

and the 2N x 1 cross-correlation matrix .

.\ R I ( "

x,

dk

- i )

XR

dk

- j)

XR ,

(k - i ) Re {

i ) X /I

("

-

I)

\ R I (k -

x" (k - i )

X/I

(k - I)

'\/1

(k -

X /I

(k r i )

X H ~ (" -

XH~

x,~(k - i) XR1(J... - j)

-

; )

x J2( k - i) x,,{k - I)

i ) .\ H ~ ("

(k - i )

\H

i)

.\H I ( "

~ (J... -.I)

'\/1 ( "

I ).\ H ~

-

(J... -

i)

'\/~(" - i) xH~IJ... - I)

.\H ~ (J...

-

-

-

i ) X f ~ (" I ) XI

-

I )

~ (J... - I )

i ).\ f ~ (J... - j )

X/~(" - i) \,~(" -.I)

"0 (k ) }

x" i k - i ) Rc { "u (k ) } R" (i )

=E

x H ~ (k - i ) Rc { r"( k ) } x, ~ (k - i) Rc

(~h)

{I"" (k ) } 400

llXFl 0

OATA~

l..5ffi 1

CATA

'lHITE G'<J5S1Nl

l~SS 'ffi l~ ::d::=>

Xl

"'CIS(

DDr-->0-~-'T--->@ibm---~:r=="'lRi!]}--r'G~'=ID--,-""RJf

X'.'->

_0

.-

_

IITEl' - - , . ; . :' - CHm'a --;,...,- - - -- - fUl' 1'9l -

Fig . '

=

W,, (k ) - /ldk).r,, (k) *

( 5)

where /l is the step size . If every complex weight is o ptim ized. the adapti ve a rray will yie ld automatic beam tracking o f the de sired SS sg na l and adequate su ppressi o n o f interfering SS s igna ls because o f d irecting its null s at them . As a result. th e interfering si g na ls will be attenuated by th e nulls w h ile the de sired SS signal will not he a null . How ever . an a rray antenna sys te m can rej e ct co m p lete ly o nly narrow -h and signa ls . In the adapti ve arra y antenna using DS -SSMA . the null depth ma y not be suffi cient to ac h ie ve the de sired interference rejection unless the syste m is modified to u peate over wide bandwidth s . Hen ce . there w ill be so me residual inte rfe re nc e [41. [51 . More o ver. the adapti ve ar ray antenna cannot suppress th e interfering SS sig na l who se arrival angle is the same as that o f the desired 55 signa l.

III. AN ADAPTIV E ARR A Y INTERF EREN CE FOR

I NCL UDIS G A C 'SCELLE R OF A

-

- - - - --

--

A DS ,SSMA sy ste m w ith an adapt ive arra y antenna incl udi ng a ca nce lle r of interference (pa ra lle l ca nce lle r struc tu re) .

for integer i , l- xR,,(k) = Re { x,,(k )} and xl,,(k) = 1m ( .r, (k) }. Since matrices R,,(O) and R" (0) cannot be obtained from the observed signal. LMS algorithm can be used in order to update complex weights chip by c h ip . suc h a s

W,, (k + I )

-

D5-SSMA 5 YSTE\1

Thi s sect io n propose s a D5 -5SMA rec eiving system using an adaptive arra y which can demodulate a de sired 55 sig na l robustl y even if he a vy inte rfe ring 55 signal s have the sa me arrival angle as that o f the de sired S5 s ig na l in a time-varying channel.

A. The Structure of the System Fig . 2 sho ws the structure o f the D5-55MA syste m with an adaptive array including a canceller of interference : it is po ssible to extend it to other S5 modulations . A prim itive 55 receiving sy stem ha ving neither an array antenna nor a canceller cannot demodulate the desired S5 sig na l even by using the inherent processing ga in of the 55 sy stem when the desired S5 s ig na l power is much lower than the interfering S5 signal power. The adaptive array antenna system (mentioned in the previous section) is effec tive in suppressing cochannel interference with arrival angles different from that of a de sired user. but the residual

interference and cochannel interference having the same arrival angle as the desired 55 signal are a major problem in S5MA . In order to solve thi s problem. we propose an adaptive array syste m including a canceller of interference . which can eliminate the interfering 5S s ig na ls having the sa me arri val angle as that of the de sired 55 s ignal by using ad apti ve digital filters (A D F ·s ). The proposed sys te m can al so ca nc e l the residual co channel interference having arrival angles different from that of a de sired user which a n adapti ve arra y antenna cannot completel y suppress. When th e int erfering si g nals from M undesired users rema in at the o utput of the arra y antenna. the ad apt ive canceller has ( W O typ e s of struc tur e. suc h as the parallel structure sho w n in F ig . 2 and the se ria l o ne in Fi g. 3. H for each antenna e le m e nt in F ig . 2 is a qu adrature hyb rid splitt ing the recc iv ed s ig na l into quadrature components . In the ca nc e lle r. th e int erfering 55 s ig na l from the ith user (i = I . 2 . . .. M ) is demodulated and respread by 55DEMi and 55MODi . respecti vel y . ADFi ( i = I. 2.. . . M ) is used to identify (he entire channel characteristics of both the channel for the i th user and the array and to generate a repl ica of the di storted inte rfe re nce component in the a rra y o utput. Then every replica of interference is subtracted from the delayed output s ig na l of the array . The o utput signal from the c a nce lle r is then fed to ADFO. whi ch compen sates the di stortion of the desired S5 signal from the Oth user. and demodulated by 5SDEMO . The final o utput data are respread by SSMODO in order to produce the reference s ig na l for the arra y and the ADF' s. In the se ria l st ruc tu re o f the canceller. interfering S5 s ig na ls are cancelled in the order of decreasing receiving po wer be cause it is ea sy to achieve acquisition, demodulati on , and ca nc e llat io n of an interfering 55 s ig na l ha ving g rea te r power and its can cellation makes it possible or ea sy to cancel o the r interfering 55 signals . Therefore . the se ria l structu re may perform more robust cancellation than the parallel one . However, the latter, shown in Fig. 2, can achieve more stab le adaptability of the array than the former. because time-delays within the feedback loop updating the array weights will result in instability unless the loop gain or the array speed of response is reduced . When there are a few strong interfering 5S signals, cancelling only those strong signals is sufficient to achieve

401

~=r::::r

' EIC>
Fig . 3. Serial canceller structure .

stable acquisition and reliable demodulation of the desired SS signal. Then the amount of hardware will not be very large. Note that this system can achieve more stable acquisition , demodulation. and cancellation of the interfering SS signals than a conventional array system without a canceller even when the D /1 ratio (of the desired to undesired SS signals' power) at the array output is small. This is because interfering SS signals can be demodulated more correctly in the case of such a small 0 /1 ratio . On the other hand . when the D / 1ratio is so large that the system can achieve stable acquisition and reliable demodulation without such a canceller. the funct ion of cancellation of ADFi (i = I. 2.... M) in Figs . 2 and 3 may be stopped . Moreover. it is possible to control ADFO so as to reject other intentional jamming and narrow-band interference similar to an interference rejection filter [7].

= /,2.: Re =0

-4l<-- - - '

.IV

I. = "~I

Re {h , W" exp { -j (21fL,, /A) sin (),}} .

and

I, = [J.o .}; I '

. ..

1;11I (.

Let B, and T, (k) be the tap coefficient vector of ADFi and an estimate of r, i k ), i.e .. the reference vector for i = I . 2. . M. Then the output signal of the canceller II (k) is lI(k)

1/

L

= y(k )

, - I

Re:

where

In the proposed combination of an array antenna and a canceller. it is important how one updates the array weights and the tap coefficients of ADF ·s. If the array weights are updated independently or the canceller by using the array output y (k) and its demodulated and rcmodulated SS signal as shown in Fig . I . time-delays within the feedback loop will be small. However. unless PN acquisition is achieved at the array output and a proper reference signal is obtained, the array and the following canceller cannot be correctly controlled . In order to improve the performance. the canceller output can be used in updating the array and the ADF's as shown in Figs . 2 and 3. Even if there are strong interfering S5 signals at the array output. the canceller system can accompl ish acquisition of those signals and correctly demodu late and cancel them . Thus , it will be easy to achieve acquisition of the desired SS signal and obtain a correct reference signal at the canceller output. Since the time-delays within the feedback loop increase, it may be necessary to reduce the loop gain and make the array slow in order to avoid instability if channel characteristics vary quickly with moving of an undesired user. Assume that interfering SS signals from M undesired users still remain at the array output. They consist of the interfering SS signals having the same arrival angle as that of the desired SS signal and the residual interfering SS signals with different arrival angles which the array cannot completely eliminate by its directivity . Then the output signal of the array y (k) becomes

y(k)

-

where the Oth user is the desired user. /; is an impulse response vector of the entire channel which consists of the channel for the i th user and the array antenna and can be written as

B. Controlling Algorithm

M

L[

- - - -- -

{fTr;(k)} + n(k)

B.

=

i, ( k)

=

[B ,II' B

i [ •

{B ,'r,(k)}

(7 )

r

•

•

B""I .

•

Ii , ( k ) . i, (k

T

-

I ) . . . . . ':i(k - m)] .

In order to update: both the: complex weights of the: array antenna and the tap coefficients of ADF's adaptivcly . we:

can use the error t'( k) which is defined as the: difference between the: II (k) equalized by ADFO and the reliable reference Re { ':11 ( k) J: that is. e( k) = CTu ( k) - Re:

{,oil (k ) }

( R)

where C is a tap coeffic ient vector of ADFO

C =

I Co.

t

C ,. . . . Cd .

and u (k)

= [1I(k) . u i k:

-

T

I) . . . . lI(k - L)] .

When the mean square error J = E 1(' ( k )21 is used as a crite rion. the optimum tap coefficients of C and B, (i = I , 2 ... M) can be obtained by so lving the following equations :

(6)

402

aJ

-

se, =0 aJ

ac

==

for i

=

I. 2 . . . M .

o.

(9a)

(9h)

Therefore. the optimum B, is equal to the channel impulse response/; i i = I. 2 .. . M) . i.e . . B, =

t.

for i = I. 2.. . . . M.

( 10)

and the optimum C is expressed by

C

= R I~/l R,If

( II )

where Ruu and Ru r are the (L + 1) x (L + 1) covariance matrix and (L + 1) x 1 cross-correlation matrix . WTRn(O) W

WTR\.\( 1) w

WTR,.\(L) W

WTR.\.\" ( -1) w WTR,x(O) W

RII II

W TRu

( -

WTRu(L-l)W

L ) W W TR.\_, ( - L + 1) W

WI}?\!. (0)

WTR,r (1 )

R

II ,.

W 1Rx,.(L)

T is transpose and R n (i - j ) ( for i . j == 1.. 2....... L + 1) and Rx r (i ) are defined by (-la) and (4b). respectively. Assuming that there is no noise . the optimum C will be

of the desired user by a nulling operation . its suppression will facilitate tracking and demodulation of the desired SS signal.

equal to the inverse characteristic of It). Thus . the optimum transfer function of .A.DFO C ( :.) is given by C(.:) ==

IV.

This section shows computer simulation results in order to evaluate the performance of the proposed system in comparison with a conventional system . i.e .. an adaptive array antenna system without a canceller. Ins i111 U Ia t ion s. eve ry use r in 0 S-SSM A use sad iffe rent A1-sequence. i.c .. maximal linear feedback shift register s~quencc with a period of 31. A model of the channel impulse response shown in Fig. 4 is used. Complex weights of the antenna and tap coefficients of ADF"s are updated by using known training signals during the beginning 500 data hits and after that by using demodulated data. and each result is calculated for 10 000 data bits. A sliding correlation SChCIl1e is used for the acquisition of PN sequence». If the corrclator output is greater than the

-1//--

L

,---=0

J~)I:'

I

In order to obtain these optimum values rccursivclv. algorithm can be used such as

L~IS

for i == I. 2. . . . ,'vI C(k

1"

1) == C(k) -

J.1,l'(k)ll(k)

( 12a) ( 12h)

where C ( k) and 8, ( k : (i == 1. 2 . . . "'-I I are the tap coefficients vectors C and HI at instance k T. J{, ( k , IS J L X III matrix given hy Rc

RI (/\)

Re

{r, (k) } {r, (k -

I )}

Re {PI (k

I) ),

RC {F, (k - L) }

Re r;: (k

:2) }

Re {p/ (k

J

t '

Re {p/ t k - I1z) } Re {PI (k Ilh and Ill" are the step sizes . and e ( k

SI\1ULATION RESULTS

-

) is defined by (8). In a controlling process of the proposed array system including a canceller . the array weights should be fixed (e.g ... only a weight is one and the others are zero) until the canceller can achieve PN acquisition for a desired SS signal because the array does not have a proper reference signal during the timing search, when the local timing is in error. In order to achieve the acquisition for the desired SS signal, at first, acquisition for the undesired SS signals at the array output is performed and then those undesired SS signals are eliminated by updating tap coefficients of ADF·s. After the acquisition for the desired S5 signal is accomplished . the array weights will be updated by using a reliable reference signal which can be obtained by the interference cancellation. Since the array can suppress interfering SS signals with arrival angles different from that

Re lf P (k

I )}

/11 -

!

1 - L)}

-

111

- L)}

dcsi red threshold level (i , e ... a lock level) more than three consecutive times during an interval of a PN sequence period. the phase of the local PN sequence will be locked: the lock level is normalized by the maximum value of the autocorrelation of the PN sequence. In the simulation, the number of multiple accessing users is three: the SS signal of one undesired user arrives from the same angle (J = 45 with respect to broadside as the desired user and that of the other undesired user arrives from a different angle (J == -45 in Figs. 5-8. The number of antenna elements is two, i.e ... N == 2. The element spacing is one-half wavelength. The number of taps in ADFO, L is 21 and the stepsize ~ is 0.01. The received signal-to-noise power ratio (SIN) in each element is 10 dB. In Figs. 5-9, the D /1 ratio for the same arrival angle (that is, the power ratio of the desired signal to the undesired signal with the same

403

0

0

\

CONVENTlOpw'SI'STIM (WITH NOCANCELlBlI

PROPOSED SYSTEM

!WIIlICA/aIiEfll

o

o

-2T

on • 1.0

o

10- 1

+2T

Fig . 4 . A model of the ch annel impulse response: T = sampling inte rva l.

10'

l§

ffi 10'

\

~ z

~

2

4

6

10

I.(JJ(lEVEll x .lI .

..------r-------r------.

on• ~6

~

011 .10

PROPOSED SYSTDA IWIrH CANCllifAI 011 •

10- 1

~

-

I

o

Fig. 8. Demodulated chip error rate versus lock level for PN acquisiton.

10' COIM1mONAL 6YSIDA IWllIlNO CANcaJBll

SIN • ' Od8 ANGIf• !II'......._ \ __'__ _....._ _~ I _ _ ~ .

L:..:.~_"_

O~

& 10

SIN • 11Id8

10. '

o

lJICl( lEVEl· O.' ANGlE .!K7' fTlRATlONI x1llllOi

Fig. 5. Convergence property of mean square error as a function of the D/I ratio.

Fig. 9. Demodulat ed chip error rate as a function of the arrival angle of an undesired 55 signal.

10' ~--T""---r--...,..---"---,

10' 10· ' w

:i

'" ~

10-'

ill

s ~

..

10· 0

2

10

6

, --

--,..--

-

-,-- ---y-

-

Fig. 10. Demodulated chip error rat e as a function of the angular speed of a moving undesired user.

---,

~

'" lO- l ~ e,

10

J

10"

o

or the desired one: is cha nged . while: the D/l ratio for the same arriv al angle is one . Since an .v-clerncnt arra y has N - I degrees of freedom in its pattern. the 2-elelllent array can form only one null. In the simulation model. the arra y may only form a null so as to suppress the interfe ring SS signal with an arriva l angle different from that of the desired 55 signal. The proposed array system can carry out such a nulling operation avo iding the effe ct of the interfering 55 signal with the same arri val angle as that of the desired user because the system uses the canceller output signal. which includes no such interfering 55 signal. in updating the arra y weights as shown in Figs. :2 and 3. Fig. 5 shows the convergence property of the mean square error E Ie ( k )~ I of the proposed array system with the canceller and a con ventional array system with ADFO and without a canceller as a function of the D / I ratio for

differe nt from that

CONVENTIONAL SYSTDA (W1TIlNOCANClliIIlI

~

10' ANGllAR SPEEDldeg" elchl01

Fig. 6. Demodulated data error rate versus D/I ratio. -

PROPOSEDSI'STIM IWITH CANCEllfRl Oil • 10

10·' 10- '

01l1 X.l1

10' .......-

PROP06EO SYSTEM {WmtCANCEUfRI0/1 .0.3

10· ' 10- 1

SIN • IOdS LOCK lEVEl • 0.4 ANGl£ =90"

CONVENTIONoOl SYSlal (W1TIlNOCANCWBlIOil • 10 & OJ

SIN • IOdS lOCXlEYB. • 0.4 ANGlE .!K7'

10

ont x.u

Fig. 7. Demodulated chip error rate versus D/I ratio.

arrival angle) is changed , while the undesired signal with a different arrival angle has the same average power as that of the desired signal. In Fig. 10. the D/ 1 ratio for the different arrival angle (that is, the power ratio of the desired signal to the undesired signal with an arrival angle

404

the same arrival angle. Note that the proposed system can achieve more stable convergence than the conventional system. Figs. 6 and 7 show error rates of decoded data and chips as a function of the 0/1 ratio for the same arrival angle, respectively. In the conventional system. the data error rate is more than 10- 1 corresponding to D /1 $ 0.8 in spite of the inherent processing gain of the system because the chip error rate is considerably large. On the other hand, there is no error in data and chips of the proposed system for 0 /1 ~ 0.4. The error rate of the proposed system is large in the range 0/1 < 0.4. since mistakes during lock of the PN acquisition may occur in SSDEMO before the canceller achieves complete rejection of interference. Fig. 8 shows the chip error rate as a function of the lock .evel, and thus illustrates the robustness of the perforrnance of PN acquisition. Here. the D / I ratios for the same arrival angle and the di tlcrent one arc one. The proposed system can accomplish PN acquisition corresponding to a wide range of 0.2 $ lock level ~ O.YS. while the con ve ntion a I s y ste 111 can dot hat 0 n Iy a ro Ll nd a 10 L k level = 0.4. Fig. 9 shows the chip error rate as ~l tunct ion of the irrival angle of an undesi red 55 signal in the range 0 ~ J ~ 90° where the D / I ratio for the same arrival angle is one or two. Even in the case of the D 'I ratio == 2.0. the error rate of the conventional -v-tcrn I" l.iruer than 1a-.~ be C J U ~ can i nI e rfc r i n g SS "i g n~11 ha\ I n ~ an arr i\ aI angle the same as that of the dc- ircc] SS "lgn~t1 prevents the array from forming a proper null. Since the proposed system can cl irninate such an intcrtcring 5S signal. a nulllng operation can he improved Fig. 10 shows the eh i p error rate a" a tuner ion of the angular speed of a moving undesired user In order to Investigate adaptability for xpacc-variauon of J channel. The undesired user having arrival angles different from that of the desired user is I110veJ in the range f1 == -'+5 ± 2():J where the D / I ratio for the same arriv al angle is one and the 0/1 ratio for the di tfcrcnt arri val angle 1:-, one or () 3. In a case of the latter 0/1 ratio == o. J. the convcru ional system cannot achieve acquisition. but the proposed system can do that and track the spacc-variution for the angular speed of a moving undcxircd user < I () - 2 (degree. chip). Hence. it is noted that the proposed syxtem can achieve more stable aduptabil ity than the conventional system in spite of larger time-delays within the feedback loop updating the array and the ADF·s.

know which interfering SS signals of undesired users arrive from the same angle as that of the desired user in the canceller. This can be found by observing the output level of the correlator which uses every PN sequence assigned to the SSMA user selectively. REFERENCES

II) M. K. Simon. J. K. Omura. R. A. Scholtz. and B. K. Levitt. Spread Spectrum Conununicutions, Rockville, MD: Computer Science. 1985. 12) R. Kohno. H. lmai , M. Hatori . and S. Pasupathy. "Adaptive cancellation of interference in direct-sequence spread-spectrum multiple access systcrns ;" in Proc. IEEE Global Tclecommun. COil! /987. vol. 1. pp. 630-634. Nov. 19X7. [3\ R. Kohno and H. Imai. "On pseudo-noise sequences for direct-sequcnce QPSK spread spectrum communication system» ... Tn/II.\. IECE I apau . vol. J65-A. no. I. pp. 69-76. Jan. 1<)~2. 141 R. T. Compton. Jr .. "An adaptive array in a spread-spectrum <:0111murucuuon sy ... tcl11.·· Proc. 1f.."Ef.·. vul . 66. pp. 2~9-2<)H. Mar. 1978. 15\ - . Adapt! t·£, Ant cnnus Concctn» lind Pcrtormancc, Enulewood Chtl». NJ: Prentice-Hall. IYHX. . "[61 L. B. Milsrcm and R. A. litis. "Slgnal processing for interference rejccuon in ... prcad spectrum conunumcations ." /1-'-1:'1-.." ASSP Mug .. vol. 3. Pp. I X- J I, A Pr . IlJX6 . [71 1., 1.1 .md L. B. M rl xtcm. "Rcjccuon of narrow-hund interference in p~ -prcad-vpccrrum . . ~ -tcm-, lISI11~ transvcr.... ul filtcrs ." IEEE Trans. COn/n/UIl,. '.(11.

0

0

V.

C()NCLUSI()~

The proposed adaptive array system can improve the performance of demodulated data error rate and achieve stable PN acquisition with greater tolerance for selecting

the lock level to a correlator output: the reason is due to its ability to reject cochanncl interference in DS-SSMA. which cannot be suppressed by a conventional adaptive array. However, for proper operations. it IS necessary to

405

CO\t-JO. pp. Y25-l)2X.

~by

IlJH2.

Direction Finding in the Presence of Mutual Coupling Benjamin Friedlander, Fellow, IEEE, and Anthony J. Weiss, Senior Member, IEEE changes due to the environment around the sensor array (e.g., the effect of metal objects near an antenna array on its beam pattern), and changes in the location of the sensors (e. g., an antenna array located on the vibrating wing of an aircraft or a hydro.phon~ ~rr~y towed behind a ship). In many practical situations It IS Impossible to ,maintain array calibration to the accuracy required for the proper operation of these eigenstructure-based techniques. This results in significant degradation in system performance, sometimes to the point that these superresI. INTRODUCTION olution techniques perform no better (or worse) than convenIRECTION-find ing techniques based on eizenstructure tional direction-finding methods. methods have been discussed extensively in the literature . A pr~ctical approach to alleviating the problems introduced by since the beginning of the last decade. Computer simulations and Imprecise array calibration is to use the received signals to adjust a relatively limited number of experimental systems have ?r fine-tune the array calibration. Self-calibrating or self-coherdemonstrated that in certain cases these techniques have superior mg antenna arrays have been developed and tested by Steinberg p~rformance compared to conventional direction-finding tech[11] and others. Schultheiss et al. [51. [6] have studied in deta~ niques. the self-calibration issue in the context of passive sensor arrays In spite of the potential advantages of eigenstructure methods with imprecisely known sensor locations. Self-calibration techtheir application to real systems has been very limited. One of niques for eigenstructure-based array processing techniques seem the main reasons for this situation is the practical difficulties to have received little attention. Lo and Marple [7] discussed a associated with calibrating the data collection systern. Eigencalibration technique that requires calibrating sources 'whose structure-based direction-finding techniques such as MUSIC [81 directions are known (at least two sources are required) and require precise knowledge of the signals received bv the sensor theref~re their t~chnique is not a true self-calibrating technique. array from a standard source located at anv di;ection. The Paulraj and Kailath [1] presented a method for direction-ofcollection of the received signal vectors for ;11 possible direcar.rival (DOA) estimation by eigenstructure methods for an array tions is often called the array manifold. The performance of the with unknown sensor gains and phases. Their method does not eigenstructure based system depends strongly on the accuracy of require calibrating sources with known directions, but is limited this array manifold. to uniformly spaced linear arrays. The process of measuring the array manifold can be time In this paper we address the problem of estimating the direcconsuming and expensive. Calibrating an antenna array designed tion-of-arrival of plane waves impinging on a sensor array for two-dimensional (azimuth and elevation) direction finding w~ose elements have unknown (or imprecisely known) coupling, with the accuracy required by these superresolution techniques gains and phases. We develop an eigenstructure-based method poses numerous practical problems. The amount of memory for simultaneously estimating the DOA's and the unknown courequired for storing the array manifold once it has been meapling, gain, and phase parameters. sured may also increase the size and cost of the system. even if interpolation techniques are used to reduce the number of points II. MUTUAL COUPLING MODEL that need to be stored [10]. Ignoring for a moment the mutual coupling between the array In addition to the problem of initial array calibration. there is sensors, we first formulate the data-signal for an ideal array with the problem of maintaining array calibration. Manv factors no mutual couplingcontribute to changing the response of the sensor array over Consider N radiating sources observed by an arbitrary array time: gradual changes in the behavior of the sensor itself and of of M sensors. The signal at the output of the mth sensor can be the electronic circuitry between the sensor and the output of the described by digitizer (due to thermal effects. aging of components, etc.),

Abstract-An eigenstructure-based method for direction finding in the presence of sensor mutual coupling, gain, and phase uncertainties is presented. The method provides estimates of the directions-of-arrival (DOA) of all the radiating sources as well as calibration of the gain and pha~e o! each sensor and the ..~utu81. cQ~pling in the receiving array. Calibration sources at known locations are not required. Conditions are provided for the existence of a solution. The proposed algorithm is described in detail, and its behavior is illustrated by numerical examples.

D

Manuscript received September 29, 1988~ revised September 19. 1989. This work was supported by the Army Research Office under Contracts DAAL03-86-C-OOI8 and DAAL03-89-C-0007. sponsored by the U.S. Army Communications Electronics Command. Center for Signals Warfare. ~. Friedlander is with Signal Processing Technology. Ltd.. 703 Coastland Drive. Palo Alto. CA 94303. and the University of California. Davis, CA. A. J. Weiss is with the Department of Electrical Engineering, Tel Aviv University. Tel Aviv. Israel. IEEE Log Number 9039141.

xm(t) =

N

L

n=l

cxmSn(t-Tmn-1/;m)+um(t), - T /2 -s t

-s T /2,

m=1,2,···,M;

where {sn(I)};;=1 are the radiated signals, {um(l)}~=t are sample waveforms from additive noise processes, and T is the observation interval. The parameters {Tmn} are delays associated with the signal propagation time from the nth source to the

Reprinted from IEEE Transactions on Antennas and Propagation, ve; 39, No.3, pp 273-284, March 1991.

406

(1)

mth sensor. These parameters are of interest since they contain information about the source locations relative to the array. F inall y, the parameters a m and 1/; m are the gain and the delay associated with the mth sensor. A convenient separation of the parameters to be estimated is obtained by representing the signals using Fourier coefficients defined by

(2) where WI = 2r(/ 1 + 1)/ T, I = 1,2,"', L; and 11 is a constant. In principle the number of coefficients required to capture all the signal information is infinite. However, if we consider signals with energy concentrated in a finite spectral band, we can use only L < 00 coefficients. Moreover, in this work we are interested in narrow-band signals with spectrum concentrated around wo, with a bandwidth that is small compared to 2 7r / T and hence L = 1. Taking the Fourier coefficients of (1) and suppressing the dependence on Wo we obtain N

X m == ~ ex e-jwO!/;m. Lm

e-jwoimnS

n=l

n

+

o

o

o -----ttr----------------__ x Sensor 1

Fig. 1.

Vm :.

m=1,2,"',M;

element in the array acts independently of all the others. This assumption is often invalid in practice. Reflected radiation from one element couples to its neighbors. as do currents that propagate along the surface of the array. The output voltage of each array element is the sum of the primary voltage due to the incident radiation, plus all the contributions from various coupling sources from each of its neighbors. Hence. the actual voltage at the output of the array is given by the following modification of (4):

(3)

where S; and Vm are the Fourier coefficients of sn( t) and um ( t), respectively. Equation (3) may be expressed using vector notation as follows:

X(j) = r . A . S(j) + V(j):

j

=

1,2,"', J:

(4)

where j is the index of different (independent) samples and

X(j)

X(j) == [X1(j), X 2 ( j ) , · · · , X.W(j)]T,

=

(VI (j), V2 (j)

,... , ~w (j)]

T,

n == 1,2,"', N. To further simplify the exposition we assume that the sensors and sources are coplanar and the sources are far enough from the observing array so that the signal wavefronts are effectively planar over the array. It is easy to verify that the delays T m n are given by Tm n

(5) where c is the propagation velocity, d mn is the distance from sensor m to sensor number one (reference sensor) in the direction of the nth source, (x m: Ym) are the coordinates of the mth sensor, On is the DOA of the nth source relative to the Y axis, and the origin of the Cartesian coordinate system coincides with sensor number one-see Fig. 1. From (4) and (5), it follows that the elements of the matrix A are given by eJ(wo/c)(X m

Sin

0n+ Ym cos On)

r . A . S(j) +

vr»:

j == 1.2.···. J (7)

A. Linear Arrays

== -dmn/c,

A mn =

= C .

where C is an M x 1\1 complex matrix. Unfortunately. the matrix C tends to change with time due to environmental factors such as temperature, humidity, pressure, vibrations and nearby objects. It is therefore desirable to estimate the matrix C without interrupting the ongoing DF mission. Since DOA estimation is a complicated process even when C is perfectly known we concentrate here on estimating a first order approximation for the matrix C. as well as for r. and the DOA's.

S(j) = [Sl(j), S2(j),"', SN-(j)]T, V (j)

Problem geometry.

(6)

It also follows that { A 1n}::= 1 == 1 and that only the nth column of A depends on On' We are now ready to address the mutual coupling between the array elements. In the foregoing theory it was assumed that each

In general, the matrix C has no special structure. However, for a linear uniform array that is well balanced a banded matrix provides an excellent model. The rationale behind this model is the fact that the mutual coupling coefficients are inversely proportional to the distance between the elements. Therefore, the mutual coupling between two elements that are far enough from each other, can often be approximated as zero. Moreover, we expect that good linear uniform arrays will exhibit a banded Toeplitz mutual coupling matrix (i.e., the coupling between any two equally spaced sensors is the same).

B. Circular Arrays U sing the same set of considerations leads to the conclusion that the mutual coupling matrix for a uniform circular array consists of three bands; a center band, a band at the upper right-hand corner and a band at the lower left corner. Moreover, it is expected that a good circular uniform array will exhibit a circulant mutual coupling matrix (MCM).

407

c. Existence of a Solution The proposed method for estimating the DOA's and the MCM is based on the eigendecomposition of the sample covariance matrix of the vector of received signals. We make the standard assumptions underlying the MUSIC algorithm and other eigenstructure-based methods for direction finding. 1) 2) 3) 4)

The signals and the noise processes are stationary and ergodic over the observation period. The columns of B = C . A are linearly independent. The signals are not perfectly correlated. The noise is uncorrelated with the signals and its covariance matrix is full rank and is known except for a multiplication constant

r·

u;.

The covariance matrices of the signal, noise, and observation vectors are given by R, = E{SSH}

an2~o = E{VV H }

R x = E{XX H } = CrARsAHrHC H

+

(8)

an2~o

where (.) H represents the Hermitian transpose operation. While conditions for uniqueness are still an open research problem it is easy to derive necessary conditions for the existence of a solution. Referring to the basic equation (8) we observe that R x can be perfectly described by 2 MN - N 2 + 1 parameters. These parameters are the N + 1 different (real) eigenvalues and 2 NM - N 2 - N parameters that define the N complex eigenvectors describing the signal subspace, that satisfy N( N + 1)/2 complex orthogonality constraints. On the other hand we have r· N unknown location parameters (r = 1 for azimuth only system . r = 2 for azimuth and elevation system. etc. ), N 2 unknown parameters that define the Hermitian matrix R S' a single unknown parameter 2( M - 1) parameters associated with rand P parameters associated with C. Thus . the problem is not strictly well posed (in the sense of [8, p. 84J) unless

Proof: The proof is a straightforward extension of the proof in [4]. This theorem suggests that one should first estimate R x and use the estimates of the eigenvectors to estimate the number of signals. Once N is known reasonable estimates of {f)n}' r, and C may be obtained by minimizing the cost function

2

+ 1 ~ r· M

~

+

N

N'2

+ 1 + 2(M - 1) +

P

2N 2 + r · N + P - 2

--------2( N - 1)

':=

D .X

ESTIMATING THE

DOA's,

MCM

2)

AN+l = AN+2 =

...

[Ql(X)]IJ=X;·Oij i.L> 1,2,···,M. Lemma 2: For any N x 1 complex vector X and any M x M complex symmetric circulant matrix A we have

(9)

GAINS, PHASES, AND

= AM, =

i = 1. 2, ... , M

d , = D jj

Q,

[WI] pq =

i= 1,2,···,L

Ali'

{ X p +q _ 0,

[W2 ] pq = {

for p

l,

I

=

{ X p + q _M _ 0,

= M /2 for even

1

p~q~2

Xp _ q+ l , 0,

otherwise

0,

[W4 ] pq

+ q -s M +

otherwise

[W3 ] pq = { X M + 1+P - q '

2

408

=

where L = M /2 + 1 when M is even and L = M /2 + 1/2 when M is odd. The M x L matrix Q2(X) is the sum of the four M x L following matrices:

an

Each of the columns of B g erA is orthogonal to the matrix U = [U N + 1, UN+2'· •• , uAwl·

I vector a are given by

where the components of the L x

The proposed method is based on the eigendecomposition of the sample covariance matrix of the vector of received signals. In addition to the standard assumptions underlying the MUSIC algorithm we assume that relation (9) is satisfied. Theorem 1: Let Ai and U i' i = 1, 2, · · ·, M be the eigenvalues and corresponding eigenvectors of the matrix pencil (R x' ~o), (i.e., the solutions of R xU = A~OU), where the Ai are listed in descending order. Then, I)

= QI(X) . d

where the components of the M x 1 vector d and the M x M matrix QI(X) are given by

Thus. for an azimuth only system (i.e .. r = I) with two sources and with P = 2.. we need an array of at least five sensors. However, if the sensor gains and phases are known, or if R s is known, then a smaller number of sensors will suffice.

III.

(II)

which is the squared Euclidean norm of the matrix 0 H cr A . Here 0 stands for the estimate of the matrix U. If 0- were a perfect estimate of U (i.e., 0 = U) then the minimum value of J c( J c = 0) will be achieved for the true C, I', and {On}' When (; is an imperfect estimate U, the minimization of J c will provide estimates of C, I', and {On}' the true MCM, gain-phase parameters and DOA's. The accuracy of these estimates has been investigated by simulations. A detailed error analysis has not been carried out as yet. The proposed minimization algorithm is based on a three step procedure. First, we assume that the gain/phase and mutual coupling coefficients are (approximately) known, and we estimate {On} 1 using the principles of the standard MUSIC algorithm. Given estimates of {f)n} we then minimize J c over the gain/phase parameters. Given {f)n} and r we minimize J; over the MCM components. These minimization steps can be repeated until J; converges. Before presenting the algorithm for minimizing J; we introduced three useful lemmas. Lemma 1: For any M x 1 complex vector X and any M x M complex diagonal matrix D we have

a;.

2MN - N

N

2 H L IIO Cr a(On) 11 n=l

Jc =

1'

M and I

p
s: q

S:

I, P

+q

2:

otherwise.

= (M + 1)/2 for

odd M.

M

+

2

Lemma 3: For any M x 1 complex vector X and any M x M banded complex symmetric Toeplitz matrix A we have

Hence, we want to minimize (14) with respect to 8 under the constraint OH W = 1, where w = [1,0,0,' .. , OlT. The result of this quadratic minimization problem under linear constraints is well known and given by

where the LxI vector a is given by Q;=A

(15)

i= 1,2,···,L

u,

where Z k is the matrix

and L is the highest superdiagonal that is different from zero. The M x L matrix Q3(X) is given by the sum of the two M x L following matrices

[W.] pq [W2 ] pq

=

{X +

=

{X

p

for p

q_ l '

0,

N

z, ~ I:

n=l

+ q -s M + 1

2)

otherwise

0,

rk +

1)

1) Set the iteration counter to zero: k = O. Select initial values for the gain-phase matrix rand initial value for the MCM (i.e., C). Usually the initial values are based on some previous knowledge (e.g.. last measured values or predictions based on idealized model). 3) Use all available data vectors to compute the data covariance matrix estimate:

4)

1 ~

1

L

j=l

H

X(j)X(j) .

=IIU H C(k)r (k)a(O) \\- 2.

(13)

These peaks are associated with the N DOA' s {On} ~= I' Also note that substituting {O~k)}, so estimated, in (given by (11)) guarantees that i~ k v is minimized for given C(k) and r(k).

r,

Step 2: Estimating Gain-Phase: 1)

Fixing the DOA's and the MCM we now rmrurruze J; with respect to the gain and phase of each of the sensors. Using Lemma 1 in (11) we obtain

Jc =

where

IV

L

n=l

a(On)HrHCHOOHCra(On)

;V

=

L

a( en) II CHl)[~Ca(en)

n=l

=

~

cH {

11=

1

Qc(lllfl{;{;HQc(lll}C

( 18)

where we used the following notation:

a(On)

= ra(On)

Q2 ( n) = Q2 ( a(8,J )

Search for the N highest peaks of the spatial spectrum defined by p(k)(lJ)

In this step we hold the DOAs and sensor gain-phase fixed and find the MCM that minimizes the cost function The minimization step capitalizes on Lemma 2 if C is circulant (circular array) and on Lemma 3 if C is Toeplitz (linear array). In the following we assume that C is a complex symmetric circulant matrix.

J;

Step 1: Estimating DOA 's: 1)

(17)

Using Lemma 2 in (11) for the cost function J; we obtain

(12)

Perform eigenanalysis and construct (; according to Theorem 1.

(6).

a

r.:

2)

= -

= diag

from the vector

Step 3: Estimating the Mutual Coupling Matrix:

A. Initialization

~

I

r

( 16)

Thus, we minimized the cost function J, by holding the MCM and the DOAs fixed and searching over the space of the sensor gain-phase parameters.

otherwise.

Proof' The proof of the lemmas is based on the special properties of diagonal matrices, circulant matrices and Toeplitz matrices. (See Appendix I.) The proposed algorithm for minimizing the cost function may be described as follows.

Rx

Compute the gain-phase matrix given by (15)

p"?q"?2

p _ Q+ I ,

Q(I(n)CHDOHCQI(n).

i = 1.2.···. L.

Note that Q2(X) and L are defined in Lemma 2. Relation (18) represents (again) a quadratic minimization problem under linear constraints. The linear constraints represent the assumed model of C. (e.g.. ell = 1). Hence if the constraint equation is WITe = u then ( 19) where G is the matrix G ~

N

L

n=l

Q2(n)H{;{;HQ2(n)

and WI represents the linear constraints.

2)

Reconstruct the MCM matrix from the vector ( 19).

(20)

c given by

B. Convergence Check

o = [r 11' r 22 , .•• , rMM I T

Compute lck + I using the estimated DOA "s, sensor gain-phase and the MeM. If

{a(0n)}.

lc.~ - lc.~ + I > e (a preset threshold)

Q1( n)

= diag

409

SNR = 30 dB. SOO SNAPSIIOTS

SO 40 . ..

iii ~

g

-s

...~9.

30

::<

SNR = 30 en, SOO SNAPSHOTS

0

::>

iii

...'"-e

~

.. .

20

~

'"-e

· IS

0

·20

::> -J

!= -e e,

10

>

'

z

"

1=

'"

.

.. . -

.

•. . .. _.. .. .

1. r ··········.··....·.·.. ·T.. ···..····j·· ·········· ~

. j

,

.. ··. r····.· . r ······.··T···········T·······..··,·····..····T···········

U

z

it

0

·10

·SO

8'"

·40

·30

·3S

DlRECllOS OF ARRIVAL [DEGREESI

Fig. 2.

·23

I-

-40

Spatial spect rum of the proposed algorithm after iterati ons 1. 2. 4 .

30.

0

3S

ITERATIONS

then update the ite ration counter k = k I.

+

Fig . 3.

I and go back to step

Value of (he cost function versus iteration number.

If

SNR = 30 . SOO SNAPSHOTS

SO

stop . T he algorith m performs the iterations until J.. co nverges . Note that at each step that cos t function reduce s so that 1,~OI

> 1,(0) > '"

>

J,~ k l?:

4S

i= z;

~o

u a:

3S

w

w

O.

Hence. 1,~ kl is a convergent se ries and convergence teed .

IS

guaran-

~

JO

a: a:

ZS

a: '" 0 w ~

-c

:I:

IV .

~

NU~lERI CAL E.\A~IP LES

To illustrate the behavior of the algorithm . cons ider a circular unifo rm array of six omnidi rectional sensors separated by half :I wavelength of the actual narrow-band source signals. We used simulated signal vectors S(j) and noise vectors V(j) drawn from a complex Gaussian distribution with zero mea n and covariance mat rices a} . 1 and a; . I . respectively . We assumed that eac h sensor is sig nifica ntly coupled with his nearest neigh bo rs while the cou pling with other sensors can be ignored . T his assumption red uces the MCM to a 6 x 6 matrix with five nonzero d iagonals . The gain of the sensors was sele cted using the following relation : OJ

=

[(13, - 0 .5)aa · v'12 + 1]

i= 1. 2... · . M

where 13; is a uniformly distributed random number between zero and one. and a'; is the variance of the gai n. The phase of each of the senso rs was selected according to

'¥; = [(-y; - 0 .5)a",·

/12] ,

i= 1.2.· .. , M

ai

10

--.------ -.+..·

,

·-----. i--. --

:C·,··.·j·.··· 5

-~. __. _---_

,

i

+.·

~

l········..···j·············I.. ······· ·· ~··········· ·.., 10

IS

:W

Fig. 4 .

30

ZS

410

3S

Relative gain /phase errors versus iterati on numbe r.

\

90

..,

;:::

80

w

70 .. ': ' :' ::::

z, U

a: w c,

lI)

'"o '" w '" o

40

:J e,

JO

::>

8

20 10 •..

(

.

l .::::.:::J::.:::::.:::::t::.:::.:'.':. .

. ._

. . ..

:

.

:

.

·············[·············t·············;···

60 50

.:.

.+

·····t··············i·············;········

.... . ...•...... . ...... . . . .;_..

. . . . . . . . .:.

_.....

...\.·..··.······r···· ········i······

..........::::.:. ::::::::r:: .

.. _. .

':.

...;.-

.

3S

ITERATIONS

= 20·

.

SNR = 30 , SOO SNAPSHOTS 100

Fig. 5.

aa = 0 .2

.

ITERATIONS

dB

1 = 500 snapshots

a",

IS

o

Z

where 'Y; is a uniformly dist ributed random numbe r between zero and one and is the variance of the se nsors phase . Figs. 2-6 describe an experime nt with the follow ing parameters :

SNR = 10 log 10 ( a}/ a,;) = 30

:;;: o

ZO

Coupling coefficient error versus iteration number.

-- ---

10.1

SNR = 30. SOO S NAPSHOTS

/ I'---.. ~

0

w

'" ~ ~

2

ffi < 8

-4

o

-I--.. ~

C

·8

o

V

----

r-,

t-....

L....-

l---

~

w

~

10'

"'" 10

10

IS

20

2S

30

3S

10

IS

2S

20

~ 3S

30

Fig. 7. RMSE of the magnitude of the coupling coefficient versus SNR . The solid line depicts theoretical values comp uted by the Cramer- Rae lower bound . The point estimates are the means and 90% confidence intervals from Mon te Ca rlo experi ments for three values of the SNR . Each Monte Carlo experiment consisted of 30 runs of 500 snapshots each .

DOA errors versus iteration number.

coup ling coefficie nt = 0.2

.

to-....

SNR ldB J

ITERA TIO NS

+j .0

DOA I = -30·

10'

DOA 2 = _ 5· DOA 3

<,

1-. _ -

I Fig . 6.

<,

:--..

= 35·

>---- -

gains = 1.000 ,0 .5800,0 .8215 ,1.077,0.7970,0.9620

10 '

phases = O· , - 3 .8· , 14.6· , 34.4 • , - 3 .4• , 41 .4• Fig. 2 shows the spatial spe ctrum of the proposed procedure at the first, second , founh and 30th iterat ion. It is clear that major DOA errors are corrected . Fig . 3 show s the red uction of the cost function value duri ng the itera tions, until co nvergence is obtained . Fig . 4 shows (II I', - I', 1 / III', II> . 100 which is a measure of the relative gain /phase errors of all the sensors . I', is the gain /phase matrix at iteration i while I', is the true ga in/ phase matrix . Fig . 5 shows the relative coupling coefficient error as a function of iterations. Befor e the first iteration the err or is 100% and it reduce s to 1.9 %. Fig. 6 shows the DO A errors for the three sources as a function of the itera tio n number . To demo nstra te the statistical efficiency of the propo sed procedure we performed the following Monte Carlo experiments . The six sensor circular array described above was used, with three far-field narrow-band emitters . The gains and phases were selected as before, with UOI = 0.02 and Ucp = 2· . The DOA 's were 'YI = 0·, 'Y 2 = 120· , 'Y3 = 240· . The coupling coefficient between any two adjace nt senso rs was cc = 0.2(x + iy) where x and yare two i.i .d . random variables with uniform distribution over the interval [-0.5, 0.5] . The coupling coefficient for any nonadjacent sensors was assumed to be zero . We performed 30 experiments for each signal-to-noise ratio , for SNR = 10, 20, 30 dB. In each experiment 500 snapshots of data were collected and processed by the algo rithm. The values of the sensor gains and phases were kept cons tant throughout these simulations . For each SNR we used the results of the 30 experiments to compute the estimated root mean square error (RMSE) and the bias. Figs . 7 -11 depict the RSE' s and compares them to the corresponding Cramer- Rao lower bound. (computed as shown in Appendix II) . The se figures clearly indicate that the proposed algorithm is statisticall y efficient even for fairly low SNR 's, at least for the test case conside red here. The bias was small co mpa red to the RMSE in each case .

_.

<,

----

~

f--.

.'

10

10

15

2S

20

35

30

S NR IdBI

Fig . 8. RMSE of the phase of the coupling coefficient versus SNR . The solid line depicts theoret ical values computed by the Cramer - Rao lower bound . The point estimates are the means and 90 % confidence interval s from Monte Ca rlo expe riments for three values of the SNR. Each Monte Carlo experiment consisted of 30 runs of 500 snapsh ots each . 10 '

_.

._- _.

, F----: 10'

1-----

I

1 - - -- 1-- .... 1 - - -- - --- 1- ----

- -- _.

10.)

~-

1---.

--.....

- - -- --

--+..-

---1

~- .. -T -

10

l

10

:

.....J .. -

<;

---- --

t--...

.-:....;.....

..

._ . . .

--

---! . --

~-=:-- .: t:~:::: .I

.

...... '.---

_

IS

20

---ji

2S

30

35

S NR (dB)

Fig. 9 . RMSE of the gain of sensor 2 versus SNR . The solid line depicts theoretical values computed by the Cramer-Rae lower bound. The point estimates are the means and 90 % confidence intervals from Monte Carlo experiments for three values of the SNR. Each Monte Carlo expe riment consisted of 30 runs of 500 snapsh ots each .

41 1

sensor characteristics is essential in order to estimate the DOA's accurately. The algorithm presented here is able to calibrate the array parameters (mutual coupling. and sensor gains and phases) without prior knowledge of the array manifold, using only " signals of opportunity" and avoiding the need for deploying auxiliary sources at known locations. We found some necessary conditions for the existence of a solution. Deriving sufficient conditions for convergence to the global minimum. and obtaining a unique solution, are still open research problems .

10'

--; <;

t---..

I

-..... r---.-..

-

ApPENDIX

.-

PROOFS OF LEMMAS

10

IS

20

35

30

25

The proof of Lemma I can be obtained by direct multiplication .

S!'o'R (dB)

Fig. 10. RMSE of the phase of sensor 2 versus SNR. The solid line depicts theoretical values computed by the Cramer-Rae lower bound. The point estimates are the means and 90% confidence intervals from Monte Carlo experiments for three values of the SNR. Each Monte Carlo exper iment consisted of 30 runs of 500 snapshots each .

B. Proof of Lemma 2 By definition an M x M symmetric circulant matrix A , with

i, jth element A ij' has the following relations between its elements :

A ij

10'

r----. I

where we used the 'notation: ~

i

;;;

i

JO"

--...

- -----

(j _ i) I M

r-------

10

15

:0

15

- ---

30

L = {( M

{~+. j } -

I.

when i > i . when i =::; j .

- i,

when M is even

+ 2)/ 2,

(M + 1)/2.

35

(23)

when M is odd

(24)

and for even M. A has the form

S:W (dB)

Fig. II . RMSE of the direction of arrival of the source at O' versus SNR . The solid line depicts theoretical values computed by the Cramer-Rae lower bound. The point estimates are the means and 90 % confidence intervals from Monte Carlo experiments for three values of (he SNR . Each Monte Carlo experiment consisted of 30 runs of 500 snapshots each.

We have repeated this type of experiment with different array geometries and different initial gain and phase errors. using error values which are large compared to what one may expect to find in practice . We found the algorithm to be fairly insensitive to the initial gain and phase error values. As long as the number of snapshots was reasonably large (as in the examples above). the performance of the algorithm was very close to the Cramer- Rao lower bound . As the number of snapshots is decreased, the performance starts to deteriorate and departs from the CRB, as expected. A small sample performance analysis of this problem unavailable at this time .

V.

=

From these relations we see that a symmetric circulant matrix contains at most L distinct elements a,. a2.· .. , aL where

I

5

(21) (22)

A ij = A j l

.-

---- r----....

= A pq if (j - i) I M = (q - p) I M

and

-+-

<,

10' )

1-3

A . Proof of Lemma J

,

10·

:;:; x

I

a, a2

a2

0,

a2

a2

aL

aL_1

aL a L_ 1

A

while for odd M . -A has the form

CONCLUSION

In this work the eigenstructure approach has been used to obtain estimates of directions of arrival as well as estimates of gain. phase and mutual coupling of the observing array sensors. We have shown in previous publications [12], [13] that the basic MUSIC method does not perform well when the array properties are not known accurately . Therefore, the estimation of the

412

a\ a2 A

a2 al

aL

a,

aL

a2

aL aL a2

It is easy to see that the matrix A may be represented by the sum

A

M

L

k=-M

diag ( A • k)

where diag (A, k) is an M x M zero matrix except for the kth diagonal, which is equal to the corresponding diagonal of A. Since the elements on each of the diagonals are equal we have for even M: L-l

L

A =

diag

ak+l

k=O

L

+

-L+I

L

+

diag

a 2 L-k-1

k=L

-M+l

L

a2L+k-t

k= -L

(JM , k)

L

W2

= L

L

a2L_kdiag

k=L

k= -I

+

k=

k=L

column

(l.w. k)

w~

-1\1+1

L

k= -L

+ I • Xk +2 • . . • •

0,

Ax

(26)

diag (l,w' k) . x

=

k~O

k

•

•. ,

c.

k=O

even M

column

(xtt, 2L - k)

odd M

f

1

k=L

{:~:

+

{:~~

column column

(x t/,2 L +

column

(x;', 2 L + k)

-1\1+ 1

~

0.

:=

(WI + W 2

column (x~.

j) . a

+

(28)

+ Jt~)a = Q2(x)a.

Q.E.D.

II

(31 )

and covariance

(32) The unknown parameters () are imbedded in the covariance R. The logarithm of the pdf for 1 1 statistically independent observations can be written as

( 0)

= - J 1 In {det ( R )}

I)} .a

(x~, 2L + k - 1)} . a

(x~, 2L -

~V3

E{x} = O~

it

L

-

x( j) H R -

)=1

IX(

j)

where

(x~, 1 - K)} . a

column

odd ,'vI

Consider a complex Gaussian vector x with zero mean.

(34 )

(29)

{:t~ column(x~,k+ l)}-a

x.;

even !vI

Proof of Lemma 3

The m, nth element of the Fisher information matrix is given by

J

+ {

k - 1)

THE CRAMER-RAO BOUND FOR GAUSSIAN SIG:"JALS

and for odd M,

Ax=

column

a L]T we have

column (x ~, 2 L - k -

+

L

ApPENDIX

(27)

column (xtt, k + I)} . a

+ {M

1 - k)

(xtt, 2L - k - 1)

k= -L

L

1)

The proof of Lemma 3 goes along the same steps as the proof of Lemma 2.

where x is the right side of (27) and column (x, j) represents an M x L zero matrix except for the jth column which is equal to x. Using (25) we obtain. for even A1,

{Lf.'

(30)

Hence.

~. . . 01 T •

Xl.···. X A1 + k]

. a.

column

k= -L

a2 L + k diag ( J.H . k ) :

X \1 •

=

+

k

(xr,

-Al+1

a_ k + I d iag ( 1.\1. k)

Using the notation a = [at, a 2 , '

=

-l

L

k) . x

[0,··',0,

Ax

(xr,

column

1\-1-1

where J M is an M x M matrix of ones. Now. note that

r

k=O

k=L

M-I

. L

•

(x~, 2L + k)}

-L+l

(25)

diag ( J M, k),

L

=

a k + I diag ( J,\1' k)

k=O

-L+l

aj

column

L

L-l

+

Xk

{:~;

M-l

A =

= [[

(x~, 1 - K)} . a

L-l

WI

and for odd M:

diag ( J,w,

column

It is now easy to verify that

a -«+ 1 diag ( J A1 , k)

k= -1

+

+

(JA1 , k)

M-l

:~:

+ {

mn

=-E {

1

a2L

aomae

(35)

'

n

To evaluate (35) we use the following relations:

d (R-

I)

= - R-

I .

d (In {det ( R) }) = tr { R -

k)} . a

E{R} = R. 413

dR . R -

I •

dR}

I

(36) (37) (38)

Taking the first derivative of L(O) we obtain

oL/iJO m

=

J 1 • tr {R-

l

•

oRjoOm· R-

1

•

Using (49), we first write the partial derivative of the covariance matrix with respect to the jth DOA as

R}

.

aR / alj = A-yjPA

- J, . tr {R- l • oR/(J0m}

J-y;'Yj = 2Re

{tr {..1-y;PAHR-'AP..1~R-'}

.( -R- 1 • iJR/iJO n· R- 1 • R)}.

+tr

Taking the expectation of both sides we obtain =

'

tr {R-

1

•

(JR/dO m· R- '· OR/dOn}.

{..1-y;PAHR-I..1-YjPA~R-I}}. (52)

Observe that

-E{a 2L/oO moOn }

= 11

(51 )

we obtain

(J R / oem) / (JO n

·(R-lR - I) + R- 1 • oR/oem

1m n

(50)

tr ( A H) = con j { tr ( A ) }

The second derivative is given by 1•

'H

+ APA-Yj'

Substituting in (49) and noting that

= J l • tr{R- I • oR/oO m ' (R-1R - I)}. 0 2 L /00 mOO n = J 1 • tr { 0( R -

H

(39)

where the unit vector e I is the itb column vector of the identity A is the matrix of derivatives given by

Thus, the number of observations enters the result only as a multiplicative constant. To simplify the exposition we assume 1 1 = 1. The modification for other values of J, is straightforward.

M x M identity matrix, and

A. The FIlvt for a General Passive Array

We use the notation

For the parameter estimation problem posed earlier in this work, the covariance matrix of the data vector x can be written as R = CrARsAHrHc H + a~'/ (40)

(54)

A ~ erA

(55)

to simplify our formulas. Using (53) and (55). (51) becomes HR J'Y(Yj.=2Re{tr{Ae-€!PA / I

where A is th~ direction matrix. To simplify the derivation we assume that a - and R 5 are known. Dividing both sides of (40) by a ~ we obtain ..

1APe }

1} eTAHRj

1 +tr{Ae I eTPAHR-'Ae eTPAHR} I j }

}

= 2Re {eTPAHR-lAPe eTAHR-IAe· / } } I

where

(42)

P=Rs/a~.

+eTPAHR-lAe eTPAHR-lAe } I j j I

erA,

(44)

W~AHA,

(45)

Q~(P-I+W)-l

(46)

A g

In

(43)

and therefore we use R instead of R to derive the FIM. Moreover, we use the following notation to simplify the formulas:

We also use the relation

PWQ

= QWP = P

- Q.

(48)

B. Derivatives with Respect to DOA

{(PAHR-1AP) x (..1HR-'..1)T

(A X B) ij = A i)Bi)'

(57)

(58)

Equation (57) may be further simplified using (46) as follows:

In

= 2 Re {(p =-

Q) x (A HR- 1..1) T

+ (QAH..1) x (QAH..1{}. (59) c. Derivatives with Respect to Sensor Phase Repeating the same set of considerations leading to (52) we obtain

(60)

(49)

414

2Re

where i-y-y is the submatrix of the FIM associated with the DOA derivatives and x denotes the Hadamard product of two matrices. defined by

In the ensuing development, we make use of the notational device.

for the partial derivative of the matrix A with respect to the DOA v, of the jth sensor.

=

+(PAHR-1A) x (PAHA{}

(47) which implies that

(56)

Hence,

It is easy to verify that

J(R) = 1(R)

•

where

(61 )

E. DOA-Phase Cross Terms

Here F is a diagonal matrix containing the exponents of the sensors' phases while G is a diagonal matrix of the sensors' gains, thus (62) r == OF == FC.

F as

It is useful to define

We first write the cross-term equivalent of (69), J'Yi¢j == 2 Re

tr

{

" H A'YiPA R

the matrix of derivatives:

(

'YI
= 2Re

-(fAPAHR-1C) x (fAPAHR-1C)T}

=2Re{(C-

= 2Im{[(p-

(C-1APAHR-1C)T}

Q)AHC- H)

{-J[ (p - Q)(C-1A)H]

+J(QAHC)

{(C-IPAHR-IAPAHC-H) x (CHR-1C)T

1A(P-

-(QAHC)

x (CHR-1C)T

- ( C- IAQA HC) x (c- IAQA HC) T}

J,/aj

Q)(C-1A(] x (CHR-1A)T X

= 2 Re {tr {A"PAHR-IAPA~R-I}

·APA HC-HG-1e eTCHR-

T'

T

-

-1-1 C A.

T

J

(68)

= 2Re

{tr

{A a,.PAHR-1APA.:!RUoj

t.: = 2Rc

}

= 2Re

Substituting (68) in (69) we obtain I

j

J

(70)

2 Re {( C- le-lAPA HR-1APA HC-HG- I) x(CHR-lC)T

+

X(CHR-1C)T

+

{

"

H

AatPA R

-I

'H

APA
-I}

(78)

J ajcPj == 2Re {-tr {Cete;G-IC-IAPAHR-I

Q)( G-1C-1A) H)

"APAC-Heje;CHR-lj}

(G-1C-1AQAHc)

x(G-1C-1AQAHC)T}.

(77)

Substituting (68) and (64) one obtains

X (G-1C-1APAHR-1C) T}

-

Jat
+tr {Aa;PAHR-IAjPAHR-I}}.

(G-IC-1APAHR-lC)

= 2 Re { (G-1C-1A( P

(CHR~IA)T

X

The cross-term equivalent of (69) is given by

+tr {CejefC-tC-tAPAHR-t

t.: =

{[(P - Q)(G-1C-1A)H]

G. Gain-Phase Cross- Terms

-APAHC-HC-Ie e~CHR-I}

-Ceje3"C-1C-tAPAHR-I}}

(76)

{(PAHR-IAPAHC-HG-I) x (CHR-1A)T

+(QAHC) x (G-1C-1AQAHA)T}.

{tT {Ceje;C-lC-lAPAHR-1 )

1}

+(PAHR-1C) x (G-1C-1APAHR-1A)T}

+tr {Aa;PAHR-IA"jPAHR-I}} . (69) la'a' == 2Re

j

+tr{Ae I eTPAHR-ICe.eTO-Ic-tAPAR-ll} , j j f

Thus, repeating the considerations leading to (60) we obtain t

(75)

J-Yiaj == 2Re {tr {Ae/e;PAHR- 1

!.w . (67)

Hence AO~i == Ce,ejGFA == Ce,e1FA == CeielG

(74)

(C-1AQAHA)T}.

Substituting (53). (55). and (58). we obtain

and

==

(C-1AQAHA{}

+tr {A"PAHR-IA"JPAHR-I}}.

(66)

6 22 ( ex 2) • " . " • 6,'-'1.\,1 ( ex .\1 )}

X

(CHR-'A)T

X

The cross-terms equivalent of (69) is given by

We first define

I) •

(73)

F. DOA-Gain Cross-Terms

(65)

D. Derivatives with Respect to Gain

~ diag {Gil ( a

}

+J(PAHR-1C) X (C-1APAHR-1A)T}

J = 2Re {(fAPAHR-IAPAHf H) x (CHR-1C)T

X

JJ

J-y

+tr { - Ceje;r APA HR-1CejeJr APA HR- 1} }

-(C-1APAHR-1C)

II

+tr {jAeje;PA HR-lCe jeJC-lAPA HR- 1 }

1} J ¢icPj == 2Re {tr {Ce.eTrAPAHR-IAPAHrHe 1 , j.eTCHRj

"

(72)

J .. ==2Re{tr{-jAe.e!PAHR-IAPAHC-He.e~CHR-I}

64)

Substituting (64) in (60) we obtain

1aiaj

"H

Substituting (64) and (53), (55) we obtain

" T" . T . T -1 Aet>j==CejejFGA==jCejejrA==jCejejC A.

c

-1APA¢jR -I}

+tr {A1'iPAHR-IAcI>jPAHR-l}.

Thus (61) becomes

= 2Re

{

+tr {Ce 1·e!OIC- lAPA HR1

(71) 415

-CejeJC-lAPA HR-1j}}

1

(79)

= 2 Re {tr {AI-tPAHR-lCeje;G-lC-lAPAHR-I}

latt> = 2Re {_j(O-IC-IAPAHR-IAPAHC-fl

x)

1} +tr {A IL PAHR-IAPAHC-HG-le j .e'fCHRj

X (CHR-1C)T +j(O-IC-1APAHR-1C)

X(C-1APA HR-1C) T}

= 2Re {e'fO-1C-1APAHR-IA j IL PAHR-1Ce.j

+e~CHR-IA PAHR-IAPAHC-HO-le.}

= 21m {[ O-IC-1A(P - Q)(C-1A)H]

}

+e;eHR-1Ap.(p - Q)AHC-HO-le j

(80)

X(C-1AQAHC)T}.

p.

and

+CHR-1Ar(p - Q)AHC-HG-'}}

(81)

jr

r are real variables. Using the notation

Jp.tt>j

A~ ~ (aC/ aIL)rA

= C~C-IA

(82)

A r ~ (ac/at)rA

= CrC-1A

(83)

=.., Re {tr {A -

~

P~4 H R - lA IJ.PA H R lAPA:R -

(84)

I} }

= 2 Re {tr {jAp.PA HR-1Ce jeJC-1APA HR- f }

-jeJ~HR-IAp.PA HR-1APA He- H ej}

+ CHR-IAp.PA HR-1APA HC- H}

(85)

{Al-tPA HR-1APA fR-1}.

="'Re{tr{A

p.

(86)

+CHR-1Ap.(P - Q)AHC- H }

+cHR-1Ar(p -

+tr { AI£PAHR-IAPA~R-l} }

PA H R - lAe j

J=

(87)

+eJAHR-IAAIJ.PAHR-IAPej} .

JIJ.~

= [A II'

A 22' ••

"

[1]

(88)

Jr~ = 2Re {diag {QAHArQAHA}

J P.Q). = 2 Re {tr

{A

lAo

lA r (p

PA H R -

- Q)} } .

J~~

l-yr

lad>

Jap.

Jar

Jtt>~

J<pa

J 4Jtt>

JtiJp.

Jtt>r

Jp.~

lp.a

Jp.¢

il-t1L

ip.r

ira

J rtP

ir~

i rr

l~cr

(97)

REFERENCES

= 2 Re {diag { QA HA p. QA HA }

+ diag { AH R -

(96)

Remark: If one or more parameters are assumed known (e.g., gain/phase of a reference sensor) the corresponding columns and rows in J must be removed.

+diag {AHR-lAp.PAHR-IAP}}

- Q)} } .

l-y¢

lOla

-;

A MM]'

= 2Re {diag {PAHR-IAI1PAHR-IA}

+ diag { AH R - lAp. (P

Q) AHC- H}}.

.;

J~~

1} +tr{A IJ.PAHR-lAPe } eTAHR}

Introducing the notation diag ( A) we obtain.

(95)

}

The FIM is given by

{AIJ.PAHR-IAeje;PAHR-I}

~

}

Jrt/J = 21m {diag { -C-1AQA HA r QA He

PAHR-1A , j PAHR- 1 }

= 2 Re {eJJ:PA H R - lA

(94)

= 21m {diag {-C-lAQAHA~QAHC

J~r = 2Re {tr {A~PAHR-IArP44HR-I}

= 2Re {tr

(93)

JIJ.4> = 21m {diag{ -C-IAPAHR-IA",.PAHR-IC

+tr {A~,-P44 HR-IAPA7R-l}}.

-

{AI£PAHR-IAPA~R-I}}

= 2 Re {jere-lAPA HR-1AIJ. PA HR-1Ce j}

I}

J rr = 2 Re {tr {A;-PAHR-IArPAHR-I}

JWI}

= 2 Re {tr {A IL PA H R - lAtPj PA H R - I } +tr

Jp.f/Jj

(92)

+tr{-jA lAo PAHR-IAPAHC-He.er:CHR-l} j }

+ tr { A)J. PA H R -

+ tr

(91)

J ra = 2 Re {diag {G - I C- lA QA H A r QA H C

we obtain

J~I-t

(90)

+CHR-1AIL(p - Q)AHe-HG- 1}}

To simplify the analysis we concentrate here on a circulant matrix with only a single coupling coefficient given by

where

}

JI1Q = 2Re [diag {G-IC-lAQAHAILQAHC

H. Derivatives with Respect to Mutual Coupling Coefficient

p.e

j

= 2Re {eJG-lC-JAQAHAp.QAHCej

x(eHR-1C)T - (G-1e-1AQAHC)

C 12 =

11

(89)

fA Q-j PA H R - I }

+tr {AI£PAHR-IAPA~R-I}} 416

A. Paulraj and T. Kailath, "Direction of arrival estimation by eigenstructure methods with unknown sensor gain and phase," in Proc. IEEE ICASSP'85, Tampa, FL, 1985, pp. 640-643. [21 A. M. Bruckstein, T.-J. Shan, and T. Kailath, "The resolution of overlapping echoes, " IEEE Trans. Acoustics, Speech, Signal Processing, vol. ASSP-33, pp. 1357-1367, Dec. 1985. [3] A. J. Weiss, A. S. WHIsky" and B. C. Levy, "Eigenstructure approach for array processing with unknown intensity coefficients," IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, Oct. 1988. [4] H. Wang and M. Kaveh, "Coherent signal subspace processing for detection and estimation of angles of arrival of multiple

[5]

wideband sources," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 823-831, Aug. 1985.

Y. Rockah and P. M. Schultheiss. "Array shape calibration using sources in unknown iocations - Part I: Far field sources," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 286-299, Mar. 1987. [6] - , " Array shape calibration using sources in unknown locations-Part II: Near-field sources and estimator implementation." IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP35, pp. 724-735, June 1987. [7] J. T -H. Lo and S. L. Marple, Jr., "Eigensrructure methods for array sensor localization," in Proc. IEEE ICASSP 1987. Dallas, TX, 1987, pp. 2260-2263. [8] R. O. Schmidt, '''A signal subspace approach to multiple emitter location and spectral estimation," Ph.D. dissertation, Stanford University, Stanford. California, 198 t . [9] A. J. Weiss and B. Friedlander, "Array shape calibration using sources in unknown locations - Maximum likelihood approach." IEEE Trans. Acoust., Speech, Signal Processing. vol. 37, pp. 1958-1966, Dec. 1989. [10] R. O. Schmidt, "Multilinear array manifold interpolanon;' Tech. Memo ESL-TM166J, ESL Inc., Sunnyvale. CA. Sept. 1983. [11] B. D. Steinberg, Principles of Aperture and Array System Design Including Random and Adaptive Array. New York: Wiley, 1976. [12] B. Friedlander and A. J. Weiss. "Eigenstructure methods for direction finding with sensor gain and phase uncertainty." in

Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing.

[13]

Apr. 1988. pp. 2681-2684. (Also. J. Circuits, Systems and Signal Processing, vol. 9. no. 3. pp. 271-300. 1990.) B. Friedlander. A sensitivity analysis of the MUSIC algorithm." IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. H

1740-1751, Oct. 1990.

417

Improving the Performance of a Slotted ALOHA Packet Radio Network with an Adaptive Array James Ward, Member, IEEE, and R. T. Compton, Jr., Fellow, IEEE

Abstract-The use of an adaptive antenna array is presented as a means to improve the performance of a slotted ALOHA packet radio network. An adaptive array creates a strong capture effect at a packet radio terminal by automatically. steering the rece.ive antenna pattern toward one packet and nulling other contending packets in a slot. A special code preamble and randomized arrival times within each slot allow the adaptive array to lock onto one packet in each slot. The throughp~t and delay perform~nce of a network with an adaptive array IS computed by applying the standard Markov chain analysis of slotted ALOHA [1], [2]. It is shown that throughput levels comparable to CSMA are attainable with an adaptive array without the need for stations to be able to hear each other. The performance depends primarily on the number of adaptive array nulls, the array resolution, and the length of the randomization interval within each slot.

A

I. INTRODUCTION

LOHA packet radio communi.cation syste~s are ?f int.erest because they provide a Simple way of multiplexing many users into a single radio channel. In these systems radio terminals transmit packets to each other whenever they have information to send, regardless of whether other terminals may be transmitting at the same time. Because terminals do not coordinate their transmissions, packets from different terminals frequently collide. A collision destroys all packets involved, and these packets must then be retransmitted after a random delay. Collisions limit the maximum throughput at one receiver in an ALOHA system to 18% if the system is unslotted and to 36% if it is slotted [3]. Because of these low throughputs, much effort has been devoted to finding improved packet radio protocols. One wellknown improvement is carrier sense multiple access (CSMA) [4], in which terminals listen to the channel before transmitting to determine if it is busy. If the channel is busy, transmission is delayed until the channel becomes idle. Kleinrock and Tobagi have shown that choosing the retransmission probability carefully in a CSMA system can yield high throughputs [4]. However, the usefulness of CSMA depends on whether all terminals in the network can hear one another. When this is Paper approved by the Editor for CATV of the I~EE Communications Society. Manuscript received February 10, 1990. This work was supported in part by the U.S. Army Research Office, Research Triangle Park, NC. and by the Office of Naval Research. Arlington, VA, under Contracts DAAL0389-K-0073 and NOO014-89-J-I007 with The Ohio State University Research Foundation, Columbus, OH. 1. Ward was with the ElectroScience Laboratory, The Ohio State University. He is now with M.LT. Lincoln Laboratory, Lexington, MA, 02173. R. T. Compton, Jr., is with the ElectroScience Laboratory, The Ohio State University, Columbus, OH 43212. IEEE Log Number 9106306.

not the case, as in satellite or mobile communications, CSMA is less effective. In the standard slotted ALOHA analysis, it is assumed that if two or more packets arrive in the same slot, none of them is received correctly. In reality. the correct reception of a packet depends not only on whether interfering packets are present, but also on the received power of each packet. Roberts [5] first noted that if one of the packets is of much higher received power than the others, it may still be correctly received. This "power capture" effect improves the throughput and delay performance of a packet radio system. Power capture has been studied by Abramson [3] and Namislo [7] when it occurs naturally as a result of different propagation distances from transmitt~r to receiver and/or channel fading. Lee [6] considered assigning random signal levels to the stations to induce the capture effect. Also. since the received power from a given direction is proportional to the receiver antenna response in that direction. directional antennas can he used to create the capture effect at the receiver. Binder ct al. (K J have considered using directional antennas to resolve potential crosslink conflicts in a multiple satellite packet system. In their work the direction to which an antenna is steered i~ obtained a priori from a form of scheduling used to set up each communication link. Their scheduling procedure. in addition to providing direction information. also reduces the contention somewhat at the expense of increased packet delay. In this paper, we examine the use of an adaptis'C antenna array to create a capture effect and thus improve the performance of a slotted ALOHA system. An adaptive array is an antenna system that controls its own pattern in response to the signal e~vironment [9], [10]. An adaptive array can capture a packet by pointing the peak antenna response toward that packet while simultaneously forming pattern nulls on other interfering packets [11]. An adaptive array can do this automatically without requiring any a priori direction information. Thus. there is no need for prearranged scheduling in a system with an adaptive array and the delay performance should be improved. Furthermore" an adaptive array provides a much stronger capture effect than an ordinary directional antenna, because pattern nulls are placed in the directions of contending packets. We shall show that the use of an adaptive array can provide throughput and delay performance comparable to that of CSMA. Moreover, with an adaptive array there is no need for users to be able to hear each other. In Section II we describe the communication system we

shall consider. Section III gives a brief overview of adaptive

arrays. Section IV describes how an adaptive array can acquire

Reprinted from IEEE Transactions on Communications, Vol. 40, No.2, pp. 292-300, February 1992.

418

ARRAY

OUTPUT

s
Fig. 1.

A single-hop packet radio system.

the first packet to arrive in a slot while nulling subsequent packets in that slot. In Section V we calculate the throughput and delay performance of a packet system using an adaptive array. Section VI presents numerical results. and Section VII contains our conclusions.

Fig. 2.

II. THE

..

+

WEIGrlT FEEDBACK

REFERENCE SIGNAL

F(t)

An adaptive array.

CO~1~lUNICAT[ON SYSTEM ~lODEL

We consider a simple ALOHA system in which a repeater occurs.) The method used to form the antenna pattern is links a network of radio terminals. as shown in Fig. 1. In this described in Section IV below. network terminals transmit messages to each other through Now let us consider this system in more detail. We begin the repeater. We assume time is slotted and that the network in the next section by reviewing the adaptive array concepts uses a slotted ALOHA packet radio protocol. Transmissions needed. between terminals occur randomly in each time slot. Each terminal transmits a packet in a given slot whenever it has Ill. ADAPTIVE ARRAYS one to send, without regard for whether other terminals rnav An adaptive array is an antenna system that controls its own he transmittiug in that same slot. pattern, by means of feedback. while the antenna operates [9], ,\11 packets are transmitted to the central repeater. w hieh [ 12]. [13]. The signal from each element in an adaptive array retransmits them hack to the network. The repeater is assumed is multiplied by a weight and then summed to produce the to be a storc-und-Iorward repeater. It demodulates each packet array output signal. A control system adjusts the weights to and checks it for errors. If there arc no errors. the packet is maximize the signal-to-interference-plus-noise-ratio (SINR) at retransmitted on the downlink. If there arc errors. the packet is the array output. After adapting, the pattern of an adaptive discarded. The repealer downlink is on J different frequency arra y has a beam pointed at the desired signal and has nulls on than the uplink. so both the repeater and the local terminals interfering signals. In a packet radio system, the desired signal can transmit and receive at the same time. Since only the is just the first packet in each slot. The interfering signals are repeater transmits on the downlink. there is no contention on the other packets contending for channel access in that slot. the downlink. Fig. 2 shows an adaptive array with N; elements. The signal Each terminal monitors all downlink packets. By examining the address contained in each packet. a terminal determines i J (t) from element j is multiplied by a weight W j and then whether it is the intended recipient of that packet. A terminal summed to produce the array output signal .~(t). The weights retains packets addressed to itself and discards others. More- are controlled by a feedback system that minimizes the meanover. when a terminal transmits a packet of its own over the square value of the error signal i( t), which is the difference repeater, it listens for that packet on the downlink to determine between the array output /3( t) and a signal r( t) called the if the packet was successfully forwarded. If the packet is not reference signal. The reference signal is a locally generated heard on the downlink. it is assumed that the packet suffered signal that determines which received signals are retained in a collision on the uplink, and the packet is retransmitted after the array output and which are nulled. Minimizing the meansquare value of E'( t) is equivalent (for narrow-band signals) to a delay of some random number of slots. We assume the receiving antenna at the repeater is an maximizing the signal-to-interference-plus-noise ratio (SINR) adaptive array. I The purpose of the adaptive array is to aim the at the array output and causes the array to steer a beam toward repeater antenna pattern at the first packet to arrive in each slot any signal correlated with the reference signal and to null any and then to null subsequent interfering packets in that slot, to signal uncorrelated with it [9]. It may be shown [9] that the optimal (maximum SINR) prevent them from destroying the first packet. This technique array weights are given by will allow one packet to be received successfully, even when several packets arrive in the same slot. (In a conventional (1) ALOHA system, all packets arc destroyed when a collision where W is the weight vector, I The transmitting antenna at the repeater is assumed to cover all the users of the network so that each terminal can hear all downlink packets.

419

W = ['Wi. ui» I

..

,

... ' lWl\T]T Ve ,

(2)

~

is the covariance matrix, (3)

and S is the reference correlation vector,

S == E[X*'i=(t)].

(4)

In these equations, X is the signal vector, i.e., a vector containing the element signals,

X == [1; 1 ( t ) . :1: 2 ( t), . . . ~ :i;.v (t)]T . e,

(5)

E[] denotes expectation. * denotes complex conjugate, and T denotes transpose. The weights in (1) are known as the Wiener weights. A well-known method of controlling the weights in an adaptive array is the sample matrix inverse technique of Reed, Mallett, and Brennan [14]. In this technique, the element signals are sampled periodically in I and (2 (inphase and quadrature) channels and an estimate of the covariance matrix is computed from the sampled signals. If X(j) denotes the value of the signal vector X at sample time l- the sample covariance matrix is computed from /\-

~ == LX*(j)XT(j)

(6)

)=1

where K is the number of samples used. The notation ep is used to indicate that (6) is an estimate of

S ==

t:

L X*(j),-'(j)

(7)

)=1

where r(j) is sample j of the reference signal ,-.( / ). The optimal weights are then estimated by solving the system of equations

~w==s

(8)

for the weight vector. Reed et al. [ 14] have shown that this technique produces an average SINR within 3 dB of the optimal SINR if the samples X(j) are statistically independent and if the number of samples K is approximately twice the number of array elements. When several signals are incident on the array, the reference signal r( t) determines which signals are retained in the array output and which are nulled. Any signal correlated with i'( t) is retained in the array output and any signal uncorrelated with r( t) is nulled [9]. To use an adaptive array in a communication system, the main challenge is to find a way to obtain a reference signal correlated with the desired signal and uncorrelated with the interference. In Section IV we describe a method for doing this with packets. An adaptive array has two limitations that are important for this application. The first is that an array with 1'1(. elements has only N; - 1 degrees of freedom in its pattern [9]. Each null or beam maximum formed by the array requires one degree of freedom. In our case, the array needs to form a beam maximum on one packet and nulls on all other packets in a slot. Thus"

an Nt:-element array using one degree of freedom to form the required beam maximum can also form nulls on up to N = N; - 2 packets. When there are more interfering signals than the available degrees of freedom, the array will not be able to null them all [9]. Another limitation of adaptive arrays is that a given array has only a certain ability to resolve signals in space. If the arrival angles of an interfering packet and the desired packet are too close, the array cannot simultaneously null the interference and form a beam on the desired packet. In this case, the array output desired signal-to-noise ratio drops and the adaptive array may not capture the desired packet. To characterize the resolution capability of an adaptive antenna, we define the resolution width (}" to be the minimum angular separation between two signals at which the adaptive array can place a pattern maximum on one signal and null the other. The resolution width H,. is taken to be Hb/2" where fib is the bcarnwidth of the array, i.e., the angular separation between the first nulls on each side of the mainbeam. H,) depends primarily on the array aperture size but also to a lesser extent on the element patterns and the number of elements. In the analysis below, we relate the performance of the packet radio system to the number of nulls available and to the resolution capability of the array. With this background, we now describe a technique for operating an adaptive array in a packet radio system.

IV.

ACQUISITION

The main difficulty in using an adaptive array in a packet radio system is the acquisition problem, i.e., the problem of forming the beam on the first packet and nulling subsequent packets in the slot. Each packet to be received hy the array will arrive at an unknown time and from an unknown direction. The array must form its pattern on a packet very rapidly ~ in time to receive the message portion of the packet. To allow an adaptive array to do this, we add a special two-part preamble to the beginning of the packet. The first part of this preamble will be used to trigger the acquisition process, and the second part will be used to form the array pattern on the packet. Fig. 3 shows the organization of a packet. A packet will be formed by first adding an address preamble to the beginning of a fixed number of message bits, as shown in the top of Fig. 3. The address preamble will identify the destination terminal and may contain other information such as the originating terminal or a packet number. Next, the combined address and message segments will be encoded with an ('11,. k) linear block code [15], which will be used for error detection at the repeater. Finally, after encoding, an additional two-part preamble will be added to the beginning of the packet. This preamble, called the acquisition preamble, will be used to lock the array pattern on the packet. The acquisition preamble will consist of two consecutive code sequences, called Codes 1 and 2. Code I will he a 13 bit Barker code [16], which has a highly peaked aperiodic autocorrelation function as shown in Fig. 4. Code 2 will be one or more periods of a pseudonoise (PN) code [17]. The periodic autocorrelation function of such a code has a sharp

420

Accress

E=j.-------=---Message

Pream ble

packet

k bits

I

T s,- -

Fig . 6.

I --

Cn,k ) li near code

ACQU ISi tIo n

Pre amble

o=Jl-.--~ CO de I

c:

-

-

-

-

-

-

-

-

Slot width , packet width , and uncertainty interval.

n bit s

Threshold Detector

Encocec ador ess pream ol e ana mess age

Coce 2

Fig . 3

.~o

-

Packet organi zation .

Fig . 7.

13

"

Packet acquisitio n circuitry.

u,

e o

.~

~

s~ 26

13 Fig~ .

lime Shift

Aurocorrclauon tun ction <,I a 13 hit Harke r code .

Aut ocorrelation

Function

Fig. 5.

Autocorrel ation function of a PN code

peak of height .V, at zero shift (and at shifts of any multiple of the code period) and then drops to a constant value of - 1 for shifts over I bit where N, is the code period. as shown in Fig. 5. To allow the packet acquisition. the width of the slot T, will be made larger than the packet width T p by an uncertainty interval T as shown in Fig. 6. To exploit the autocorrelation properties of the preamble codes, the starting times of packet transmissions from all terminals will be randomized over the interval TIL, as in [18]. The uncertainty interval also makes the acquisition process fair (by preventing stations closest to the repeater from always acquiring the repeater first) and gives the designer control over the probability that two packets arrive at almost the same instant. The adaptive array will operate as follows. At the beginning of each slot. when the repeater is ready to acquire a new packet, the array weights will be set so the array pattern covers all users in the net. Such a pattern is easily obtained by turning one array weight on and the rest off. With one weight on, the I"

array pattern is just the pattern of the element that is turned on. This element pattern will be chosen so it covers the entire net. We call this the uniform coverage mode. In this mode, any user can access the system , To acquire an incoming packet, we use the following technique. At the array output is a filter matched to Code 1, followed by a threshold detector. and then a reference signal generation circuit. as shown in Fig. 7. Assume first that only one packet arrives during the slot. With the array in its uniform coverage mode. the incoming packet will pass through the array and into the matched filter. The output of this filter will contain a sharp peak at the end of Code I . This peak will serve as a timing spike to trigger generation of a reference signal during Code 2. The reference signal will be a signal modulated by the same PN code as in Code 2. The timing spike will start the reference signal at the proper time so it is correlated with the received packet during Code 2. The reference signal will continue only during Code 2. The array pattern will be adapted during Code 2. Because the reference signal code is synchronized with Code 2 in the packet, the array will optimize its weights for reception of the packet." At the end of Code 2, the array weights will be frozen. The array pattern will then be held fixed during the address and message portions of the packet. Now suppose two or more packets are received in the same slot. Each of these packets will cause a timing spike at the matched filter output. But only the first timing spike will trigger reference signal generation and begin array adaptation. Timing spikes due to later packets will be ignored by the system, because the acquisition circuit will be designed so that once it has been triggered, it will not trigger again in the same slot. Because the reference signal code will be aligned with Code 2 of the first packet, it will be essentially uncorrelated

421

2 The reference signal does not have to be locked in frequency or phase to the received packet for this process to work. The only requirements are that the PN codes be synchronized to within about one fourth of a code bit, and that the difference between the reference signal frequency and the received signal frequency be less than the reciprocal of the adaptation time [19), [20) .

with the second packet as long as the second packet is at least one bit later than the first. This is so because the autocorrelation function of a PN code has a very low value for shifts of 1 bit or more. (See Fig. 5.) The second packet and all later packets will therefore be regarded as interference by the adaptive array and will be nulled. At the end of Code 2, the array pattern will be optimized for receiving the first packet and will have nulls on later packets. If the second packet arrives less than one bit after the first, the first two packets will be correlated. The adaptive array will not null the second packet in this case and there will be no throughput. In this case we say that the first packet is not acquired. With the uncertainty interval TtL properly chosen, however, the probability of this event is small. The throughput analysis below takes this possibility into account. The uncertainty interval T u and the durations of Codes 1 and 2 will be chosen so that all packets in a given slot begin no later than during the Code 2 preamble of the first packet in the slot. For this reason it is possible to finish adapting the array weights at the end of Code 2 and fix the array pattern during the address and message segments. The adapted pattern at the end of Code 2 will have nulls on the interfering packets, and these will be retained for the rest of the slot. In the analysis below, we assume that the packet SNR is high enough so that if a packet is present, it is always detected by the acquisition circuitry. We also assume that the possibility of a false alarm, i.e., the triggering of a reference signal without the presence of a corresponding packet, is negligible. We assume the array acquires the first packet to arrive in each slot as long as another packet does not arrive in that slot less than one bit after the first. However, even if a packet is acquired, it may still not be successful. An acquired packet will be unsuccessful in either of t\VO cases:

slot until successful, at which time it becomes unblocked and resumes transmitting new packets. Typically, PT' > Pn so that backlogged packets are quickly cleared. At the end of each slot, the downlink transmission provides immediate feedback to the terminals regarding the success of their packets. Let .J'Y k denote the number of blocked terminals at the beginning of slot k. The number of blocked terminals at the end of the slot depends only on the number at the beginning of the slot and the events occurring during the slot. Thus, the time-varying state of the network can be described by a Markov chain, where the state represents the number of blocked terminals. At slot k, the state Xi, can vary between o and M, We shall compute the one-step transition matrix P == [Pt , ) ] and then the equilibrium probabilities of the Markov chain describing this system. In a given slot, there will be a total ofn, == 11,,, + n; packets transmitted where nn and n; are the number of new and previously backlogged packets transmitted in the slot. Given the state ..Yk == i. nil and ti; are independent Bernoulli random variables with distributions

1) when more interfering packets arrive during a slot than the number of available nulls, or 2) when another packet arrives too close in angle to the acquired packet. At the end of each slot, the array is reset into its uniform coverage mode, and the acquisition cycle starts over for the next slot. We now consider the throughput and delay performance of a packet radio repeater using an adaptive array with this acquisition technique.

Oc » (ll '.) l

~ -

[I ..,,-\.k -,.}I ( -?' - ( .\ I I- l) Pit 1-

p I.{ nn --

Pit ) .\ I -

I-I(9)

Thus, the distribution of the total number of packets per slot is

L (21l(sli)(2,·(1 - ·-;Ii). l

Qdlli) ~ Pr{nt == ll"\k == i} ==

( 11)

Let P, (I) be the probability that a packet is successful given that l packets are transm itted in the slot. The success probabilities P., (l), which depend on the adaptive array characteristics and the acquisition parameters, will be determined below. Given P.,(l), the transition probabilities PI.) may be found by enumerating the possible ways that each transition may occur.

• j < i-I. i == 2..... J\1: Not possible, since at most one backlogged packet can be cleared in a slot.

V. THROUGHPUT AND DELAY ANALYSIS

( 12)

To determine the throughput and delay performance, we apply the Markov chain analysis of a slotted ALOHA network [1], [2] to include the effects of the adaptive array and the acquisition process. We consider a finite population of M terminals transmitting to a central repeater equipped with an adaptive array. At the beginning of each slot, each terminal is either blocked or unblocked, depending on whether its previously transmitted packet was unsuccessful or successful. An unblocked terminal transmits a packet with probability Pn in a slot. Only unblocked terminals generate new packets. A blocked terminal retransmits its backlogged packet with probability Pr in each

422

• j == i - I , i == 1,···. M: 1) nn == 0, n.; ~ 1, and one backlogged packet is successful.

Pi . i -

1

==

o; (Oli) L Q,.( iii)?., (I).

(13)

[=1

• j == i 1) n.;

+ k, i ==

0, ... ,lvI, k == 0, ... , AI - i :

== k + 1, ti;

~ 0, and one packet is successful.

2) nn = k: ti; 2: 0, and none of the transmitted packets are successful. Pl,i+k

==

o; (k + Iii) L

(2,.(il,t)

For, I ~ 2, we use the uniform distribution of the transmission times to write

r.(l + k + 1)

I=U

+ (2

rt (

k Ii)

L (2,. (IIi) (1 -

Thus, from (17) and (18),

t-. (l + k)).

[=0

Pa(l)

(14)

=

I,

{

(19)

where u. == Tu/Tb is the length of the uncertainty interval in bits. Once a packet is acquired, two conditions must be satisfied for it to be successful. First, there must be no more than .N == N, - 2 additional packets transmitted in the slot, because the adaptive array can place pattern nulls in at most N directions. Second, no other packet can arrive from an angle within ()b/2 of the acquired packet arrival angle. If this happens, the adaptive array will be unable to resolve the acquired and interfering packets and there will be no throughput for the slot. The P""\l!(l) may be computed as follows. First, we have

This Markov chain analysis is similar to that of Namislo [7]. (Namislo determines the success probabilities for a fading environment by using a Monte-Carlo simulation. We will derive them directly for the adaptive array.) To compute the P, (l), we first note the distinction between acquired packets and successful packets. An acquired packet is one for which the array acquisition circuitry generates a reference signal that is not correlated with any other packets. Note that for a packet to be successful, it must first be acquired by the array. Once a packet is acquired. it is successful only if the adaptive array can form a beam on the acquired packet and place pattern nulls in the directions of the other contending packets. Given that there are l packets in a slot. \V~ characterize each packet by an arrival time t i == 1.··· / within a slot and an arrival angle HI' i = i.···.!. In accordance with the acquisition procedure in Section IV. we assume that the t are i.i.d. random variables uniformly distributed on the uncertainty interval [0. T u ] within the slot. We also assume packet arrival angles are i.i.d. random var iables (independent of the arrival times) uniformly distributed in azimuth [0. 27\] about the central repeater node. Then

(20)

since with only one packet present there are no other packets to interfere with the acquired packet. Moreover, because the adaptive array has only ~\T nulls, we set

I'

l>;.V

I

r.;

:s :s

+ 1.

(21)

To find (I) for :2 l lV + 1. recall that Ol is the arrival angle of the acquired packet and define D 1 [H t - Hb/2. 01 + 01>/2]. Then [).~ia(l)

( 15)

where PI!(l) is the probability that a packet is acquired given 1 packets arc incident. and P.. . la (I) is the probability that a packet is successful given it is acquired and l packets are present in the slot. The ~L (I) depend on the arrival times and the length of the uncertainty interval. while the !)"lfL (/) depend on the arrival angles, the resolution capability of the adaptive array, and the number of available nulls. With the preamble code structure described in Section IV, the first packet in a slot is acquired as long as all subsequent packets in that slot arrive at least one bit duration Tv later than the first packet. If the first packet is not acquired, no packets are acquired for that slot. Thus,

Fa(I) = l P r { t 2 > i. 1

1 == 1 1

(1 _ ~ )1; 1 >

+ Ti; t J > f 1 + Ti; . . . . . t I > t 1 + Tv} ( 16)

== Pr{H 2 ~ D 1 · f) 3 ~ D l · · · · .H[ ~ D l } == E o1[Pr{ t1 2 ~ D 1·03 ~ D1 ... ·.O[ ri D1IB l

= Eli,

[g

Pr{ H,

~ D1IHd]

}]

(22)

where Eel [] denotes an expectation over the random variable H1, and we have taken advantage of the independence of the arrival angles. However,

Pr{ H t

~ D1IBd =

(1 - ~~ ).

(23)

which is independ-ent of Ol. Thus, (22) becomes 2~ j SN

+ 1.

(24)

Hence, from (15), (19), and (24), the success probabilities are

where the factor of I accounts for the fact that any of the l packets transmitted can be the first packet in the slot. If only a single packet is transmitted in a slot, it is acquired, so (17)

423

1

[=0

1~

l=1

O'

P~(l) ==

{

(1 - -1 )l( 1-~ (} )l-l~ U

27T"

0;

2~l~N+1·

1> N

+1

(25)

Given that the system is in state i, the probability of a successful packet transmission is the conditional throughput S(j) , given by

0.9

M

S(j)

=L

Qt(lJj )Ps(l ).

(26)

1=1

The average number of new packets entering the system state j is

In

(27) The Markov chain described above is irreducible. Since we assumed a finite population, all states are recurrent non-null. The states are also aperiodic. Consequently, this Markov chain has a limiting distribution denoted by 11"

= [1l" (0), 1l"(I), ·· · , 1l" (M )]

j, the number

(a)

0.9

(28)

where

1l"(j) = Pr{X=

.

= j} = lim,,_oc Pr{X k + n = jlXk = i }.

~

f

(29) The steady-state probabilities are found by solving the linear system of equations [21] 11"

= 1l"P

§

(30) 10

along with the constraint that

L 1l"(j) = 1.

(31)

25

(32)

) =0

First we examine the conditional throughput S(j ) of systems with and without an adaptive array. We consider a network of 50 users. We start with an example where P71 0.002 and p; 0.2. For this case. lvl p" 0.1, which is a low traffic situation where slotted ALOHA may typicall y be used. Fig . 8(a) shows the conditional throughput S(j ) and the new packet input rate Sin(j) versus the state j . Curves for various numbers of adaptive array nulls are also shown. For these curves we have Bb = 10° and u = 62. There is a significant increase in conditional throughput as the adaptive array is added and the number of nulls is increased. Also, note that there is a fixed number of nulls above which little further improvement is gained. The stability problems of ALOHA systems have been well documented [1], [2], [22]. The finite population ALOHA model is said to be stable if there is a single intersection point of the S(j ) and Sin(j) curves and this intersection point is in a region of low delay. In Fig . 8 we have intentionally chosen Pr high enough so that the system without an adaptive array is unstable. The curves with an adaptive array are stable. Moreover, for an adaptive array with 4 nulls or more. the

=

and the average throughput is M

S(j )1l" (j ).

(33)

j=O

In the steady state, the average input rate equals the average throughput, so (34) We use Little's theorem [23] to express the average delay D experienced by a new packet as

B

B

S in

S

D= =- = = .

-10

U» \ r~

VI. RES ULTS

AI

B = Lj1l"(j ).

35

:ilJ

Fig.8. Cond itional throughput comparison . For the curves with an adapt ive array: 0. = 10", U = 62. (a) M = 50, P» = 0.002, p, = 0.2. (b) M = 50, P. = 0.006. Without the adaptive arr ay, p, = 0.1; p, = 0.2545 with the adaptive array.

Once the 1l"(j ) are found. they can be used to determine the average throughput, delay, and backlog of the system. Given 7r(j), the average number of blocked terminals B is

-

:0

(b)

j =O

=L

15

J, Ihe num ber o f blocked

;\I

S

orblocked users

(35)

We now use these results to examine the performance of a slotted ALOHA system with an adaptive array .

424

=

=

0.9

200

S.(j)

180 160

~

!=

= ;; ....

.e

.

,., " Q

-=

~

r <

;:

~

;;; 0.1 0' 0

100

Neu-no AA

10

N=6

120

~

;:

N=O--no AA

140

80

j

60

40 20

15

20

j, the number

25

30

or blocked

40

35

45

0

50

I

--=:==::;.=====-_~~~_J 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04

0

users

New transmission probability (packets/slot)

Fig. 'I . Co ndit iona l through put for a network of 50 users with I' " as the number of adaptive array nulls is varied. 1//, 100 and /I

=

=

0 .11l 8.

value of PI! could be raised substan tially without introducing instability. Thus, it is seen that the adaptive array has a stabilizing effect on the system . In Fig. 8(b) we compare two stab le cases for fi " = O.OO(j. In eac h case we have chosen the largest retransmission probability possible for stable operation . Without the adaptive array. fi r is set to 0.1. which results in an average throughput of S = O .2~i::\ packets/slot. an average backlog of 1] = 2.0l users. and an average delay of D = (j .'l) slots/packet. The maximum possible average throughput is .\ / /1" = (U packets/slot. With an adaptive array . /1, is set to 0.2)-15. resulting in S = O.2!JV. IJ = (Ull. and D = 1l.:-:0(j slots. For such low traffic scenarios. the adaptive array provides only a slight increase in throughput but a marked improvement in the delay performance . The main advantage of using an adaptive array in an ALOHA network is the abil ity to handle much higher traffic rates and operate at a much higher throughput than is possible in a standard ALOHA system. In Fig. 9. we consider a case with P» = (J.(lli::\. so that on average . more than .\! l! » = O.V packets (new plus back logged) are transmitted per slot. We fix [I ,. = 0.2. To have a stable system. the adaptive array needs at least 5 nulls. For :V 2: ;j. the average throughput is O.~ packets/slot. This example shows how performance can be improved by increasing the adaptive array capabilities . We note that a throughput of (J.~ is comparable to typical values attainable by CSMA [24 ], and with slotted ALOHA under other capture mechanisms [5)-[7] . In general, performance improves as the number of adaptive array nulls increases or as the array bcarnwidth is reduced. Increasing the number of avai lable nulls allows more collisions to occur without reduci ng the number of successful packets. Reducing the array beamwidth allows the array to successfully null interfering packets over a larger angular region. Performance is also improved as the length of the uncertainty interval is increased. (Of course, a longer uncertainty interval requires a longer slot width and reduces the number of message bits transmitted per unit timc.) As the adaptive array capabi lities (resolu tion. number of nulls) are increased, average throughputs close to unity can be approached. The

(a)

:[ ~

= 62.

~

,

0.7 t

~

I

;

/

I

= rL ~

,e.0.6 0.5

~

~

~

-<:

0.3 f

0.1 : / 0 ''

o

/

\ N=6 ,

\

\

\

/,/

0.4 ~

I

"

/

1

0.2[.

/

/

/

I

\

/

-,

\'

N=O--no AA

! '

. 0.01

0.005

0.015

0.02

0.025

O.oJ

0.035

0,04

New transmission probability (packets/slot)

(b)

/---- - --- --- -.---=.:=-=-.=-= -===C0'l

50 ~ I - - -- ..-- - -..-.--- ----45

r

40 t ~

35 ~

I

3O ~

-g

!

[ <

I I I

I

25

::

N=O"no AA

~

I

~I

': L_==-===-_~__~_~_0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

New transmission probability (pack"!tslslot)

(c)

n.

Fig. 10. Average S. D performance . Without the adaptive array. I' , 0.1. With the adap tive array. IJ,· 0 .2. lib 10°. 11 62. (a) Average dela y versus /,,, . (b) Average throughput versus P» - (c) Average back log vers us u« .

=

=

=

=

limiting case of fh = 0°, 1L = oo(n = 0), N = M - 1 correspo nds to perfect capture where one packet is successful in eve ry slot in which at least one packet is transmitted. Fig. 10 compares the average delay, throughput, and backlog performance of systems with and without an adaptive array for the case fh = 10°, 'U = 62, and N = G. The retransmission probability Pr is 0.1 withou t the adaptive array and 0.2 with the

425

array. These curves were obtained by varying Pn and computing S, D, and B from (32)-(35). We see from Fig. 10(a) that the delay with the adaptive array is always better than without it, and the difference is greater as Pn is increased (as the input traffic is increased). Fig. 1O( b) shows the average throughput. For low traffic, both systems are stable and provide nearly the maximum possible throughput. However, the system without the adaptive array becomes unstable at relatively small Pn while the throughput with the adaptive array keeps increasing, to a maximum of near 0.83. Finally, if Pn is increased too far, the system with the adaptive array also becomes saturated and the network becomes highly backlogged. The average backlog for the two cases is shown in Fig. lO(c). Again, these curves indicate that by using an adaptive array, we can achieve acceptable delay at throughput levels that are much higher than are possible in a standard ALOHA system. VII.

[8] R. Binder, S. D. Huffman, I. Gurantz, and P. A. Vena, "Crosslink architectures for a multiple satellite system:' Proc. IEEE, vol. 75. pp. 74-82, Jan. 1987. [9] R. T. Compton. Jr., Adaptive Antennas-Concepts and Performance. Englewood Cliffs, NJ: Prentice-Hall, 19HK [10] R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wilev, 1980. [II] M. Azizoglu, R:T. Compton, Jr., F. D. Garber, G. M. Huffman. and H. C. Yu, Adaptive arrays in satellite packet radio communication systems," Final Rep. 718163-1, The Ohio Stale University, Dcp. Elee. Eng., ElectroSci. Lab., Nov. 1987. [12) S. P. Applebaum, "Adaptive arrays." IEEE Trans. Antenna."; Propagat.. vol, AP-24, pp. 585 -598, Sept. 1Y76. [13] B. Widrow, P. E. Mantey, L. J. Griffiths. and B. B. Goode. "Adaptive antenna systems," Proc. IEEE, vol. 55, pp. 2143-2I5Y, Dec. 1967. [14] I. S. Reed, J. D. Mallett. and L. E. Brennan, "Rapid convergence rate in adaptive arrays," IEEE Trans. Aerospace Electron. Svst .. vol. AES-IO. pp. 853-862, Nov. 1974. [15] S. Lin and DJ. Costello, Error Control Coding: Fundamentals and Applications. Englewood Cliffs, NJ: Prentice-Hall, 1983. [16] M. I. Skolnik, Radar Handbook. New York: McGraw-Hill, 1970. [17] S. W. Golomb, Shift Register Sequences. San Francisco, CA: HoldenDay, 1967. [18] D. A. Davis and S. A. Gronemeyer, "Performance of slotted ALOHA random access with delay capture and randomized time of arrival," IEEE Trans. Commun., vol. COM-28, pp. 703-710, May 1980. [19] D. M. DiCarlo and R. T. Compton, Jr., "Reference loop phase shift in adaptive arrays," IEEE Trans. Aerospace Electron. Syst., vol. AES-14, pp. 599-607, July 1978. [20] D. M. DiCarlo, "Reference loop phase shift in an n-element adaptive array," IEEE Trans. Aerospace Electron. Syst., vol. AES-15, pp. 576-582, July 1979. [21] E. Cinlar, Introduction to Stochastic Processes. Englewood Cliffs, NJ: Prentice-Hall, 1975. [22] D. Bertsekas and R. G. Gallager, Data Networks. Englewood Cliffs, NJ: Prentice-Hall, 1987. [23] J. D. C. Little, "A proof for the queueing formula: 1 = AW," Oper. Res., vol. 9, pp. 383-387, May 1961. [24] L. Kleinrock, Queueing Systems, Volume II: Computer Applications. New York: Wiley, 1976. U

CONCLUSION

In this paper we have shown how an adaptive antenna array may be used to improve the performance of a slotted ALOHA packet radio network. The adaptive array creates a capture effect by separating packets in angle and thereby preventing collisions at the receiver. We described how an adaptive array could be used in a slotted system and analyzed the performance of such a system. Typical performance results were presented. It was shown that this technique achieves a performance level comparable to CSMA. Unlike CSMA.. however, a slotted ALOHA system with an adaptive array does not require that all users be able to hear each other in order to attain high throughput. The performance is determined primarily by the array resolution, the number of nulls, and the length of the uncertainty interval in each slot. ACKNOWLEDGMENT

Significant contributions to this work were made by M. Azizoglu, Dr. F. D. Garber, G. M. Huffman, and Dr. H. C. Yu under a previous NASA contract [11] at The Ohio State University. The authors gratefully acknowledge their contributions. REFERENCES

[1] L. Kleinrock and S. S. Lam, "Packet switching in a multiaccess broadcast [2]

[3] [4]

[5] [6] [7]

channel: Performance evaluation," IEEE Trans. Commun. vol. COM-23, pp. 410-422, Apr. 1975. A. B. Carleial and M. E. Hellman, "Bistable behavior of ALOHA-type systems," IEEE Trans. Commun., vol. COM-23, pp. 401-410, Apr. 1975. N. Abramson, "The throughput of packet broadcasting channels," IEEE Trans. Commun., vol. COM-25, pp. 117-128, Jan. 1977. L. Kleinrock and F. A. Tobagi, "Packet switching in radio channels: Part I-Carrier sense multiple access modes and their throughput-delay characteristics," IEEE Trans. Commun., vol. COM-23, pp. 1400-1416, Dec. 1975. L. G. Roberts, "ALOHA packet system with and without slots and capture," Comput. Commun. Rev., vol. 5, no. 2, pp. 199·-204, Apr. 1975. C. C. Lee, "Random signal levels for channel access in packet broadcast networks," IEEE J. Select. Areas Commun., vol. SAC-5, pp. 1026-1034, July 1987. C. Namislo, "Analysis of mobile radio slotted ALOHA networks," IEEE Trans. Vehic. Technol., vol. VT-33, pp. 199-204, Aug. 1984.

426

Signal Acquisition and Tracking with Adaptive Arrays in the Digital Mobile Radio System IS-54 with Flat Fading Jack H. Winters, Senior Member

Abstract- This paper considers the dynamic performance of adaptive arrays in wireless communication systems. With an adaptive array, the signals received by multiple antennas are weighted and combined to suppress interference and combat desired signal fading. In these systems, the weight adaptation algorithm must acquire and track the weights even with rapid fading. Here, we consider the performance of the Least-MeanSquare (LMS) and Direct Matrix Inversion (DMI) algorithms in the North American digital mobile radio system IS-54. We show that implementation of these algorithms permits the use of coherent detection, which improves performance by 1 dB over differential detection. Results for two base station antennas with flat Rayleigh fading show that the LMS algorithm has large tracking loss for vehicle speeds above 20 mph, but the DMI algorithm can acquire and track the weights to combat desired signal fading at vehicle speeds up to 60 mph with less than 0.2 dB degradation from ideal performance with differential detection. Similarly, interference is also suppressed with performance gains over maximal ratio combining within 0.5 dB of the predicted ideal gain.

A

I. INTRODUCTION

NTENNA arrays with optimum combining reduce the effects of multipath fading of the desired signal and suppress interfering signals, thereby increasing both the performance and capacity of wireless systems. To be practical, though, the implemented combining algorithms must be able to rapidly acquire and track the desired and interfering signals. Most previous theoretical and computer simulation studies of the increase in performance and capacity with optimum combining, e.g., [1]-[6], assumed ideal tracking of the desired and interfering signals. In the computer simulation study where block-by-block adaptation was considered [7], the data rate was at least 5 orders of magnitude greater than the fading rate. Although this is appropriate for the indoor radio system studied in [7] which used kbps data rates at 900 MHz, digital mobile radio systems have a much lower data-to-fading-rate ratio. For example, in the North American digital cellular system IS54 [8] with a data rate of 24.3 ksymbols/s in the 800 MHz band, at 60 mph the data-to-fading ratio is only 300, while in the Western European GSM [8] it is around 2 000. In a previous experiment [4]-[6] that demonstrated the feasibility of optimum combining with a three-fold increase in capacity (suppression of two equal-power interferers with eight antennas), the Least-Mean-Square (LMS) algorithm tracked these Manuscript received September 8, 1992; revised October 26, 1992. The author is with AT&T Bell Laboratories, Holmdel, NJ 07733. IEEE Log Number 9211016.

signals with data-to-fading-rate ratios as low as 25. However, the tracking error loss could not be measured because of NO quantization noise. Furthermore, this experimental system had many more antennas than interferers, which is not typical of most wireless systems. Here we consider the dynamic performance of adaptive arrays in wireless communication systems. Specifically, we consider the performance of the LMS and Direct Matrix Inversion (D MI) algorithms in tracking the desired and interfering signals in the digital mobile radio system IS-54. We show that implementation of these algorithms permits the use of coherent detection, which improves performance by 1 dB over differential detection. Results for two base station antennas and flat Rayleigh fading show that the LMS algorithm has large tracking loss at speeds above 20 mph. However, the DMI algorithm can acquire and track the weights to combat desired signal fading at vehicle speeds up to 60 mph with less than 0.2 dB degradation from the ideal (perfect tracking) performance of optimum combining with differential detection. Similarly, interference is also suppressed with performance gains over maximal ratio combining within 0.5 dB of the predicted ideal gain. In Section II, we determine the performance of optimum combining with ideal signal tracking. In Section III we study the performance of the LMS and DMI algorithms for acquisition and tracking of the signals in IS-54. A summary and conclusions are presented in Section IV. II. IDEAL PERFORMANCE

A. Weight Equation Fig. 1 shows a block diagram of an M antenna element adaptive array. The complex baseband signal received by the ith element in the kth symbol interval Xi (k) is multiplied by a controllable complex weight ui, (k). The weighted signals are then summed to form the array output So (k). The output signal is subtracted from a reference signal r ( k) (described in Section III) to form an error signal E(k). Weight generation circuitry determines the weights from the received signals and the error signal. In this paper, we are interested in determining the weights that minimize the mean-square error, i.e., IE 2 (k)l. Let the weight vector w be given by

(1)

Reprinted from IEEE Transactions on Vehicular Technology, Vol. 42, No.4, pp. 377-384, November 1983.

427

We define the received desired signal to noise ratio p as So(k)

l - - -.......- - - - . .

i == 1,···,M

ARRAY OUTPUT

(9)

the interference-to-noise ratio (INR) as i -

+

E(k)

r(k) REFERENCE SIGNAL

~ ----~

L - - -_ _......

SINR

Block diagram of an M element adaptive array.

where the superscript T denotes transpose, and the received signal vector x is given by (2)

L

Xd

+ Xn + L

(3)

Xj

j=l

where Xd, X n , and x i are the received desired signal, noise, and jth interfering signal vectors, respectively, and L is the number of interferers. Furthermore, let S d (k) and S j (k) be the desired and j th interfering signals, with

E[s~(k)] == 1

(4)

E[s](k)] = 1 for 1 < j ::; L.

(5)

Then x can be expressed as L

X

=

UdSd(k)

+ Xn + L

(6)

ujsj(k)

j=1

where Ud and Uj are the desired and j th interfering signal propagation vectors, respectively. The received signal (desired-signal-plus-interference-plusnoise) correlation matrix is given by

Rxx = E [

(Xd + Xn +

t Xi) (Xd X t Xi) T] +

*

n

+

1=1

J=1

(7) where the superscript * denotes complex conjugate and the expectation is taken with respect to the signals sd(k) and S j (k). Assuming the desired signal, noise, and interfering signals are uncorrelated, the expectation is evaluated to yield

R x x ==

uduI + a I + L ujuJ 2

(10)

L

(8)

j=l

where (J2 is the noisepower and I is the identity matrix. Note that R x x varies with the fading and that we have assumed that the fading rate is much less than the bit rate.

==

P

l+INR·L

(11)

where lLdi and Uji are the ith elements of Ud and 11,j, respectively, and the expected value now is with respect to the propagation vectors. The equation for the weights that minimize the mean-square error (and maximize the output SINR) is [9]

The received signal consists of desired signal, thermal noise, and interference and, therefore, can be expressed as

x =

1 to M, j == 1 to L

and the signal-to-noise-plus-interference ratio (SINR) as

ERROR SIGNAL

Fig. 1.

:=

- 1U,d* w == R xx

(12)

where the superscript -1 denotes the inverse of the matrix. Note that scaling of the weights by a constant does not change the output SINR. In (12), we have assumed that exists. If not, we can use x x is nonsingular so that pseudoinverse techniques [10] to solve for w. These optimumcombining weights are the same as those in [5], as shown in Appendix A.

R

R;;

B. Optimum Combiner Performance We determine the performance of ideal optimum combining in the digital mobile radio system IS-54 in the fcllowing manner. We first determine the bit error rate (BER) with the IS54 modulation technique, 1r /4-shifted differential quadrature phase shift keying (DQPSK), for ideal maximal ratio combining. With maximal ratio combining, the received signals are combined to maximize the signal-to-noise ratio at the array output, which is the optimum combining algorithm when interference is not present. Analytical results are presented for both differential detection and coherent detection, since both cases are studied in Section III. We then determine the reduction in the receive SINR required for a given BER, with optimum combining as compared to maximal ratio combining when interference is present. This gain with optimum combining is determined using analytical results with one interferer and Monte Carlo simulation with L 2: 2. Although these gains are generated only for coherent detection of binary PSK (BPSK), these results are also applicable to both coherent and differential detection of DQPSK, but at different BER's. This is because, for given receive SINR, the output SINR with optimum combining is independent of the modulation and detection technique, as can be seen from the equations in Section II-A. Thus the gain with optimum combining and BPSK at a given receive SINR will be similar to that of DQPSK at the same receive SINR-only the corresponding BER will be different. Note that IS-54 specifies a maximum BER of 2 x 10- 2 , and, therefore, our results are generated for BER's around 10- 2 .

428

With cohe rent det ection of DQP SK and maxim al ratio combining, the ave rage BER wi th flat Rayl eigh fading is approx ima tely given by --

B ER

= 2PE -

2

PE

10- 2

(13)

where (fro m [11])

PE

=T

M

(

1-

J

10- 4

2:

p)

U

1 k -k +

·t,C

a: w

M

[J::J

)T"(I+V :p)" 2

10- 6

( 14)

BER

=

1

PEh ) = e-"!

o

(15)

w here PEb ) is given by (fro m [12 ])

[~ loh/v'2) + ~(V2 - 1)" h h /V2)]

"'(' \if - I e -'ll P

....L.

-'-

10

20

30

.--J

40

Fig. 2. Average BER versus E,./ No for co herent and diff erential detectio n of DQPSK and DBP SK with M = 1 and 2 wit h flat Rayleigh fading and maxi ma l ratio combining.

12 , - - -- - , - - - - , - - - - . . - - - - - - ,

10

1 Interferer

(17)

No te that wi th differential detection of differ ent ial BP SK (DBPSK), the av erage BER is 1/2 (1 + p)-M . Fig. 2 shows the ave rage bi t erro r rate ve rsus p (SINR wit h INR - 00 dB ) for DQPSK and M 1, 2 w ith maxim al rat io combining. Results are also shown for DBPSK, w hich requir es 3 dB low er p for th e same BER with co herent detection . Note that for M = 1, w ith both DQPSK and DBPSK, differ enti al detection requires a 0.4 dB high er p for a given BER than coherent detection. For 1\1 2, differential detection of DBPSK requires a 0.7 dB high er p than co herent de tect ion , w hile differentia l det ecti on of DQPSK requires a 1.0 dB high er p than co herent detection. Differential dete ction of DQPSK requ ires a 11.2 dB SINR for a 10- 2 BER . Now , let us consider the BER wi th idea l optimum co mbining. The BER w ith optimu m co mbi ni ng and flat Rayleigh fading in the pre sen ce of noise only is given by the results above for maximal ratio co mbining . With one interferer that also ex periences flat Rayleigh fading , the BER for co herent de tect ion of BPSK is given by [1, eq. (25)]. For mu ltip le interferers wit h flat Ra yleigh fa ding, this BER ca n be determi ned by Mo nte Ca rlo si mulation as de scribed in [1] . Fig. 3 shows the gai n in dB of idea l optimum co mbining over maxim al ratio co mb ining with two antennas for one to six interferers ve rsus the interference -to- noise ratio (INR). This ga in wa s determ ined fro m the reduc tion in the requ ired SINR for a 10- 3 BER at the receiver wi th cohe rent detection of BPSK. The ga in occurs because optimum co mbining is suppressi ng interfere nce in additio n to increasing desired signa l-to-noise rat io. A 10- 3 BER was chose n beca use the

=

..L-

P (S INR with INR =-oodB) (dB)

(16) w here h is the kth orde r modified Bessel fun ction of the first ki nd, and p(I) is the probabil ity densi ty of the signal-to- noise ratio afte r maximal ratio co mbin ing and is giv en by [131

ph ) = pM(jVJ - I )!'

- - - - Differential Detection

10-10 ' -

00

pb )PEb ) d-y

- - Coherent Detection

10-8

W ith differential detection of DQPSK and maxim al ratio combining, the average BER w ith flat Ra ylei gh fadin g ca n be s how n to be given by

=

_ - - - - - -\ 2

::::=.---------1 ~ 5 6

=

5

10 INA (dB)

15

20

Fig. 3. Gain in dB of ideal optimu m combi ning over max imal ratio combining wit h two ante nnas for one to six interferers versus INR at a 10- 3 BER for co herent detection of BPSK.

11.1 dB SIN R requ ired for a 10- 3 BER w ith m axim al ratio co mbining and coh erent detection of BPSK [11], [14 ] is close to the 11.2 dB SINR requ ire d for a 10- 2 BER w ith maximal ratio co mbining and differen tial de tec tio n of DQPSK. T hus Fig. 3 also shows the gai n for a 10- 2 BER w ith diffe rent ial de tection of DQPSK. As shown in [1], the ga in does no t va ry significantly for BER ' s between 10- 2 and 10- 3 . W ith two ante nnas , optimum co mbining ca n co m pletely suppress one interferer. T hus the maximum gain w ith optim um co mbi ning and one int er ferer is 10 IOg lO (10INR/ 10 + 1) dB , w hich is approximately INR for large INR. How ever , this ga in can only be achieved w ith out desir ed sign al fading . With fading , as show n in [4)-[61, the complete suppress ion of one interferer results in the loss of one order of div er sit y aga inst

429

multipath fading, which corresponds to a 12.9 dB increase in the SINR required at a 10- 3 BER [11], [14]. Thus to achieve gain, optimum combining must trade off a partial loss in diversity improvement for partial interference suppression. The resulting gain is approximately half (in dB) the maximum gain possible without desired signal fading. With more than one interferer and two receive antennas, the gain is seen to be much lower. However, the gain is almost 1 dB even with six interferers. III. PERFORMANCE OF LMS AND DMI IN IS-54 In the digital mobile radio system IS-54, the frequency reuse factor (number of channel frequency sets) is 7. However, as shown in [6], it may be possible to reduce the frequency reuse factor to 4 (nearly doubling the system capacity) through the use of optimum combining of the signals from the two existing receive base station antennas. However, for this result in [6], we assumed ideal optimum combining, i.e., perfect tracking of the desired and interfering signals by the combining algorithm at the base station. Below, we consider the dynamic performance of optimum combining in IS-54.

A. Weight Generation The weights can be calculated by a number of techniques. Here, we will consider two techniques: the Least Mean Square (LMS) and the Direct Matrix Inversion (DMI) algorithm [9]. For digital implementation of the LMS algorithm, the weight update equation is given by

w(k

+ 1) == w(k) + J-LX*(k)E(k)

(18)

where J-L is a constant adjustment factor, x (k) is the received signal vector in the kth bit interval, and the error is given by

E(k) = r(k) - .so(k)

(19)

so(k) == w T x(k).

(20)

where

With DMI, the weights are given by [9] (21) where the estimated receive signal correlation matrix is given by K

n: = l/KL x * (j )x

T

(j )

(22)

j=1

where K is the number of samples used, and the estimated reference signal correlation vector is given by K

rXd ==

l/KL x* (j )r (j ).

(23)

j=l

Note that, as before, we have assumed that tc; is nonsingular. If not, pseudoinverse techniques can be used [10]. The LMS algorithm is the least computationally-complex weight adaptation algorithm. However, the rate of convergence

to the optimum weights depends on the eigenvalues of R x x , i.e., on the power of the desired and interfering signals [9]. Thus weaker interference will be acquired and tracked at a slower rate than the desired signal, and the desired signal will be tracked at a slower rate during a fade (when accurate tracking is most important). The OMI algorithm is the most computationally-complex algorithm because it involves matrix inversion. However, OMI has the fastest convergence, and the rate of convergence is independent of the eigenvalues of R x x , Le., signal power levels. One issue with the OMI algorithm is its modification for tracking time-varying signals. Here we consider calculating the weights at each symbol interval using one of two data weighting functions: 1) a sliding window (fixed K in (22) and (23) and 2) an exponential forgetting function on ii., and rXd' namely,

rxx(k

+ 1) == {3r x d(k) + x*(k)r(k)

(25)

where 13 is the forgetting factor. For M == 2, the OMI algorithm has about the same computational complexity as the LMS algorithm. In particular, weight calculation from the inversion of the 2 x 2 correlation matrix (21) does not even require division by the determinant, since this is only a weight scale factor that does not affect the output SINR. For larger M, since the complexity of matrix inversion grows with M 3 (versus M for LMS), DMI becomes very computation intensive. However, the matrix inversion can be avoided by using recursive techniques based on least-square estimation or Kalman filtering methods [9], which greatly reduce complexity (to the order of M 2 ) but have performance that is similar to DMI" [~l Similarly, pseudoinverse techniques [10] can be used if R x x does not exist. Therefore, our performance results for OMI should also apply to these recursive techniques. N ext, consider reference signal generation. Since this signal is used by the adaptive array to distinguish between the desired and interfering signals, it must be correlated with the desired signal and uncorrelated with any interference. Now, the digital mobile radio system IS-54 [8] uses time division multiple access (TDMA), with three user signals in each channel and each user transmitting two blocks of 162 symbols in each frame. For mobile to base transmission, each block includes a 14-symbol synchronization sequence starting at the 15th symbol. This sequence is common to all users in a given time slot (block), but is different for each of the six time slots per frame. Since base stations operate asynchronously, signals from other cells have a high probability of having different timing (since there are 972 symbols per frame) and being uncorrelated with the sequence in the desired signal. Thus as proposed in [6], for weight acquisition we will use the known 14-symbol synchronization sequence as the reference signal. DMI is used to determine the initial weights using this sequence, since accurate initial weights are required. Note that the weights must be reacquired for each block, because with a 24.3 ksymbols/s data rate and fading rates as high as 81 Hz, the

430

fading may change completely between blocks received from a given user. After weight acquis ition, the output signal consists mainl y of the desired signal, and (during proper operation) the data is detected with a bit error rate that is not more than 10- 2 to 10 - 1 . Thu s we can use the detected data as the reference signal, using eithe r the LMS or OMI algorithm for tracking.' In our simulation results shown below, we did not consider the effec t of data errors on the reference signal; i.e., the reference signal symbols were the same as the transmitted symbols. Note that since the modulati on technique is OQPSK, the error of interest is only the relative phase between adjacent symbols, rather than the error vector r (k) - So (k) in (19). Indeed , the LMS algorithm can use the phase error of each symbol, i.e., / r (k) - / so(k ), where lv is the phase of y, as the error signal. ? This results in no amplitude control of so(k ), but the amplitude is not used for OQPSK detection anyway. However, we found better trackin g with the error vector (19) and, therefore , used (19) for our results shown below. Note that with the OMI algorithm we do not have the option of using the phase error- we must use the error vector (19).

B. Results To determin e the performance of the acquisition and tracking algorithms in IS-54, we used IS-54 computer simulation programs written by S. R. Huszar and N. Seshadri . We modified the transmitter, fading simulator, and receiver program s for flat Rayleigh fading with one interferer and added our optimum combining algorithms with both coherent and differential detection. Specifically, the transmitted desired signal consisted of blocks of 162 symbols with 7r / 4-shifted OQPSK modulation. The symbols in each block were random ly generated 2-bit symbols for symbols 1- 14 and 29- 162, and a synchronization sequence for symbols 15-28. This signal, sampled at 8 times the symbol rate, was filtered by a square root cosine rolloff filter with a rolloff factor of 0.35. For the interfering signal, randomly generated symbols, independent of the desired signal symbols, were used for the data, and a synchronization sequence that is orthogonal to that to the desired signal was used for symbols 15- 28. The relative timing of the interfering and desired signals was adjustable in increments of 1/8 of the symbol duration. Independent, flat Rayleigh fading for each signal at the two receive antennas was generated by multiplying each signal by a complex Gaussian random number, which varied at the fading rate [13]. The received signals were then weighted, combin ed, and filtered by a square root cosine rolloff filter, followed by coherent or differential detection. Let us first consider the performance with OMI for acquisition and LMS for tracking with differential detection without interference. Fig. 4 shows the BER versus SINR for vehicle [ We do tracking in each block (starting from the synch roniza tion seq uence) in the forward direc tion for symbols 29 to 162, and in the reverse direct ion for symbols 14 to I. 2 Since OQPSK also has consta nt amp litude, the consrant modulus algorithm can also be used to ge nerate an error signal, i.e., e(k ) = 8 0 (1.: ) so(k )/l so(k )l , fo r the LMS algorithm, as shown in [15j . A reference signa l is, therefore, not needed , but this means that the rece iver can acquire and track an interfering signal rather than the desi red signa l. and, therefore. the algorithm cannot be used for optimum co mbi ning when interferenc e is prese nt.

INR =

lO,t

ffico

·~ d B

10,2 Ideal Maximal

Ratio Combining OOmph - Differential Detection

10.3

60 mph . Coherent

Detection

Omph . Ditter entrat Detection

Omph - Coherent Detection

10-4 '--- - - - ' - - - - - - - ' - - - - --'--------' 10 o 20 SINR (dB)

Fig. 4.

BER versus SINR for vehicle speeds of 0, 20, and 60 mph wi th OMI for acquisi tion and LMS for tracking.

speeds of 0, 20, and 60 mph at 900 MHz , corresponding to fading rates of 0, 27, and 81 Hz. Computer simulation results are shown for the BER over 178 blocks (~ 28000 symbols, which should be adequate for BER > 10- 3 ) , along with theoretical results for maxim al ratio combining (15). At o mph , the fading channel was constant over each block , but independent between blocks. Also, LMS tracking was not used at 0 mph, and thus the results show the accuracy of weight acquisition by OM!. OMI is seen to have less than I-dB implementation loss for BER 's betwe en 10- 3 and 10- 1 . At 20 and 60 mph , the tracking performance of the LMS algorithm is poor. For SINR below 14 dB, the LMS algorithm tracks so poorly that the best BER is ob tained with Jl, = 0, i.e., if LMS tracking is not used . Thi s lack of tracking causes little degradation at 20 mph, but a several dB loss in performance at 60 mph. For SINR above 14 dB, the LMS algorithm improves performance, with the best Jl, equal to 0.08. At 20 mph, the performance with the LMS algorithm is about the same as at o mph. However, at 60 mph there is a 4.2-dB implementation loss at 10- 2 BER. Thus the LMS algorithm is not satisfactory for optimum combining in IS-54. 3 Next , consider OMI for both acquisition and tracking with differenti al detect ion without interference. Fig. 5 shows the average BER versus SINR with OMI and vehicle speeds of 0 and 60 mph. For these results, we used OMI with a 14-symbol sliding window (K = 14 in (22) and (23», which gave us the best results for a 10- 2 BER at 60 mph. At this BER , OMI has a negligible increase in implementation loss at 60 mph as co mpared to 0 mph. Alth ough differential detection is typically used in mobile radio becaus e of phase track ing problems, we can also use coherent detec tion with optimum combining. Thi s is because 3 In [15] it was show n that the LMS algorithm was satisfactory for div ersit y comb ining and equal ization using the co nstant modulus algorithm for erro r signal generation in GSM , with data-to-fading -rate ratios as low as 1700. However, as mentioned before, this techn ique cannot distinguish betwe en the desir ed and interfering signals.

431

INR =

10- 1

-~dB

10- 1

Ideal Maxim al Ratio Co mbining

. .......

o mph

"

ffiCD

a:

10-2

ill

co

Ideal Maximal RallO Combinmg

10-2

lOdB

~~~2~~:~~>"

"'<>~<",-,>,.

.......'<:....

60mph • Drtterentiat Detectio n 60mpn • Co here nt Detect ion

10- 3

"

o

-----JL-_ _-...JL.-

10

,,

L.-_ _--.J

20

10

SINR (dB)

Fig. 5.

"

'", ,

Detecnon

' - --

.

"'" '..\~.>.",

Omph • Differentia l Detection Omph - Coherent

10-4

K=1 4

SINR (dB)

BER versus SINR for vehicle speeds of 0 and 60 mph with DMI for acquisition and tracking.

optimum combining requires coherent combining of the received signals, which means that the weights must track the received signal phase , and the array output signal phase should match the phase of the coherent reference signal. Thus coherent detection of the array output is possible, which, as shown in Section II, decreases the required SINR for a 10- 2 BER by 1.0 dB with ideal phase tracking ." With the LMS algorithm, however, tracking is so poor that coherent detection is worse than differential detection. On the other hand, with the DMI algorithm, there is improvement with coherent detection. Fig . 5 shows that coherent detection decreases the required SINR for a 10- 2 BER by 1 dB, resulting in performance that is 0.3 dB better than the theoretical performance of differential detection (but 0.7 dB worse than ideal coherent detection). At 60 mph, the performance degrades by an additional 0.5 dB; i.e., the performance is 0.2 dB worse than ideal differential detection (and 1.2 dB worse than ideal coherent detection). Thus the use of coherent rather than differential detection cancels most of the implementation loss of DMI at 60 mph. Finally , consider the dynamic performance of optimum combining for interference suppression. For the results shown below , the symbol timing for the desired and interfering signals was the same. Our results showed that this was the worst case since there was a sligh t improvement in performance with timing offset between the two signals (see below) . With the LMS algorithm, even at 20 mph the performance does not improve with the INR, showing that the algorithm is not accurately tracking the interferer. However, with DMI, the performance improvement with INR agrees with ideal tracking results. Fig. 6 shows the average BER versus SINR at 0 mph with one interferer with INR = - 00 , 0, 3, 6, and 10 dB . DMI with a 14-symbol sliding window and coherent detection was used as before . 4Note that this is significant in comparison to the 3.6 dB gain with optimum comb ining in IS-54 with 2 receive antennas [6], Also , it is almost half of the 2.5 dB gain needed for a frequenc y reuse factor of 3 rather than 4 (and an additional 33% capacity increase).

Fig, 6.

.....

-.-.

',

"

20

BER versus SINR with one interferer for a vehicle speed of 0 mph with OMI for acqui sition and tracking .

10-1

ldeaJMaximal Ratio Combining

~~~"-~~'a

3dB 6dB

ffi

[l)

60 mph

K=14

l OdB

10-2

10-3

10-4 '--_ _----J'--_ _- J L -_ _- J o 10

----I

20

SINR (dB)

Fig. 7.

BER versus SINR with one interferer for a vehicle speed of 60 mph with DMI for acquisition and tracking, and l\ = 14.

The requ ired SINR for a 10- 2 BER is 10.2, 9.5, 8.6, and 6.5 dB for INR = 0,3,6, and 10 dB, respectively, which is within 0.5 dB of the predicted gain shown in Fig. 3. Fig. 7 shows the average BER versus SINR at 60 mph with one interferer. Again , a 14-symbol sliding window was used since this gave the best results at a 10- 2 BER. At a 10- 2 BER these results show a gain with INR that is within 0.5 dB of the gain shown in Fig. 3. The implementation loss increases the SINR , though , resulting in poor performance at a 10- 3 BER with K = 14. However, note that the optimum window size for a given BER is determined by a tradeoff of two effects . As the window size decreases, the weights have more error due to the averaging of fewer samples, but less error caused by channel variation over the window. Our results showed that as SINR increases, the performance is improved by decreasing K.

432

10- 1

IdealMaximalRatio Combining

60 mph K=7

Ideal Ma ximal Ratio Combining

10-1

. Noise Only INA - OdB.> 6dB

ffim

10- 2

10.3

'

'~" "

1OdS -,

..'~"'" '. "

,>-,

ffi

~.l'oo.

m

""<~~>:~~.~>,.,.,

,: ~~:!~~~~~~~,

. . -,

.

' . ' .. '... ' .

......... , .

10-4

20

,',

................... ~.~~ .....

..... , ." . . ,

10- 3

" , ". . , ...

10

10-2

....

'",::,~~~ ~.~.;~;>, ''''

L....

......

.

.....

...

..

.

, "y . , . . , ... .....

" .

........

'c.

,

--.I

--'-

a

10

20

SINR (dB)

SINR (dB) Fig. 8.

60 mph ~ = 0 .675

Noise Only INA .OdB ::-.,

3dB,··-,

Fig. 9. BER versus SINR with one interferer for a vehicle speed of 60 mph with OMI for acqui sition and tracking, and exponential weighting with /3 = 0 .6 75 .

BER versus SINR with one interferer for a vehicle speed of 60 mph with OMI for acquisition and trackin g, and J( = 7.

Fig. 8 shows the performance with K = 7, which gave the best results at a 10-. 3 BER. At this BER, with interference, the improvement of optimum combining is seen to be close to that with K = 14 at a 10- 2 BER (Fig . 7). Furthermore, with K = 7 at a 10- 2 BER , the improvement with interference is similar to that with K = 14. However, with noise only , the BER for a given SINR is higher with K = 7 than with K 14, because fewer samples are averaged to determ ine the weights. Fig. 9 shows the performance of DMI with exponential weighting for {3 = 0 .675. This {3 gave the best result s for BER = 10- 2 and 10- 3 . With noise only , the BER is seen to be lower than with either K = 14 or 7, and at a 10- 2 BER the performance is close to that of ideal maximal ratio combining with coherent detection (i.e., 1.0 dB lower SINR than the curve shown for ideal maximal ratio combining with differential detection). With interference at a 10- 2 BER , the gain with optimum comb ining is close to the predicted ideal gain; i.e., the perform ance is slightly better than DMI with a sliding window and K = 14. How ever , at a 10- 3 BER with interference, the performance is slightly worse than that shown in Fig. 8 with K = 7. Thu s either the sliding window or the exponential weighting technique can be used to generate accurately the optimum combining weights, even at 60 mph . Finally , Fig. 10 show s the effect of timing offset between the desired and interfering signals. Results were generated for a 10 dB SINR at 60 mph with K = 14, as in Fig. 7. These result s show that the BER varies with timing offset by less than 12% « 0.4 dB improvement in SINR at a 10- 2 BER) , with the best performance when the interfering and desired signals are offset by approximately half the symbol dur ation .

2xlO-2 , - - - - - - , - - - - - . - - - - - , - - - - - ,

-'-'- '-'-'-'-'-'-'-'-'-'--- '- '-'-'- '-'-'-'-'-'- '- '-'-'-._ ._._. Noise Onty

10-2

=

IV. S UMMARY ANO CONCL USION S In this paper, we have studied the dynamic performance of adaptive arrays in wireless communication systems . Specifically, we studied the performance of the LMS and DMI

b:.-

.. ... ....... .. ....... .. .. .. ... ... ....... ..... .. ....... ............ . .......... ..

-

---

INA.OdB 3dB

• • •• • • . .• . . • • • . - . 6dB

a: w

m

----------------____

5x10 -3

------ ......... -

_ __ --

10dB

SINR=10dB

2xlO- 3

L....

a

---...l

0.25

--L

.....L_

_

0.50

0.75

0.875

Timing Offset (Fraction of Symbol Duration)

Fig. 10. Effect of timing offset on the BER for a vehicle speed of 60 mph 14 , and SINR 10 dB. w ith OM! for acquisition and trackin g, II:

=

=

weight adaptation algorithms in IS-54 with data to fading rates as low as 300. We showed that implementation of optimum combining allow s the use of coherent detection, which improves performance by over 1 dB as compared to differential detection . Although the performance of the LMS algorithm was not satisfactory, results showed that the DMI algorithm acquired the weights in the synchronization sequence interval and tracked the desired signa l for vehicle speeds up to 60 mph with less than 0.2 dB degradation from the ideal performance with differential detection at a 10- 2 BER. Similarly, an interfering signal was also suppressed with performance gain s over maximal ratio combining within 0.5 dB of the predicted ideal gain . Thus our results indicate that we can obtain close to the ideal performance improvement of optimum combining even in rapidly fading environments.

433

ACKNOWLEDGMENT

We gratefully acknowledge useful discussions with G. D. Golden, J. Salz, and N. Seshadri. APPENDIX

A

To relate the weight equation (12) to that of [5, Eq. (11)], we need to consider three differences between the analysis given here and in [5]. First, in [4]-[6], we considered the generation of N == L + 1 separate outputs at the receiver, each with minimum mean-square error, while here we consider only the output of the desired signal. Using the notation of [4]-[6], the channel matrix C that relates the transmitted signal vector (including the L interferers) to the received signal vector x at a given time is given in our notation by

(A-I) Thus the weight matrix W for the optimum linear combiner that generates N output signals is given by (from (12»

W == aR;;C*

(A-2)

with the vector s at the output of the combiner given by

s == W T x.

(A-3)

Note that the weight vector w of (12) is just the first column of W. Now, we can show that

R x x ==

+ CC t

(A-4)

+ CCtJ-1C*.

(A-5)

(72]

and from (A-2),

W == a[a2 ]

matrix. However, the weights can be shown to be equal (with a scalar multiple) in the limit (j2 ~ O. The change in the weight equation was done to put it the form for DMI (21).

A second difference is that in [4]-[6] we considered the zero-forcing weights, which can be obtained from (A-5) in the limit, (72 ~ 0, i.e., (A-6) Note that [CCt]-l exists only when N == M. Otherwise, the inverse becomes the pseudoinverse. Finally, the weight matrix of [5], which we will denote as W[5] , was defined as the transpose of the weight matrix given here, i.e.,

REFERENCES [1] 1. H. Winters, "Optimum combining in digital mobile radio with cochanneI interference," IEEE J. Select. Areas Commun., vol. SAC-2, July 1984. [2] 1. H. Winters, "Optimum combining for indoor radio systems with multiple users," IEEE Trans. Commun., vol. COM-35, Nov. 1987. [3] 1. H. Winters, "On the capacity of radio communication systems with diversity in a Rayleigh fading environment," IEEE J. Select. Areas . Commun., vol. SAC-5, June 1987. [4J 1. H. Winters. J. Salz, and R. D. Gitlin, "The capacity of wireless communication systems can be substantially increased by the use of antenna diversity," in Proc. 1st Int. Conf Universal Personal Commun., Sept. 1992, pp. 28-32. [5] 1. H. Winters, 1. Salz, and R. D. Gitlin, "The capacity increase of wireless communication systems with antenna diversity," in Proc. 1992 Conf. Inform. Sciences Syst., vol. II, Mar. 18-20, 1992, pp. 853-858. [6] J. H. Winters, J. Salz, and R. D. Gitlin, "Adaptive antennas for digital mobile radio," Proc. IEEE Adaptive Antenna Syst. Symposium, Melville, NY, Nov. 1992, pp. 81-87. [7] S. A. Hanna, M. EI-Tanany, and S. A. Mahmoud, "An adaptive combiner for co-channel interference reduction in multi-user indoor radio systems," in Proc. IEEE Veh. Technol. Conf., St. Louis, MO, May 19-22. 1991, pp. 222-227. [8J D. J. Goodman, "Trends in cellular and cordless communications," IEEE Commun. Mag., vol. 29, pp. 31-40, June 1991. [9] R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980. [10] A. Dembo and J. Salz, "On the least squares tap adjustment algorithm in adaptive digital echo cancellers," IEEE Trans. Commun., vol. 38, pp. 622-628, May 1990. [11] P. Bello and B. D. NeJin, "Predetection diversity combining with selectively fading channels," IRE Trans. Commun. Syst., vol. CS-IO, p. 32, Mar. 1962. [12J P. G. Proakis, Digital Communications. New York: McGraw-Hill, 1983, p. 175. [13] W. C. Jakes, Jr., et al., Microwave Mobile Communications. New York: Wiley, 1974. [14] J. H. Winters, "Switched diversity with feedback for DPSK mobile radio systems," IEEE Trans. Veh. Techno/., vol. VT-32, pp. 134--150, Feb. 1983. [15] T. Ohgane, "Characteristics of CMA adaptive array for selective fading compensation in digital land mobile radio communications," Electron. Commun. Japan, Part 1, vol. 74, no. 9, pp. 43-53, 1991.

(A-7) and is given in [5] as W[5]

= lim [0" 21 + ctcr1c t. 2 a --+0

(A-8)

Although (A-6) and (A-8) look similar, note that CC t (A6) is an M x M matrix, while

atc

(A-8) is an N x N

434

Effect of Fading Correlation on Adaptive Arrays in Digital Mobile Radio Jack Salz, Member, IEEE, and Jack H. Winters, Senior Member, IEEE

Abstract-In this paper, we investigate the effect of correlations among the fading signals at the antenna elements of an adaptive array in a digital wireless communication system. With an adaptive array, the signals received by multiple antennas are optimally weighted and combined to suppress interference and combat desired signal fading. Previous results for flat and frequencyselective fading assumed independent fading at each antenna. Here, we present a model of local scattering around a mobile where the received inultipath signals arrive at the base station within a given beamwidth, and derive a closed-form expression for the correlation as a function of antenna spacing. Results show that the degradation in performance with correlation in an adaptive array that combats fading and suppresses interference is only slightly larger than that for combating fading alone, i.e., with maximal ratio combining. This degradation is small even with correlation as high as 0.5.

A

I. INTRODUCTION

NTENNA arrays with optimum combining combat multipath fading of the desired signal and suppress interfering signals, thereby increasing both the performance and capacity of wireless systems. This increase is reduced, however, by correlation of the fading signals between the received antennas. Previous theoretical and computer simulation studies of optimum combining (e.g., [1 ]-[6]) assumed independent fading of the desired and interfering signals at each receive antenna. Such independence occurs if multipath reflections are uniformly distributed around receive antennas that are spaced at least a half wavelength apart. However, the signals often arrive at the receive antennas mainly from a given direction. For example, in rural or suburban mobile radio, a high base station antenna typically has a line-of-sight to within the vicinity of the mobile, with local scattering around the mobile generating signals that arrive mainly within a given range of angles or beamwidth. This problem was studied in [7], where theoretical and experimental results showed the relationship of angle of arrival and beam width with the correlation of fading between antennas. Specifically, as the angle of arrival approaches end-fire (parallel to the array) and the beamwidth decreases, the antenna spacing must be increased to reduce correlation. When this correlation is high (>0.8), because the signals at the antennas tend to fade at the same time, the diversity benefit of antenna arrays against fading (i.e., with maximal ratio combining) is significantly Manuscript received November 14, 1992; revised February 25, 1993. The authors are with AT&T Bell Laboratories, Holmdel, NJ 07733 USA. IEEE Log Number 9403803.

reduced [8]. On the other hand, because independent fading is not required for interference suppression, antenna arrays can suppress interference even with complete correlation (== 1), i.e., in line-of-sight systems without multipath. In particular, theoretical and computer simulation results [1], [3], [4], [9], [10] have shown that with M antennas, M - 1 interferers can be completely suppressed in both fading (with zero correlation) and nonfading (with complete correlation) environments. Thus, we need to understand the antenna array performance with joint fading reduction and interference suppression. In addition, the effect of correlation with frequencyselective fading, when equalization is also used, must be evaluated. This paper considers the effect of correlation of the signal fading at the antennas of an adaptive array with optimum combining to combat desired signal fading and suppress interference, and optimum linear equalization to combat frequencyselective fading. We first present a model of local scattering where the received multipath signals arrive within a given beamwidth. We derive a closed-form expression for the fading correlation with this model as a function of the angle of arrival, beamwidth, and antenna spacing. Using these theoretical results with Monte Carlo simulation, we then generate results for the effect of beamwidth (i.e., correlation) on the adaptive array performance with given antenna spacing and random angles of arrival. Results are presented for optimum combining with flat fading, as well as for frequency-selective fading, using a two-path delay spread model with joint optimum combining and linear equalization. Computer simulation results show that the degradation in performance with correlation in an adaptive array that combats fading and suppresses interference is slightly larger than that for combating fading alone, i.e., with maximal ratio combining. This degradation is small even with correlation as high as 0.5. Results for an adaptive array with either flat Rayleigh fading or frequency-selective fading show that with an antenna spacing of four wavelengths, there is little performance degradation as long as the beamwidth of the received signals is greater than 20°. Further increases in antenna spacing would reduce this beamwidth even more. In Section II, we describe optimum combining and equalization with antenna arrays and discuss how fading correlation can occur. The model and theoretical analysis of wireless systems with fading correlation is presented in Section III. In Section IV, we describe the computer simulation technique and present results on the performance degradation with correlation. A summary and conclusions are presented in Section V.

Reprinted from IEEE Transactions on Vehicular Technology, Vol. 43, No.4, pp. 1049-1057, November 1994.

435

T

User 1

T

User 2

T

UserN

(1)

In this paper, we are interested in linear processing at the base station of the M received signals to generate an output signal that corresponds to the data from one desired signal (user 1). Specifically, we consider ideal optimum combining and linear equalization, where the M received signals are combined to minimize the mean-square error (MSE) in the output. For ideal optimum combining, we assume perfect tracking of the desired and interfering signals by the combining algorithm at the base station. For ideal linear equalization, we consider a synchronous tapped delay line with an infinite number of taps, as shown in Fig. 1. This equalizer is the optimum linear equalizer under the assumption that the desired signal spectrum is bandwidth-limited to the data rate (1IT). As shown before [1], with ideal optimum combining and linear equalization, the minimum MSE for user "1" for given C (w) is given by

T

= CT~-2 1r

j7r/T -7r/T

2

M-1

M

Fig. 2. Wireless environment where all signals from a mobile arrive at the base station within ±~ of angle d».

BACKGROUND

Fig. 1 shows a digital wireless communication system employing adaptive arrays where a base with M antennas receives signals from N users. These N users operate in the same bandwidth simultaneously and include signals destined to the base, as well as those destined to other bases, but interfering with the desired signals, as in cellular systems. Let the complex channel transfer function from user "i" to antenna "j" be denoted as Cij (w). Then, the channel vector from user "i" to the base antennas is C, (w) = [eil (w) ... CiM (w)]T, where the superscript T denotes transpose, and the M x N channel matrix between the N users and the base is given by

MSE[C]

D

D

1

II.

.+---+•

•+----+.

Fig. 1. Wireless communication system employing adaptive arrays, where a base with AI antennas receives signals from 1'1 users.

1

[I + pct(w)C(w)];l dw,

(2)

where I is an N x N identity matrix, p is the signal-to-noise ratio, and CT~ = E [Ian 12 ] , where the an'S are the complex data symbols. The superscript denotes complex conjugate transpose, and [] ~11 stands for the "11" component of the inverse of a matrix. For coherent detection of binary phase shift

keyed (BPSK) or quadrature amplitude modulated (QAM) signals, the error rate can then be upper bounded by 1-

P; ~ Ec [ exp [ -

MSE[C]/a~]] -MSE[C] ,

(3)

where Ec[ ] is the expected value with respect to the channel matrices. With multipath, the C-ij (w ),s are modeled as complex Gaussian random variables at each frequency w. The variation of Cij(W) with w depends on the delay spread model of the channel. In this paper, we examine numerically two such models: 1) flat fading, i.e., C,ij(W) = Cij for all w, where equalization is not needed, and 2) a two-path delay spread model, .. (

C1.] W

) _ -

1

Cij

+ cije 2 -

jW'T 1

,

(4)

where Ti is the time delay between the two paths for the i-th user and j and are complex Gaussian random variables, and the fading in the two paths with different time delays is independent, i.e., j is independent of CTj (but the Cij'S are not necessarily independent). Previous papers have assumed that the Cij 's are independent. Such independence occurs if multipath reflections are uniformly distributed around the receive antennas that are spaced at least a half wavelength apart (this situation is examined in detail below). However, the signals often arrive at the receive antennas mainly from a given direction. For example, in rural or suburban mobile radio, a high base station antenna typically has a line-of-sight to within the vicinity of the mobile, with local scattering around the mobile generating signals that arrive mainly within a given range of angles or beamwidth. Fig. 2 shows a typical scenario where all signals from a mobile arrive at the base station within ±~ at angle ¢. This problem was studied in [7], where theoretical and experimental results showed the relationship of angle of arrival and beamwidth with the correlation of fading between antennas. Specifically, [7] assumed that the probability density function for the angle

436

ct

Crj

ct

The overall channel matrix C can now be expressed in terms of the M-column vectors (see (1)),

of arrival of the z-th ray is given by 1r

- 2+¢

::;

(Pi

:s "2 + ¢ 1r

(5)

(9)

where n is an even integer chosen to determine the beamwidth and Q is a normalizing constant chosen to make p((Pi) a density function. The correlation of the fading between two antennas spaced D apart is then [7 J

R x x == and

R x y ==

j

' 1r

/2+(p

. --rr/2+¢

j

' 1r

/2+(p

. --rr /2+(p

cos (2JrD / A sin (cP'L - cjJ) )p( dJi )dcjJ'i

(6)

sin (2JrD / A sin (cjJi - (D) )p( tDi )d¢i

(7)

where A == w/(2Tic), C is the speed of light R x J; is the correlation between the real parts of CiJ and e,k, and R:L'!J is the correlation between the real part of CiJ and the imaginary part of Cik. Unfortunately, (6) and (7) must be evaluated numerically. Therefore, in the next section, we present a generic model, where the probability density function of (P, is assumed to be uniform,

-

~

+ (j) :s; (PI :s;

elsewhere

~

+ (/)

:l;kl\]

:t)kilJ

and seek to evaluate the 21\;] x 2l\!! correlation matrix

( 1I) Defining the 2 x 2 matrix D[L-Ji

CHANNEL MODEL

We develop a mathematical model for multipath media applicable in wireless digital communications employing antenna array processors. The model is useful for the evaluation of signal correlations among the antenna array elements which are critically important in determining ultimate system perfonnance. The degree of correlation depends on the element spacings and signal scattering angles resulting from the physical surroundings. Using the model of a uniform probability density function of. ¢'i (8), in the Appendix we derive closed-form expressions for the correlation of the fading between the j-th and k-th antennas (as compared to (6) and (7)). Note that using these expressions, (A-19) and (A-20), when ~

( 10)

(8)

This allows for the derivation of a closed-form expression for the correlation coefficient, with results that agree with the results obtained numerically using the model of [7] (see Section IV).

III.

where the column vector Ci.(w) represents the transmission characteristics from user" k" to all the antenna elements. Since each user is characterized by its own surroundings, and if the users are not on top of one another to within wavelengths, it is reasonable to assume that the columns in (9) are statistically independent. Consequently, we need only to characterize the correlation properties of a typical user, which we have already accomplished. Expressing the complex column vector Ck(w) == Xk(W) + 'lYk (w) where Xk and Yk are the real and imaginary 1\11-column vectors associated with user" k." we define the 21''1/ augmented column vector as

k xxCi - j) == [-R- xy (I:I, - J:\)

~xy(li

-]1)]

i, j == 1..... IVI

R y y ( / -.1)

( 12) where the entries are given in (A-19) and (A-20), it is easy to see that k, can be represented in terms of these block 2 x 2 matrices as follows:

12 x 2

Rk

7?k

D1T DT2

DIJ

D1

12 x 2

DT1

.«:

D2 D1

12 x 2

DII - 2

D

AJ

Di\1-1 D1\I-2

( 13)

12 x 2

where (j~ is the received signal power for the k-th user (see (A-16), with a subscript k" denoting the k t h user~(A-16) applies to a typical user). Clearly, k; is a Toeplitz matrix. 1.1.

== Jr,

IV.

RESULTS

A. Correlation and

If.

where z == 27r The implication of these results is that when reflections are allowed to arrive at the antenna array from all directions, the correlation of signals at adjacent antenna array elements is determined from J o(z) == 0 which implies that z == 27r~ ~ 2.4 or ~ ~ ;; ~ .382. This sets the minimum spacings between antenna elements yielding zero correlation.

Let us first consider the correlation as a function of the antenna spacing D / A, angle of arrival cP, and beamwidth ~. When the signal arrives from broadside (¢ == 0°), R x y == 0 for all D/ A. Thus, the envelope

correlation, R (IR x x 12 + IR x y 12 ) 1/2 is just IR x x I. R x x versus antenna spacing is shown in Fig. 3 for 6. == 180°,90°,40°,20°, 10° , and 3°, These results agree with results using the model of [7] with ~ equivalent to the 3 dB beamwidth of [7]. The figure shows that, as ~ decreases, the

437

0.5

rl

-0.5

~

0

= 90'

-0.5

.1 '---"-_ _-=---'a 2

·1 ' - - - - - - - - - ' - - - - - - - - ' : - - - - - - - - - : 4 6 2 a DIA

-1.-

--'

4

6

Of }"

Fig. 3. Correlation of the real portion of the fading versus antenna spacing for 0 = 0° .

first zero in the correlation occurs at larger antenna spacing. Specifically, the first zero-crossing occurs at D/).. ~ 30/6, with 6 in degrees . Thus, these results depend mainly on D /)../6, and show that independent fading occurs when the antenna beamwidth from two elements of the array is about the same as the beamwidth of the arriving signal. When the signal arrives from other than broadside, ¢ -:F 0°, the antenna spacing for low correlation increases and the envelope correlation is never zero for almost all values of ¢ -:F 0° and 6 < 180° (since zero envelope correlation requires that Rx x (A-27) and R x y (A-28) have zero crossings at exactly the same spacing). Fig. 4 shows R x x versus antenna spacing for the worst case of ¢ = 90°. For R x y , the peak value of the oscillations is similar to that shown in Fig. 4. Note that the correlation decreases much more slowly with antenna spacing at ¢ = 90° than at ¢ = 0°. Fig. 5 shows the antenna spacing required for the envelope correlation to remain below 0.5 as a function of ¢ and 6. The required spacing is only a few wavelengths up to very small beamwidths, unless ¢ is close to 90°. Experimental measurements of the beamwidth in mobile radio are presented and discussed in [7], [12]. These results show that, as expected, the beamwidth decreases with the antenna height. Fortunately, in most cases, antenna spacings on the order of only 10).. (several feet at 900 MHz) are required to obtain low correlation. The effect of correlation on reducing the effectiveness of antenna diversity against desired signal fading is shown in [8]. With maximal ratio combining and two antennas, small correlation « 0.3) has a negligible effect on performance, and the degradation is small unless the correlation is large (> 0.8). The effect of correlation on reducing the effectiveness of antenna arrays against interference suppression is as follows. With M antenna elements, the array has M - 1 degrees of freedom. Thus, as shown by theoretical and computer simulation results [1], [3], [4], [9], [10], an M antenna element

Fig. 4. for d>

Correlation of the real portion of the fading versus antenna spacing

= 90°.

8 , - - , - - - - - - , - - - - - - - . - - - -r - - - - - . ,

6

'" o v

~' .E

s

2

0'----'-------':------'------'------' 90

80

60

40

20

a

6 (Degrees)

Fig. 5. Antenna spacing required for the envelope correlation to remain below 0.5 as a function of
array can null out M - 1 interfering signals independent of the fading correlation (i.e., with or without fading). The only factor that changes with the environment is the required spatial separation of the interfering signals: without fading, the signals must be separated from the desired signal by the antenna beamwidth, while with fading (with 6 = 180°), the signals need only be separated by about half a wavelength. For 6 < 180°, we note the following. Spacing the receive antennas at greater than )../2 decreases the beamwidth of the array but also creates grating lobes, i.e., the antenna pattern repeats every 90° / (D / )..). Because of these grating lobes, with large antenna spacing to reduce fading correlation, interfering

438

signals outside of the antenna beam width can not alway s be suppress ed . However, bec au se of the multipath fading, mo st interfering signals within the beam width but separated by at least half a wa velength , can be suppress ed. Therefore , as before , onl y the required spatial separation of the interfering signals changes with the environment. Thus. as the signal beamwidth decreases (i.e., as the corre lation increa ses), the effectiveness of adaptive arrays to suppres s interference alone does not change, but the effectivene ss aga inst fadin g does.

10. 1 ,-----r-

- - - . . . , -. -- - - - , - --

p

ffi lD

~

--,

18dB

Flat Fading

10.2

M=N=2

B. Performan ce with Fading and Interferen ce The effec t of corr elati on on an adaptive array that jointl y suppresses interference and reduces fadi ng effec ts was detennined in the foll owing manner. For fixed D[); and the same fixed 06. for the desired and interfering signals, we use Monte Carl o sim ulation to derive 10.000 channel matri ce s C with random ¢ and fadin g and then calculate the performance averaged ove r the se matri ces (i.e.. (f and fading ). We ass ume that the user s are rand oml y located (se parated by at least half a wavelength ) and thu s ¢ is an independ ent rand om variable for eac h user with a uniform pro babil ity de ns ity function. i.e .. -

t;

< (/J

~

n

( 14 )

pIs('w lwrc .

The performance mea sur es we co nsider are the ave rage erro r rate . as well as the outage probability. i.e.. the probabilit y that the error rate exceeds a gi ven value . The error rate for a given i/J and fading was calculated as follow s. For give n (p for eac h user . D.. and D[ ); the correlation matrix RI.: / (J~ for ea ch user was calculated using ( 13). To generate C , we first ge nera te a 21\1 vec to r A. I.: for each user with eac h element Uki bei ng an inde penden t. zero -mea n Gau ssian rando m va ria ble with a variance of 1/2. Thu s.

= [al.:!

A I.:

... (lud

T

( 15)

for the k- th user. The A:-th column of C. CI.: . is then • I I:!

Ck Note that

; 2 is give n by . I /

k

(J-

I.:

:1:

(16)

(J ie

~

R~:2 = :1: where

= -HI.:.)- ,h-

[

~ .

o

= [Xl ' . . :1:2M ( ,

and eigenvalues of

o()

0

J ,\ U E

o and

.1:,

]

T I:

( 17) .

and '\ , are the eigenvectors

~, respectively. For frequency- selective t;

fading with two-path delay-spread (4) , the abov e procedure was repeated twice to obtain the cL ' s and c~) ' s. Th e MSE is then given by (2) and the err or rate by (3). We first consider the effec t of correlation with flat fadin g, two recei ve antennas, and one interferer with the same power as the desired signal. Fig . 6 exhibits the average error rate versus 06. with p = 18 dB and 27 dB , and D = 0.382'\ and 3.82'\. Note that D = 0.:382'\ corresponds to zero corre lation when the signa l arr ives uniformly from all angles

3.82

I

p ~ 27dB

10.3 L.-_--'180

-'--'---_

_

100 90

--'

-'

a

50

6 (Degrees ) Fig. 6.

Average erro r rate ve rsus

~

With lial fad ing and J [

= .Y = 2 .

(6 = HmO). At D = 0. 38:2'\, the performance is degraded slightly at 06. = 90° and become s much wor se with smaller D. . However. at D = 3 .82'\, there is little degradation until D. is 10°- 20°. Thus, increasin g the antenna spac ing by a factor of 10 decr eases the tolerable ~ by abo ut a factor of 10 as well (co rrespond ing to the decrease in antenna bearnwidth as discussed in Section IV-A) . As shown in Fig . 5. at 20°, the correlation is about 0.5 in the worst case of (p = DO°. Fig. 6 also shows that the degradation with .0. is larger with high er o, but the abo ve conclusion s are the same. Similar result s were obtained for the outage probability. Fig. 7 shows the outage probab ility vers us .0. with flat fading. N - 1 equal -power inte rferers, and M = N + I. Result s are shown for the probabil ity of excee d ing a 10- 2 error rate. with p = 17 dB. and D = 0 .382 '\ and 3.82'\ as in Fig. 6. As compared to Fig. 6, these result s sho w that for D = 0.3 82'\ correlation degrade s the performance more wh en there is an additional antenna. Additional result s we obtained for M = N + 2 and M = IV + :3 show that the degradati on with co rrelation gro ws eve n larger with more ante nnas . In Fig. 7, the IV! = :2 result s are without inte rference and thu s correspond to the performance with maximal ratio combining . The performance with D = 0.382'\ is seen to be degraded somewhat more by co rrelation when interference mu st also be suppress ed (i.e., AI = 3 and 4 versus JV! = 2 result s). However, in all cases when the spacing is increased to D = 3.82'\, the performance rem ains constant I as long as 06. is greater than about 20°, i.e.• the correlation is below 0.5. Finall y, we consider the effect of corr elation with frequency selec tive fading when jo int optimum combining and equalization is used . Fig . 8 shows the average error rate versus 6 with two-path dela y spread and 1\:1 = IV = 2. Results are for 'p = 17 dB , D = 0 .382'\ and 3.82 '\ , and dela y T = 0 , 0.7 T . and T for the de sired and interfering signals. Note that The outage probabil ity is seen in Fig. 7 [0 increase slightly with increasing JI 3 and 4 with D j-\ 3.82. but this is ju st a num erical aberration due to using only 10.000 channe l matrices in the simulations. I

~ at one point for both

439

=

=

.2 , --

--.-

-

-

--,-

-

-

-

,..--

-

-

little degradation, but higher correlation significantly decreases performance. Although our results show that this degradation incre ases with the number of antennas, these results are for a linear array, which cau ses all fading to be highly correlated when signals arrive from endfire, i.e., as if> -+ 90 0 • Since this problem can be reduc ed when M > 2 by not arranging the antennas linearly, we may be able to avoid this increase in degradation with the number of antennas. However, in all cases, increased antenn a spaci ng reduces the ~ at which degradation occurs.

-,

.1

N

o , w

II:

.05

Flat Fading M=N+l =2104 P = 17d6

~

~

'5

rf

.02

.01

~~=~;::;;:;;;S~~l2:on.= 3.82

.008 '-----'-180 150

-

- - - ' - - - - - - ' - -100 50

V. CONCLUSIONS -

In this paper, we have investigated the effect that fading correlation has on the performance of an adaptive array in a digital mobile radio system. We described a mathematical model of local scattering around the mobile where the received multipath signal s arr ive at the base station within a given beamwidth and derived a closed-form expression for the correlation as a functi on of antenna spacing. Monte Carlo simulation results sho w that the degradation in performance with correlation in an adaptive array that combats fading, suppresses interference, and equalizes frequency-selective fading is only slightly larger than that for combating fading alone, i.e., with maximal ratio combining. Th is degradation is small even with correlation as high as 0.5. Our results show that, with an antenna spacing of four wavelengths, there is little performance degradat ion as long as the beamwidth of the received signals is greater than 20 0 . Thi s tolerabl e beam width can be reduced even further by larger antenna spacing since this beamwidth is inver sely proportional to the antenna spacing.

-'

o

to (Deg rees)

Fig. 7.

.07

Outa ge probabil ity versus

.:j.

with tlat fadin g and .\f = .\'

+ 1.

, - - - - - - r - - - - , - - - - - - - , - -- --,

05

II:

w

Two-Path Delay Spread

02

lD

M=N=2

p = 17d6 .0 1

150

Fig. 8.

100 to (Degrees)

Average error rate versus

J I = .\' = 2.

.:j.

50

ApPENDIX

o

DER IVATION OF C LOSED-FORM

the error rate decreases with T , due to the diversity benefit of frequency-selective fadin g with equalization, as shown in [1]. This improvement increa ses with T until T = T and then rema ins constant since the two path s are resolvable for T 2 T . The figure also shows that there is some improvement even if only the interference has frequency-selective fading, but the best improvement occurs when both the desired and interfering signals have frequency-selective fading. A large portion of the max imum possible improvement is obtained when T O.7T . Fig. 8 shows that for D 0.382.\ , the degradation with correlation increases with frequency-selective fading. As before , however, with D 3.82.\ , the performance is not degraded until ~ is less than about 20 0 • Thu s, correlation degrades the performance of an adaptive array that combats fading , suppresses interference, and equalizes frequency-selective fading somewhat more than an array that only combats fading. Correlation up to 0.5 causes

=

=

EXPRESSIONS FOR

with two-path dela y spread and

=

Rx x

AND

u.,

The most fundamental description of a linear, quasistationary , multi path medium in wireless systems employing antenna arra ys is the impulse respon se from user " i " to array element output "j ." Such a typical impulse response can be repre sented as the superposition of a large number of impulses, (A- I) n

where the gn' s and the tn' s are the strengths and dela ys of the possible paths. Clearly, in a time varying situation, these parameters will depend on time . In a system of N users and M antenna elements, we must describe N x M such responses. Thus, if the input to the medium of a typical user is s(t )eiwot , where W o is the angular carrier frequency, the output of a typical antenna element becomes,

440

So (t ) = eiwot

L gns( t n

tn)e-iwotn .

(A-2)

We now derive the correlations among array elements for

Following the seminal work of Turin [11], the set of all in's is partitioned into L disjoint sets .il t , £ == 1 ... L. With each set A€, we associate a representative delay Tf such that i-. EAp if :,;(t - tn) ~ 8(l - Te). In other words, the differences Tf - t n are much smaller than the reciprocal bandwidth of s( t). With these approximations in mind, we rewrite (A-2) in the form 8o

L s(l - L

a single user by assuming plane waves at the array. This is

a reasonable assumption when users and antenna array are separated by many wavelengths. Suppose the reference wavefront plane coincides with element 1 (see Fig. 2). Then, the wave arriving at element 2 suffers a delay relative to the first element,

L

(l ) == el~'l)t

Tp)

£==1

.f}nC-1W·"t"

(A-3)

T

L

+ w) L

bfe"(~'
(A-4)

Sok,j(t) = eiwol

Thus, a typical baseband-equivalent frequency transfer characteristic from user "I;" to antenna element "j" can be represented in the form L

: r/,j

)Lu...

1. . f1. -

1 . . . . 'V .J.: -- 1 ... iv 'I . 1

•

Sill

¢.

I(p\

~

7f

(A-7)

and (n - l)T at element "ti," Thus, if we denote the output signals at antenna elements "k" and "j" by .';ok(t) and .';oJ(t), respectively, due to the transmission of a signal of the form, .)(t) e'r> t, located at an angle (P, we can write

(=1

. ( ) - ~1J..:j Ck.J W ~ J{ (,

C

HI

where ne is the set of integers such that I; ri f~4t· Denoting Lnr gne-iwotn == be and taking Fourier transforms of both sides of (A-3), we obtain the standard L-ray, or frequencyselective multipath description of fading channels,

So(w) == S(wn

D

== -

L

L set -

Tf)b~ktj)

f==1

(A-8)

where

(A-5)

1/

{=l

For this model to be useful, a statistical characterization of the set of M x N frequency functions (:~ ..J (w) must be provided. In our application, we shall assume that the terms in the various sums defining hi's arc random quantities and so it is reasonable to assert that the h(l' s are complex random variables. Furthermore. we assume that there are large number of terms in each sum and that each sum includes different random terms and. consequently, from the central-limit theorem. the hI" S, f == 1 ... L, ITIay be regarded as i.i.d. complex, zero-mean. Gaussian random variables. If we let wot n == f)n in the sums defining br, we write the real and imaginary parts as

1

where (/)1/ is the angle of arrival of the n-th ray. As we have already argued, the h;,(\) 's are complex i.i.d. Gaussian random variables associated with array numbers n, and therefore the sought-after correlations are determined by each h~(\) and different O"s. Thus, we seek the correlation coefficients between the following random variables: (k)

(k)

. (k)

h['

==:r p

+ I,Yr' l: ==

IJ (j) t

- ,,(.d -.L (

" (J). + 1,.tJ p •J

1.. ...J.\;1

and -·1 - ., .. 1"/ ~

where '1,(0:) == R(:"\b(n) ./(' " e

(A-6)

Now, it is reasonable to regard Hn IllOd ulo27f as i.i.d, unifonnly distributed random variables with the consequence that Ie and Ye are now independent and so lb, I is Rayleigh distributed and / be is uniform. This is then the rationale for regarding Ck] (w fin (A-5) as a complex Gaussian process in the frequency domain. For our application, the correlation among the elements of Ckj' s is of paramount importance. In order to facilitate the evaluation of these parameters, we must return to the basic definition of the br's in (A-6). We begin by considering the following geometrical model. This entails placing the users and the antenna array in a reasonable geometrical relationship, Without loss of generality assume that the antenna array is linear with M elements with identical spacing, D, between elements. We label the elements in ascending order. Users are located at arbitrary angles and distances with respect to the antenna array as depicted in Fig. 2. With each user, we associate a scattering angle of size 2~. This implies that all subpaths from the user to the antenna array are restricted to emanate from within this angle.

(A-9)

and ') L{ ?ie(0) -_ I III b(O:) P ,n - 1.-.- . ...ivl .

We note that since the en's are i.i.d. uniform, the real and imaginary parts of b~o) are independent for any n. We now calculate for any n

E[x~(»f = E[y;L\)f

=

~L

ELq;]·

(A-IO)

Tlf

It is now straightforward to calculate the four correlation coefficients

441

E [:E~k) .T~j)] = E

[y;k) y;j)]

= ~ L E [g; cos nf

[(I, - j)21r ~ sin ¢nJ J (A-II)

where the Jm's are Bessel Functions of integer order and

and

D

E[X~k)y~j)] = -E[x~j)y?)]

(A-I8)

z = 27r"I'

= ~ tE[g~Sin [(k - j)27r~ Sin¢n]] nt

(A-12)

we can integrate (A-I3) and (A-I4) and obtain the following convenient formulas for the desired correlation coefficients:

where

D

wo-

C

D

= 27rfoC

D = 21r,.

.

= Jo(z(k - J))

1\

~

+ 2 L..J

J 2m (z(k - j)) cos (2m
sin(2m~)

m=l

According to our hypothesis, there are a large number of terms in the sums indicated in (A-IO) and (A-II) and if we make the additional physically reasonable assumption that the ¢n's are dense in the range (1) - D.., 1> + A), the sums can be expressed as integrals of the form, independent of l,

(A-19)

and

~

= 2 ~ hm+l(Z(k -

.

j))sm [(2m + 1)¢]

where the _normalized R's are defined as be readily checked that

and

2mLl

+ 1)~) (2m + 1)~

sin ((2m

R=

(A-20)

R/ (J2. It can

and

as they must be for "physically consistent" considerations. (A-14)

ACKNOWLEDGMENT

where the density function of the returned strengths a 2 ( ¢) must satisfy (A-I5)

Making the reasonable assumption that this density function is a constant over the angle segments, we then obtain the relationship (72

=

~ LE[g~] ne

=

~E[lblI2],for all I.

We wish to acknowledge Professor Bar-David from the Technion for generously sharing his expert knowledge with us while he was a consultant at AT&T Bell Laboratories, during the evolution of this investigation. Also, thanks are due to our colleagues at AT&T Bell Laboratories, Noach Amitay and Jim Mazo, for listening to our frequent arguments and sharing their expertise with us.

(A-I6)

which is consistent with the definitions in (A-8). Now, by making use of the well-known series representations,

= Jo(z) + 2 L 00

cos (zcosB)

L 00

sin (z sin B) = 2

m=O

J 2m (z) cos (2mB)

m=l

J 271'1 + 1 (z) sin [(2m + 1)8] (A-I7)

442

REFERENCES [1] 1. H. Winters, J. Salz, and R. D. Gitlin, "The impact of antenna diversity on the capacity of wireless communication systems," IEEE Trans. Commun., vol. 42, no. 4, pp. 1740-1751, Apr. 1994. [2] J. H. Winters, "Signal acquisition and tracking with adaptive arrays in the digital mobile radio system IS-54 with fiat fading," IEEE Trans. Veh. Technol., vol. 42, no. 4, pp. 377-384, Nov. 1993. [3] _ _ , "Optimum combining in digital mobile radio with cochannel interference," IEEE J. Select. Areas Commun., vol. SAC-2, no. 4, pp.. 528-539, July 1984. [4] _ _ , "Optimum combining for indoor radio systems with multiple users," IEEE Trans. Commun., vol. COM-35, no. 11, pp. 1222-1230, Nov. 1987. [5] _ _ , "On the capacity of radio communication systems with diversity in a Rayleigh fading environment," IEEE J. Select. Areas Commun., vol. SAC-5, no. 5, pp. 871-878, June 1987. [6] S. A. Hanna, M. EI-Tanany, and S. A. Mahmoud, "An adaptive combiner for co-channel interference reduction in multi-user indoor radio systems," in Proc. IEEE Veh. Technol. Conf., St. Louis, MO, May 19-22, 1991, pp. 222-227.

[7] W. C. Y. Lee, "Effects on correlation between two mobile radio basestation antennas," IEEE Trans. Commun., vol. COM-21, pp. 1214-1224, Nov. 1973. [81 W. C. Jakes, Jr., et al., Microwave Mobile Communications. New York:

Wiley, 1974. [91 R. A. Monzingo and T. W Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980. r 101 R. T. Compton, Jr., Adaptive Antennas. Concepts and Performance. Englewood Cliffs, NJ: Prentice-Hall. 1988. [11] G. L. Turin, "Communication through noisy, random-multipath channels," MIT Lincoln Lab., Tech. Rep. 116, May 1956. [12] W. C. Y. Lee, Mobile Communications Engineering, New York: McGraw-Hill, 1982, pp. 275-280.

443

Capacity Improvement with Base-Station Antenna Arrays in Cellular CDMA Ayman F. Naguib, Student Member, IEEE, Arogyaswami Paulraj, Fellow, IEEE, and Thomas Kailath, Fello~v, IEEE

Abstract-In this paper, the use of antenna array at base-station for cellular CDl\'IA is studied. We present a performance analysis for a multicell COMA network with an antenna array at the base-station for use in both base-station to mobile (downlink) and mobile to base-station (uplink) links. We model the effects of path loss, Rayleigh fading, log-normal shadowing, multiple access interference, and thermal noise, and show that by using an antenna array at the base-station" both in receive and transmit, we can increase system capacity several fold. Simulation results are presented to support our claims.

T

I. INTRODUCTION

HE increasing demand for mobile communication services without a corresponding increase in RF spectrum allocation motivates the need for new techniques to improve spectrum utilization. One approach for increased spectrum efficiency in digital cellular is the use of spread spectrum code-division multiple-access (CDM,,,) technology [1], [2]. Despite the high capacity offered by CDMi\ technology. the expected demand is likely to outstrip the projected capacity with the introduction of Personal Communication Networks (peN). One approach that shows real promise for substantial capacity enhancement is the use of spatial processing with cell site antenna array [3 ]-[ 11]. By using spatial processing at the cell site, we can estimate the array response vector and use optimum directional receive and transmit beams to improve system performance and increase capacity. Such improved antenna processing can be incoporated into the proposed CDMA transmission standards. The increase in system capacity by using antenna arrays in CDMA comes from reducing the amount of co-channel interference from other users within its own cell and neighboring cells. This reduced interference transforms to an increase in capacity. The currently proposed IS-95 COMA standard already incorporates a degree of spatial processing through the use of simple sectored antennas at the cell site. It employs three receive and transmit beams of width 120 0 each to cover the azimuth. Sectoring nearly triples system capacity in CDMA. While it might appear that even narManuscript received December I. 1993: revised April 10. 1994. This research was supported in part hy the SOlO/1ST Program managed by the Army Research Office under grant DAAH04-93-G-OO~9. This paper was presented in part in the 27-th Asilomar Conference on Computer. Signals. and Systems. The authors arc with the lnforrnation Systems Laboratory. Stanford University. Stanford. C A 94305 USA. IEEE Log Number 9403~06.

rower sectors might yield further capacity gains, simple planar wavefront assumptions used in sectoring are not valid for narrow beams that employ large apertures. Simple sectoring, therefore, suffers from significant losses and motivates the need for "smart antennas" that adapt to a dynamic spatial channel seen by the cell site antenna array. In this paper. we study the capacity improvement of multicell COMA cellular system with base-station antenna array for both the downlink and the uplink. As in the proposed IS-95 CDMA standard, we assume that the uplink and the downlink occupy different frequency bands. We adopt the Rayleigh fading and log-normal shadowing model in [12] to model signal level. In this model. the fast fading around the local mean has a Rayleigh distribution. Due td"_ shadowing. the local mean fluctuates around the area mean with a log-normal distribution and standard deviation a.\", which varies between 6 to 12 dB, depending on the degree of shadowing. We also assume that the received signal power falls off with distance according to a fourth power law. That is, the path loss between the user and the cell site is proportional to r:' where r is the distance between user and cell site. In the next section, we analyze system capacity in uplink, where each signal propagates through a distinct path and arrives at the base station with independent fading. In Section III. we also analyze system capacity in downlink where all signals received at the mobile from the same base station undergo the same fading. Next, in Section IV we present simulation results. Finally, Section V contains our concluding remarks.

II.

MOBILE-To-BASE LINK

We assume that the cell site alone uses a multielement antenna array to receive and transmit signals from and to the mobile. No antenna arrays are considered for the mobile due to practical difficulties in implementing such a concept. Consider a scenario where there are N users randomly distributed around each cell site at varying ranges. We assume that the receiver is code locked onto every user but does not know the direction of-arrival (DOA) of these users. Each user transmits a PN code modulated bit stream with a spreading factor (processing gain) of L. Let P be the received signal power at the cell site, let the system noise power (excluding interference from other in-

Reprinted from IEEE Transactions on Vehicular Technology, Vol. 43, No.3, pp. 691-698, August 1994.

444

band users) be a 2 and, finally, let M be the number of antenna elements. Assuming perfect instantaneous power control, the interference from a mobile within a given mobile's cell will arrive at cell site with same power P. Since mobiles in other cells are power controlled by their cell sites, the interference power from such mobiles, when active, at the desired user's cell site is given by [I] lit

r::)

r(k»)4 IIQ'(O) 11 2 = P ( r;~) IIQ';~) 112 = P . (3~

x(t)

, Ci«(t - Tik)a"

+

-T T "

net)

j) (2)

where K is the number of interfering cells , ail is the M x 1 array response vector for signal arriving from the ik-th mobile in the k-th cell and we assume that a ~ all = 1, Cit(t) is the code used by that user, b;.(') is the bit of duration T, Til is the propagation delay , 1/;11 is a Bernoulli variable with probability of success v that models the voice activity of the same user (i.e . , a user will be talking with probability v) , and n is the thermal noise vector with zero-mean and covariance

E; {n(t)n *(T)} =

a2 - I M

=0

t

=T

(3)

t

*"

(4)

T.

_

_ Own -cell interference. constant power

_ _ Desired Signal

(I)

where is the distance from the irth user in the k-th cell to its cell site, Q' ~:) is a zero mean complex Gaussian random variable that represents the corresponding amplitude fade along that path and combines both the Rayleigh fading and log-normal shadowing effects (i.e ., IIQ' ~:) II has a Rayleigh distribution whose mean square value E;{11Q'~:)1I2} is log-normal ; i .e. , 10 loglo E;{11Q'::)11 2 } is normally distributed with zero mean and variance a~), r ~~) is the distance between the same ik-th mobile in the k-th cell and the desired user's cell site (i .e . . cell site 0), and finally Q' :~) is the corresponding amplitude fade . Note that in [1], only the effects of the log-normal shado wing is considered , Note also that since the mobile will be controlled by the cell site that has minimum attenuation (3 " ::5 I [I] . Fig . 1 shows a desired signal and interference signals from mobiles within cells and outer cells for both omnidirectional beams and directional beams . Clearly. direc tional beams reduce the interference power and boast the signal to interference-plus-noise ratio . To be able to form such beams , we need to estimate the array response vector, or the spatial signature , of the de sired user mobile . Using this estimate of the array response vector . we can form a beam towards each mobile . Assuming a narrowband signal model. the M x I output of an array of M sensors at the cell site can be written as

I

. . __.. __ Outside-cell interference . variable power

445

.. _. ... . Out side -cell inter ference, varia ble

pow ~r

_ _ Own -cell Inter ference. constant power

_ _ Desired SIgnal

x

x x

FIg . I . Interference in uplink w ith and " uhout bcam tornu ng .

Equation (3) implies that the noise is both temporally and spatially white . For the desired user . let Ow T". (' " . and b, (.) be the array response vector, the time delay. the used code, and the transmitted bits, which are assumed to be i. i.d . binary random variables taking values ± I with equal probability, respectively . The antenna outputs are correlated with the desired user's code c., to yield one sample vector per bit. Without loss of generality . as sume that T o = O. The post-correlation signal vector for the desired user's L-th bit is given by z; (l)

= \'1x(t) Co (I) dt

11

N

= So (l )0" + in2::2 h =

K

.v

+ 2:: 2::

k=1 ;, = \

where t l

= (/ -

I)T,

(2

Ii" (l )ai"

1/;;J;,(l)ah

= LT,

+

nr(t)

(5 )

and (6~

K

fjJ/) =

Var {n2}

r

It ~Th J )

JPl3jk bjk (

,12 Co (r)n(r) J/

dt,

= g {xx *}

(12)

(13)

N

I(

t_2: o == 2 ~ioHa:at"lJ2, K

11

(14)

N

= k2: 2: ~h{3~ lIa(;ah 11 2 . = I u. == ,

(15)

The probability of outage is defined as the probability of the bit error rate exceeding a certain threshold P0 required for acceptable performance. As noted in [1], with efficient modems and powerful convolutional codes, adequate performance (BER < IO-'~) is achieved with Eb / (No + (J < 7 dB. Let S be the Eh/(N(} + 1 value required to achieve the level of performance. then the outage probabil ity is 0 )

Pr (BER > Po)

=

Pr (

h

E < No + 1 0

Pr (II + f, > ~ _ (J~). S MP

III

."J'

= LJPbo(l) + L: I"

-= 2

V;t"("a(;ar,,

.V

2:-= I u;_2:== I

if; u; l., a (~ar, + a (;n T (I ) .

( 10)

The first term So (I) is due to signals from the desired user, the second term n I is due to interference from users within its own cell. the third term n~ is due to interference from users outside the cell: both are zero mean, and I1r is due to the additive thermal noise, which is normal with zero mean and variance equal to Var tnT} == La 2/M. Additionally, we assume that each user's code consists of a sequence of L i.i.d. binary random variables taking values ± I with equal probability. As noted in [15], with asynchronous transmission, random-sequence codes give approximately the same analytical results for nonrandomly chosen codes. Under this assumption and using the results in [15], we can show that the variances of nl and n2 are given by N

Var {n I}

2.

where I, and 12 are the interference-to-signal power ratios due to own cell and outer cell users respectively, and are given by

do (I) == a {; Zo(' )

k

11

~

as the generalized principal eigenvector of the matrix pair (R u • R:..:-.). Using this estimate of a". the post-correlation antenna outputs are combined via bearnforrnins to estimate the signal from the desired user. The decision variable. which is the output of the bearnforrner. is then given by

+

Ik

(9)

R~)~} = e{Z(1:(~}

K

0

L

In order to combine the array outputs to estimate the signal from the desired mobile, we need to determine the array response vector for the wavefront arriving from this user. In general, in COMA systems the number of users will far exceed the number of antennas. Therefore, subspace methods of direction-of-arrival estimation (e.g., MUSIC [13] and ESPRIT [14]) are not applicable. In [3], we showed that the array response vector of the desired mobile au can be estimated from the pre-correlation and post-correlation array covariances Rand R.. where r.r ~",-",

= So (/) +

Ik,

pend on the voice activity of the users, their array response vectors, and fading and shadowing effects. The faded energy-per-bit to interference-plus-noise densities ratio can be written as

cjl(r - T1k)CO(t) dt,

1

e;

ik,

(7) These variances are themselves random variables that de-

(8) nr(!) =

N

= LP k2: 1/;·13~'ie lIa*a, = , 2: = 1

= LP j"L;;2 1f,,,IIa:aj"ll\

(11)

446

s)

(16)

This expression gives the outage probability as a function of the random variables II and 12 • The distribution of the random variables II and 12 depends on the number of active users, their relative distances, their array response vectors, array parameters, and fading and shadowing effects. The capacity of the system in terms of maximum cell loading can be found by finding the maximum N such that for the required BER. Pout will not exceed the present threshold. To obtain Pout as a function of N, we need to specify the array (i.e., the number of sensors, spacing between them, and their arrangement) to be able to find the distribution of Ila: aik tt2 , and hence the distribution of I, and 12 • To simplify the evaluation, we use the following first order approximation. As we pointed out earlier, the effect of forming a beam towards the desired user is to reduce the effective number of interferers to those mobiles that fall within the beam formed towards the desired mobile. Since the number of those inteferers is random, we ap2 proximate this effect by replacing the lIa: a., 11 term in II and /2 by a Bernoulli random variable Xik that has a probability of success B/21r where B is the effective bandwidth

and is equal to 8 {lla~aik 11 2 } . This random variable represents the interference activity of the users, i.e., a mobile will cause interference to the desired mobile if it falls within its beam. In this case, we can write /1 and /2 as N

I

= 2.:

J

N

i., == 2

,I,. 'Yl o

K

x. = 2.: i.,

.

lo

== '2

( 17)

A..

'P/ o

N

K

L: L:

L: L:

l/;ikXI~{3T~

k=1 h==l

lV

k==1 h=l

cjJh{3~

( 18)

where cPit.: = l/;ik Xii.: is a Bernoulli random variable with probability of success v = vBI27f. The distribution of f \IQd~) Il/lla~;) I is given by [12] Pr (f

<

r) =

1

J;

roo

L",

1

exp (- U 2)

+

r-21O-2a"ulo du,

(19)

For a large number of users. the random variable /2 (interference due to K . N users) can be approximated by a Gaussian random variable with mean JJ-i N and variance a; N that depend on li, the degree of shadowing as, and the number of interfering cells K. We have evaluated the mean and variance of 12 using Monte Carlo integration considering only the first two tiers of interfering cells (i. e .. K = 18) and these were found to be given by u,

=

pacity. To be able to form such beams, the cell site needs to have an estimate of the transmit array response vector to each mobile. However, in the current standard, frequencies for the uplink and downlink differ by 45 MHz. In this case, the receive and transmit response vectors can be significantly different [16], [17]. Hence, reciprocity between uplink and downlink does not hold and the beamformer weights used for reception cannot be used for transmission. A method of performing transmission beamforming is the feedback method [18], [19], where training signals, or tones, are periodically transmitted from the cell site to all mobiles on the downlink. From the received signal information that the mobiles feedback to the cell site on uplink, it is possible to estimate the downlink spatial channel, and thus estimate the transmit array response vector. All signals received at the mobile from the same base station will have propagated over the same path, hence they will experience the same fading and path loss. Therefore, we assume that cell site transmits the same power to all mobiles controlled by that cell site. With this assumption, the power of each signal arriving at the desired mobile from the k-th cell site is given by

a~ = 0.463[' - 0.2741'2.

(21)

Also, the random variable II has a binomial distribution with parameters tN - 1. f ). Let LIS - a ~/PJl =: O. Since II, [.2, and all cPi k are independent. we can usc the results in [1] to shows that Pout

= ,vi: I

k=O

. Q

(lV - 1) l' ~ k

(1 _

f

r\'

(0 - k ~ !J.,N)

I - I..

where 1

OO

.s: i.t

e -.'"2/2 dy.

This equation gives the outage probability as a function of the number of mobiles per cell that can be supported. Note that this reduces to the result in [1] when no antenna arrays are used at the cell site. The results of evaluating (22) as a function of cell loading and beamwidth are shown in Fig. 3. Also, simulations to evaluate the accuracy of the above approximation are shown in Fig. 5 and discussed in Section IV. III. BASE-To-MoBILE

LINK

Consider now the base-to-mobile link. We assume a similar scenario as in the uplink. With antenna array at the cell site, the cell site must also beamforrn on the downlink in order to effectively increase the system ca-

447

(23)

where a t' represents the fading and shadowing experience by all signals arriving at the desired mobile from the k-th cell site. and ,.~Ol is the distance between the desired mobile from its cell site. As in [1]. we assume that the power received by the mobile from its cell site is the largest among all other signals from other cell sites (otherwise the mobile would switch to the cell site whose received power is maximum). That is. we assume that k

(22)

Ja~ ,v

Q(x) = -

p . p~

(20)

0.523~.

=

1. . . . • K.

(24)

Fig. 2 shows desired signal and interference powers seen by the desired mobile for both omni- and directional beams. Assuming N users per cell randomly distributed around each cell site at varying ranges, we can write the received signal at the mobile of interest as xo(t) =

iO~1 '1/;;" JP13"b +

f f

k = I i~ = I

jo (

[t -TT,,,

(J;,JP13k b" ( I t _

I

)

-

T

edt -

T,~

TIJa,:'a~»)

\ )

(25)

where n(t) is the background noise received by the mobile, and a ~:) is the transmit array response vector of the desired mobile as seen by the k-th cell site. All other notations remain the same as in the previous section. The mobile correlates the received signal by its code to yield

terference due to signals from its own cell site nl is zero . However, we assume here that there will be cross-correlation between those signals, which represents a worst case. Hence, similar to the uplink case, we can show that the variance of n I and n2 is given by

... .. .. . Variable Power

N

Var {nIl

1ol Il 2 = LP o illL:= 2 !/;·lIa*a 'I}

Var {n2l

= L k=2:I r, il 2:= I

10

K

(28)

0

N

!/;illla~a~k lI12,

(29)

and the energy-per-bit to interference-plus-noise densities ratio can be written as

L

,

0'-

.. .. . .. Variable Power _ _ Cons tant Power

(30)

+ G 1 + G2

Po

where G 1 and G2 are the interference-to-signal power ratios due to their own cell and outer cell signals, respectively , and are given by (31) K

-

\ x \

x

\

X

P "lIr

= s,,(/) +

III

(l)

+

II~ (/)

+

"r(l)

.v

= L(3" JPb,,(/) + 2: !/;,) ,,,a;' a::" I"

K

=~

.V

L: L:= I O',Ji,a,:a :;' + k= I 'I

"r(ll.

(26)

where 1'1 is defined as before but with (3k instead of (3'1 and "r(t)

I{

k

P"

-

IIa "* a ()tkl 112 .

(32 )

= Pr (BER > P,,) = Pr

(0'2

P"

+ G1 +

G~

>

!::.) . S

(33)

the decision variable

+

k = I il = I

The corresponding outage probability is then given by

X

Fig. 2. Interference in downlink with and without beamforming.

d;

P

.V

G~:- " ~ " LJ 1/;

= 1, (c,,(t)n(r)

dt ,

(27)

As in the uplink. the first term s.; is due to the desired signal from the cell site to the desired user, the second term n I is due to interference from the same cell site into the desired user, which is zero mean, the third term n2 is due to interference from other cell sites into the desired user. and nr is due to the additive thermal noise and it is normal with zero mean and variance equal to Var {nr} =

La 2 •

In the proposed CDMA standard 15-95, orthogonal codes (Walsh codes) are used on the downlink for all users within a cell, i.e .. in the ideal case (no multi path) there is no cross-correlation between those signals and the in-

448

Unlike the uplink case. the distribution of G~ does not yield itself to analysis (here we have only K independent fading variables, while in the uplink case we had K . N fading variables. and when N is large we were able to model 12 as Gaussian) . Therefore. we resort to simulations to estimate P"1I1 as a function of cell loading and number of sensors, from which we can obtain the system capacity (maximum cell loading) as a function of cell loading and number of sensors. The results of these simulations are shown in Figs. 6-9 and are discussed in Sec tion IV . IV. SIMULATION AND NUMERICAL RESULTS In all of our simulations and numerical results, we consider only the first two tiers of interfering cells, which means that K = 18 cells . We assume that the voice activity factor v is 0.375 . We assume that for adequate performance, the required BER is 10- 3 which corresponds to Ebl(N" + 10 ) of 7 dB . We also assume that the processing gain L is 128. Finally, we assume that a, is 8 dB . For the uplink, the outage probability was computed using (22). The results are summarized in Fig. 3. From this figure, it is shown that by using antenna array to form narrow beams towards the desired mobiles, a many-fold increase in system capacity can be obtained. For example,

t

: :.)//::

I

.Qllini

• 8c:i

., .. .. ,.

I;

.)

.. .. .eW'!'.1.20 .

. ...

"

• . . . .. . . . . .

,

:

I

'r

Bvo. 060

/

w III

if

..

10

..

10

I

J

50

100

I:

._.

. _ _ ... approximalebeam panern ..

I:

...

0.7r ··············:···, ... ... t;···

~

..

I:

0.6r ······.. ······:···"

.....I:( . I:

/

f

0.41-

. ..... ...............

·········· , {,

0.3r ·············-:·j

"-

f

I

0

..t:

e

.. .'.

O.sr · ············, ..,,·,

_--,_ _ .. .lIUll beam paIlem

,:

~0.51- · · · · · · · · · · · ·· ·:· ·1

.. .. , ... .

...~

I:

.;

~

a:

I~

BW=30

I:

0.91-·· ·······.. ··:· ,· ·/

I 150

200

r.

250

N. Number of Users per Cell

300

350

4

400

5

6

Fig . 4. Actual and approximate beam pauerns .

Fig . 3. Uplink outage probability as a function of beamwidth .

for 0 .01 outage probability, the uplink system capacity goes up from 31 users per cell for the single antenna case to about 320 users per cell if we use an arra y. such that we have beams with beamwidth of 30 ° . Also . to evaluate the accu racy of the approximation we used. we simulated the system (based on (14) - (16) ) with a cell site circular antenna array with nine elements and radiu s equal to AI:" corresponding to half-power beamw idth of 4:"° . In fact. the beamwidth was taken to be slightly more than 42 ° to account for the interference energy picked up through the side lobes of the array pattern . Fig . 4 show s the actual array pattern versus the app rox imate pattern. In Fig. 5 . we plot both the outage probabil ity computed from (22 ) and from simulations. which indicate good agreement between the simulation results and the approximation . For the downlink , results for the outage probability were obtained by simulations based on (31 )-(33) . A circular array with one, five , and seven elements and A/2 spacing was used in the simu latio n. For all other parameter values above. the histogram of Eh/(N" + U is obtained for each M and N value from 20 .000 runs. In each run , 19 Rayleigh random va riables with mean squa re value that hav e a log-normal di stribution with o, = 8 dB are generated, and the maximum of the se is taken to be that of the desired mobile's cell site . Also. we assume that the mobile is positioned on the boundary between cells . which represents a worst case situation . Some of the generated histograms of G, + G2 are shown in Figs . 6 , 7 , and 8 . The generated histograms are used to est imate the probability of outage as function of cell loading and number of sensors. These results are summarized in Fig . 9 , which also shows a many fold increase in capacity by using an tennas to form narrow beams towards the desired user. Note that as we mentioned before, if orthogonal codes are used on the downlink and in the case of no rnultipath , interference will be primarily due to outside cell interference and the corresponding cell loading N at which outage will occur will be larger.

10·

.---,---:--.----:-:0----.----.....---,---, BW~49

Degrees

: Simulations

_

EquatIon 21

(;

o c:i

a: ~

w

~

Ii:

449

10"

i

1,"'

, 1

! 10''-[---=-":-----:-'-::-----:-'----'---:-------:-------:' 190 200 210 220 230 240 250 260 !

,

I

N, Number 01Users per Cell

Fig . 5. Uplink outage probabil ity : simu lation vs . approximat ion.

3500 Number of Mobiles per Cell, N-30

3000

Number 01Array Sensors . M, l Number oj Runs. R,20000

2500

~20oo

i !

. ..

"' 1500 1000

..

500

..... .

0

Jl

o

5

. ..

"'

..

10

15

Innnnn. 20

25

Interference Power

Fig . 6. Histograms of G, -;- G2 for M

30

= I . N = 30 .

35

40

V.

4000 3500

.Number 01Mobiles psr CeU. N~130

3000

.Number 01Runs. R=20000

Number of Array Sensors. 1.1=5

2500

rr:

0 1500 1000 500

5

"n

10

nnnnn~_

15 20 25 Interference Power

Fig . 7. Histogram of G, + G, for M

35

30

40

REFERENCES

= 5. N = 130.

4 500 r---.---..,------,--~--.._-_._--..,_-___r--...,

I

Numoer of Mobiles per Cell. N=170

4000 "

Number 01Array Sensors. 1.1=7 Numoerol Runs. R=20000

3500,. 3000 -

!d 2500"

;;; cr ~

c" 2000 ~ : 500 1000 500-

20

25

Interference Powe r

30

35

40

45

FIg . H. H"tngral1l of G , + G, for JI = 7. .V = 170

I

Ou tag e Probablhty vs. Number a t Sensors

!

r M=7

50

100 150 N. Number 01MObilesper Cell

CONCLUSIONS

We have studied the capacity improvement for CDMA cellular communications systems with base-station antenna array for both uplink and downlink. The outage probability was evaluated as a function of cell loading, array parameters, fading and shadowing effects, and voice activity . Our analytical and simulation results show that there can be a substantial increase in system capacity by incorporating antenna arrays at the base-station. Our approach uses spatial processing to determine the dynamic spatial wavefront at the cell site and constructs a robust beamformer. Our model, used in this paper, does not include the effects of multi path which will be presented in a different paper.

200

Fig . 9 . Downlink outage probability vs. number of array sensors .

250

[11 K. S. Gilhousen. I. M. Jacobs . R. Padovani, A. Viterbi. L. A. Weaver. and C. Whe atly , "On the capa city of a cellular CDMA sysrem," IEEE Trans. Veh. Techn o! .. vol. 40. no . 2, pp. 303-312 . May 1991. [21 A. M. Viterbi and A. 1. Viterbi . "Erlang capac ity of a power controlled CDMA system," IEEE J. Sel ect . Areas Commun . . vol. 11. no . 6 . pp. 892 -900 . Aug . 1993. [31 B. Sua rd. A . Naguib . G. Xu . and A . Pau lraj. "Performance anal ysis of CDMA mobile: co mmunicatio n systems using antenna arrays," in Proc. 1C.~SSP ·93. vol. VI. Minneapol is. MN . pp. 153-156. April 1993 . 141 A. F . ~aguib. \_~ . Paulraj . and T. Kailath , "Capacity improvement with base-station antenna arrays in cellular CDMA . " in Proc. 27th Asil omar COil! 0 11 Signals. Systems and Computers: vol . II. Pacific Grove. GA . pp . 1437-1441 . Nov . 1993 . 151 1. C . Liberti and T . S Rappaport . " Reverse channel performance improveme nts 10 CDMA cellular com munication sys tems employing adapt ive antennas ;" in Proc. GLOBECOM ·93 . vo l. I. pp . 42 -4 7 . 1993 . 16J V. Wee rack ody , " Diversity for the direct-sequence spread spe ctrum sys tem using multiple transmit antenna s. " in Proc. ICC'93 . vol. Ill. Gene va. Switzerland , May 1993 . (7J P. Balaban and J. Salz , " Optimum divers ity combining and equalization in data transmission with application to cellular mob ile radio Part 1: Theoretical considerations," IEEE Trans . Commun .. vol. 40. no . 5. pp. 885-894. May 1992. 18\ S. C. Swales . M. A. Beach. D . J . Edwards. and J . P. McGeehn . "The performance enhancement of multibeam adaptive base station antennas for cellular land mobile radio systems," IEEE Trans. Veh. Technoi .; vol. 39 . no . I. pp . 56-67 , Feb. 1990 . (9] J. Winters. 1. Saltz . and R. Gitlin . " T he capacity of wireless com municat ion systems can be substantially increased by the use of antenna diversity. " in Proc, Can! on Inf ormation Scien ce and Syst ems . vol. II. Princeton . NJ. pp . 853-858 . Oct. 1992. [10] S. Anderson. M. Millnen. M. Viberg . and B. Wahlberg , "An adap live array for mob ile communication systems." IEEE Trans . Veh , Techno!. . vol. 40. no . I. pp . 230-236. Feb . 1991. [II J R. Kohno. Hi-lmai. and S. Pasupathy , " Combination of an adaptive antenna array and a canceller of interference for direct-sequence spread spectrum multiple-access system." IEEE J. Select. Areas Commun., vol . 8. no . 4. pp. 675-682. May 1990 . [12] R. C . French. "The effect of fading and shadow ing on channel reuse in mobile radio ." IEEE Trans . Veh. Techno! .• vol. VT-28 , no . 8, pp. 171-181. Aug . 1979 . [131 R. O . Schmidt. "A signal sub space approach to multiple-emitter 10cal ion and spectral estimation ." Ph .D . dissertation , Stanford Uni v . . Stanford . CA 94305. 1981. [14] A. Paulraj . R. Roy. and T. Kail alh. " Estimatio n of signal parameters by rotational invariance techniques (ESPRIT)." in Pro c. of 19th Asilomar Can! on Circuits , Systems and Camp . , 1985. [15] W.- P. Yung , "Direct sequence spread-spectrum code-division-multiple access cellular systems in Rayle igh fading and log-normal shad owing channel," in Proc. ICC'91. vol . II. pp . 871-876.1991. (16) G. Xu. H. Liu, W. Vogel . H. Lin. S. Jeng, and G . Torrence, "Ex-

450

perimental studies of space-division-multiple-access schemes for spectral efficient wireless communications, " submitted to SuperCom/ ICC'94, May 1994. [17] J. H. Winters, "Signal aquisition and tracking with adaptive arrays in wireless systems," in Proc. 43rd Veh. Technol. Conf., vol. I, pp. 85-88, Nov. 1993. [18] D. Gerlach and A. Paul raj ..• Base-station transmitting antenna arrays with mobile to base feedback," in Proc. 27th Asilomar Con! on Signals, Systems and Computers, Pacific Grove. CA, pp. 1432-1436. Nov. 1993. [19] Y. Akaiwa, •. Antenna selection diversity for framed digital signal transmission in mobile radio channel," in Proc. VTC'S9, vol. L pp. 470-473, 1989.

451

Analytical Results for Capacity Improvements in CDMA Joseph C. Liberti, Jr., Student Member, IEEE, and Theodore S. Rappaport, Senior Member, IEEE

Abstract-In this paper, we examine the performance enhancements that can be achieved by employing spatial filtering in code division multiple access (CDl\'IA) cellular- radio systems. The goal is to estimate what improvements are possible using narrow-beam adaptive antenna techniques, assuming that adaptive algorithms and the associated hardware to implement these systems can be realized. Simulations and analytical results are presented which demonstrate that steerable directional antennas at the base station can dramatically improve the reverse channel performance of multicell mobile radio systems, and new analytical techniques for characterizing mobile radio systems which employ frequency reuse are described using the wedge-cell geometry of [1). We also discuss the effects of using directional antennas at the portable unit. Throughout this paper we will use phased arrays and steerable, fixed pattern antennas to approximate the performance of adaptive antennas in multipath-free environments.

by battery consumption at the portable unit, therefore there are limits on the degree to which power may be controlled. Finally, to maximize performance, all users on the forward link may be synchronized much more easily than users on the reverse link [6]. Adaptive antennas at the base station and possibly at the portable unit may mitigate these problems. In the limiting case of infinitesimal beamwidth and infinitely fast tracking ability. adaptive antennas can provide for each user a unique channel that is free from interference. All users within the system would be able to communicate at the same time using the same frequency channel. in effect providing space division multiple access (SDMA) [7]. In addition, a perfect adaptive antenna system would be able to track individual multipath components and combine them in an optimal manner to collect all of the available I. INTRODUCTION signal energy [81. In this paper. we will investigate the URRENT day mobile radio systems are becoming effects of spatial filtering by simulating a phased array and congested due to growing competiton for spectrum. by simulating antenna patterns with fixed patterns but adMany different approaches have been proposed to maxi- justable boresight angles. Furthermore. multipath is not mize data throughput while minimizing spectrum require- considered. ments for future wireless personal communications serClearly ~ the perfect adaptive antenna system described vices [2], [3]. One way to increase capacity without added above is not feasible since it requires infinitely large anspectrum is to reduce cell sizes [4]. For this reason, cell tennas (or alternatively ~ infinitely high frequencies). This sizes in emerging cellular communication systems are raises the question of what gains might be achieved using much smaller than cells used in land mobile cellular sys- reasonably sized antenna arrays which operate at UHF and tems designed previously. This, however, also leads to microwave frequencies. increased infrastructure (base station) costs. Furthermore. While both TDMA and COMA systems have been proto maximize capacity in CDMA systems, power control posed for emerging personal communication systems. is required [5]. COMA is more naturally suited to the pseudo-SDMA enThe reverse link (the link from the mobile unit to the vironment. This is because co-channel users do not have base station) presents the most difficulty in COMA cel- to be synchronized with each other in a CDMA system. lular systems for several reasons. First of all, the base As the advantages of SDMA are realized, the interference station has complete control over the relative power of all levels seen by each simultaneous CD~'lA user drop. and of the transmitted signals on the forward link; however, the bit-error performance will improve for each COMA because of different radio propagation paths between each user. On the other hand, when no SDMA is achieved, user and the base station, the transmitted power from each CDMA performance is no worse than the case where omportable unit must be dynamically controlled to prevent nidirectional antennas are used at both the base station any single user from driving the interference level too high and the portable unit. In a single cell TDMA system, users for all other users [1]. Second, transmit power is limited must be reassigned to new time slots to take any advantage of SDMA. Manuscript received September 30. 1993~ revised March 31. 1994. For interference limited asynchronous reverse channel The authors are with the Mobile and Portable Radio Research Group. CDMA over an additive white Gaussian noise (AWGN) Bradley Department of Electrical Engineering. Virginia Tech. Blacksburg. channel, operating with perfect power control with no inVA 24061. IEEE Log Number 9403205. terference from adjacent cells and with omnidirectional

C

Reprinted from IEEE Transactions on Vehicular Technology, Vol. 43, No.3, pp. 680-690, August 1994.

452

antennas used at the basestation , the bit error rate (BER),

E1em~ IO

Pb • is approximated by [6]

(1.1)

where K is the number of users in a cell and N is the spreading factor. Q(Y) in (1.1) is the standard Q-function. the probability that y > Y when y is a zero-mean. unit variance, Gaussian distributed random variable . Equation (1.1) assumes that the signature sequences are random and that K is sufficiently large to allow the Gaussian approximation described in [6] to be applied . To il\ustrate how directive antennas can improve the reverse link in a single cell CDMA system, consider the case in which each portable unit has an omnidirectional antenna, and the base station tracks each user in the cell using a directive beam. Assume that a beam pattern, G(
ALFN_l.i QN ' I.,n

Q N. I.OUl

Fi g. 1. A generalized adapti ve antenna a rray w ith N ele me n ts . The inputs from ea ch antenna are m ixed do wn to a n intermed iate frequen c y a nd di vided int o I and Q co m po ne nts . The I and Q co m po ne nts fro m eac h antenna are tiltered by an ad aptive line ar filte r (A LF' J is the ALF correspond ing to the ith e le me nt and the jth user ). Th e I o u tputs from each ALF a re su m med to provide I,," ,. Similarl y th e Q o u tpu ts from each ALF are su m med to provid e Q"" . I ,,", and Q,,", form the sig na l wh ich is avail able to the rec ei ver.

Fig. 2 . An ideal ize d tla t-to p power pun ern With a 60 ° beam wid th a nd a - 6 dB side lob e lev e l . Th is pat tern ha s no variat ion In the /) dire ction (the ele va tio n plane) for 0 s < 11'. T hrs coo rd ina te svstem is used throughout ' th is paper.

a

Fig. I) of the base station antenna array. which is steered

to user O. is given by

f = E

K- I [

I ~I G(cf>;)P r .

] 1

( 1.2)

where
453

( 1.3)

Assuming that users are independently and identically distributed throughout the cell. the average total interference power received at the central base station may be ' W h ile this work considers adaptive a nte nnas at the ba se sta tio n . power co n tro l could be im ple me nted using a reference omnidirectional antenna at the base statio n to receive all mobile signa ls .

expressed as

I

= Pc(K -

Jro Jro R

1)

21r

ftr, cp)G(cp) dcp dr

(1.4)

where f(r, cp) is the probability density function describing the geographic distribution of users throughout the cell. Assuming that users are uniformly distributed in the cell, we have

I =

r,

(K -

211"

I)

r:!7r

Jo

G(cp) dip,

( 1.5)

The directivity of an antenna which has no variation in the () direction is [11]

21r

( 1.6)

Therefore the average total interference seen by a user in the central cell is given by

I

=

Pc(K - 1) D .

In order to develop simple bit error rate simultaneous asynchronous interference users when directive antennas are used, the bit-error-rate expression of (1. I) can

Ph ~ Q(.J3,V x CIR)

( 1.7)

expressions for limited CDMA we assume that be expressed as (1.8)

pattern shown in Fig. 2 with a side lobe level of 0.25 and a main beamwidth of 60 0 , the directivity of the antenna is 2.67 or 4.3 dB. The bit error rate with the directive antenna at the base station is 2.5 x 10- 5 , a BER improvement of two orders of magnitude. This example illustrates the possible improvements that can be achieved using adaptive antennas at the base station. In the remainder of this paper, we remove the constraint that users in adjacent cells do not inerfere with the received signal, and develop a general analysis technique which is confirmed by simulation. Section II describes analytical techniques used to detennine bit error rates in cellular CDMA systems employing adapti ve antennas. Section III presents simulations in which we compare the performance of five base station antenna configurations, three of which use adaptively steerable antennas at the base station. It is assumed that the portable units use omnidirectional antennas. We also compare the simulation results with the analytical results developed in Section II. In Section IV, the effects of adaptive antennas at the portable unit are examined using several different base station configurations. Furthermore. we demonstrate the two distinctly different effects achieved by using directive antennas at the portable unit versus using directive antennas at the base station. Finally. Section V summarizes the results of this paper.

where N is the spreading factor. and CIR is the ratio of the power of the desired signal to the total interference. In (1.8), it is assumed that M interfering users. each with a received power level of P/M. have the same effect on bit error performance as one interfering user with a received power P. This assumption is known to be inaccurate when the powers of users are widely different and when the number of users is small [12]: however. it provides first order approximation for the case of a large number of users. Using the fact that the power of the desired signal, weighted by the array pattern. is P; and using (I . 7). the bit error rate for user 0 is given by

II.

REVERSE CH:\:'-J~EL PERFOR~1A~CE \VITH ADAPTIVE ANTEN~AS AT THE BASE STATIO!\

The use of adaptive antennas at the base station receiver is a logical first step in improving capacity for several reasons. First of all . space and power constraints are not nearly as critical at the base station as they are at the portable unit. Second. the physical size of the array does not pose as much difficulty at the base station as at the portable unit. Note that adaptive antennas may also be used at the base station for directing energy in the forward channel. in which case the analysis is similar to the reverse channel case because of the perfect power control assumption. The only difference on the forward link is that interferers are (1.9) other base stations. rather than portable users. Since the transmitter and receiver typically operate in two different Thus, (1.9) holds for any single cell system with perfect frequency bands in a duplex manner, the adaptive antenpower control when base station antenna pattern which nas at the base station transmitter would be adjusted by has no variation in the (J direction. Equation (1 .9) is useful performing a transformation on the tap weights adapted in showing that the probability of error for a CDMA sys- for the receiver, and copying the new weights to the transtem is related to the beam pattern of a receiver. If we use mitting antennas [9]. This is reasonable if an assumption the idealized antenna pattern illustrated in Fig. 2 to ap- of retrodirectivity on similar frequency bands is apporproximate a realizable directive antenna pattern then it is priate. If the multipath components arriving in the reverse immediately apparent that the gain of the antenna directly channel do not have the same angles of arrival as those in contributes to the performance of a CDMA system. For the forward channel, then it is no longer appropriate to instance, if K = 250, and N = 511, with omnidirectional derive the transmitter tap weights from the received sigantennas at the base station, an average bit error rate of nal. 6.6 X 10- 3 is obtained per user. Using the flat-top beam Equation (1.9) is only valid when a single cell is con454

sidered. To consider the effects of adaptive antennas when CDMA users are simultaneously active in several adjacent cells , we must first define the geometry of the cell region . For simplicity, we consider the geometry pro posed in [I] with a sing le la yer of surrounding cell s, as illustrated in Figs . 3 and 4 . Let d i • j represent the d istance from the ith user to base j as illu strated in Fig . 3 . Let a,« represent the distance from the i th user to ba se sta tio n O. the center base statio n . Assume that path loss in dB between user i and ba se j is given by a sim ple di stance dependent path loss rela tion ship suc h that the pow er received at base station j , fro m the tran sm itter of user i . Pr . i . ]> is give n by

Pr . i . ]· =P T.I

(47fd ) (di.J A

--

drcl "

-'

-

ref

(2. 1)

Fig . 3 . Th e wedg e ce ll geo me try proposed in [I I.

where II is the path loss exponent typically ranging between 2 and 4. and d rd is a clo se-in reference distance

[u.

If we assume that perfect power co ntro l is appl ied to the i th user. and all ot he r users in ce ll i. by base j , suc h that power P, :) is rece ived as base i . then the power tran smitted by user i . Pr l • is give n by P

=P /.I

'·1

(47fd"'I)~( ~)"

(2.2)

A d""

The power receiv ed at ba se statio n 0 from user i . g ive n by

P r:l 1l

2R,:

is \

(2.3) Sub st ituting (2 .2 ) into (2 .3). the power recei ved at base user i . in adjacent ce ll. j . is g iven by

Fig . ·L Geo metry fo r detc rrruning d r , ) as a func tio n o f tiL'" the dis ta nce betwee n user i and the centra l ba se stati on and ~ . u . the ang le of the use r i rela tive to the line betwe en the ce ntral base sta tio n and base sta t ion } .

o from

Pr:I .1l = P, OJ ( dd~ ) " I .U .

jacent cell is given by (2 .4 )

To analyz e (2 .4) . we co ns ide r the geo me try sho wn in Fig . 4. From the law of cosi nes .

Sub stituting (2 .5) into (2 .4) . the po wer received at ba se o from user i is given by P r .1.0 =

r .,

2R)2 4R ),,/2 ( I + (d - d cos l{!r.O 1.0

(2.6)

1. 0

To determine the average out-of-cell interference power inc ident on the central ba se statio n . we ass ume that users are uniforml y di stributed in a typi cal adjacent cell from r = R to r = 3R and fro m l{! = to Thus , we use a modified geometry from [I) where eight equal are a cells surround the center cell . The probability density function (pdf) for the spatial di stribution of users in a single ad -

-7f/8 7f/8.

455

fe r. l{!)

=

r -R'

7f -

R < r < 3R:

(2.7) Let X represent the e xpec ted value of the inte rfe rence power from a single user in one of the adjacent cell s when om nid irectio nal base station antennas are used .

))/12 'R)2 4R (( I + ( -r - ~ cos l{! dr dl{!

(2 .8)

If it is assumed that all nine base stations control power such that Pc: j = Pc, then given a value of n, we can ex press the expected value of central cell interference power for a single adjacent cell user as X

=

{3P e

(2 .9)

where (3 =

[3R [11"18

JR J-1C18 Iir. 4R - -;: cos cp

({J)

((

(2R)2 1 + -;:

))n/2 dr dcp

VALUES OF

(2.10)

No No + Na1M 1

!=---

(K -

l)Pc

for

K»

:::

= (K - l)P c + 8K{3Pc-

e

2

0.14962

3

0.08238

4

0.05513

E[Pr~I.O

10

<

<

r

R]

r r

21r

R

= Pc

r 2 G(lP) dr dlP

Jo Jo

wR

= P,

(2.14)

D

where D is the directivity of the beam with pattern G('P) and the average received power at the base. Pr : i . O from an interfering user in the central cell is directly a function of the base station directive gain. Then the average interference power at the array port of the antenna array at the base station, as shown in Fig. 1. due to a single user in an adjacent cell is given by

E[P r : 1•O IR

< r < 3R]

1 7 i·~R 1~:~ =-2: 8 p =0

R

-

. p( ( I + (

s

7r

(

P7r) r 'P+- --~ 4 7rR-

G

2rR)2-

4R - cos

,.

(~)

)" .: dr d4P. (2.15)

(2.12)

where we have assumed that there are K users in each of the nine cells. For 11 = 4, from Table I, {3 = 0.05513, and, from (2.12), f = 0.693, implying that 31 % of the interference power received at the central base station is due to users in adjacent cells. Note that, when omnidirectional antennas are used at both the base station and the portable unit, the value of the reuse factor . f.. is determined by the cell geometry, the power control scheme . and the path loss exponent. When omnidirectional antennas are used at both the base station and the portable unit, the total interference seen on the reverse link by the central base station is the sum of the interference from users within the central cell, (K - l)Pc ' and users in adjacent cells, 8K{3Pc-

I

n

EXPONENT, n AS

1

--1 + 813

1

Loss

(2.10)

is thus given by

(2.11 )

where No is the total interference, seen by a desired user in the central cell, at the central base station on the reverse link, Nat is the total interference seen "by the desired central cell user from all users in a single adjacent cell. M I is the number of cells which are immediately adjacent to the central cell, which is always eight for the geometry considered in this paper. This reuse factor is a measure of the impact of users in adjacent cells on the performance of the link between a user in the central cell and the central base station. When power control is perfonned as described in this section, such that the power received from each mobile unit in the base station controlling that unit is P, . then (2.11) may be expressed as

(K - l)P c + 8K{3Pc

TABLE I A FUNCTION OF THE PATH DETERMINED BY

Table I lists the values of (3 for several values of n. When omnidirectional antennas are used at both the base station and the portable unit, 13 is related to the reuse factor, [, which is defined in [1], for a single layer of adjacent cells, as

f =

(3 AS

Here a special case is considered. If G( 'P) is piecewise constant over the region C2p - 1)(7r/S) < ~ < (2[1 + 1)(7r/8) for p = 0 . · . 7 . then the antenna pattern may be expressed as 7

G(lP)

')~() G"V

V(lP) =

l~

(_

lP

P1r)

(2.16)

4

where (1.17)

Substituting (2. 16) into (2. 15) . we obtain.

(2.13)

Let us assume that for the mth user in the central cell, an antenna beam from the base station with pattern, G«({J), may be formed with maximum gain in the direction of user m. It is assumed that perfect power control is applied such that all base stations controlling reverse link received power to the same level, Pc. The average interference power contributed by a single user in the central cell

456

E[P r :1 •O IR < r < 3R]

= P; -1 2:7

8 p =0

1 iit'/H 3R

Gp

4R - -;: cos (c,o)

R

)n/2

-1r/8

r

-.,

7rR-

dr de,

(

I

+

(2 R)2 r

(2.18)

The directivity of the antenna pattern described by (2.16) is

8 D =7 - -

2: c,

p=o

(2.19)

Therefore, (2 .18) may be rewritten , using (2 .10) and (2 .19) , as E[Pr:i.O I R

< r < 3R] =

PJ3 D

(b) Sectorized

(2 .20) 18

It can be shown that (2 .20) remains valid when the beam pattern, G('P) , is rotated in the 'P plane . Therefore (2 .20) is appropriate when G('P) is piecewise constant over (2p - I) (1r/8) < 'P - 'Pel < (2p + 1) (71"/8) for any angle 'Pel between -1r/8 and 1r/8 . Using (2 .20) with (1. 7), the total interference power at the array port (in Fig . 1) of the center base station receiver is given by

1=

(K - I)P, + 8KPJ3 D

.

(e) Flat- topped

For K

»

3ND K(\ + 8/3)

270

2:'0

(2 .23)

r..

111.

SIM ULATION OF

ADAPTIVE A:-
REV ERSE CHANNEL PERFORM AN CE

271"

= ----):" G('P) dip

Fig . 5 . The rive base statio n antenna patterns used in this study. These patterns arc shown for the case when the desired user is at an angle 'P = 600 from the X axi s . Shown here are (a) the omnidirectional pattern, (b) the 120 0 secton zed pattern . (c) the tlar-topped patt ern. (d) the three clement binomial pha sed array (re ferred (0 ;IS the .. ;ldaptive " pattern in this pape r i . and (c r thc binomial phased arra y pattern overlaid with a I:W o se ctorrzunon patt ern (re fe rred to as thc " aua pt;ve -se ctorized " pattern) .

The second configuration. illustrated in Fig . 5(b), used 120 secto rization at the base station . In our model. the base statio n used three sectors. one covering the region from 30 0 to 150 0 , the second covering the region from 150 0 to 270 0 • and the third covering the region from -90 0 to 30 The first sector is illustrated in Fig. 5(b) since this sector would be active when the desired user is at an angle of 60 0 • In this system. only interfering users within view of the same secto r as the desired user were included in the CIR calculation. The effective gain of this antenna is 4 .8 dB. The third simulated base station configuration, shown in Fig . 5(c) , used a "flat-topped" beam pattern similar to that shown in Fig . 2 . The main beam was 30 0 wide with uniform gain in the main lobe . Side lobes were simulated by assuming a uniform side lobe gain which was 6 dB below the main beam gain . From (1.6), the directivity of this beam is 5 .1 dB . The fourth configuration , which used a simple three element linear array, is illustrated in Fig. 5(d). This is the beam pattern formed by a binomial phased array with elements spaced a half wavelength apart . The axis of the array is in the 'P = 0 0 direction . Like a111inear arrays, this array exhibits a pattern which is symmetric about the axis of the array (the X-axis , as shown in Fig. 2) , therefore a 0

0

To explore the utility for (2.23) and to ver ify its accuracy . we considered five ba se station antenna patterns which are illustrated in Fig . 5 . These antenna patterns are assumed to be directed such that maximum ga in is in the direction of the desired mobile users . The first-base station antenna pattern is an omnidirectional pattern which models that used in traditional cellular systems . This configuration . shown in Fig . 5(a) was used as a model for standard omnidirectional systems without adaptive antennas . In order to make a fair comparison between the effects of various antenna types on bit error rate as a function of directivity, and given the fact that the simulations were performed in two dimensions only. antenna gains cited in this section are defined by (1 .6) which is restated here :

D

270

(2.22)

Equation (2 .23) relates the probability of error to the number of users per cell . the directivity of the base station antenna . and the propagation path loss exponent through the value of p. It is assumed that perfect power control is applied as described in Section 1. with all base stations controlling reverse link received power to the same level.

BA SE S TATION FOR

(e) Adap tive-Seetorized 90

I . Ph is approximated by

3ND \ K(l + 8J)} '

270

90

(2 .21 )

Substituting (2 .21) into the (1. 8). using the fact that the desired signal power at the array port is Pc' we obtain an average bit-error probability for the CDMA system em ploy ing a piecewise constant directive beam : PI> == Q (

270

(3.1)

457

•

mirror image of the main beam is also present as illustrated in Fig. 5(d). This array is not capable of adaptively nulling interfering signals; therefore we expect the perfonnance of this array to be poorer than that of a truly adaptive system. On the other hand, we did assume that the array was able to direct the one of the two main beam components in the direction of the desired user. For each desired user, the phase was computed for each element of the array and the new beam pattern was formed at the center cell base station. While the three-dimensional gain of a binomial phased array is constant at 4.3 dB regardless of scan angle, the two-dimensional gain defined by (3. 1), which is more appropriate for comparison given our assumption of users in the horizontal plane only, varies between 2.6 and 6.0 dB, depending on scan angle, with the higher gain corresponding to broadside scan angles. The pattern for the fifth simulated base station configuration, a sectorized adaptive antenna, is shown in Fig. 5(e). Beginning with the sectorizing system whose pattern is illustrated in Fig. 5(b), we added a three element linear phased array to each sector. The linear array for each sector is aligned such that the broadside direction is in the same direction as the center of the sector. This base station configuration actually uses a total of nine elements, however, only three elements are used to track any given user. For example, in Fig. 5(e), the desired user is at an 0 angle of

was calculated from Desired Signal ~

C1Ri = K - l

l:

n=O

Pi.O.O 8

Pn .O•O +

K-l

(3.2)

l: l: r..:

m= l n=O

~~ In-cell Interference

Out-of-cell

Interference

The bit error rate for the i th user in cell 0 on the reverse link was determined by first calculating the CIR for the ith user from (3.2) then using that value in (1.8), which is restated here:

Ph., = Q( ~3N x CIR,)

(3.3)

where N is the spreading factor. For each of the simulations performed in this study, a spreading factor of N = 511 was used. It was assumed that any portable unit in the nine-cell region (except for the desired user) contributed to the interference level of the desired user in the central cell. This calculation was carried out for every user in the central cell and the resulting bit error rates were averaged to obtain an average bit error rate for the cell. For instance. if there were 1700 users in the nine cells and 300 users in the central cell . then the bit error rate was determined for the 300 users in the central cell. and 2699 interfering users contributed to each CIR computation. Each base station configuration wassimulated for user densities ranging from 15 to 500 users per cell, in steps of 25. Fig. 6 shows average bit error rates resulting from the simulation for the five previously described antenna patterns for several values of path loss exponent. n . The three element linear array. whose pattern is shown in Fig. 6( d) . was able to achieve almost an order of magnitude improvement in BER despite the large backlobe , By eliminating the large back lobe. but still retaining significant side lobes. the flat-top pattern. shown in Fig. 6(c). achieves a BER which is better than two orders of magnitude less than the BER when omnidirectional antennas are used at the base station. with fewer than 200 users pe r cell. The average bit error rate alone is not a sufficient metric of system performance. Rather. the distribution of BER·s over the user population is a second-order measure which provides insight about the performance of a CDMA cellular system. Fig. 5 relates the average BER to the BER which is not exceed by 50, 90, 95, and 99 % of the users. Note that for a given bit error rate, two to four times as many users many be supported using directional antennas as for omnidirectional antennas. It is useful to note that these increases in performance were made by applying relatively modest requirements to the base-station adaptive antenna. The flat top antenna was specified to have a 30° beamwidth and a side lobe level that was only 6 dB below the main lobe. It should be noted that these bit-error-rate improve-

458

TAB LE II R ELATIONSHIPBETW EEN THE AV ERAGE BIT ERROR R ATE. AND P WHERE P,..r IS DE FINED SUCH THAT X% OF THE USERS IN THE C ENTRAL C;L~ HAVE A BIT ERROR RAT E WHICH IS L ESS THAN P,. ,. THI S IS FOR THE C ASE OF K = 200. AND A PATH Loss E XPONENT OF II = 2. N OTE THAT TH ERE IS A M UCH WIDER RA NGE OF BI T ERROR R ATES FOR THE H IGHER G AI N ANTENN AS. FOR EX AMPLE. 2 USERS OR I % OF THE USER POPULATION E XPERIENCED A BER WHICH WAS W ORSE THAN 1.5e-3 WHEN THE SECTORIZELJ ANTENNA PATTERN WAS C ONSIDERED

le

Pe.sO

Pe.90

Pe.95

Pe.99

Omn i

3.0e-2

3.1e- 2

3.2e-2

3.2e-2

3.2e -2

Sec torized

6.le -4

5.3e·4

LOe-3

l.le-3

L5e- 3

Adaptive

2.ge -3

2.6e -3

6.4e-3

7.2e -3

77e- 3

Flat- lopped

4 .0e-4

3ge-4

5.2e-4

5.6e -4

6.5e -4

Adapt ive-Se ctori zed

16e-7

6.5e-8

J Oe- 7

4.7e ·7

2.4e-6

--''----'-

--'=

Ic " r--;;:-:;--.--~--===

=

Scelonud -- Ad_pll" _ . R.... c»pprd

Ie- I

.( le.J

~Ie-

.

.<'e-J

. . . ...~ ... .

.'

100

Avg BER

Base S tation Pattern

- -, 'd'r_--:Om n

200 )00 400 Numb::r of Users p:r Ce ll ( K)

SOO

(, l D=2

Idr-----.--~-~=~

- Omn i Stdoriud Ie: _ . Ad_ptlwt

•.

F'J.I .lopp~d

Ic:. J

. .. -

~ 1C: .2

....-;. :-;.:.: ..~:: ..

te-a 1, -

'01,--- - - -,-- - -== = - Omnl le

re. I

100

"'. ,-------,-~--===

S«totirrd •• Ad_pUn . FlaHoppRt

200 JOO .too Num ber DCL' SEf'1 oer Ce ll \K l

500

(C) a=J

• • Adapthtl ,s«luflred

~ le - 2

Fig . 7. PIOls o f an al yt i ca l result s usi ng equation 2 .23 w ith two-di me nsiona l di rec ti v it ies of 1.0. 2 .67 . 3.0 . and 3 .2 f o r the o m n i . ada pt i ve. secton zed and flat -topped patt crn s, res pec tive ly .

,-

lelr----,,-----.--~--~---, IO}

nn

~OO

:'O um l"o:: l o ll...:1S rocr \ 'd l I K,

un

\ 00

I Ul

n=2 - . n=J

: 00 )00 .&00 :-
lea - -. n=4

l b) n= I

(Ill na 2

Omnt Base Station

" "" r_ ~ll~mn , ---'---== k l

..

<>:

S«tOf"lu d

Ie ·1

UJ

"d.rlll~

CD 1e -2

F1.H~pf"d

.('"

• • A d . p ll "? ,S f'( Irn"' l rd

Ie-3 ;' It . :

Ie-4

l r·1

le_5L-..Lu~'-L-:...u..,---

100

100

S u.." tEr of

t ~ 'Ilr n

(c) 0=4

; 00

JOO

rer l -elJ '" I

100

200

--"""'--....J 400 500

300

Number of Users per Cell (K)

~~

Fi g . 8. BER fo r the o rn ru and flat -topped beam sys tem s as a f u nctio n o f II .

Fig . 6 . B ER usrng ad ap t i ve anten na s at the base stuno n tor l a l 1/ = 2 . ( b l II 3. and (c) /I = ~ . These resu l ts were develo ped throug h srrnulano n by averaging the BER III every u ser," th e ce nt er cell

=

ment s are primarily due to the dire ctivity of the antenna arra y. The improvements are also dependent on the geographical distribution of interfering users . but in the case of uniformly distributed users . as noted in Section II. the improvement is approximately equivalent to increasing the carrier-to-interference ratio by the gain of the d irectional antenna. Even more drastic improvements were available when sectorization was combined with the adapti ve antenna approach . Adding the three element array to the secto rized system, as shown in Fig . 5(e) . provided a reduction in BER of th ree orders of magnitude for 200 users per cell. Fig. 6 shows results calculated result s from (2.23) for four of the antenna patterns shown in Fig . 5 . By comparing Figs. 6 and 7. it can be seen that for omnidirectional antennas. 120 0 sectorization , and the flat-topped pattern, the calcul ated bit error rates from (2.23) matches the simulation results exactly , even for a relativel y small number of users (K = 25 , 50) . For the case of the binomi al phased

arra y . the analytical results for Ph are optimistic by almost an order of magnitude when K < 200 for all values of n. For K > 350, the anal ytical result s for Pb are only smaller than the simulation results by a factor of 0 .3 or less . Unlike the omnidirectional. sectorized, and flat-topped pattern s. the binomial phased arra y did not exhibit constant two -dimensional gain as a funct ion of scan angle. Therefore . the use of the three-dimensional directive ga in as an " average " gain in (2 .23) is an approximation. By comparing Figs . 6 and 7 it may be concluded that a smaller value of average directive ga in might result in a better match between the simulated and analytical results for the binomial phased array . Nevertheless, these figures demonstrate the accuracy of (2 .23) when compared with extensive simulations . As noted in [13], use of a path loss exponent of n = 4 can result in overly optimistic estimates of system capacity and performance . The different base station antenna configurations demonstrate vary ing sensitivity to the path loss exponent , n. As illustrated in Fig . 8, the flat-topped beam system is highly sensitive to changes in the path loss

459

...,.........,.......,,.....---......,..--..-...

exponent. This is reasonable to expect since , when the CIR is large , the bit error rate is more sensitive to rela tively small changes in interference power. IV. SIMULAnON OF ADAPTIVE ANTENNAS AT THE

...,.......----..-.---,.-...,

Ie

PORTABLE UNIT TO IMPROVE REVERSE CHANNEL

Ie- ..

PERFORMANCE

100

In this section, we examine how the reverse channel is affected by using adaptive antennas at a portable transmitter. A flat-topped beam shape, as illustrated in Fig. 2 , was used to model an adaptive antenna at the portable transmitter. Since space is extremely limited on the portable unit, the gain achievable by the portable unit antenna will be considerably less than that at the base station. For this study, it was assumed that the portable unit could achieve a beamwidth of 60° with a side lobe level that was 6 dB down from the main beam. This corresponds to an antenna with a directivity of 4.3 dB . The pattern is similar to that shown in Fig. 5(c) except that the beamwidth is wider in this case. It was assumed that each portable unit was capable of perfectly aligning the boresight of its adaptive antenna with the base station associated with that portable unit. In this manner. portable units could radiate maximum energy to the desired base station. while reducing battery power proportional to the directivity of the portable antenna . Portable units with adaptive antennas were simulated for each of the five base statio n patterns described in Section III. As in Section III. average values of Ph were found by averaging the bit error rates of each user in the central cell, subjected to interference from the central cell and all immediately adjacent cells . The resulting bit error rates for these systems are shown in Fig . 9. Note that. com paring Fig. 6 and Fig . 9. the bit error rates for the reverse channel are improved when directive antennas are used at the portable unit. For omnidirectional base stations. the BER is only decreased by a small amount (20% or less) for K > 200 when steerable directive antennas are used at the portable unit. However, for highly directive base station antenna patterns such as the adaptive-sectorized pattern, the BER was decreased by an order of magnitude for K > 300 . In Fig . 10, we have defined the BER factor as the ratio of the BER with adaptive antennas at all portable units to the BER without adaptive antennas at the portable units. A small BER factor indicates that adding adaptive antennas improved the BER significantly. For example, a BER factor of 0.5 indicates that using an adaptive antenna at the mobile unit resulted in a reduction in BER of 50% compared with the case of omnidirectional antennas at the mobile unit. As shown in Fig. 10, the adaptive sectorized base station pattern improved greatly by adding adaptive antennas at the portable unit. The resulting BER for this base station configuration when using adaptive antennas at the portable unit was decreased by an order of magnitude

200

300

Numbct of Uan pct Ceq (K)

400

100 200 :JOO 0600 N.."bc, of UKr, per Cell f K )

SOO

(b) 0=3

(I) 0=2

I" ,.......-::-c,.....--

-

.....

,...-

-r--:-n

~1c. :

"::C lc-.

Ic ·J

101) 200 JOO ..00 Nllrnbcr of Ux n ptr Cell fK )

SOO

(c) 0=4

Fig. 9. BER for five different base stat ion configurations using adapti ve antennas at the portable unit for (a) n = 2, (b) n = 3, and (c) n = 4. These results were developed through simulation by averaging the BER of every user in the central cell.

-

0 ......1

o. .-

..e..,

S«toril~

- . Ad.,lIn F1at.lopped • Ad-.pli,,~Srr;'larfud

~

D.

,

., . D.J

..-r : . . ~

~

~

.

...

-~_

... .. ..... ",

*

....::,

~

I ..

...

. ....

-

~. -

D ~ "'-'- -'U." .... C. MIKI

:

"

~

4~

m

Fig. 10. BER factor, defined as the ratio of the BER with adapt ive antennas at the portable unit to the BER without adaptive antennas at the portable , for five different base station configurations using when using adaptive antennas at the portable unit. This comparison is made for n = 4.

compared with the BER when omnidirectional antennas were used at the portable unit. In general. the more directive base station configurations benefitted more from adding adaptive antennas at the portable unit. Using a 60 " beamwidth fiat-topped pattern with a -6 dB side lobe level at the portable unit, the reverse channel BER for omnidirectional base stations was only improved slightly over the case of omnidirectional antennas at the portable . For directive antennas at the base station, the improvements were more dramatic , as illustrated in Fig . 10. The relatively small improvements obtained by using adaptive antennas at the portable unit can be explained by the fact that when omnidirectional antennas are used at the mobile unit. no more than 1-0.455, or 0 .545, of the total interference power is due to users in adjacent cells (see Table III where f = 1/(1 + 8(1». When using adaptive antennas at the mobile unit, all users in the central cell will appear no different to the central base station than if they had used omnidirectional antennas. Thus, adaptive

460

TABLE

III

RATIO OF IN-CELL INTERFERENCE TO TOTAL INTERFERENCE, FUNCTION OF PATH

Loss

f,

more efficient reuse, and for more frequent reuse of signature sequences throughout a large coverage area.

AS A

EXPONENT, FOR FIVE BASE STATION ANTENNA

PATTERNS WITH OMNIDIRECTIONAL ANTENNAS AT THE PORTABLE UNIT

Base station antenna pattern

n=2

n=4

n=3

Omni

0.4535

0.6012

0.6927

Sectorized

0.4532

0.6008

0.6924

Adaptive

0.4524

0.6002

0.6920

Flat-topped

0.4534

0.6011

0.6926

Adapti ve-sectorized

0.4531

0.6007

0.6922

0.4552

0.6028

0.6939

1

1 + 8{3 (Eq.2.13) (values of

r3 from Table

2.1)

TABLE IV RATIO OF IN-CELL INTERFERENCE TO TOTAL INTERFERENCE, FUNCTION OF PATH

Loss

V. CONCLUSIONS

f,

AS A

EXPONENT, FOR FIVE BASE STATION ANTENNA

PAfTERNS WITH ADAPTIVE ANTENNAS AT THE PORTABLE UNIT. THIS DATA IS FROM THE SIMULATION DESCRIBED IN SECTION IV

Base station antenna pattern

n=2

n=3

n=4

Omni

0.6752

0.8155

0.8826

Sectorized

0.6749

0.8153

0.8824

Adaptive

0.6753

0.8152

0.8822

Flat-topped

0.6751

0.8154

0.8826

Adaptive-sectorized

0.6747

0.8150

0.8823

antennas at the portable unit will only reduce out-of-cell interference levels, Therefore. the maximum improvement in CIR. on the reverse link. that can be achieved by using adaptive antennas rather than omnidirectional antennas at the portable unit is only 3.5 dB. Table III shows several values of the reuse factor, f, defined in (2.12) as the ratio of in-cell interference to total interference. for several base station patterns when omnidirectional antennas are used at the portable unit. Similarly, Table IV shows values of f when steerable. directional antennas. with directivities of 4.3 dB, are used at the portable units. Comparing Tables III and IV. it can be concluded that the use of adaptive antennas at the base station does nothing to improve the reuse factor, f: however the use of adaptive antennas at the portable unit does allow f to be improved. When omnidirectional antennas are used at the portable unit, f is entirely determined by the cell geometry, the power control scheme, and path loss exponent, n, which is a function of propagation and not easily controlled by system designers. Using adaptive antennas at the portable unit, it is possible to tailor fto a desired value which is greater than the reuse factor obtained using omnidirectional antennas at the portable unit. Ideally, driving f to unity would allow system design to much less sensitive to the intercell propagation environment, when perfect power control is assumed. This is an important result for CDMA cellular systems because it indicates that use of adaptive antennas at the portable unit could help to allow greater capacity through

It was shown in this study that adaptive antennas, with relatively modest bandwidth requirements, and no interference nulling capability, both at the base station and at the portable, can provide large improvements in BER, as compared to omnidirectional systems. Analytical expressions which relate the average BER of a CDMA user to the antenna directivity and propagation environment were derived and used to determine capacity improvements offered by a number of antenna patterns. It was demonstrated in Section III that the linear phased array provided an order of magnitude of improvement over the omnidirectional base station. The low-gain (5.1 dB) flat-top pattern provided almost two orders of magnitude of improvement over the omnidirectional system. In addition, it was shown that up to three orders of magnitude of improvement can be achieved by adding a simple three element linear array to a three-sector base station. In terms of capacity, the results of Section III indicate that using adaptive antennas at the base station can allow the number of users to increase by a factor of 2 to 4, while maintaining an average BER of 10- 3 on the reverse link. The bit error tate on the reverse channel is further improved by adding adaptive antennas at the portable unit. Using a 4.3 dB gain antenna at the portable, the bit error rate for the directive base station configurations (but not the omnidirectional base station) was at least half of the bit error achieved without directive antennas at the portable unit. For the highly directive adaptive sectorized base station, the improvement was over an order of magnitude for user densities less than 425 users/cell when each user employed an adaptive antenna. Since the directivity of portable unit adaptive antennas is limited by the size of a handheld device, improvements achieved on the reverse channel at the portable are not as dramatic as gains achieved by adaptive antennas at the base station. In addition, cost issues may limit the application of portable unit adaptive antennas. However, the reduction in reverse channel BER may be critical in extremely high traffic environments. In addition, the portable unit is required to track the only current base station, while adaptive antennas at the base station must track every user in die cell. It should be noted, however, most importantly, Tables III and IV showed the increase in reuse efficiency which portable adaptive antennas provide. By using modest gains at the portable unit, such antennas ameliorate the loss in capacity due to intercell propagation through interference control. In short, adaptive antennas at the base station can have a major effect on bit-error-rate performance, but cannot impact the reuse factor" f. Conversely, it has been shown in this paper that adaptive antennas at the portable unit can provide no more than a 3.5 dB improvement in reverse channel CIR; however, they allow the reuse factor,

461

!,

to be altered. It should be noted, however, that the use of directional antennas at the portable unit can only result in an increase in reuse factor of approximately 1/3. It was assumed throughout this study that the adaptive algorithms and hardware could be designed to meet the specified requirements on beamwidth, side lobe level, and tracking ability. It should be noted that, unlike the arrays discussed in this paper, a properly designed adaptive array can null out interference. Conversely, tracking a large number of users with an adaptive array is nontrivial, and it was assumed that each of the base station arrays described here were able to track all of the portable units without error. The multipath channel was not considered in detail in this study; however, it will be significant in developing algorithms for successful adaptive antenna steering. Rather than tracking users, the adaptive array in a multipath environment must track the angle of arrival of multipath components in order to distinguish the maximum signal. This problem is currently under investigation. Furthermore, efforts are currently underway to develop bit error rate expressions which are accurate for small numbers of simultaneous CDMA users with non-identical power levels.

[12] R. K. Morrow and J. S. Lehnert, "Bit-to-bit error dependence in slotted OS/SSMA packet systems with random signature sequences," IEEE Trans. Commun .. vol. 37, Oct. 1989. [13] L. B. Milstein, T. S. Rappaport. and R. Barghouti, "Perfonnance evaluation for cellular CDMA," IEEE lSAC, vol. 10, May 1992. (14] B. Widrow, P. E. Mantey. L. J. Griffiths, and B. B. Goode, "Adaptive antenna systems." Proc. IEEE, vol. 55, no. 12. Dec. 1967. [15] R. Kohno, H. Irnai, M. Hatori, and S. Pasupathy. "Combination of an adaptive array antenna and a canceller of interference for directsequence spread-spectrum multiple-access system." IEEE lSAC, vol. 8, May 1990. [16] S. Anderson, M. Millnert, Mats Viberg. and Bo Wahlberg, "An adaptive array for mobile communication systems," IEEE Trans. Veil. Technol., vol. 40, Feb. 1991.

REFERENCES [1] T. S. Rappaport and L. B. Milstein, "Effects of radio propagation path loss on OS-COMA cellular frequency reuse efficiency for the reverse channel.·· IEEE Trans. Veh. Techno/ .. vol. 41. no. 3. Aug. 1992. [2] G. R. Cooper and R. W. Nettleton, "A spread-spectrum technique for high-capacity mobile communications." IEEE Trans. Veh. Technol .. vol. VT-27, Nov. 1978. [3] A. Salmasi ... An overview of advanced wireless telecommunication systems employing code division multiple access." Con! Mobile, Portable & Personal Commun., Kings College, England, Sept. 1990. [4] W. C. Y. Lee. Mobile Cellular Telecommunications Systems. New York: McGraw Hill, 1989. [5] K. S. Gilhousen et al., Han the capacity of a cellular COMA system:' IEEE Trans. Veh. Technol .. vol. 40, May 1991. [6] M. B. Pursley, "Perforrnance evaluation for phase-coded spread spectrum multiple-access communications with random signature sequences, " IEEE Trans. Commun., vol. COM-25, Aug. 1977. [7] W. A. Gardner, S. V. Schell, and P. A. Murphy, "Multiplication of cellular radio capacity by blind adaptive spatial filtering, " IEEE Con! Sel. Topics Wireless Commun. Mobile. Vancouver, B.C., Canada, Jun 1992. [8] S. C. Swales, M. A. Beach, D. J. Edwards. and J. P. McGeehan, "The performance enhancement of multibeam adaptive base-station antennas for cellular land mobile radio systems," IEEE Trans. Veh. Technol., vol. 39, Feb. 1990. [9] R. T. Compton, Adaptive Antennas. Englewood Cliffs, NJ: Prentice Hall, 1988. [10] B. Agee, "Solving the near-far problem: Exploitation of spatial and spectral diversity in wireless personal communication networks, " in Proceedings Third Virigina Tech Symp. Wireless Personal Commun., June 1993. [11] W. L. Stutzman and G. A. Thiele, Antenna Theory and Design. New York: Wiley, 1981.

462

Adaptive Transmitting Antenna Arrays with Feedback Derek Gerlach and Arogyaswami Paulraj

Abstract- We address the problem of transmitting multiple cochannel signals from an antenna array to several receivers so that each receiver gets its intended signal with minimum crosstalk from the remaining signals. In addition to the usual "information" mode, we propose a "probing" mode during which probing signals received at the mobiles are fed back to the transmitter. These probing signals are used to identify an unknown propagation environment, enabling the transmitter to form the necessary transmission beampatterns.

A

I. INTRODUCTION

DAPTIVE receiving antennas have been widely applied to military and communication problems to eliminate unwanted interference or separate multiple signals. The aim of receive beamforming is to form a spatial filter that passes the desired signals and suppresses unwanted components. A receiving beamformer can observe its own output and modify its spatial filtering to improve the signal suppression/enhancement [1] . By contrast, the aim of transmit beamforming is to launch a signal into a propagation environment so that each receiver gets its desired signal without crosstalk from the signals intended for other receivers. This task is complicated by the presence of reflecting bodies of which the transmitter has no knowledge. The proposed adaptive transmit beamforming approach uses feedback of the signals received at the mobiles. This feedback makes possible the transmission of multiple signals to multiple receivers with low crosstalk, even in the presence of an unknown multipath environment. While receiving adaptive antenna arrays have been widely studied [2], [3], the transmit problem is equally important and has received little attention so far, except in [4]. In this letter, we formulate the adaptive transmit array problem and present simulations of its signal separation performance. We consider antenna arrays at the transmitter only, and the receiver has a single omnidirectional antenna. II. PROBLEM STATE~IENT AND ASSUMPTIONS

The goal of adaptive transmit antenna arrays is to send multiple cochannel signals from an antenna array through Manuscript received April 18, 1994; approved July IS, 1994. This work was supported by the Army Research Office under Grant DAAH04-93-G0029 and by ARGOSystems, Inc. under Subcontract 59613. The associate editor coordinating the review of this letter and approving it for publication was Prof. Moura. The authors are with the Information Systems Lab, Stanford University, Stanford, CA 94305 USA. IEEE Log Number 9405627.

Fig. 1. Multiple infonnation bearing signals transmitted from an array to multiple mobile receivers.

a propagation environment to several receivers so that each receiver only gets its intended signal with minimum crosstalk. Let (1)

be the d information bearing signals intended for d remote receivers. Let the antenna array consist of tri transmitting elements, and let the complex vector channel from the array to the kth receiver be given by ak:

(2) where aik is the complex channel response from the ith element to the kth receiver. The channel vector ak represents the total channel including the transmitter electronics, antenna array, and reflections.within the medium. In order to ensure that the vector channel is adequately described by a single vector, we need the following assumption:

bm p

«

BW- 1

(3)

where bm p is the maximum differential delay due to multipath in the propagation medium, and BW is the information signal bandwidth (same for all information signals). This narrowband condition is present in today' s advanced mobile phone system (AMPS), which has a 25-KHz bandwidth. Digital systems which meet (3) do not suffer from lSI.

Reprinted from IEEE Signal Processing Letters, Vol. 1, No. 10, pp 150-152, October 1994.

463

We can now define a channel matrix (4)

where A is a m x d complex matrix in which the (i, k) entry gives the complex channel gain from the ith transmitting element to the kth receiver. The channel matrix provides a complete description of all reflections and scattering in the environment. In order to transmit the d signals to the receivers, let W j be the beamforming weight vector for the information signal. Let (5)

receivers form the entries of the probing response matrix B. Next, the complex amplitude data is fed back to the transmitter. Knowing B, the base can estimate the matrix A. Let the I probing signals be

Pl(t), ... ,pl(t).

(11 )

Unlike the usual information signals, each probing signal is sent on an orthogonal channel (time, frequency, code) so that the receivers may measure the response of each probing signal. As before, each probing signal is transmitted according to its own probing vector

If we consider the array output due to only the jth signal t) with its corresponding weight vector W i: then the signal recei ved at the kth receiver will be

(12)

Sj (

(6)

where CJk represents the information signal amplitude received at the kth receiver, which was intended for the jth receiver, and * denotes conjugate transpose. If we let [C]jk = Cjk, we have

W*A=C

(8)

Diagonal elements of C are desired signal levels, and offdiagonal elements of C are crosstalk amplitudes. To ensure that each information signal is received only by its intended receiver with unit amplitude, we would like

C =1.

(9)

Since each element adds a degree of freedom, to achieve (9), we must have m 2 d. III. INCORPORATION OF FEEDBACK The channel matrix A summarizes the channel including both the antenna array and the propagation environment, and matrix A is not known to the transmitter. We therefore cannot directly find W such that "W* A = I. We propose to use feedback from the receivers to estimate A and hence W. Once A is estimated, a W that achieves (9) is

W=A+*

(13)

where bj k is the amplitude received at the kth receiver due to the jth probing signal. If we let [B]jk = bj k , we have

(7)

Equation (7) is a vector version of the familiar statement

in). reCeiver ) = (transmitter) . . x (h c anne I gaIn ( amphtude amplitude

The response at the kth receivers due to the jth probing signal is given by

(10)

where + denotes psuedoinverse. To estimate A, we introduce the concept of probing and information modes. In the probing mode, the transmitter transmits probing signals, whose responses at the receivers are measured and fed back to the transmitter from which A is estimated. During probing, transmission of the usual (information) signals is temporarily halted. Instead, the array is excited in tum by several probing signals, and each receiver measures the relative complex amplitude response of each probing signal. These complex responses measured at the

B

= V* A.

(14)

The probing signals are agreed on by the transmitter and receiver. One choice for Pj ( t) is Pj

(

t) = {

~xp( ua; t )

o

if t E

[(j -

otherwise

1)f, j f]

(15)

where We is the carrier frequency, and T is the probing signal duration. In any case. since the 1probing signals are orthogonal using a bank of 1 matched filters matched to the probing signals, each receiver can measure a column of B corrupted by measurement noise: (16)

where the entries of E are assumed to be zero mean i.i.d. Gaussian random variables. Next, each mobile digitizes the received probing signal amplitudes and feeds these back to the base on its own reverse (digital modulation and assumed error free) channel. These reverse channels are assumed to be available, and for the purposes of this discussion, they can be on different frequency channels or even a wireline channel. The spatial reuse of the reverse channel is an independent problem and has been well studied [3]. The cell site assembles 13 and then computes A. Knowing V, which are the inputs to the channel, and B, which are the noise-corrupted probing responses, the transmitter identifies A using a least squares estimate

464

(17) where + denotes the pseudoinverse operation. Once A is in hand, the transmitter can determine W using (10). Each additional probing vector provides another equation involving A. Since A has m rows" the condition to uniquely determine A is 1 2: m. Additional probing vectors will improve the accuracy of the least-squares estimate (17).

Channel reuse in an access method therefore consists of the following five steps: 1) Transmit information signals to the receivers using information weight vectors W. 2) Monitor the level of crosstalk at the receivers periodically, and halt the information transmission when the crosstalk exceeds a threshold of acceptab ility. 3) Enter the probing mode: a) Choose probing vectors V. b) Transmit the probing vectors V , and measure the response matrix B at the receivers. c) Feed back the probing response matrix B to the transmitter. d) Estimate A via A = V+*B. e) Form information signal weights according to W =

A+* .

. 'l:OP: 8prObesIWavelength

~

:s .1l

£. 10.

IV. SIMULATIONS

Simulations were carried out to evaluate the performance of an adaptive array sharing a single channel among two receivers simultaneously . Using a six-element circular array with a 15.0° beamwidth, a beampattern for each signal was created according to (17) and (l0). The propagation environment contained 20 local scanerers for each mobile placed randomly in a 250 wavelength vicinity of the mobile. Energy arrived at the receivers only via local scatterers, and no line of sight was present. The receivers moved around the transmitter in a circular path of radius 5000 wavelengths (carrier = 900 MHz) at 2.5 inilhr, maintaining a fixed angular separation. To track A, the transmitter periodically alternated between probing and information mode. Because the channel varied as the receivers

2

~ '5 o

10-' L - ' - - - ' - - - ' - - - ' - - - ' - - - ' - - - ' - - - - - - ' ' - - -,, - - . l 0 .4 0 .5 0 .6 0.7 0 .8 0.9 1 1.1 1.2 1.3 1.4 Mobile Spacing in Beamwidths (=15 degrees)

Fig. 2.

4) Resume information transmission with the new choice of weight vectors. 5) Go to step 2. The frequency with which it will be necessary to enter the probing mode will be determined by the receivers' speeds and the propagation medium complexity . If two mobiles sharing a channel approach each other, then the channel matrix will become singular. Since this method is not designed to accommodate singular channels, the transmitter should hand off one receiver to a new channel. To accommodate the probing signals in a TDMA system, a portion of each slot should be devoted to probing. If the mobile motion, and hence, the required probing rate is slower, probing could occur every n th slot.

.

Bottom .: 16 _ p robesiw aYe l e ng~

Outage probability versus mobile spacing.

moved, the interference was least immediately after each probe and worst just before the next probe. Two probing rates of 8 and 16 probes/wavelength were used, and each real entry of A was specified with 4.2 bits, for a net feedback rate of 1379 and 2753 bps, respectively. Fig. 2 shows the probability that the channel's SINR was below 7.3 dB for various mobile spacings. The 7.3 dB threshold is a BER of 10- 3 for B~SK. As the mobile spacing increased, the channel quality increased because the two channel vectors were less parallel. V. CONCLUSION

We have proposed an adaptive transmit antenna array that uses feedback to achieve low signal crosstalk at the intended receivers . Simulations show that at low mobile speeds (2.5 mi/hr), adequate signal separation requires feedback data rates in the thousands of kilobits per second, making the approach most applicable for static of slow-moving receivers. Methods of reducing the feedback rates are needed.

465

REFERENCES [1] B. Widrow and S. Stem. Adaptive Signal Process ing . Englewood Cliffs. NJ: Prentice-Hall. 1985. [2] A. Naguib and A. Paulraj, "Performance of COMA cellular networks with base-station antenna arrays," in Proc. Int. Zur ich Seminar Digital Commun. (Zurich. Switzerland ), Mar. 1994. [3) B. Sublett. R. Gooch, and S. Goldberg, "Separation and bearing estimation of co-channel signals," in Proc. MILCOM '89, May 1989, pp. 629--Q34. (4) O. Gerlach and A. Paulraj, "Spectrum reuse using transmitting antenna arrays with feedback," in Proc. Int. Conf Acoust.. Speech, Signal Processing (Adelaide, Australia), Apr. 1994, pp. 97-100.

Adaptive Antennas for Third Generation DS-CDMA Cellular Systems George V. Tsoulos, Mark A. Beach, Simon C. Swales Centre for Communications Research University of Bristol Bristol, UK Fax: +44 117 9255265, Tel: +44 117 9287740 e-mail: [email protected]

Abstract: This paper considers the perfonnance of a DS-CDMA system employing adaptive antenna technology at the base station site for both an Umbrella and a Micro-cell in a hierarchical cell structure. The possible advantages and problems from such a deployment are discussed. By exploiting the capabilities of Ray Tracing to provide the complex channel impulse response, a new adaptive antenna simulation model is presented along with some initial results for the perfonnance of well known adaptive algorithms in a multiple interference scenario. These provide insight into how the adaptive antenna operates when used in conjunction with DSCDMA and illustrate the potential benefits. Finally, propagation measurements are provided in order to validate some of the claimed capabilities. 1.

INTROD UCTION

Figure 1: Hiera rchical cell structure concept.

The need for mobile radio systems with increased spectrum efficiency is paramount in the drive towards third generation systems [1]. Currently favoured solutions in today's systems include the deployment of smaller cells as well as fixed sector, or multi-beam antennas, at the base stat ion (BS) site. In terms of modulation schemes and access techniques, application of spread spectrum modulation with Code Division multiple access (COMA) and especi ally Direct Sequence (OS) COMA, look to be amongst the favoured approaches. Recognising that the ambitious requirements of UMTS & FPLMTS can not be fulfilled with the known cellular architectures (macro, micro, pico cells) led to the conception of the idea of a hierarchical cell structure [2 - 3]. The key issue for this type of cell architecture is to apply multiple cell layers to each service area , with the size of each layered cell tailored to match the required traffic demand and environmental constraints (Fig. 1). In essence , microcells will provide the basic radio coverage but they will be overlaid with Umbrella cells to maintain the ubiquitous and continuous coverage required. Especially for the OS-COMA system, this mixed cell technique gives answers to situations where a possible performance degradation may occur, e.g. fast moving users requ iring handover, or black spots in coverage .

Advanced antenna techniques, such as adaptive antennas, is an area which seems to gather momentum recently [4 - 7], as another possible way to increase the efficiency of a given system. Adaptive antennas, based on the spatial filtering at the base station, separate the spectrally and temporally overlapping signals from multiple mobile units. This can be exploited in many ways such as: - Support a mixed architecture. - Comb at the near-far effect. - Support higher data rates. - Combine all the available received energy, (multip ath). In the following section , a brief discussion will be presented on the application of adaptive antennas in an Umbrella cell. The conclusions are taken from an earlier publication [5], but include some additional propagation measurements to support previous claims. The remaining sections focus on the use of adaptive antennas in a microcellul ar environment operating with OSCOMA. This work includes the development of a detailed Ray Tracing based simulation model and the pre sentation of some initial results .

Reprinted from Proceedings of 45th Vehicular Technology Conference, Vol. 1, pp. 45-49, July 1995.

466

gree of spati al selectivity that can be applied by the antenna system, i.e. whether to form a single narrow beam or adopt an optimum combining approach. In a large cell application, the use of an A DA based approach for a beamformer, would po tentially be more desirable since the ADA of the signals has a relatively narrow angular spread [9]. In a microcellular environment, th e angul ar spread of the signal from a single user is much greater , (figures 6b , 6c), due to the lower height of th e BS antenna and the close proximity of the scattering objects. Also, the ADA of the signa ls will change rapidly, with the do minant direction not always towards the desired user , as in the large cells case . Therefore , in the microc ellular case , the optimum combining approach see ms to be more flexible, providing increased capacity, as it will be shown in the following sections.

2. A N ADAPTI VE BASE STATIO N A NTENNA FOR THE UMBRELLA CELL OF A MIXED CELL STRUCTURE The potential advantages offered by employing an adaptive ante nna at an U mb rella BS site with a OSCOMA system, can be summarised as follows :

• • • • •

Mitigation of the near-far eff ect. Capacity enhancement. More efficient handover. "In-f ill" cove rage for the dead-spot s. A bility to support high data rates.

These wer e discussed in greater detail in an earlier publi cati on [5], although in orde r to support the last claim, some propagation measurements have been carried out. The measurements were performed with a Fast Fouri er Transform (FFT) D ual Channel Sounder at 1.823 GH z [8]. The RMS del ay spread was calculated using a 10

900 "0

750

'" 600 '" .s e-, '"

a; 0

Ul

.:

f:

450

..J

II:

300

distance (meters) Figur e 2: Wid eb and measurement s

dB power window on each measured impulse response profi le. The results are shown in figure 2, while figure 3 shows the map of the area where the mea sur ements were per formed . For the umbrella cell base sta tion which was at the roof of a building with approximat e height 50m, two ante nnas wer e used : on e omnidirectional end-fed dipole (identical to the mobil e ante nna) and one directional shro uded yagi with 15dBd gain. From the above figur e can be see n that th e RMS delay spread is much less for the case of the directiona l antenna, with a reduction which can be up to 1/5. The reduced de lay spread results in less int er symb ol interferenc e and, therefor e, provides the pos sibility of suppo rting higher bit rate services .

Figur e 3: Map of the area un der investigation

4. SIMULATION MODEL T he simulation model can be separated into two basic block s: a) The block which generates the impulse response of the channel under investigation. This is done with th e help of a Ray Tracing simulatio n tool developed by the Un iversity of Bristol [10]. Th e input parameters include the number of reflections and diffractions, th e tr ansmitted power, antenna radiation patterns, etc. Th e result ant output file includes the time delay, the angle of arrival and the power of each received ray. h) The block which simulates th e adaptive antenna array, illustra ted in th e next figur e 4.

3. AN ADAPTI VE BASE STATION A NTENNA FO R SM ALL CELLS The angle of arrival (ADA) of the radio signal, along with its multipath compon ents. dir ectly affect s the de -

467

-

,

X,

(k)

where N is the total number of antenna elements. The desired, or reference signal, roCk) is simply the PN sequence from on e user, (i.e. no data modulation is considered at the moment) , and the error signal is defined as the difference between the array output and the desired signal e(k) = y(k) - roCk). This model for the adaptive antenna offers the capability of selecting one from several adaptive processing algorithms, such as the LMS , NLMS, RLS, SQRLS and the OMI, [11 - 13].

x,(kJ

y (k) Array Output

J---r---~

An/enna Array I >-I....,+-T-+-+-~ (N elements)

Adap tive Co ntrol Pro cessor

5.

Reference '. (k)

The aim for the simulations is to investigate the performance of the adaptive algorithms on an environment basis and to provide insight into the mechanism followed by the adaptive antenna, when operating in conjunction with OS-COMA. Parameters used in the simulations include: averaging over 15 runs, 8 antenna elements with half wavelength spacing, 1023 chips M-sequence with 1.25 MHz chipping rate, step for the LMS and NLMS algorithms 0.01 and a value of 1 for the forgetting factor for the RLS and the SQRLS algorithms. From figure 5 can be seen that, as it was expected, the recursive least squares algorithms, converge very fast, (within around 50 samples, while neither of the LMS - NLMS have reached the same level even after ten times that time) . The RLS and the SQRLS algorithms have very similar behaviour, with the SQRLS giving the best output and being more robust. The choice of an adaptive algorithm must be made on the basis that the algorithm must be able to rapidly acquire and track the signals in a variety of mobile scenarios. Therefore the obvious choice is either of the RLS - SQRLS algorithms. In the following simulations the RLS algorithm is used.

Figure 4: Adaptive Antenna Arr ay

x.(k) is the sample of the total received signal at the nth element at instant t = kT, where T is the sampling interval, as well as being the chip duration of the PN sequence. x.(k) consists of the desired and interfering OS-COMA signals and random noise, and it can be expressed as:

x.(k) =

2: 2: hm,e Jkd(. -I )sin (~,)rm(k M

R

m=l r=1

t,)

SIMULATION RESULTS

+ N(k) (1),

where h"" and r",(k) are the elements of the vectors of the impuls e response and the OS-COMA signal from the mth user respectively:

h, = [hi'll' hm2," ' , hm,," ' , hmR]T, r m = [rm(k), rm(k - t l ) , " ' , rm(k - t,),· .. , rm(k - tRW. rm(k) = dm(k)· PNm(k) . ei~m, with dm(k) the binary data and 'Pm the carrier phase of user m. N(k) represents the

random Gaussian thermal noise. M is the total number of users, R is the total number of rays, d is the interelement distance, {J, and t, are the angle of arrival and the delay of each ray r respectively and [ ]T denotes the transpose. Although the total received signal at the n th antenna element is calculated by considering the interelement phase shift for each incoming ray , (n - 1)kd sine {J,), depending on the environment under investigation, it can also be calculated directly from the ray tracing to ol. The output from the adaptive array in vector notation is: y(k) = wT(k)x(k), where w(k) and x(k) are the weight and element vectors respectively. Using (1), this gives :

0 r---;--~-,",-.,..--"'---c----,-----,---,----, - 10

1.. .• _ tll-~==="'-l==='4-=~======"'"'-1 _ ~_.

.. •

100

:-. . _.

.:..

200

:_..

samples

J . _.

300

~

__

;.. _

400

Mea n weight for the second antenna eleme nt

Figure 5: Mean weight convergence for different algorithms, with 16 users.

468

soo

SINR_IN

-+-

;

~ a:

z

0;

.... _

1------.----;.--

- -.-

than steering it towards the first desired ray, because there is much more interference around the first ray which would be accommodated by the main lobe and hence would decrease the output SINR.

"_ ' . . .. .. •, . . ..

,

.. __

- ._

,~ _

__ _

,

,~

.

If better output SINR than the one depicted in figure 6a , is needed, then an increase in the antenna elements would offer great improvement, as it is depicted in figure 7.

_..

.

·30 ' - - - - - - '- - - - ' - - - -........- - - - ' - - - - 4 12 8 20 24 Users

( a)

18 . - - - - - . - - - - - - - - - , , . - - - - - - - - , - - - - - . ,

16

a:

14

;!; 12 CIl

10

8

·30

0

Angle

61!!-- - - - ' - - - - - " " ' - - - - - - - - - - l 20 16 12 4 8 No of elements

•

Ifl)ut SINA ... ·19.61 LIB

30

Figur e 7: Output SINR for the RLS algorithm and 16 users as a function of the number of a nte nna elemen ts,

(h i

Simulations showed that the influence of the thermal noise (modelled as White Random Gaussian noise), to the adaptive antenna performance is negligible. For example , for a microcellular environment with 16 users and the RLS or the SQRLS algorithm, there is a reduction of less than OAdB in the output SINR. This maximum reduction corresponds to the rather worst case situation of an input SNR of 3dB . The above behaviour can be explained on the ground that the influence of th ermal noise in a system can be neglected when traffic in the system is close to its capacity limit, because then interference power becomes a dominant factor for determining communication quality and channel capacity. This obviously is even stronger for the case of OSCOMA .

-10

·60

30

30

60

90

I.e)

Figure 6: (a) Output SINR for the RLS algorithm as a function of the number of users. (b) & (c) Produced radiation patterns for 8 and 24 users respectively.

6.

The results depicted in figure 6 show that the array is capable of adapting to the given user scenario even with as many as 24 users . It has to be mentioned here that the SINR values shown in figure 6a are the mean values after convergence. By comparing the results depicted in figures 6b and 6c, the concept of the " smart" antenna is revealed: Although the array should direct its main lobe towards the ray with the maximum incoming power, its first sidelobe towards th e next ray with the next maximum power and so on, it doesn 't do so for the case of figure 6c. The reason for this behaviour is that the criterion used by the adaptive algorithm is the optimum SINR. This is going to be achieved by steering the main lobe towards the seco nd desired ray rather

DISCUSSION

The advantage from using an adaptive antenna with a OS-COMA system is two-fold: First , the output SINR is greatly improved, which corresponds to an improvement on the capacity of OSCOMA, which can be substantial. Second, the produced radiation pattern has a directionalit y which varies according to the environment under invest igation. For an umbrella cell scenario, due to the small number of signals and their very narrow angular spread, the produced radiation pattern can be very directional, which can be exploited in a number of ways as it was described in [5]. Even for the microcells, where the number of users is great and the angular distribution

469

of the incoming signals very wide, the produced radiation pattern is going to be better than an omnidirectional pattern (even slightly). Obviously, the pattern oriented analysis for the benefits achieved with an adaptive antenna, (discussed in [5]), can not be applied for the case of microcells.

REFERENCES

[1] IBC Common Functional Specification, "Mobile Communications: General Aspects and Evolution", Specification RACE D731, Issue D, Dec. 1993. [2] Hakan Eriksson et aI, "Multiple Access Options for Cellular Based Personal Communications", 43rd VTC, Secaucus, New Jersey, USA, May 18 - 20 1993, pp. 957-962. [3] S. Chia, uThe Universal Mobile Telecommunications System", IEEE Communications Magazine, pp 54-62, December 1992. [4] J.S.Winters, "Signal acquisition and tracking with adaptive arrays in digital mobile radio system IS54 with flat fading", IEEE Transactions on VI: Vol. VT-42, No.4, November 1993, pp. 377-384. [5] G.V.Tsoulos, M.A.Beach, S.C.Swales, "Application ofAdaptive Antenna Technology to Third Generation Mixed Cell Radio Architectures", 44th VTC, June 8-10 1994, Stockholm, Sweden, pp. 615-619. [6] G.V.Tsoulos, M.A.Beach, S.C.Swales, "Adaptive Antennas for Third Generation Cellular Systems", 9th ICAP, 4 - 7 April 1995, Eindhoven, the Netherlands. [7] Race Tsunami Project, "Requirements for Adaptive Antennas for UMTS", R2108/ART/WP2.1/DS/I/ 004/bl, 22 April 1994. [8] M.A.Beach, S.Chard, J.Cheung, T.Martin and T.Wiltshire, "Description ofthe advanced handover experiment", PLATON R2007, 1993. [9] S.C.Swales and M.A.Beach, Direction Finding in the Cellular Land Mobile Radio Environment", lEE Fifth International Conference on Radio Receivers & Associated Systems, RRAS90, University of Cambridge, England, 23rd - 27th July 1990, pp.192-196. [10] G.E.Athanasiadou, A.R.Nix, J.P.McGeehan, "A Ray Tracing Algorithm for Microcellular Wideband Modelling", 45th VTC, Chicago, USA, July 1995. [11] Adaptive Filter Theory, S.Haykin, 2nd edition, Prentice Hall 1991. [12] Introduction to Adaptive Arrays, R.Monzingo, T.Miller, John Wiley, 1980. [13] Advanced Digital Signal Processing, J.Proakis et al, Macmillan Publications, 1992.

In a system like the DS-CDMA, the optimisation process must be repeated cyclically for each desired user. This can be done either in parallel with the help of a bank of beamformers or with one time shared beamformer. Considering as an example, the case of a channel which is sampled every 1ms, the following can be mentioned: • For the case of an umbrella cell with 10 users, the time available to the Beamformer to optimise its response for each user in a serial mode, corresponds to 125 samples for a 1.25MHz PN sequence. This means that if fast algorithms are used, the use of one Beamformer in a serial mode, can be possible for this kind of cell structures. • For the case of a microcell with 24 users, the samples available for convergence when one Beamformer is used, are limited to 52. This obviously indicates the need for a bank of Beamformers and parallel beamforming. 7.

CONCLUSIONS

Work presented in this paper discussed the application of adaptive antennas in a third generation DS-· CDMA mixed cell architecture system, at both the umbrella and the microcell base stations. It was shown that an adaptive antenna can be used in order to enhance the performance of a DS-CDMA system. In the microcellular environment, simulation results were presented which employed a Ray Tracing tool to provide the radio channel characteristics. Work currently under way is investigating the performance of an adaptive antenna in different cellular environments with moving users. Also, different forms of adaptive antennas are considered as a function of the environment they are operating, in an attempt to provide a unified approach for all the different environments. ACKNOWLEDGEMENTS

George V. Tsoulos wishes to thank the Centre for Communications Research (University of Bristol) for his postgraduate bursary. The authors would like to thank Professor J.P .McGeehan for his continuous encouragement and the provision of laboratory facilities. Also, the authors would like to thank C.M.Simmonds for the propagation measurements and the postprocessing of the results. Finally, many thanks to G .E.Athansiadou for her help with the Ray Tracing and M.P.Fitton for his help with the field trials.

470

The Spectrum Efficiency of a Base Station Antenna Array System for Spatially Selective Transmission Per Zetterberg, Student Member, IEEE, and ·Bjorn Ottersten, Member, IEEE Abstract- In this paper we investigate the spectrum efficiency gain using transmitting antenna arrays at the base stations of a mobile cellular network. The proposed system estimates the angular positions of the mobiles from the received data, and allows multiple mobiles to be allocated to the same channel within a cell. This is possible by applying a transmit scheme which directs nulls against co-channel users within the cell. It is shown that multiple mobiles per cell is an efficient way of increasing capacity in comparison with reduced channel reuse distance and narrow beams (without directed nulls). The effect of the spatial spread angle of the locally scattered rays in the vicinity of the mobile is also investigated.

U

1.

INTRODUCTION

SING antenna arrays, at the base stations, to perform spatially selective reception and transmission is a newly proposed way of increasing the capacity of a cellular network [2], [13], [15]. In [13] and [15], the reduction of the channel reuse factor is investigated as a means of increasing capacity. The analysis in [13] assumes that ideal sectorized beams are formed in the direction of the mobile, thereby reducing the probability of co-channel interference. In [15] the antenna array outputs are linearly combined to produce the least mean square error at the output. Since the weights are updated at the fading rate, not only is co-channel interference suppressed but the fading is also mitigated. For base to mobile communication reuse of weights adapted during reception is proposed in [15]. However, this requires a system which uses time duplex division (TDD); that is, contiguous timeslots are allocated for mobile to base and base to mobile communications (at the same frequency). Since outdoor systems such as TACS, GSM, DCS-1800 and IS-54 use different frequency bands for receive and transmit [12], the base to mobile scheme described in [15] cannot be applied in any of these systems. Since it is desirable to increase capacity also in the base to mobile link, we here investigate a transmit scheme which does not rely on reuse of weights adapted during reception. The technique here is based on array response and directional information. In the proposed scheme, the angular positions of the mobiles are estimated during reception and then used to calculate the transmit weights for array transmission. The problem of angle estimation is not addressed in this paper, and we refer to [71, Manuscript received May 25, 1994; revised February 8, 1995. This work was supported in part by the Swedish National Board for Industrial and Technical Development (NUTEK). The authors are with the Royal Institute of Technology, S-100 44 Stockholm, Sweden. IEEE Log Number 9413245.

[9], [10] and [14] where algorithms for solving this task may be found. In order to calculate the transmit weights, the antenna transfer function, at the transmit frequency, is assumed known. In this paper we introduce a new approach for increasing capacity. While [13] and [15] explore reduced reuse distances we here also investigate the reuse of channels within the cells (with unchanged distance between co-channel cells). This permits the use of a simple dynamic channel allocation scheme which avoids major interferers from getting close (in angle) to the desired mobile. Transmit weights are chosen such that a main beam is pointed at the desired mobile with nulls in the direction of co-channel interferers within the cell, but not outside the cell. It is possible to direct nulls against cochannel users in other cells also. However, the implementation of such a scheme in the downlink of a TDMA system might be difficult due to synchronization problems. In this paper, a comparison between reduced cluster sizes and multiple mobiles per channel is also made. Our results indicate that the latter technique is more effective. However, it should be kept in mind that the transmit scheme directs nulls only against co-channel users within the cell. In the analysis, the spread angle of the locally scattered rays in the vicinity of the mobile is a crucial factor. We find that it is possible to increase the capacity between two and twelve times using up to 20 antenna elements. The capacity is largely dependent on the spread angle of the locally scattered rays in vicinity of the mobile and on the number of antennas at the base stations. The paper is organized as follows: Section II explains the cellular network, the base station antenna array transmission system and the propagation modeling. The weight selection algorithm used in base to mobile communication is derived in Section III-A. In Section III-B, the channel allocation scheme is presented and finally, Monte Carlo studies are used in Section IV-B to determine the spectrum efficiency gain. II.

PRELIMINARIES

A. The Cellular Network

The coverage area of the mobile radio system is assumed to be divided into a network of hexagons [8], where each hexagonal cell is covered by a base station site. The channel reuse factor (cluster size) will be denoted with C, the cell radius with R, and the channel reuse distance with D. The parameters C, Rand D are related through D == J3CR. Thus, a large set of channels, C, implies a large distance

Reprinted from IEEE Transactions on Vehicular Technology, Vol. 44, No.3, pp. 651-660, August 1995.

471

Fig. 2.

Illustration of the transmission system.

Fig. 1. The Cellular Network with channel reuse factor four, C = 4.

between co-channel cells, D. Increase in C means decreased interference but also a decrease in the number of channels available in the cells . In this paper, the hexagonal cells are divided into three 1200 sector subcells . Each of these subcells uses a fixed third of the channels available in the hexagonal cell. The subcells are covered by 1200 base station antenna arrays. The sectorization reduces the number of interfering cells in the first tier of interferers from six to two. The concept is illustrated in Fig. I for the case C = 4. Interferers in the second tier and further away will be neglected .

B. The Base Station Antenna Array Transmission System

• • •

x (t ) = W ' (9 )s(t) x ,(t)

•

•

•

!xm(t j

/.

( >120 \ .,:

0

The base station transmission system is based on four Fig. 3. The Uniform Linear Array (ULA) and the spatial multiplexer. algorithmic blocks as depicted in Fig. 2. One of the building blocks is the direction finder. This algorithm estimates the mobile angular positions 4' and their angular spreads r; (to where c denotes complex conjugate. The resulting m dimenbe defined) of the mobiles in the subcell from the received sional vector, x(t), is the input to the m antenna elements. The data D . As mentioned in the introduction, multiple mobiles implementation of the transformation can be done in analog or will be allocated to the same channel within the subcell . digital hardware and at different intermediate frequency bands. The channel allocator determines which mobiles should be The exact interpretation of the messages S k (t) depends on this allocated to the same channel. This algorithm uses only the implementation, but the analysis of this paper is independent angular positions of the mobiles 4' as input. The channel of this. allocation is represented by the n c x d matrix e, where n c is the number of channels within each subcell and d is the number of C. The Antenna Array Configuration simultaneous mobiles on the channel within the subcell . Each The antenna array of the base station is assumed to be linear contains the angular positions of the mobiles on a row in with uniformly spaced antenna elements. This form of antenna corresponding channel. The elements of an arbitrary row of configuration is known as a Uniform Linear Array (ULA) . The will be denoted by the 1 x d vector (). The corresponding individual elements of the array are ideal sectorized antennas angular spreads will be denoted by a , with a sector of 1200 • The active sectors of the antennas The weight selector calculates matrices of weights to be are positioned towards the broadside of the array (see Fig. used in transmission. One matrix is calculated for each chan3) and the spacing between the antenna elements , is set to nel. The angular positions , (), and angular spreads a of A/ v'3 where A is the wavelength of the carrier wave. The the mobiles allocated on the channel in the cell are used number of antennas in the configuration, m, is an important in this calculation. In order to simultaneously transmit d parameter of the system. In Fig. 3 the polar coordinates (T, a ) different messages, {s( t) , .. . , Sd(t)} , to d different mobiles, are introduced. The elements of the 1 x d dimensional vector the messages are spatially multiplexed. This operation can be of mobile positions, 0, are given in terms of the angle a. represented by the multiplication of the m x d dimensional matrix, W C ((), u), with the d dimensional vector s(t) = D. Propagation Modeling [Sl (t ), " ' , Sd(t )V i.e. In this section, we define the channel model between the x(t) = W C(O, u)s(t) (I) antenna elements of the array and a receiver at the position

e

e

472

(T, a). The transfer function consists of three factors: path loss, shadowing, and fast fading. The path loss and the shadowing are common to all the antenna elements. The path loss is modeled as (1/ T ) T where ry is the path loss exponent. The shadowing is modeled by a factor L which has log-normal distribution [5]. The standard deviation of 10 log L is denoted as adB (the mean is zero). The fading gain and the phase of the m antenna elements of the array are stacked into a vector denoted v(a~ a). The vector v(a, 0") is a random vector with a distribution depending on a and 0" (where a is to be defined). When the receiver and transmitter are located in different cells the fading of the antenna elements is assumed to be fully correlated, or equivalently the local scatterers in the vicinity of the mobile have negligible radius (implies that 0" == 0°) in comparison with the distance between the base station and the receiver [6]. Mathematically we model this as v( a, 0°) == {response of a single ray}

== Fa(a)

Fig. 4.

Illustration of local scattering.

where

CJ

. 27r . ( )) = F [ 1, exp (-J J3sm a , " ' ,

~ sin (a))]T

~ak E~

(2)

where F is the common fading of the antenna elements and a(a) models the phase differences of the antenna elements due to propagation path differences. The complex random variable F has a Rayleigh distributed amplitude and is uniformly distributed in phase [0, 27r]. The Rayleigh distribution is normalized such that E{IFI 2 } == 1. The function a(o:) in (2) is assumed to be known, although in practice a(a) may have to be obtained by calibration. When the receiver is in the same cell as the base station, the fading in the antenna elements is not assumed fully correlated. The model used in this situation is discussed in more detail in the next section.

E{v ( a, a)} == E {ej ip k } E{a( a

==

21r sin E N ( J3

(o ), (j )

O.

(7)

u.;(a, a) == E{v(a, 0") V * ( a, a)}

E{ (teXPj;k) a(a + 6a k))

(t

(3)

21r sin (n + 6ak) v'3

+ ~a k ) }

The second equality follows since ip k is uniformly distributed [0, 21r] and thus E{ e j rpk } == O. The covariance matrix is derived next By definition

=

The phase shifts, 'Pk, of the rays are assumed to be uniformly distributed [0, 21r], and the angular perturbations, ~ak are assumed to be distributed

(6)

N(O, a)

where .6.ak is given in degrees. This means that the spatial spread of the energy which is received by the mobile has approximately normal shape with standard deviation a. The normal distribution has previously been used in the propagation study [1]. As will be seen in the simulations, a is a critical parameter for the system. The parameter CJ which is related to a through (5) represents the angular spread in terms of the beamwidth of the array. Since the beamwidth of a linear array increases with a, CJ decreases with a. We will refer to a as the physical spread and CJ as the spread in terms of the beamwidth. Since the number of rays, N, is assumed large, it is natural to assume that the entries of the vector v (a, a) are jointly normally distributed. The following expression is obtained for the mean

E. Modeling of Local Scattering In this section, the model of the fast varying factor of the transfer function between the antenna elements of the array and a mobile receiver in the cell is presented. Consider the situation when the signal received by the mobile is built up by N locally scattered rays in the vicinity of the mobile, as depicted in Fig. 4. Assume that each of these rays has an individual stochastic phase
(5)

2700 cos (a )a.

Equation (4) means that (21r/J3) sin (a + ~ak) when treated as a random variable is normally distributed. Equation (4) can be closely approximated as

clef

. exp ( - j (m - 1)

==

J37r 2

eXP j ; k)a(a + 6 a k

))*}

(8)

where ()* denotes complex conjugate transpose. Since the terms of v( a, o ) are independent and equally distributed the following expression is obtained

(4)

473

message divided by the sum of the squared amplitude of the interferers, i.e.: (12)

where (13)

Fig. 5. The geometry involved in reception at the mobile (k

Conditioning on {27f/V3(sin(a and using (4) yields

+ 6.ak) -

= 2. d = 3) .

sin(o))

The variable P is the squared amplitude and is essentially the power of the signals. The subscripts d, hi and ci denote "desired," "host interference " and "co-channel interference" respectively. The Outage Probability (OP) will be used as a measure of the quality of the link between the base station and the mobile. That is the probability that the SIR is below a certain threshold g, or formally

= v}

OP = Pr {SIR

< g}.

(16)

This is a reasonable measure since it is known that most digital modulation schemes perform well above some threshold in SIR. The phase shifts, ipk, in (3) are of course frequency dependent. However, within the bandwidth of the messages, Sn(t), they can be considered to be constant. F. Outage Probability Consider the reception at the kth mobile on an arbitrary channel. Let 0 be the vector of angular positions of the mobiles at the desired base station, and let 0 1 and 0 2 be the corresponding position vectors at the interfering base stations . Assume further that the position of the kth mobile is (r, Ok) seen from the desired and (T1 ' (1) , (T2 ' (2) from the interfering arrays. The situation is depicted in Fig. 5. By the assumptions made earlier, the signal , u(t) , received by the kth mobile is given by

G. The Distribution of the Mobiles

To analyze the capacity of the system, some assumptions on the positions of the mobiles must be made. Clearly, if all the mobiles in the cell have the same angular position it will be impossible to host more than one mobile per channel and cell. A natural assumption is that the users are uniformly distributed with respect to area. The resulting angular distribution is plotted in Fig. 6. To simplify the analysis the uniform area distribution will not be used, instead we will assume that a has a density function such that (27f / V3) sin (a) is uniformly distributed [-7f, 7f] , or equivalently, a has the probability density function

f (v) = Q

27f V3 cos (v), 3600 3

(17)

This distribution is also plotted in Fig. 6. III. Two ALGORITHMS

(11)

A. Selection of Weights where W n (0, u) is the nth column of the weighting matrix W(O, u) and Sn(t) and s~(t) are the messages transmitted by the desired and interfering base stations respectively. A natural assessment of the instantaneous Signal to Interference Ratio (SIR) of the kth mobile is the squared amplitude of the desired

In this section, we design the algorithm or function which calculates the weighting matrices used in transmission. The algorithm uses only the angular positions of the mobiles in the subcell, 0 = [0 1 , • . • , Od], and the angular spreads of those mobiles a = [0"1, . . . ,lid] as input. Note that the assumptions

474

and L n is the log normal fading at the nth mobile. Assume that the angle Q of a beam which hits a co-channel user in a neighboring cell is distributed according to (17). Assume further that the distance between the base station and this mobile is equal to the distance between co-channel cells D (Section II-A) . It follows that, the mean square of the interference generated by the base station at this mobile is given by

- .............

0 .01

0.007 ~

g 0.006

.~

'"

:5 0.005

-

et 0.004

- - Assumed

~

~

A rea umfcrm

E{ ~: Filw'k(O, u)a(Qi)1

2

0 .003 0.002 0 .001

}

= ciw'k(O, u) (l:~:Oooo fQ(v)a(v)a*(v) dV) Wk(O , u)

1

I

OL-.-----'---:':-----'----=---~-___:

- 60

Fig. 6.

- 40

-20

0

Angle in degrees

20

40

60

= ciw'k( O, U)Wk((}, a )

TIlustration of the angular distribution of the mobiles.

where

described in this section are for the purpose of designing an algorithm. The assumptions in the analysis will differ somewhat. The objective of the weight selection algorithm is to provide weighting vectors, Wk(O, a), such that the Outage Probability (OP) is the lowest possible. We reason as follows: The transmission of the kth message Sk (t) will yield contributions to the signal received by 3d different mobiles . These mobiles are: the desired (kth) mobile, the other d - I mobiles in the cell which use the same channel, and 2d mobiles in the first tier of interferers. Choose W k (0, a) such that the quotient of the mean power of the desired contribution to the undesired contributions is maximized, thus specifying Wk(O, a) to the degree of a scalar factor. This scalar factor is chosen such that the nominal gain in the direction of the kth mobile is one. Assume that all mobiles in the cell allocated to the channel are one half cell radius from the base station, that is r = R /2. Consequently, from (8) and (13) the expectation of the squared amplitude of the desired contribution can be written as

E{ (~) -r L 1w'k (O, a )v(th , iTk )12}

ci = E{L i}E{lFiI2} /(3R2C p /2.

(23)

Using E{L h } = E{L n} = E{L i} and E{lF i I2 } = 1 the following criterion is now obtained

where

M1

= (3Cr / 22-r Rvv (fh , !.Tk)

M 2 = (3C )-r/ 22-r

L d

n=l.n#k

Rvv(Bn , !.Tn ) + 2 dI.

(25)

(26)

The maximum of the criterion function (24), is given by the largest solution f.L to the generalized eigenvalue problem (27)

where it should be noted that M 1 and M 2 are positive definite matrices and e is an eigenvector corresponding to u: The vector Wk (O , u ) is given by

= Ckw'k(O, u )E{v(lh , !.Tk )v*(Bk, !.Tk )}Wk ((} , u )

= CkW'k (O, u )Rvv(Bk , !.Tk)Wk(O, u )

(22)

Wk (O , o )

(18)

1

= a* (B)k e e.

(28)

(Numerical algorithms for solving the generalized eigenvalue problem can be found in [3].)

where (19)

The expectation of the squared amplitude of Sk(t ) at the nth mobile (n E {I " .. , k - 1, k + 1, · ·· , d} ) in the cell is similarly given by

E{ (~) -r Lnlw'k (O, u)v(Bn,

!.T n

)12 }

= cnw'k ((}, u)Rvv(Bn , !.Tn)Wk((}, u)

(20)

where (21 )

B. The Channel Allocator

The channel allocator allocates d mobiles, on each of the n c channels available in the cell, using the information of

the angular positions of the mobiles . The objective is to keep the mobiles operating on the same channel well separated in angle. The channel allocation is represented by the (n c , d) dimensional matrix e. Each of the n c rows of this matrix corresponds to one of the n c channels available in the cell. The d entries of a row contain the angular positions of the mobiles allocated on the channel. The channel allocator works as follows : Assume that the positions of the ti ; x d mobiles are given by i[J = {(P1 " " ,
475

cPn

c

x d.

The

e

matrix is then assigned according to

e 1, k =
(37) (29)

e

This means that is filled column wise, starting from the upper left corner and finishing at the lower right comer. If the distribution of the rows of is the same, the interference will be independent of the channel number. This is accomplished by randomly permuting the rows of such that every possible permutation has equal probability. Furthermore, the columns of are circulated to remove dependence. The distribution of the positions, 0, of the mobiles on an arbitrary channel is deri ved and presented in [17]. In particular it is shown that

v ()k ~

e

e

-

-

2

(1)27r ~

27i"

d(k - 1),

as n;

-+ 00

v

(J3

when

)

1 - d)

when

d k < -

2

+1

d k> - 2

+1

B( a) = diag ([1, e- j a,

(30)

... , e-j(m-l)ii])

Qi E Uniform[O, 27i"]

-

'h

21r = y'3 sin (fh)

(31)

and ~ denotes convergence in second mean norm. It is also shown in [17] that

81

E Uniform [0, 27i"]

1) Calculate the matrices Rvv(Ok, ak) using (5) and (10).

(38-39) and q is given by (40). 4) Calculate rl and r2 using the cellular geometry and the position of the mobile (r, a) (See Fig. 5). 5) Simulate 11(0, 0- 1 ) using the formula v(O, 0- 1 ) R~G2(O, 0"1)(, where ( is an m dimensional vector of independent complex elements with real and imaginary parts independently normally distributed N(O, 1/ /2). 6) The remaining is straightforward use of (34)-(42).

A. Simulation of the Instantaneous SIR

ao

(33)

then the SIR can be simulated using the equations (34)

(42)

2) Calculate a square root factorization of Rvv (0, 0"1) i.e. determine R~G2(O, 0'1) such that R~~2(O, a1)R:~2(o, 0'1) = Rvv(O, 0- 1). 3) Calculate WI (9, u), ... ,Wd(O, u) where 'IJ is given by

IV. RESULTS

cos (a)

(41)

The most important steps when using (34)-(42) are:

i.e. 81 is uniformly distributed [0, 27i"]. In a practical application the reallocation should be made as often as possible. The computational demands seem to be very small. The most computationally demanding task is the sorting of the d x n c mobile positions.

a:=---

(39)

diag (A) denotes a diagonal matrix with elements given by A and sin -1 (.) is the inverse of sinf-). Proof' See Appendix B.

(32)

In order to estimate the OP for a mobile, a large number of realizations of the instantaneous SIR is drawn. The following theorem provides a computationally simple formula to obtain these realizations. Theorem 1: If the number of channels per subcell, n c , is large and the angular spread in terms of the beamwidth a (see Section II-E) is the same at all mobiles (seen from a desired base) i.e.,

(38)

(40)

where

where

(V3 a:(k - 1))

(h = sin- 1 a:(k -

e

(Ok -

sin- 1

Note that steps 1-4 only have to be performed one time whereas steps 5 and 6 should be repeated for every new sample of SIR. The method used to determine the OP of the system is to draw a large number of independent realizations of the instantaneous SIR and count the fraction of times that the SIR exceeds the threshold g. In order to draw these realizations we must specify a number of user parameters. In the next section the influence of the parameters m. d, C, and ao on the OP will be investigated. (Remember m is the number of antenna elements in the arrays, d is the number of mobiles per channel in each cell, C is the channel reuse factor, and 0"0 is the width of the local scatterers). The remaining parameters are kept constant at the following values: The position of (R/2, 0) and the angular the mobile is given by (r, a) spread in terms of the beamwidth is the same at all mobiles. The rationale for the assumptions on r, a and a will be discussed in Section IV -C. The threshold used in the OP calculation (16) is 9 = 9 dB (simulations using 9 == 3 dB gave qualitatively the same results). The standard deviation of the log normal fading is a ae = 6 dB. The path loss exponent used is , = 4. The number of channels in each cell n; is large i.e. ti; = 00. (Simulations have indicated that this assumption can be considered to be valid when n c > 100). The number

476

= a

,

0.09

I

~

,

~O .04

a'5

l

/

x: C=4

,+

+.. C-7 -

/

//

//

I

0.03 __ _ B.ef.e rence

-

.J -

//

(J :::::::

'0

x- - -

K

10

11

" 10 ~ ~

c

..., _

E

E

-

..J

,'"

'2 ~

_

0°

~

ro 12

"

.4"

a

~ 14

I

0- - / /

,I I

,)to -

,'"

-

-><'

I

I

0.02

-- - ,,I

.:+::x: :'.t

~

0.0 1

:; :: :: ::

o

1

1.5

I

2

2.5

3

3.5

4

Spectru m Efficiency (E)

=

=

4.5

5

=

=

=

=

=

B. The Spectrum Efficiency of the System

As promised in the introduction of this report the capacity gain which can be obtained with the proposed system will be investigated. As a starting point for this analysis the case m = 1, d = 1, C = 4 will be used as a reference. This can be thought of as the system of today, with which the new system is compared. The capacity of the system can be enhanced either by increasing the number of mobiles per channel and cell, d, or by decreasing the channel reuse factor C. To make a fair comparison of these two alternatives the (relative) spectrum efficiency factor E, defined as

d C

678

Spect rum Etticency (E)

Fig. 8. The spectrum efficiency when C

samples of the instantaneous SIR used to estimate each OP is 40000. This yields a theoretical standard deviation in the OP estimate of ~0.08 % when the true outage probability is 2.5%.

E =4-

3

5.5

Fig. 7. The OP for the case In 7 and 0"0 1.6° . (The values of d corresponding to the points in the plot are; C 1: d 1, C 4: d 1. . . .. .5, C 7: d 2. . . . . 5)\abovedisplayskip lOpt\belowdisplayskip IOpl.

=

0

x

~

I

I

"

0 : C::;l

"3sr16

I

'

X

'0

I

I

:i .05 10.

x0.8

/J '"

18

I

I

I

0 06

1.6°

I

I

I I

0 .07

a ;:::;;:

20

I

I ,

0.08

ofL

I

I

(43)

will be used. Thus, for our reference system E = 1 and the OP has been estimated to be 2.65%. Example 1: In order to determine the maximum spectrum efficiency with seven antenna elements m = 7, the OP is estimated for all combinations of d E {I,· · · , 7} and C E {I, 4, 7}. The result when ao = 1.6° is shown in Fig. 7. From this figure it is concluded that the maximum value of spectrum efficiency which can be obtained without having the outage probability exceed the reference 2.65% is E = 4. This value is obtained when C = 4 and d = 4. Example 2: The same procedure as in Example 1 is repeated here for all m E {l , .. ·, 20} and ao E {0° , 004° , 0.8° ,1.6° , . . . ,4. 8°} . In these simulations, the channel reuse factor four, (C = 4), was found to be an optimizing value in all the cases. Thus, the maximum spectrum efficiency is equal to the maximum number of mobiles per channel in each cell for the case C = 4. In some cases, the channel reuse factor one was also optimal. In Fig. 8, the minimum number of antennas required to obtain an OP

12

= 4 (0" is given at = 0). Q

less than 2.65% has been plotted as a function of spectrum efficiency. The eight different curves correspond to the eight different values of ao. It is noted that the number of antennas required increases rapidly as ao increases. In particular it is seen that in order to multiply the capacity with five, six antenna elements are needed when a = 0° whereas 16 are required when ao = 4.0°. From Example 2 we deduce that the performance of the proposed system degrades with increased angular spread. This is in contrast to the reception technique [11] which improves with angular spread. The reason is that as the angular spread increases, the wavefront becomes less planar, which can provide improved performance through diversity gain. Furthermore, when receiving with an antenna array (as in [II)) it is possible to estimate the channel transfer function between the mobiles and the antenna elements, while during transmission (as in this paper) this is impossible due to lack of feedback. C. Influence of the Mobile Position

In Theorem I and in Example I and 2, it is assumed that the spread in terms of the beamwidth of the array, a- , is the same at all angles 0: (recall Fig. 3) and independent of the distance between the mobile and the array. In this section we show how the results can be interpreted in the physically more reasonable case of a spread which is inversely proportional to the distance between the array and the mobile r, and independent of the angle 0:. Thus, assume that the physical spread a is given by a

R = -aD 2r

(44)

where ao is the spread at r = R /2. Assume further that the weight selection algorithm (Section III-A) still uses the spread a = ao/cos (0:), when it calculates the correlation matrices, independently of the actual value of a . It is possible to show that Theorem I still applies except that v(O, ao) should be replaced by v(O , ao cos (o:) /r ) where 0: is the angular position of the mobile. The results of Example 1 and Example 2 still apply in the point (r, 0: ) = (R /2, 0). However under

477

is such that a main beam is pointed at the desired mobile with nulls in the direction of co-channel mobiles within the cell, but not outside of it. For this system, multiple mobiles per cell increases capacity more than reduced reuse distance. It is found that in order to increase capacity d times ~ 1.7d antennas -are needed when 0'0 = 0.8° whereas ~ 3.2d antennas are needed when 0'0 = 4°. (The approximations are valid for

0.8

m ~ 20 .)

-0.8 -0.6 -OA -0.2

0.2

OA

0.6

0.8

ApPENDIX A

Umts ; cell rad ii.

LEMMAS TO THEOREM 1

(a)

From the definition of B(ii) in (41) we immediately obtain

1,-------r--,---.--~"""7l"~_...-,._____,r___r-_,

B-1(ii)

0.8 ~

B(ii

OAL

(46)

+~)

=

(47)

B(~)B(ii).

Using (5), (10) and (41) we obtain

0.2

Rvv(0:', O'o/cos (0:'))

O L----l. _-L_-"- _L.-~:::.....-......L --'-------'------L -...J

-0.8 -0.6

-OA

-0.2

0.2

OA

0.6

0.8

= B ( ~ sin (0:')) R,JV(O , 0'0) . B* (~ sin (0:')). (48)

Units = cell rad ii.

(b)

Fig. 9. a) III = 1.6 0 •

0'0

(45)

and

0.6r

-1

= B*(ii) = B( -ii)

= i , d = 4, C = 4, 0'0 = 1.Go , b) = 9. d = 1. C = 1, 111

Let us define

ic; (ii, 0'0)

as

Rvv(ii , 0'0) = B(ii)Rvv(O, O'o)B*(ii) .

the assumptions here, the proposed system will most likely have a lower OP than the reference system in most of celL Because when 0:' and/or r increase, the spread in terms of the beamwidth, ir, decreases and this improves the SIR of the proposed system. To test validity of this conjecture the OP has been estimated in a large number of positions (r , 0:') for a large number of cases. In Fig. 9 below, the result of this validation process in the cases m = 7, d = 4, C = 4, 0'0 = 1.6° (Fig. 9a) and m = 9, d = 1, C = 1, 0'0 = 1.6° (Fig. 9b) is displayed. The plots illustrate a subcell with the base station in the point (0 ,0). The symbol "+" indicates that the proposed system has a lower OP than the reference system, and " 0" indicates the reverse. In the case m = 7, d = 4, C = 4, 0'0 = 1.6° the proposed system is better than the reference system in approximately 85% of the area while the corresponding number in the case m = 9, d = 1, C = 1, 0'0 = 1.6° is 75%. The results in other cases are similar. To overcome the problem with the area close to the base, reuse partitioning [4] could be used. V . CONCLUSIONS

In this paper, we have investigated the spectrum efficiency gain of a base station antenna array system for base to mobile communication. The propagation model includes path loss, shadowing and fast fading. The effect of the spread angle of the locally scattered rays in the vicinity of the mobile is also taken into account. Also, we introduce a new approach for increasing capacity in which multiple mobiles are allocated on the same channel within a celL The proposed transmit pattern

(49)

From (48) and (49) we obtain - (27f Rvv(O:' , O'o/cos(B)) = Rvv .j3sin(O:') ,O'o ) .

(50)

Using (47) and (49) we arrive at

Rvv(ii + ~ii , Define Wk(iJ ,

0'0)

_ -

0'0)

= B(~ii)Rvv(ii , O'o)B*(tlii) .

(51)

as

Wk((), 0'0)

= arg

max

x*Af1x -

x· B(8k )a (O) = l x * M

2X

(52)

where (53)

Af2 = (3C)'/22'Y

L d

Rvv(Bn ,

0'0)

+ 2dI .

(54)

n=l,n¥k

From (24)-(26), (50) and (52)-(54) we obtain Wk( (),

O'o/cos (Bd, . . . , O'o/ cos (B D ))

=Wk (~ sin (Bd,···, ~ sin (B D), 0'0) '

(55)

Since B(ii) and Rvv(ii , 0'0) are periodic with periodicity 27f (from (41) and (51» we obtain from (52)-(55) that

Wk(fJ , 0'0) =

wk(B 1 + k 127f , "

where k1 , ' . . ,k d are integers . Theorem 2:

478

', BD

+ k D27f , 0'0)

(56)

Proof: Using (51) and (52)-(54) we obtain that wk(6 1 + Ll, ... ,Od + 6. ao) := Wk(O + Ll, 0"0) is given by

Similarly we obtain,

7

iiJ,,/iJ + ~, 0"0) =

W~((Ji, (Jo/cos (ei),· .. ,ao/cos (eb))a(a i )

x* B(~)MIB*(6.)X

== w~ (6, u)B( (01

arg max X· B(I:>.)M2 B · (I:>.)x. (58) x { x * B(6.)B(()k)a(O) == 1.

-

Wk((J

0"0)

y*Af2~

= argm;x y* B(fh)a(O) = 1

{ where y == B*(6.)x.

(59)

w~((Ji,

D B

ao/ cos (81 ) , ... , aO / cos (()D) )v( ()1 \ ao/ cos (()1 ))

== w~((O, 82

-

81 , "

' ,

. V(Ol, ao/cos (B l ))

Bd

-

e

1 ],

ao)B( -HI) (61)

where (62)

Since v(B1 , ao/cos(e 1 ) ) is multivariate normally distributed with mean zero and covariance R vv (01 , ao/cos (81 ) ) we obtain from (46), (50) and (51) that B( -01)V(01' O"o/cos ((;II))

N(o, B( -B1)Rvv(iJ 1 , O"o)B*( -8 1 ) ) == N(o, ~Jv(O, 0'0)). E

(63)

Applying (56) and (63) to (61) yields w~((J,

ao/cos(e 1 ) , · · · ~O"o/COS((}D))V(Ol, ao/cos(Ol)) d ist _

*

-

== Wn([O, (02 -

-

--

el)27'i~"" (()d - ( 1 )27r L

ao)v(O, 0'0) (64)

where d~t denotes that the distributions of the left and right hand side are equal. Using the property (30), (55) and (56) on (64) yields w~((},

where

ao/cos (()1),··· ,{Jo/cos (()V))V(f)l, {Jo/COS (01)) dist == W n* ((}v ,(J'V) V (0 ~ (To ) (65)

lJ is given by (38-39) and

(67)

ACKNOWLEDGMENT

1

The proof here is based on the definitions and results of Appendix A. Without loss of generality the proof is given for mobile number one, that is k == 1 in (12)-(15). Applying (55) and (57) (with L\ = -2n/V3sin(t11)) to w~((J, ao/cos(e 1 ) ,. · · ~ao/cos(eD))V(()l, O"o/COS(Ol)) yields w~ ((J,

(01),···,ao/cos (eb))a(ai)

where al is uniformly distributed [0, 2n], {} is given by (38-39) and (f is given by (40). Applying (65) and (67) to (12)-(15) with (33) yields the desired result. 0

(60)

ApPENDIX

(JO/COS

~t w~(O, u)B(ai)a(O)

From (52) and (59) it is obvious that

PROOF OF THEOREM

(66)

=

1L& +~,

Oi)21r )a(O)

where g~ 27r / V3 sin (Oi) and Oi == 27r / J3sin ( Qi)' Since 01 is independent of Qi and uniformly distributed [0, 27r] from (32), the argument [01 - Qi]21r, will also be uniformly distributed [0, 27r] and independent of ()~. Thus using (66)

Introducing y == B* (6.)x yields

~

-

if is given by (40).

Special thanks to Dr. T. Trump, Dr. U. Forsen, and Dr. M. Almgren for valuable comments and discussions.

REFERENCES [1] F. Adachi. M. T. Feeney, A. G. Williamson, and J. D. Parsons, "Crosscorrelation between the envelopes of 900 Mhz signals received at a mobile radio base station site," lEE Proc., vol. 133, pt. F, no. 6, Oct. 1986 . [2] S. Anderson. M. Millnert, M. Viberg, and B. Wahlberg, "An adaptive array for mobile communication systems," IEEE Trans. Veh. Technol., vol. 40, pp. 230-236, Feb. 1991. [3] G. H. Golub and C. F. Van Loan, Matrix Computations. Baltimore: Johns Hopkins Press, 1983. [4] S. W. Halpern, "Reuse partitioning in cellular systems," Proc. Veh. Technol. Con! VTC-85, 1985, pp. 322-327. [5] W. C. Jakes, Ed., Microwave Mobile Communication. New York: Wiley, 1974, pp. 79-131. [61 W. C. Y. Lee, Mobile Communications Design Fundamentals. New York: Wiley, 1993, pp. 157-159. [7] 1. Li and R. T. Compton, Jr., "Maximum likelihood angle estimation for signals with known waveforms," IEEE Trans. Signal Processing, vol. 41, pp. 2850-2862, Sept. 1993. [8] V. H. MacDonald, "The cellular concept," Bell Syst. Tech. J., vol. 58. pp. 15-41. Jan. 1979. [9] S. J. Orfanidis, Optimum Signal Processing, An Introduction. Singapore: McGraw-Hill, 1990. [10] B. Ottersten, M. Viberg, and T. Kailath, "Analysis of subspace fitting and ML techniques for parameter estimation from sensor array data," IEEE Trans. Signal Processing, vol. 40, no. 3, pp. 590-600, March 1992. [11] J. Salz and J. H. Winters, "Effect of fading correlation on adaptive arrays in digital wireless communications," in Proc. Centre International de Conferences de Geneve (ICC-93), Geneva. Switzerland. [12] R. Steele, Ed., Mobile Radio Communications. London: Pentech Press, 1992, pp. 82. [13] S. C. Swales, M. A. Beach, D. J. Edwards and J. P. McGeehan, "The performance enhancement of multibeam adaptive base station antennas for cellular land mobile radio systems," IEEE Trans. Veh. Technol., vol. 39, pp. 56-67, Feb. 1990. [14] T. Trump, Maximum likelihood estimation of nominal DOA and angle spread using an array of sensors, Tech. Rep., (IR-S3-SB-9422), Access: see [17] below. [15] J. H. Winters, "Optimum combining in digital mobile radio with cochannel interference," IEEE Trans. On Veh. Tech. , vol. 33. pp.

144-155, 1984.

[16] _ _ , "On the capacity of radio communication systems with diversity in a Rayleigh fading environment," IEEE Selected Areas Commun., vol. SAC-5, no. 5, pp. 871-878, June 1987.

479

[17] P. Zetterberg, "The Spectrum Efficiency of a Basestation Antenna Array System for Spatially Selective Transmission," Report Version, (IR-S3-SB-9403), available by Mosaic: Document URL: http://www2.e.kth.se/s3/signaVINDEX.html or by anonymous ftp to: elixir.e.kth.se directory/pub/signal/reports, [18] P. Zetterberg and B. Otters ten, "Experiments Using an Antenna Array in a Mobile Communications Environment, " Proc. 7th SP Workshop on Statistical Signal & Array Processing, 1994 (IR-S3-SB-9412) Access: See [17] above.

480

Capacity Enhancement and BER in a Combined SDMA/TDMA System Josef Fuhl and Andreas F. Molisch Institut fur Nachrichtentechnik und Hochfrequenztechnik, Technische Universitat Wien, Vienna Gusshausstrasse 25/389, A-I049 Wien, Austria Phone: (+43) 1 58801 3546; Fax: (+43) 1 587 05 83; email: [email protected] Many papers have been published on SFU, see e.g. [1],

Abstract - This paper considers the performance of a TDMA system employing smart antennas at the base station. Two adaptation schemes are analyzed - the switched beam approach and an adaptive array based on an adaptation algorithm to maximize the Signal-toInterference and Noise ratio. For an SDMA system the switched beam approach performs worse than the adaptive array. Adaptive arrays based on gradient-vector estimation (e.g. LMS) are not suitable for mobile radio. The class of Least Squares (LS, RLS, SQRLS) algorithms shows satisfactory performance. For a linear array with 8 elements a minimum angular separation of 100 between two users is sufficient for the adaptive array to achieve as good performance as a system serving one user per traffic channel.

[2], [3], [4]. They show that for a single user the bit error

rate (BER) can be decreased by pointing the "main beam" of the antenna towards the current location of the user. This contribution is devoted to a true SDMA scenario. We consider the canonic situation that two users are served on the same traffic channel, consequently the capacity of such a system will nearly be doubled. OUI simulations are based on a channel model including directions of arrival and on the air interface of the 2nd generation standards GSM and DCS1800. We show how the BER is changed by adding the second user, as a function of Signal-to-Noise Ratio (SNR) and the number of antenna elements. The paper is organized as follows: Section 2 discusses the channel model used for the simulations. Section 3 addresses the simulation setup for the whole system. In Section 4 we take a look at the performance of different adaptation schemes. Section 5 gives the simulation results for various parameter combinations. Section 6 concludes this work.

1. INTRODUCTION The growing number of users of cellular communication systems necessitates measures to increase the performance of such systems, i.e. their coverage and capacity. Currently, there is a considerable interest in adaptive base station (BS) antenna arrays for 2nd and 3rd generation mobile communication systems. A possible 2-step implementation procedure for smart antennas may be as follows:

II. CHANNEL MODEL

• Spatial Filtering at the Uplink and at the Downlink

(SFU-SFD): Smart antennas are used both at the uplink ( mobile station (MS) transmits, BS receives) and at the downlink (BS transmits, MS receives). Only one user is served in one traffic channel. The aim is increased coverage and decreased interference for cochannel cells.

• Space Division Multiple Access (SDMA): With the use

of adaptive directional antennas and additional hardand software at the BS, users in different angular positions can be served in the same frequency band and in the same timeslot, i.e. on the same traffic channel. The data intended for each user are separately processed in base-band in such a way as to give the user-specific antenna pattern. The signals are added (linearly superposed) and modulated onto the RFcarrier, which is radiated from the antenna. This approach leads directly to increased spectral efficiency of the system. However, it can be added to an existing 2nd generation mobile communication system only if there are also changes and redefinitions in the switching and signaling system. The concept of a cell in its traditional sense has has to be redefined.

\Ve utilize a channel model including directions of arrival (DOA) and fading [5], [6], [7], and [8] (Fig.I). It is suitable" for both rural and urban areas. Fading correlation at the receive array is automatically included by the model. Like all the references above, our considerations are restricted to a two-dimensional channel model (i.e. the horizontal plane), but as mentioned in [8] this does not impose severe restrictions for mobile radio applications. Many scatterers in the vicinity of the mobile combine to one fading signal, spread out in angle over several degrees dependent on the distance of the mobile from the BS. By this model we extend the concept of DOA to a nominal DOA associated with an angular spread in contrast to the widely used discrete DOAs. In order to model the propagation physically we partition the propagation area into two different regions [8], [7]: (1) Regions without scatterers; (2) Circular regions where the scatterers are located. The motivation for choosing circular regions where the scatterers are located lies in measurements conducted by [5] and [6]. The radius R of these regions is about 100A - 200"\, where A denotes the wavelength [6]. This models can be easily generalized as shown in [8]. The overall impulse response for this channel at the location of the m-th antenna element r-« = [x m , Ym, Zm]T

Reprinted from Proceedings of the 46th Vehicular Technology Conference, Vol. 3, pp. 1481-1485, 1996.

481

---

CDF( Q',)

" I

I

~

~

~~2R

Scatterer

""

1 ::~===--~ ", ,

0.7

\

I

.'

, "

0 .5

' ••..

": 1

I ; ··

., ...

0 .3

,

0 .2

BS

,, ,.

r

. . ..

0 .4

\ \ I

I I

. , .. ..

0 .6

\

.... . . . j ••.

r. .

0.1 - :

, ,r

Fig.! Chann el model

h(rm ,

T,

cell radius

t, cp)

L

= 2: h,(rm , T, t, cp), 1=1

- ' - : r2

(1)

where L is the number of scattering points, T is the delay (relative time), t is the absolute time, and cp is the azimuth angle. The quantity ht(rm , T, t, cp) denotes the impulse response of the l-th path at the location r m of the m-th antenna element. The impulse response is different for every location of an antenna element. A. Angular Spread

We define the angular spread a s as the angle under which the diameter of the scattering region is seen at the BS (Fig. I). It is given by

as = 2 arctan

(~),

(2)

where R is the radius of the scatterer circle and r m is the distance between BS and the mobile. We assume that the users are uniformly distributed within the cell area. The Cumulative Distribution FUnction (CDF) F(CLs,k) of the angular spread CLs,k of the incoming signals from user k, 1 ::s k -s K, where K denotes the number of users, seen at the BS can be calculated to [9)

(3) where r l (rz) are the distances from the BS to the nearest (farest) location of the user and 0'0 given by CLo =

= r l = 100,), a n d the oute r = 1000,),; - : r 2 = 5000,),; and

Fig.2 CDF of the angular spre a d for R

(1 ::s m -s M) is given by

2 arctan

(~)

(4)

is the angular spread associated with a user located at the cell fringe. Fig.2 gives the CDF of the angular spread with the outer cell radius rz as a parameter.

482

r2

as parameter. - - :

=20000,),.

r2

III. SIMULATION MODEL We use a mod el with a protocol closely related to t he 2nd generation standards GSM and DCS 1800. The only difference is that two or mor e users are served at t he same traffic channel (i.e. at th e same frequency and time slot ). Therefore each user 's mobile has to be assigned a unique training sequence, each of which must com ply with the GSM (DCS 1800) specifications. This 26-bit midam ble originally int ended for estim ation of t he impulse response of th e channel (equalizer t raining) is now used for user separation and identificati on also . We assume perfect time alignment of the received sequ ences. We consider a narrowband channel where transmission suffers from flat-fading only. As parameters for th e chan nel we assumed L 20 scattering points per user. All th ese points are randomly located within th e scattering circle. The radius of the scattering circle has been set to T\ = R = 100>' and th e outer cell radius to rz = 5000>'.

=

IV . ADAPTATION SCHEMES Two different adaptation schemes are investigated in this paper , th e switched beam approach and an adaptive array based on Least Squares (LS) adaptation. Sw itched beam uses a set of P (P 2: K) different beam positions (Fig .3) to separate t he users. The output signals from the different positions are demodulated and th e reference sequence (t raining sequen ce or user identifier) is compared to th e training sequences used . Different criteria like minimum Bit-Error-Rate (BER) , maximum received power, or a combination of both can be utilized to determine the best suited pattern . The signal containing th e training sequence which is closest to the desired sequence (in terms of the abovementioned criteria) is taken .for reception of the specific user .

2

~"~ ':_~w ,(n)

~...... j -.:.:.~-.......

wz(n) ~~: ---=-~---4C

~,..;-i

w3(n) .

-...-oGG--../

Fig.3 Switched beam

(a) 1 ~., .1. . . ._ _~~_ _.... L-I

Y

I

,< .. .:

I

2~_ ' -h--~-+- L-I

Y

I

I

I I

I 1

, :. :

3 ~_:-++r---01~-+-+--__""

L-I

I

1-

,

"

.......

"

""S..:.

Training

'

.

,.' \

.... ....

.. ,

.,.

... Co: :.

~

,

'0

,

~

,

::.:f! U HU! U Uf~~;HJ~;~: i l;U:

..

.:.

,,',

_I

5

...

"

\

Switch for Tracking

I

,

.., t ::::::.': :, ': ::: 10'

M~,., :-++h-_~H+-' y

....

'1

.10

Steps

10

'5

(b)

Fig.4 Adaptive array

-

-

.

~

20

25

Steps

Fig.5 Averaged mean power e2 of t he erro r for t he co nsidered adap-

Adaptive A rrays are based on maximization of the

Signal-to-Noise and Interference (SNIR) ratio upon a known sequence [10] (FigA). Algorit hms based on gradient-vector estimation (LMS) and/or Least Squares (LS), (LS, RLS, SQ RLS and its non- deterministic counterpart , the time- space domain Wiener filter) may be utilized. Fig.5 a shows t he averaged mea n power of the error for the LMS. T he numbe r of anten na elements is M = 8, their spac ing d = )../2. The step size J-l LM 5 of the LMSalgorithm was set to the ha lf of the maximum ste p size to obtain convergence. The Signal-to-Noise Ratio (SNR) was set to 20dB. Since th e LMS does not converge with in the used 26-bit traini ng sequence, we used it repeatedly. The slow convergence of the LMS-algorithm is remarkable . T his is due to t he eigenvalue sp read of the covariance matrix of the antenna outputs. For smart antenna applicatio ns t he corre lat ion mat rix is usua lly ill-conditioned (i.e. its eigenvalues are widely spread) , therefore t he convergence speed of t he LMS is low. Fig .5b shows the averaged mea n power of t he error for the RLS, SQRLS , and LS al-

tation sche mes. (a) LMS; (b) - - : LS, - : RLS, and 0 : SQ RLS . No te the di fferent scale on t he x - a xis .

gorithm. T he forgetting factors of bot h RLS and SQRLS were set to 1. They show that t he RLS and the SQRLS perform in the sa me way for infinite pr ecision arit hmetic . For finite precision arit hmetic the SQ RLS is preferable [11]. T hese graphs illustr ate that LMS is not suitable for a mobi le radio environment with short t rai ning sequences [12], [4], but LS (Wiene r filter) or (SQ)RLS algorith ms are well suited . V. SIMU LATION RESULTS A. Single User

Fig .6 shows the BER for t he considered adaptation st rategies for various SNRs wit h t he number of anten na elements M as parameter . The curve for M = 1 agrees well with t heory (BER 1/(2SNRlin), for SNRlin > 10, where SNRlin is the SNR on a linear scale) . Switched beam performs better tha n the adaptive array. T his may be att ributed to the non- st ationar y chan nel. T he nomi nal DOA of t he impi nging signa ls is the only quantity being st ationary . This gives adva ntages for DOA-based approaches for this specific scena rio.

483

=

1=: f- .__.

-

-

-- ..

-.

..

-

.-

~=1 -'- :;>... ~

~~

--

F '

f=.

-

~

~

.V

--

M-2 -

~-

""'-..;;

-............. ~

<,

............

~~ ,

M 5

_. -.- -

-

--- .

~

..

- ..

-- -2

-

0• •

...-

,M-a/

----. . .

-

c=

--

~

-

--

..............

~

,.. ........

._--

~;

....

..~ .""::a...

." , · · · · · · · ·_· ·~ ·· ·· · · · · · ·~· · · · · · ·· · · · ·j···· · · · · · · · · M= 1····' ········ . . . . ..~ ;

~~~ . ......... .

...........}

i ···········j············i·····- ,

( ...........•-

:

~

;

~

~ ,

;

_j ~

.

.

.

_ ._.-._- "- .._._----_._- -_._---_._.4

6

8

10 12 SNR/dB

14

16

18

20

cr:

UJ

to

Fig.6 BER versus SNR for a single-user system. - - : Switched beam, - : LS; 0 : RLS, SQRLS. A linear array with d = )J2 is used. M is the number of antenna elements .

·········:- ········M=S·· ~ "':::"; - - _ · · ·; · · · · · · · · · · r;;;~·8

... - - -: -_ ._ - _ ~ .._

i.._._

~

The simplest SDMA-scenario are two users within the same traffic channel (K = 2). With a conventional detector without any equalization the BER for one user is 0.25 for a Signal-to-Noise Ratio (SNR) of SNR = 00 [9]. Fig.7 shows the BER curves for one of the two users served in the same traffic channel. If two users are close in space they can hardly be separated by a conventional SDMAapproach . Even if they could be separated at the uplink, downlink transmission to the mobiles without causing significant interference to at least one of the users is not possible. These close-by users could only be served on the same traffic channel if more sophisticated signal processing at the mobile is used [13]. So, if the two users are too close in angle, i.e. their angle difference reaches a given threshold value fP'h' a "handover" of one user to a different frequency or timeslot has to be performed. This threshold, of course, influences the capacity: the higher the threshold, the lower the capacity. For RLS, SQRLS, Wiener filter, and the Least Squares approach a threshold of fP'h = 10° with M = 8 antenna elements is sufficient to obtain nearly the same performance as for the single-user case. Although Fig.5 suggests otherwise, we found that LS and (RLS, SQRLS) perform equally well BER-wise. The figures demonstrate that the switched beam approach performs worse compared to LS-adaptation. Switched beam uses a given set of beams and selects the best of them in accordance to a chosen criterion . Since this approach cannot place the maximum in the "best", i.e. desired, direction and a concomitant zero in the unwanted angular section, interference is always higher than for the adaptive array approach. This leads to the increase in BER especially for high-SNR conditions . For one specific scenari o Fig.8 shows the antenna pat-

484

••_

:::::::::::r:::::·::::r::::::::::l::::::::::::L~::::::::: r:: ::::::::t :::: ::_ .. ...........: : ~

............-.

B. Two User

. ..- ·· · · Tmfu54lc l~~~;j

~

:::::·::::·>· ·::::+:::· :::::::::::: :::·::~:::::: :::F:::::< :::

2

~-

~

.

4

6

8

:

-

10 12 SNR/dB

14

16

18

20

14

16

18

20

(a)

2

4

6

B

10 12 SNR/dB (b)

Fig.7: BER versus SNR for a two-user system. - - : Switched beam, - : LS; 0 : RLS, SQRLS. (a) Threshold
(3] I. R. Carden, M. Barrett. " Adaptive Antennas for Second and Third Generation Mobile Systems." RA CE Mobile Telecommunications Workshop , May 17-19 , 1994, Amsterda m, pp. 728-732.

8/deg.

.------0---

User 1 30

90

1

User 2

-30

\---+-----l--4---+---,~_+-_t'___+_-_t___1

(4) G . V. Tsoulos, M. A. Beach, S. C . Swales. " Adaptive Ante nnas for Third Generation DS-CDMA Cellular Syste ms ." P ro c. VTC'95 , Chicago, Illinois , USA, July, 25-28 , pp.45-49. (5] P. Eggers . " Einfluf der Geliindestreuung auf die Strahlun gseigenschaft en von Basisstationsantennen fiir Mobilfunks yst eme ", Nachrichtentech., Elektron., Berlin 45 (1995) 4, pp . 57-62.

-90

Fi g .S An ten n a patterns t o separat e th e t wo users d erived by t he us e of LS a d a p t a t io n . - : P a t ter n t o serve us e r 1, - - : Pattern t o se r ve us e r 2 . T he antenna is a lin ear a rray with M

=

8 elements spa ced

by d = >.. / 2.

te rns serving the two users separa tely, derived by L5 ada ptati on . The adapta tion algorit hm places a maximum into the directio n of th e wan ted signals and nulls t he interferin g ones, indeed. VI. CONC LUSIONS T he performance of two adapt ation schemes for smart ante nnas for mobile communications is ana lyzed . T he swit ched beam ap proach performs worse com pared to an adapt ive array, especially at high-SNR conditions. For the ada pt ive arr ay only the class of L5 adaptation algorit hms (LS , RL5 , and SQRLS ) shows sa tisfactory perform an ce. The convergence speed of t he class of gradient-vector estimation (like t he famous L:\1S ) is too low for application to mobile radio problems. Ad ap tive array ap proaches show surprisingly low sensitiv ity to close- by int erference. T his means that only small angular threshold valu es for handovers ar e necessary , which results in a larger ca pacity increase th an for th e switched beam approach. VII. ACK NOWL EDGMENT Support of this work by the Aust rian PTT is gratefully acknowledged . T he views expressed in t his pa per are t hose of the a utho rs and do not necessa rily reflect the views within the Austrian PTT . We t ha nk Prof. Bonek for sti mula ting discussions. VIII . REFERENCES (1] J . H. Winters . " Signal Acquisition and Tr ackin g with Ada p tiveArr ays in t he Digital Mobile Radio System IS54 with Flat Fading." IEEE Tr ans . Vehicular Technology, vol. 42, no. 4, November 1993, pp . 377- 384. (2] O. Munoz-Medina , J . Fernandez-Rubio. " Adaptive Antennas in Mobile- Satellite Communica tions." RA CE Mobile Telecom munications Workshop , May 17-1 9, 1994, Amsterda m , pp . 754-759.

485

[6] W. C.- Y. Lee. " Effects on Correla tion Between Two Mobile Radi o Base- Station Ant ennas ." IEEE Trans . on Communications, vol. C OM-2 1, no. 11, November 1973 , pp . 1241-1221 . (7) O . Norklit, J. Bach And ers en. " Mobile Radio Environments and Adaptive Ar rays" . Proc. 4th Personal, Indoor a nd Mobile Radio Conference, PIMRC '94, The Hagu e, Netherlands, pp. 72.5-728.. (8] J . J. Blan z, P. W . Ba ier and P . Jung. " A Fle xibly Configurable Statistical Chann el Mod el for Mobile Radio Systems with Direct ional Diversity" . Proc. 2. ITG- Fachtagun g Mobile Kommunikation '9.5 , Neu-Ulm, Sept ember , 26-28, 1995, pp . 93- 100. (9) J . Fuhl , E. Bonek. "Adaptation Schemes for Smart Antennas - A Comparison Bas ed on a Channel Model Including Directi ons-Of-A rri val" , C OST 231 T D(96), Belfor t , France, 24-26 January, 1996. (10] S. Haykin. Adaptive Filte r Theor y (Second Edition). P rentice-Hall , Inc. . Englewood Cliffs (New J ers ey), 1991. (11] F . M. Hsu. " Square Root Kalman Filtering for HighSpee d Data Received over Fading Dispersive HF Chan nels" , IEEE Trans . on Inf. Theory, vol. IT- 28, no . 5, September 1982, pp . 753-763. [12J J . Fuhl , A. F. Molisch , " Space Domain Equalisation for Second and Third Generation Mobile Radio Systems" , P ro c. 2. ITG-Fachtagung Mobile Kommunikation, Neu Ulm, Sept embe r 1995, 26- 28, pp . 85- 92. (13] S. W . Wales. "Technique for cochan nel interference suppression in T DMA mobile radio systems" , lEE P ro c.Commun ., Vol. 142, No . 2, Aprill 1995, pp . 106-114.

Performance of Wireless CDMA with M-ary Orthogonal Modulation and Cell Site Antenna Arrays Ayman F. Naguib, Member, IEEE, and Arogyaswami Paulraj, Fellow, IEEE

Abstract- In this paper, an antenna array-based base station receiver structure for wireless direct-sequence code-division multiple-access (DS/CDMA) with i\l-ary orthogonal modulation is proposed. The base station uses an antenna array BeamformerRAKE structure with noncoherent equal gain combining. The receiver consists of a "front end" beamsteering processor feeding a conventional noncoherent RAKE combiner. The performance of the proposed receiver with closed loop power control in multipath fading channels is evaluated. Expressions for the system uncoded bit-error probability (BEP) as a function of number of users, number of antennas, and angle spread are derived for different power control scenarios. The system capacity in terms of number of users that can be supported for a given uncoded BEP is also evaluated. Analysis results show a performance improvement in terms of system capacity due to the use of antenna arrays and the associated signal processing at the base station. In particular, analysis results show an increase in system capacity that is proportional to the number of antennas. They also show an additional performance improvement due to space diversity gain provided by the array for nonzero angle spreads.

D

I. INTRODUCTION

IRECT-SEQlJENCE code-division multiple-access CDS/CDMA) is an emerging technology for civilian wireless communications. CDMA offers improved performance in terms of capacity or coverage area over frequency-division multiple-access (FDMA) or time-division multiple-access (TDMA) based cellular networks [1]-[4J. One approach to further increase the system performance is the use of spatial processing with base station antenna arrays 15j-(8J. By using spatial processing at cell sites, \ve can use optimum directional receive and transmit beams for each user to improve coverage or increase capacity. The increase in system performance by using antenna arrays in CDMA comes from reduction of cochannel interference from own cell and neighboring cells. In general, this reduction in cochannel interference can be used to improve other system performance measures such as coverage area, transmitted mobile power. and system capacity. In [9]. we proposed a base station antenna array Bearnformer-RzvKf receiver that exploits the spatial structure Manuscript received May 1, 1995; revised November 22, 1995. This paper was presented in part at the IEEE ICC'95 Conference, Seattle, WA, June 1995. A. F. Naguib was with the Information Systems Laboratory, Department of Electrical Engineering. Stanford University, Stanford, CA 94305 USA. He is now with the Information Sciences Research Center, AT&T Laboratories, Murry Hill. NJ 07974 USA (email: [email protected]). A. Paulraj rs with the Information Systems Laboratory, Department of Elecrrical Engineermg. Stanford University, Stanford, CA 94305 USA. Publisher Item Idenufier S 0733-87 J 6(96)05239-0.

in the multipath received signal in addition to the time diversity to provide a more efficient combining of paths. This receiver incorporates a "front-end" heamsteering processor feeding a conventional coherent RAKE combiner, which assumes know ledge of the amplitudes and phases of path gains. This requires the transmission of a pilot signal to obtain good amplitude and phase estimates. However, transmitting a pilot in each user' s reverse link signal, whose power is greater than the data-modulated portion of the signal, reduces efficiency to less than 500/0. Instead, either differential phase shift keying (DPSK) which does not require phase coherence or M -ary orthogonal modulation with noncoherent reception should be used. For M > 8, where M is the number of orthogonal signals, orthogonal modulation is better than DPSK llO], [l l ]. Analysis results for CDMA communications systems employing DPSK are reported in [12]-[15]. The analysis in [15] also assumes a base station antenna array receiver structure. Analysis for DS/CDMA with M -ary orthogonal modulation but without antenna arrays has appeared in r16]-[20]. The use of very low rate orthogonal codes in spread spectrum multiple access is considered in [16]. The analysis in l17j is done for the additive white Gaussian noise (AWGN) channel. However, the analysis in [18] models the multiple-access interference (MAl) as Gaussian noise and considers a rnultipath Rayleigh-fading channel [19]. The analysis in [201 is done for a general multipath energy distribution. In this paper, we propose an antenna array-based base station receiver structure for DS/CDMA with M -ary orthogonal modulation and noncoherent RAKE combining and study its performance. The receiver consists of an antenna array followed by a bank of Walsh correlators. The base-band received signal vector and the post-correlation signal vector are used to estimate the channel vector for each path. The output of the correlators is then fed to an optimum bearnformer followed by a noncoherent RAKE combiner. The output of the RAKE combiner is then used to estimate the transmitted data. Assuming that the channel parameters are almost constant over several symbol periods, the output of the RAKE is also fed back to the channel vector estimation algorithm to determine the winning post-can-elation signal vector which corresponds to the actual transmitted Walsh symbol. We may note that, although the analysis results in this paper drew upon some of the analysis results in [19] and [20], they are different in the sense that they are done for a base station receiver with an antenna array and include the effect of closed loop power control.

Reprinted from IEEE Journal on Selected Areas in Communications, Vol. 14, No.9, pp. 1770-1783, December 1996.

486

I-Channel Short Code

(ltl)

Convolutional

Encoder

and Repetition M -ary Orthogonal Walsh Modulator W(t)

Long Code

Mask

c;Ct)

User

.LongCode Fig. l .

dQ(t) Q-Channel Short Code

Mobile transmitter block diagram.

This paper is organized as follows. In Section II, we present the channel and received signal model. In Section III, we describe the antenna array receiver structure. System analysis and derivation of the signal and decision variables statistics are given in Section IV. The probability of error analysis is given in Section V. Numerical and simulation results are presented in Section VI. Finally, Section VII contains our conclusions and remarks.

and the I or Q channel PN code as (I) ( ) = c, ( t) a (I) t ( ) cit

and

(Q) ( t ) -__ t:.l (1, - ) a.'( Q) (. t).

C1

To simplify our analysis, the PN codes c;I)(t) and c~Q)(t) are represented by [20], [22]

c~I)(t) ==

L

c~~Jp(t - kTc )

(2)

c;Q)(t) ==

L

c~;)p(L - kTc )

(3)

00

oc

ll. SIGNAL AND CHANNEL MODEL

The mobile transmitter block diagram [21J is shown in Fig. 1. The binary data at the output of the interleaver are grouped into groups of J = log2 ~1 bits. Each group is mapped into one of lvJ orthogonal Walsh sequences \tV( t). The resulting signal is then spread using the user's long PN code C; (t). The signal is further multiplied in both I and Q channels by the short PN codes (](T) (f.) and 0,( Q) (t),

respectively. The PN modulated Q channel signal is delayed by half a chip period rT~_:/2. The two spread signals are upconverted to radio frequency for transmission. The power of the transmitted signal is adjusted according to both the open and closed loop power control mechanisms (see, e.g., [21]). Then we can write the signal transmitted by the ith mobile as .s i ( t)

:=-

1/)~

IP:(vv (h) ( t) C

+ l-i/l h ) (t

1(

r=-x

where c~~,: and c)~) are assumed to be independent and

identically distributed (i.i.d.) random variables taking values ±1 with equal probability, and p(O is the chip pulse shape, which can be any time-limited waveform. Here we assume that p(t) is rectangular although our results can be easily extended for any time-limited waveform. To simplify our analysis, we assume that we have a constant deterministic power-delay profile and that the log-normal slow fading is the same for all multipath components. Therefore, we can write the complex lowpass equivalent representation of the vector multipath fading channel from the ith user to the base station antenna array as {23] L,

t ) a( I) ( t ) cos (wc t)

hi(t, T) == ~L 6(1 t=o

To)Ci (t -- To)a (Q)

x (t - T()) si Jl ( wet))

0 < t

S

T'w

( 1)

where P, is the transmitted power per symbol per dimension, 'I ~lJ is the symbol period, W r is the carrier angular frequency, To == T c/2 is the time offset between the I and Q channels, and finally 'l/;i is a Bernoulli random variable that models the voice activity of the ith user (we assume that a user will be on with probability v and will be off with probability 1 - v). vV( h) (t) is the hth orthogonal Walsh function, h == 1, ... , IvI. Let the processing gain be G ::::: T w /1~_ For simplicity of notation, we shall denote the product of the user's PN code

Tl,l)al,l

(4)

where L, is the number of multipath component for the zth user, S, is the log-normal shadowing experienced by the same user, TI,l is the time delay of the l multipath component. and al,'l is the K x 1 channel vector of the base station antenna array to signals in the lth path from the ith user, where K is the number of antennas. Without loss of generality, throughout this paper we shall assume that we have a uniform 1inear array (ULA) of omnidirectional sensors in each sector at the base station. Each of the resolvable multipath components (i.e., those separated by more than T; from one another) will actually be themselves a linear combination of several

487

L

parallel demodulators

1- '

t ·····--·----·.. - ··.········.···--·--····-···.,

l--

1

Downconverter 10-.... and LPF

-44

Walsh

H ,

·

•• •

Walsh

:

~equence

Corre lators

~equence

Correlator s

1-

..

~

L parallel demodulators ,( ········--·· ·· ·--··---- ···-·--------- ······- -1

...

rlH Walsh ~quence ~ Downconverter fand LPF

~

, ~ .................

+{ : I

1 ~

~

pre-correlation

signal vector X

Fig. 2. Basc

•••

II""

I I

Walsh:equence CorreIators

~

, ,,I , ,,I

j

.

4

stat ion recei ver

Mvary Decoder,

•

(M)

data

Measure Frame Error Rate(FER) i+-

.

Threshold

P-- Closed Loop

post-correlation Z signal vector

'. t

Deinterleaver, and Viterbi Decoder

~

Power Control Algorithm Up/Dow n

WeightVector Estimation

weight vectors

Z' ••• , WL

Wi' W

I Select Index, I ofMaximum

select post-correlation signal vector

Power Control Command

I

block diagram.

unresolvable paths of varying amplitudes and phases which arrive at the base station within angle spread ±.Ll./,i at angle BI " with respect to the array. The elements of al,; are zero mean complex Gaussian correlated random variables , each having an autocorrelation function Ju(21f!dT), where fd is maximum Doppler shift, and if 1= k otherwise.

(5)

This implies that the channel vectors of two differen t paths are independent. The matrix RI,i is a K x K Hermitian Toeplitz matrix that describes the correlation between the elements of the channel vector of the lth multipath component ar , which is a function of the angle spread .Ll././, the mean angle of arrival &/.,' the waveleng th A, and the spacing between sensors d. If we assume a uniform angle of arnval over the angle spread, then it can be shown that [24], [23], the real and imaginary parts of R I " (m , 71,) are given by

= Jo(r(m,n)) + 2 L 00

Re{R/,,(m, n)}

••

ET

l.. . .. ..... . . .. .. .. .. . .. . . ....... . ... .. .__•••••• J

.~

•••

Optimum Beamfonning and Incoherent I-RAKE Combining

......•

,

ZJ

ZJ

JIl"".

Correlators

(1)

Jl

•••

\7

K

Optimum Beamforming and Incoherent I-RAKE Combining

1• • • ••• • • • • •• •• _. - • • • • • • • • • •• • • • • - . .... . .. .....

• •

L.--

.

JII"'.

1=1

hl+l (r(m, 71,)) sin((21 + 1)B1,i )

/=0

x sinc«2l + l).Ll.l ,,)

(7)

and r(m,n) = 21fd ·Im - nl /'\, We assume that the channel parameters vary slowly as compared to the symbo l duration T", so that they are constant over several symbol durations, Therefore, after downconverting to baseband , we can write the K x 1 complex baseband received signal vector for the ith user as

= jSiPi'lfJ; L L,

x ;(t )

c~h)(t -

Tl ")ejh 'al ,,

(8)

1=1

where

¢l "

=

WeTl ,.

and c~") ( t) is defined as

Let N be the number of cochannel mobiles. The total received signa l at the cell site is the sum of all users' signals plus noise and is given by

h/(r(m,n))

x cos(2lBL / ) sinc(2l6i,/)

= 2L 00

Im{ R/,;(m,n)}

L x.;(t) + net) . N

(6)

488

x(t) ::-:

.=1

(10)

· IA = ._

_._

...................

.-

z l /J(m)

~I.l

::

~.:..:.:::..::::.:..:':..:.~.:..:..~:.:':. :'.::.~': ._....._.....::.:::.:::~::::..~::.:=1 =====::' z1.1 (m) :

_..-.

r·--······__·········_--

•

( M)

..............._.._..t:..

1 •

••

II-- ---;= ::j::::= :> z~/m) (I)

•

••

cit) I !

-========(===> <7~m) 1

wM

.

:

;! •

.

i::::::c:cc:::::::c:::~:::=: ::::::::cc=:==:::::::::::::::=::::::::: :::::c_::::c:c:c~c:c-:: ::::::::::: cc=:: ::::c :::c -:::~

r...tl-*'i;

: _

l

; •

;

:

... .... .... ......._ _

Fig. 3.

_

_

_

_

_

_._

_

--~-- ---, ...,>

: .::..7 - - - .. -. -

Z(1)( m) LI (M)

ZLI ( m)

Corrclators,

The vector n (t) = n c(t) + jn,,(t) is the K x 1 additive Gaussian noise vector with zero mean and covariance

where (J" ~ is the noise variance per antenna. III.

RECEIVER MOD EL

The block diagram of the base station antenna array receiver is shown in Fig. 2. It has a " Beamfo rmer-RAKE" structure where several multipath components are tracked in hoth time and space. After down-converting to baseband , the outputs of the LPF are fed into a bank of M Walsh correlators shown in Fig. 3. Assuming that the hth Walsh symbol was transmitted, where li = 1, · · , I'v! , the pre-correlation and post-correlation K x 1 signal vectors x U ) and y ~~ ) are used to estimate the channel vector a l. i and the corresponding K x I optimum hcamforming weight vector W I ., from the pre-corr elat ion and post -correlation array covariances R n and R yv .1-, using the code filtering approach derived in [91 and [23] . The deta ils of this algorithm and how R x ," and R zz,I" are tracked with the channel variation can be found in [23] and [25J. For each rnultipath component we have M different postcorrelation signal vectors y ~ ':) . h = 1, · · · . .~rf . The vectors

y;-';) arc fed to an opt imum bcamforrn er.

L , bearnforrners for the hth Walsh function wi.,Yi~: ) are then fed into an incoherent RAKE combiner. The output of the incoherent RAKE com bine r z; h) is the decis ion variable for the hth Walsh function. The beamformer and the incoherent RAKE combiner for the hth Walsh function are shown in Fig. 4 . In order to update the post-correlation array covariance R y y ,l , t (that will be used in e stimating a ., and WI., ), the

Th e outputs of the

489

receiver needs the post-correlation vector y;~ ) corresponding to the true transmitted Walsh sym bol ~V (h J(t ) . However, at this stage the receiver has no prior knowledge of which post correlation vector y;~ ) is the right one. Here, the receiver relies on the inherent correlation of the multi path vector channel and the assumption that the channel remains almost constant over several sym bol periods. In this case, the receiver uses a delayed updat e of R vv.i ., (and hence dela yed estimation of the channel vector and the optimum beamforming weight vector). Thi_s is done by using the decision on the CUITe nt Walsh symbol It to

yi':)

selec t the post-correlation vector to update R y y . l " and obtain the optimum weight vector WI ,,' This weight vector WI " will be used for beamforrning for the next symbol. Th e de cision variables z~ \), . . . . z}'·1) at the output of the incoherent RAKE are then fed to an M -ary decoder, deinterleaver. and Viterbi convolutional decoder. Without loss of

Time Align

~"

't J

•• •

w..

•• •

2,/

Time Align 't

I

L

//

Z (n) 1

Fig. 4. Optimum beamfonning and incoherenr RAKE.

generality let us assume that the first user is the desired user and let Tk,1 be the time delay of the kth tracked multipath which is assumed to be estimated perfectly and k == 1, ... , L 1 . Then, we can write the post-correlation signal vector yL~) [or

transmitted

( 18)

the kth tracked multipath component for the first user as

y~~l ==

== ==

1 T11.l

f1T1 y

(n ) d k ,1

(11.)

k

.

l+Tw

u(n) k,l

x(t)cin)*(t - Tk,l)dt

TJ...,l

(n). If n

+ uk, 1 ,

==

h

(14)

= 2/TwS 1 Pl ej ¢ /.; ·l a k ,1 ==

m(n) k,l

( 12) (13)

if n i=- h

U k,1 '

d~~{

and

l'T

+ sen) -+- n(n) k)l' k,l

(15) ( 16)

di',Li is the desired signal vector, m~~{ is the MAl signal

vector, sk~i is the self-interference (SI) signal vector due to

other multipath components of the 1st user. and n~~l is due to the AWGN. Let At == JSiP(l/J'I.' Also, for the kth tracked multipath let the optimum beamfonning weights determined using the previously estimated Walsh symbol be W k,l' For an equal gain combining incoherent RAKE, the nLh decision variable of the 1st user corresponding to the nth Walsh symbol is given by [10] n == 1,' .. ,A1.

However, for the data, a symbol-by-symbol M-ary decoder is used [11]. Both approaches yield exactly the same decisions for the M -ary symbol and both are optimal (i.e., a maximum likelihood rule) for an AWGN channel. Since the IvIAl is not necessarily Gaussian, this decision rule is actually not optimal. However, when the number of cochanrrel users is large, the MAl can be modeled as Gaussian noise and therefore this decision rule can be used. 'I'he primary reason for using the syrnbol-by-symbol approach for the data is to provide improved performance with error-correcting codes by using soft decision decoding [10], [1 1]. Note also that we cannot use the output after the convolutional decoding and deinterleaving to select the postcorrelation signal vector, The reason for this is that we will have to wait for a decision to be made on the current symbol and convolutionally encode and interleave again, By the time this process is over (which is at least twice the time of one frame of bits), the channel would have changed and the estimated channel vector and the channel vector of the new symbol will be quite different. This will lead to a ctegraclation in the beamformer output SINR.

(17)

IV. SIGNAL STATISTICS In order to derive the uncoded bit error probability (BEP), we need to derive the statistics of the decision variables 1 1 ' thediff .21(1 I, Zl(2) , " ' , Zl(M)', F'irst, we wi' 1 exarnme l erent terms

Now, to select which post-correlation signal vector y~~) should be used in estimating the post-correlation array covariance Ryy,l,'t, a hard decision is made on which Walsh symbol was

490

'

zi~l i.e., the multiple access interference signal vector due to other cochannel users m~ll{, self-interference signal vector in

due to the user own rnultiparh components vector

(n) uk,}'

S~11{, I

and noise

A. Noise Analysis

The noise term

nk

H

' T~" j (nJ· Uk,l = p;=;-.

j vTu· T~.l

f (11,)./1 \nk,l

_

-

h were

For

-

is given by

{

l -rT,L'

~(n)*(

(

n t. )c1

n(n),II k,l '

~

k,1

1

k,l'

A

Ll ""'

[(I(n),II + I(n),QQ) k.l,l.l

1 ~

1=1 l:f:.k

t

~

Tk,l)(

ll

+ nk,l (n).QQ) + "(_n~n),IQ + n(n),QI) } k,l k,l

(n);!! n(n),QQ n(n),IQ

ok,!

-

and

n(n),QI k,l

+ /(n).QI)] J~/.I + J/.( _ I(n).lq k, i ) \1 k, 1 ,I, 1 e al .i

( 19)

where

(20)

k.l,l,l

( n ) ,II I(n),(.JQ I(n),TQ rk,l,l,'i' k.l,l,'I. . k,l,l,l '

an

d

r(n),QI k,Ll;l.

(30)

d fi d are e ne as

are defined as

we have

Also, we can write the MAl due to other users' signals as

: : : "'.44.£ L N

L,

~

[(I(n).TI k.l,l,1

+ I(n)'~Q) k,l,l,z

i=2 l=l

+ .J.: (_z(n),l9 + I(n),Ql)] e j (/) , k,l,l,l. k,l.l,·L

.

a

i,

.

(37)

Let

. rand om vector . sh ow t h at n (n)i' II.lS a Gaussian We can easily k with zero mean and covariance

,I

f 1. Similarly, we can show ')

r=-oo

that n~~l'QQ, n~~LrIQ. and n~~rQI are all uncorrelated zero mean Gaussian random vectors with the same covariance. Hence n~~{ is a zero mean Gaussian random vector with covariance 2a~ I.

where q.~~ == W~h)c~I1~' It follows from (2) and (3) that

T ~' -00,'" 00' is an i.i.d. binary random sequence taking values ± 1 with equal probability. Hence, it follows that (n ) I I ' U' . I(n),! T I k,1:'l,i IS zero mean. sing (38) , we can rewnte k,l,l./ as

q(I),

B. Self and Multiple-Access Interference Analysis The self interference due to other multipath components is given by

(n) ,I I _ -

I k . 1 ,L i

1

1

G-l

IT' '""" Z::

V

.L u;

b=O

j,'TI.:'l +(b+l)T

c

( 1)

Q1,b

.

TI... l

+bTc

. (

Pt

_

b'T. _ .i.e

Tk,l

)

00

X

(28)

L1

::::

JCPl 1 A 1 "l(n) L..J k,l,l,l e . al.l

'L..J "'

(f) P.t ( q'i,'T'

-

I

r'7'c -

it Tl,i. ) (1/

r=-r")('J

(29)

1=1 lfk

(39)

491

where {3k,1.l,l == rl,i - rk,l modulo-T, and Rp(s) is the partial auto correlation of the chip waveform defined as

Rp(s)

=

is

p(t)p(t

+ t; -

o~

s)dt,

s

~ I~.

J

q;;-l Rp(fJk,Ll,t) + q~,~: Rp(T

c -

f3k.l,l,-L).

Using the results in [22]. we can show that {Fb}b=O,. are independent random variables and hence T

\ ar

{1(n),II} __ 2 T k,1,l,'t - '3 c·

(41) ,C-1

· ·1 I 1 h th t l(n),II j(n),QQ l(n),IQ d S irm ar y, we can a so s ow a k),lJ7.' k,lJ,i , k,l,l;i ,an

lk~/'l~I are all zero mean uncorrelated random variables with the same variance given by (42). n Remark: In deriving the variance of /,/ : , \ve used the assumption that the chip pulse shape is rectangular, In reality the channel is bandlimited (due to low pass filtering following the down converter) and the received signal cannot be a square wave. Under the condition that the same amount of energy is received regardless of the channel used, the received signal through a bandlimited channel will have a higher peak value, resulting in a higher level of M.A.I interference due to larger fluctuations. In [191 and [22], it was shown that if the bandlimited channel has ideal low pass filter characteristics with a bandwidth B = liTe, then, we would have

lk

SImilarly. covariance of

The total interference vector

21~()~I.

R(n) uu,k,l -

a

Consider the pre-correlation and post-correlation signal vectors x(t) and yk~{. With the assumption that the MAl 1S spatially white, the optimum bcamforming weights can be shown to be (48)

where ( is some arbitrary constant (that does not change the beamformer output SINR). For simplicity of the analysis, we

J

set ( == 1/ a k,l ah,1 . Define the beamformer output for the kth multipath component of the first user w*h,l yen) as U(n) k,l k,l (n ) 2A 1 -V/'--T I I ]¢~. .1 Uk,l:::= w ak.1 e

==

*

(n)

+ ,V *k, 1 Uk,(11.)1 .

for n

:=

h, (49)

for t: =1= h.

Wk,l U k. 1,

(50)

where \ak,l! == Jak,lak,l. We can easily show that V~:~)

=-

k u~ti is a zero mean complex Gaussian random variable with variance (J2. For simplicity of notation, let L 1 == L. Then,

W

1

the decision variables for the first user are

From [10], and conditioned on Al and al,l, l ~ 1, ... 1 L, vre n 2 can show that for == h, ) has a noncentral X distribution with 2L degrees of freedom and noncentrality parameter

(44)

zi

i~~l == m~~i + s~~{

(52)

The noncentrality parameter E is the symbol energy. For :f. h; n ) has a X2 distribution with 2£ degrees of freedom. Therefore. we can write the conditional probability density function (pdf) of as

n

zi

zfn)

n==h

(45)

a} is

(47)

C. Decision Statistics

n

is modeled as a zero mean complex Gaussian random vector with covariance I~1~1 == R{i~1:1i1~{*}. Although, this assumpnon does not always hold [or CD M.A. analysts, it was shown in [22] that it is valid for large G. Moreover, simulation results presented later in Section VI show that for large N . L, if we assume that the angles of arrival of the multipath components are uniformly distributed over the sector, the total interference vector i~~{ will be spatially white. In this case

where

21

(43)

ni~l will then be

Var{ n~~l} ==

covariance of the interference-plus-noise vector

(42)

I..

v ar {I(n),I I} --- Tc k,l,l.t

= 2. The

Ui~~l) is then given by

(40)

For rectangular pulses, R p (s) == s. For asynchronous networks, a reasonable assumption is that (3k,l,I,t are independent and uniformly distributed over [0, Tc ) . Let

rl ==

C

given by

ni=h

(53)

where I L (.) is the modified Bessel function of the Lth order.

(46) and C is a constant equal to two for a bandlimited channel and ~ for a rectangular pulse shape. For the remainder of our analysis we will assume the case of a bandlimited channel, i.e.,

r(.) is the Gamma function, and IS

==

e (72'

(54)

We may recognize Is as the symbol energy to interferenceplus-noise ratio.

492

Remark: In this analysis. we used the assumption that the channel vector remains constant over t\VO symbol periods. Also, we assumed that ak', 1 is estimated perfectly. In reality the channel is time varying and the array covariances are estimated using few samples. This will lead to errors in the estimated channel vector and hence a reduction in the symbol energy IS' In [25], we studied these losses as a function of the maximum Doppler shift f d and angle spread ~. These results show that, with optimum forgetting factor for recursively estimating the array covariances, the loss in [8 is less than a few tenths of a dB. Therefore, the analysis results obtained here can be regarded as an upper bound on the system performance.

A. Low Doppler Frequency

At "low Doppler frequencies and high diversity orders, fast closed loop power control can eliminate most of the channel variation due to multipath fast fading. In the case of ideal power control, 18 is a fixed quantity and is given by '1s ==

!J!\t ("fs

)

1 - P; _ 1 - P (~(2) 2,

==

== 1 -

==

1-

PROBABILITY OF ERROR ANALYSIS

In this section we derive the uncoded BEP with hard decision. As we mentioned earlier. with forward error correction (FEe) and interleaving, which is not considered in this paper, symbol by symbol soft decision decoding yields better performance. To derive the probability of error, without loss of generality, let us assume that It. == 1, i.e., the first Walsh symbol ~"l(t) is transmitted. Then the probability of symbol error is

<

(1) -y(3) Z 1 ',(..1

<

(1) .,.

21

,

-;yU'vl) <

, .... 1

(55) (56)

z(l») 1

. -

1 (

a2

(CXJ

Jo

[1 - e-z/a2 ~ ~ (;2 y]

(57)

and (2)

p (z 1

I

(1) _ -

IZ1

-y

~)

_ -

zz:

,.: .I f 0

," z ~ 2) (x)

(58)

dx L-l

z 1 - c- 12~1(Z) a Z:: T! 02

l

(59)

I=()

Finally, the corresponding uncoded BEP

})1J( T..;)

z

--

a21s

) L:;l f

The symbol error probability and the corresponding uncoded BEP derived above are conditional probabilities and are functions of /Or 8' which is itself a function of the channel vectors al,l,···,aL,1, shadowing and path loss 51, and the first user transmitted power Pl. Also note that because of power control (both open-loop and closed loop), PI, 51, and al,l, ... ,aL,i are generally dependent variables. The dependency among

these variables is in general a function of the maximum Doppler shift f a of the first user. A reasonable assumption is that the combination of open-loop and closed-loop power control is perfect in eliminating the slow fading due to shadowing and path loss [26]-[28]. Based on the mobile speed (or fd), we consider the following different cases.

_~ l L - 1 (,2 {ijJ'S Z

rt

-?

a-

>

)

dz . '">

(62)

However, due to the delay in the control loop, finite step size by which the mobile can increase or decrease its power. and errors on the downlink, power control cannot be ideal (see, e.g., (28]). Therefore, the symbol error probability obtained above needs to be averaged over the probability density function of 1t;, which is not known in general. However, an approximation to the uncoded BEP can be obtained as follows. Here, we can use the approach in [29] and [20] to get an approximation for the uncoded BEP. First, let C; denote the coefficient of variation of 1 s' defined as

c. ==

JVar{1'8 } ---e-{- } - .

(63)

Is

In [29] and [201, it is shown that for a low Cv , (less than O.~ [20)), a reasonable approximation of the symbol error probability is

P,'c' : : :; ~FAd7$) + ~PM(;s + J3(7,) 3

6

t-

~PMhs

-

v'3(7~)

(64)

Vv here is and a I are is the mean and variance of 1 ~. We can see that (J 'Y also represents the power control error. Then, the corresponding uncoded BEP is

" Pb

is given by (60)

;'\;/-1

l=O

fOG [fl(Z[2) < z I zP) = z)]M-l fz;I)(Z I 'Ys)dz

./0

(61)

where i is symbol energy to interference-plus-noise ratio per path per antenna. Also, the density function of z;n) for n == h given in (53) becomes an unconditional density. Therefore, the syrnbol error probability is P,f\".l

Y.

1 .L . K

2J - 1 " =: -J--PjH.

2 - 1

(65)

B. High Doppler Frequency

For high Doppler frequency and/or long loop delay, the fading statistics of the received signal after power control remain the same as those of the multipath fast fading with only perfect average power control. In this case, we have L

" 18 == 1- '~

'.9

lat,ll 2 .

(66)

l=1

The distribution of depends on the angle spread 6. through al,}- We consider the following three cases.

1) Small Angle Spread: For zero (or relatively small) angIe spread ~, the channel vector of the lth multipath component can be expressed as

(67)

493

where aZ,1 is a zero mean complex Gaussian random variable and for aULA Yr.i is a Vandermonde vector [30] given by Vi,l

==

where J..K

1rk=IT

1=1 L-:j=k

[1

e-rrrsinf:Jz,lD/A ... e-J7f~1Il6J,1 D(K-l)/-\]T,

rs

has a X2 distribution

with 2L degrees of freedom, i.e., £-1

' (ry) = f 'Ys, (ryK)L(L -

(II

ZL-l

"2) =

(J2L(1

Zl ) \

+ iK)L(L -

(69)

1)!

" e-z/((T~(l+l'K)). (70)

2) Large Angle Spread: For large angle spread, theLelements C

becomes uncorrelated and hence 2::£=1 lal,112 is a sum of K L i.i.d. random variables having a X2 distribution with two degrees of freedom. Therefore, '"Ys is distributed as a X2 random variable with 2K L degrees of freedom

of

al)

f-y,(-r)

=

/_

,KL-l

(71 )

C'Y)KL(KL-l),e-"'"

The corresponding unconditional probability density function of zln). for h

= ti is (see the

Appendix)

where

R == I

1

(0- 2 (1 + "y))KL-l X

((!(-l)L) l

(~)(K-l)L-l(_l

] + l'

1 + l'

)1.

(73)

3) Other Values of Angle Spread: For other values of ~, we can easily show (see the Appendix) that the syrnbol energy to interference-plus-noise ratio is L

/8 ==

K

L L ilil u

ZLI

2

(74)

l=J i=l

where 1111···'ULK are i.i.d. zero mean complex Gaussian random variables and i i i == l' )..li, where {Al,i} 1.=l,"', K are the eigenvalues of R Z,l , the covariance matrix of the first mobile's lth multipath component. Also. if the is

'8

=::

Let {i'h.h-:l. L,1.=1· K be equal to {il,}1.=l-- KD· \ve assume that the ilt' s are distinct (this is true angles of arrival are sufficiently different). Then, distributed as [101 (75)

KL {eL (2( '1=1

1

7ri

where

and

(]'

1+

_ )) * g(z)

"L .

t.::

g(z)

Hence, it can be shown that the corresponding unconditional pdf of z~n) for h == ii is (see the Appendix)

f

k=l, ... ,LK

~

Z/ (U2 ( 1+ i 1»)

fzH(Z)

e-,/Ci R-) 1)! -

_

(76)

and the corresponding unconditional pdf of z~n) for h == ti is (see the Appendix)

(68)

In this case, we can show that

,k ik - i1.

* denotes

= u 2 ( L _~)(; _

2)! e-

}

c o2 /

(77)

(78)

the convolution operation.

zi

Using the unconditional density of ) for h = n in (70), (72), and (77), the average symbol error probability is given by "

PM=::l-l

oc [

z u '2 l_e- /

n

1 z Lf!(u

£-1 l=O

2)

l ] M-l

fz~dz)dz

(79) and the corresponding average uncoded BEP is given by (65).

VI. NUMERICAL AND SIMULATION RES\.}Ll'S

First, we study the accuracy of the approximation that the MAl signal vector can be assumed to be a spatially white complex Gaussian random vector. The base station receiver in Fig. 2 was simulated. In our simulation, we assumed that the processing gain G == 256, L == 4,1'1 = 40,1\-1 == 64 and LJ == 0.375. We also assumed ideal power control. 'VJ.le assumed that the base station has three sectors, each with a five element ULA as shown in Fig. 5. The angles of arrival {8k ,2} were assumed to be uniform over [0,120°] (i.e., uniform over the sector). The angJe spreads {~kJ} were assumed uniform over [0,60°]. The results of 10000 post-correlation signal vectors were used to estimate the statistics of the Mi\I signal vector. Figs. 6 and 7 show the empirical PDP of both the I and Q component of the MAL at the first antenna. From both figures we can see the validity of the Gaussian approximation. Also, the covariance matrix of the MAl vector ft~~J.k.,l was estimated and is shown at the bottom of the next page. The

Frobenius norm [311 of the error IleilF == IIRS~:'k)l - IIIF was estimated as 0.0058 which also shows that the MAl can be assumed to be spatially white. Next, we study the system uncoded BEP. The closed loop power control was simulated according to the model in [28] and 1271. In this simulation, we assumed that the mobile can increase/decrease its transmitted power by 0.5 dB at a time and that the power control command was sent every 6Tu.We assumed that the loop delay IS also 6Tw and that the return channel error rate is 0.05. Figs. 8 and 9 show the power controlled signal level distribution versus the distribution of the simulated multipath Rayleigh fading at the RAKE output for f d == 5 and 100 Hz. From these two figures we can see that closed loop power control eliminates most of the channel variations at low Doppler frequencies, while at high Doppler

494

0.08 - ' _ .. ,.__._ -

-

..- ..- --- . ,.-

-

IE:] Simuinuo ns - "om. ',,'m,

0 07

-

-

, --

-

--,---,

I

~

0.06

----------"---1

'::':--

0.05

8

is e:

j

0 04 0.03 0.02

d

• F Ig. .)

008

000 0.05

•

•

0.01 0.00

Si rnulario n scenario

i'

I 007

•

•

-:._=-.....,. ".-

: ~ Siruulatiunv

Nurnm.l

,-

!

_." ---'-.._ -- '''''- - '- '- --,.-

f

0.9 0.8

~ E

\

r-

-e

V1

i

..

.s c

·10

o

M- -

Muttipath

0.4 0.3

- . - - . -+----

I

'//----j-

-

0.0

-

-t-

--+1--.. -

-

Ii

-·--I1---·-_.-·-t- //U l

-Tf'f --

'

-

-

20

i

-4

! /

___- ---1 ·2

i

+-!

f-

- -..,.....--1 I

----t--.

- --t-- - -

I-

i· //

0.1

f -channel: first ante nna inte rfe rence di strib ution .

/ /' I

f~~::' _ I~l-/ .-_ . / --- 1i_I

0.6 - ' -

0.2

10

11- Power-Controlled Signa l Lev el I) (

0.7

~ 0.5

§.

Fig. 6.

10

{I -c ha nnel' first ante nna Inte rfere nce di stnbution .

FI!! 7 1.0

·20

o

10

·? O

F i ll l l\ ~

'5 0 04 ~ 00.03

'--_La...,=

J-i--. : t -+--i Ll L o

2

Norm ah zed Si gnal Level (dB)

frequenc ies the rece ived signal after powe r contro l has the same distribution as that of the s imulated multi path fa st fad ing. For more de tails on the pow er control performance of the above receiver, the reader is referred [0 [28] . For the ideal power control case , the probability of e rror was computed using (60) and (62) . For the power control case with f d == .'i Hz. the approxim ation in (64) was used . The resulting j)b is plotted for L == 4 and I( =- 1, 1\' ..: :1 , and K '- fl in Fig. 10. Not e thai this is independent of the ang le spread ~ since with Id = .s Hz, the lading is slo w enough to be tracked by the pow er co ntrol loop for all va lues of ~ (see 12R]) . If we assume that the req uired uncod ed BEP is :s: 10- 3 . then fo r K == 1 the ma ximum number of allowa ble user s per sec tor N m a x is 29 for ideal pow er co ntro l and 21\ for po wer co ntro l with 1<1 = j Hz. For K = 3. these numbers go up to 85 and

10043 0.0062 - 0.0029i 002(;:1 - O.()O()..j i 0.0212 - {l.0003i 0.0204 - 0.01 1:1i

0.00(j2 ~ 0 0029i 1.0039 - OOEm + O.OOO{j i 0.0090 - 0 0009, 0.0102 -+ 0 0027,

Fig. 8. ing :

Power-controlled signal distribution versus simulated muluparh lad Hz. f, and L -1 .

J. / = '.

= ',.

=

82, respe ctively . This shows the improved performance due to beamforming . For the high f d case, as menti oned ear lier. the distribution of "I" and henc e the distribution of z ~,, ) for n = h, depends on the angle spread ~ . Fig. 11 shows the pdf of z ~ ") for different values of ll. and with ideal power contro l. From thi s figure we can see that the higher the angle spread is, the closer to the ideal power contro l case the pdf bec om es. Thi s can be explained as follo ws. At zero angle spread, the received signa l in any multipath component will experience the sam e fad ing at all ante nna s and the antenna array will not pro vide any space

002(;:1 + (1.0004i - 0.0198 - O.O()()O{ 10050 - II.OO:I!) - 0.00 11i 0.0 111 f- 0.01 l Si

495

0.0212 + O.0003·i 0.0090 + 0.0009i - 0.0039 + 0.00 1 Ji 1.0055 -O .015G - O.OOJ <Ji

0.0204 + 0.011:~ I 0.0102 - o.OO:n·i 0.01 11 - Of111 5i -0 .{H5(j + 0.00 19, J .0U27

1

1.0

0.9

H-

0 .8

.~

:i:

0.6

]

0.5 I - - -

.

~

~ 0 .4 ~

0.3

I

/ :/

0 .7

V1

I

Power -Controlled Signal Level -- Multipath Fast Fading .

-

/

.... ..

- ---

0 .2

j

--'

;g, gv 0.04

-

~ _.

l:' ;;; 0 .03

c;"

1--I

I I

.E ~

I

...

002 f---I-.;.....j.++-----'~f_----....L

.c

£

-

0.01 f--/~i++_---+~~-_+----+_--___j

I.: .

r/

l?"

0

4

Decisrcn Vari able

· . 9 . Power-con trolled si . ltd ro signa I d IStrib n u ti on versus sirnu a e mu It'ipa tb fad FIg ing: I« 100 Hz. J{ 5 . and L 4.

=

.. _... _ . ... _. .

l "

-

"

=

_. . . I

.-

_.

'-

I.'

it'

10 "

--

_.

--_··t -tl --l H 50

-

..

----

100

g

~

PI,

for ]'./

=

!j

= 11

at high

l«.,

= 0.0 84

~

'2 0."

10.

3

I--1-- -

p

)

=E'"

' ....tY

-

._ -

;:f

- --· f· - -- "..

.J.j

10. 5

rr1-;

--

40

20

~_.

_.

~-

80

60

200

100

120

i===

An e Screed t>

a· [

140

180

180

.. .

.-

---6-

-

-- ..

~

-o- K = l

I --0- KK =3 - 5

H-+-

Ideal Power Contro l

~

.. __.

...

1\

I---

- .-. _..

--

I I

I

/ /

1 0~

Power ControlJd = 5 Hz

150

7 .-

0..

1==

-

V /

- - 0 ~~

~

--_. _.

~

10-2

"3

1, C,

_..

a;

200

N . Number of Users

N, Number of Users

Fig to.

11.

W

-1- ----\

/

-#--r:~ K = 3, C,=0.077 ;? J--&-- K=5, C,=O.077

_.

~ --:-

~

K

..

f

..-J.'j

/

~l

-

--_.

PDF of : 1( " ; for

t==r=

-

-;::ft.

_.

._ - +-_.f--.

-' .

'>I

._-

Fine-. 11.

' 1( 1)

_rI:. ~

----_..

.-

IdealPower Control Fading, t. ~ D· 0 Fading, t. ~ 3 l-adin g, t. ; (fJ.

}-----A-~"'-'--...:+

Nor ma lized Signal Level (dB)

=

- -

}--- ---+-+- \ - -

"=>

. .•

·2

·4

0.05

I--

u,

. ..

,1

If

0.1

-

. ....

)

1--1

0 .0

-:

Fig . 12.

Hz and closed loop power control.

diversity for this multi path component. As the angle spread increases, the signal fading at different antennas becomes more and more uncorrelated which leads to less variations in the RAKE output. Figs. 12-14 show the uncoded BEP for 6. == 0° ,6. == 3°, and 6. - 600 for a high maximum Doppler frequency fd (high enough such that the statistics of the received signal after power control remain the same as those of the multi path fast fading). For 6. == 0° we can see that for Pb == 10- 3 the maximum number of allowable users reduces to (compared to the perfect power control case) 10, 29, and 55 for K == 1,3, and 5, respectively, which corresponds to a 65% reduction in system capacity. This capaci ty reduction is due to the multipath fading which was not eliminated by the closed loop power control. With a single antenna, the statistics of "Ys does not depend on the angle spread . Therefore , the maximum number of allowable mobiles for J{ -'= I is the same at ten mobiles per cell for any value of angle spread . However, for angle spread 6. = 3° this number goes up to 44 and 90 mobiles

Ph

for high /" and ~

~~

10' \

.--.

tJ

..n ~

..

-

1/ _.__ .

/

10"

I - ;---f-.

20

_.

I

'"

-_.

I

40

~

--. 60

80

..

-0- K =l ~--0- K =3 ---6- K -- 5

. ;.......-.

7'

_.

.--

"7

,/ ... -

..

I-----

r.--

.»:

..-

~

10"

. _---

power control.

J

H J. 0."

= 0" . and

100

120

Angle Spread t>-- 3" . -

140

160

180

200

N, Number of Users

Fig. 13.

Ph

for high / ,/ and 2l.

= 3° , and powe r contr ol.

for K == 3 and J( "..- 5, respectively. This is due to the additional space diversity gain provided by the array . For

496

_.

.-

-~

--

-

g

10-2

'"

.-

T-'---

==t --:-.- - =--= -_. r--P-- --

-

-- - f

,

.-

_." - -

/

J

-/I---j- -

-0-

-0-

--- -

-

-&-

--j- .- - -_..,- j ---~ t-::::::-:t:j

I

20

60

40

80

- r

.... --- ~-

• . _;---t.

_ .

~

- _.

C - -. _

'-

f--

(

_.

1 0~

./

-_. .

100

- --.-

-. - ..

,../'

~

---'

120

140

-

K=1 K=3 K=5

Antle SpreadIl. -

-

~ -

~. L

1aO

180

200

sen ted an approximatio n fo r the uncoded BEP as a function of the mea n symbol energy to interference-plus-noise ratio and the power co ntrol error. Fo r the case of high maximum Dopp ler shift, we derived exact expressions for the uncoded BEP for different cases of angle spread. In all cases, an improvement in sys tem performance pro portion al to the number of sensors is observed. Addi tiona l improvemen t is obtained due to space diversity gai n at hig h angle spreads . APPENDIX

To derive the uncon dition al pro bability density funct ions in (70), (72) , and (77), first we reca ll the chara cteristic function rep rese ntation of the condi tional density in (53) [10]

N. Number of Users

Fog 14.

P" for high

TABLE [ P ERCENT RED UCTION I" C APACITY AT

38 11 1

187

TABLE II

K 1 3 5

P" :::

50% 50% 50%

PFRCENT REDUCT ION IN CAPACITY AT

29

85 142

. , <,,) Z

exP{sz- ')"~2s/ (~2S+1)}

( 2

'l

fTs + l )~

Then the unconditional de nsity of z\n ) is given by [0- 2

r-OR HIGII

i\ ::: 10- '2

Reduction at 6 ::: 0° .::l. ::: 3°

K

_ _I_f

f - r ( I ')" ) - 27rJ.

j" and --'- ::: GOo. and pow er co ntrol

50% 34.6% 26.5%

Pb

.::l. ::: GO n 50% 19.6% 13.8%

::: IU -.\ FOR H IGH

65.5% 48.8% 36.9%

(80)

t"

tel

Reduction at PI, ::: 10- 2 .::l. ::: 0° .::l. ::: 3° 6 ::: GOo

65% 65.5% 65.5%

.

d«.

65.5% 31.8% 21.0%

where F-y, (.) is the characteris tic functio n of I s' Then I) For small angle spread, 1-1,(,) is given by (69) and

~ ::: GO° , these number goe~ even higher to 58 and I 10, which is. agai n, due to the space diversity gai n provided at large angle spreads. T he res ults of Figs. 12- 14 are summa rized in Tables I and II. Th ese two tables give the % reduction in capaci ty at 10- 2 and ?b ::: 10- 3 relative to the case with ideal power control, respe ctively. We make the following observation. For a fixed angle spread, increasing the number of antennas will provide more space diversity gain . Similarly , for a fixed number of ante nnas. the large r the angle spread, the larger the space diversity gain provided by the array will

Fb :::

2) For large angle spread.

l- ,h)

is given by (7 1) and

be.

VII. CONCLUS ION In this pape r, we proposed an antenna array-based base station receiver structure for DS/C DMA wireless systems with M -ary orthogonal mod ulation and studied its performance . We deve loped [he vector rnultipa th channel and received signal models. The average uncoded BEP was eva luated as a function of the number of users . number of antennas. and angle spread for different power control scenarios. For the case of low maximum Doppler shift , we pre-

Now we have "

(~2s + 1 ) (K-l )L

Fc(s ) = ( ~ 2( 1

+ ~I )S + 1)l\L

( [( - I ) L

L 1=0

497

(84)

where

1

1

Ri == (0- 2 (1 + 1))KL-lI! x

~((j2:; l ds

_I

l)(K-l)L! s=

1

- (a 2(1 + i))KL-l x

-1

(0"2(1+1'))

((!{ - 1)£) l

(~)(K-l)L-l(_l 1+1 1+1

)l

(85)

It follows that (91)

where

3) For other values of angle spread, first we need to derive

(92)

the density function for Is itself. We have L

rys

= ;y ~ lal,]

(87)

2

1 .

1=1

Let aZ,l == R~:J~2UI where Ri,l is the K x K covariance matrix of al,l and til is a K x 1 zero mean complex Gaussian random vector with covariance matrix I. Since R'l,l

is Hermitian, we can rewrite

RZ,l

as (88)

where U 1 is an orthonormal matrix and A z is a diagonal matrix of the eigenvalues of R l ,1. Let Al diag{AllA12' ··AlK}. Then we can write lal,ll~~ as

lal,l12 == u;UlA I Viul == iii AlUl

L K

==

2

Ali lil'li 1

(89)

i=l

where lit == Urlll is also a zero mean complex Gaussian random vector with covariance I. Then, we can rewri te

'5 as

(90)

where itt = 1'Al i and, therefore, Is has the density function in (75). It follows that

REFERENCES

[1] D. L. Schilling, "Wireless communications going mto the 21st century," IEEE Tram; Veh. Technol., vol. 43, no. 3, pp. 645-652, Aug. 1994. [2] A. M. Vitcrbi and A 1. Viterbi, "Erlang capacity of a power controlled CDMA system." IEEE 1. Select. Areas Commun .. vol. 11, no. 6, pp. 892-900, Aug. 1993. l3j D. L. Schilling, "Broadband spread spectrum multiple access for personal cellular commurucauons." in Proc VTC' 93, May 1993. pp. 819-822. [4] K. S. Gilhousen, L M. Jacobs, R. Padovam, A. Viterbi, L. A. Weaver, and C. Wheatly, "On the capacity of a cellular CDMA system." fEEE Trans Veh. Technol., vol. 40, no. 2, pp. 303-312, May 1991. [5J S. Simanapalli, "Adaptive array methods for mobile commumcanons." in Proc. VTC'94, Stockholm, Sweden, June 1994. [6] A. F. Naguib, A. Paulraj, and T Kailath, "Capacity improvement with base-station antenna arrays In cellular CDMA," IEEE Trans Veh. Technol., vol. VT-43, no. 3, pp. 691-698, Aug. 1994. [71 1. C. Liberti and T. S. Rappaport. "Analytical results for capacity improvement in CDMA," IEEE Trans Veh. Technol, vol. 43, no. 3, pp. 680-690, Aug. 1994. [~] S. C. Swales, M. A. Beach, D. J. Edwards, and J. P. McGeehn, "The performance enhancement of multibeam adaptive base station antennas for cellular land mobile radio systems," IEEE Trans. Veh. Technol.. vol. 39, no. 1, pp. 56-67, Feb. 1990. [9] A. Naguib and A. Paulraj, "Performance of CDMA cellular networks with base-station antenna arrays," in Prot' International Zurich Seminar on Digital Communications, Zurich. Switzerland. Mar. 1994, pp. 87-100. [10] J. G. Proakis. Digua! Communications, 2nd ed. New York: McGraw Hill, 1989. [11] A. J. Viterbi, Principles oj" Spread Spectrum Multiple Access Communications. Reading, MA: Addison-Wesley, 1995 r121 G. Turin, "The effects of multipath and fading un the performance of direct-sequence CDMA systems," JEEEJ Select Areas Commun., vol SAC-2, pp. 597-603, July 1984. [13] M. Kavehrad and B. Ramamurthi, "Direct-sequence spread spectrum with DPSK modulation and diversity for indoor wireless communications." IEEE Trans. Commun., vol. CO~1-35, no. 2, pp. 224--236. Feb 1987. [14] M. M. 1. Wang and L. B. MIlstein, "Predetection diversity for CDMA indoor radio communications." in Virginia Tech Symposium on \Jv'lreless Personal Commun , Blackhurg, VA, June! 992 pp 13.1-13 10 [l 51 1\. F. Naguib and A. Paulraj, "Effect of multipath and base-station antenna arrays on uplink capacity of cellular CnlV1A," in Proc GLOBECO.M'94, San Fransisco, CA, 1994, pp. 395-3<19.

498

\ \6) 1\.], Vnerr», "Very low rate convolutional code for maximum theoretical

performance of spread-spectrum multiple access channels," IEt.E j Select. Areas Commun., vol. 8, no. 40, pp. 641-649, May 1990. ll71 K. L Kim, "On the error probability of a DS/SSMA with a noncoherent ;\1 -ary orthogonal modulation," in Proc. VTC'92. Denver, CO, 1992, pp. 482-485. lI8] Q. Bi, "Performance analysis of a CDMA cellular system," in Proc. VTC'92, Denver, CO, 1992, pp. 483--486. (19] Q. Bi, "Performance analysis of a COMA cellular system in the muttipath fading environment," II1 Pioc PIMRC'92, Boston, MA, 1992,

pp. 108-111.

f20) L. M. A. Jalloul and J. M. Holtzman, "Performance analysis of DS/COMA with noncoherent _\1-ary orthogonal modulation in multipath fading channels," IEEE J. Setect /vreas Commun, vol. 12, no. 5, pp. 862-870, Sept. 1994. [2\] Qualcomm Inc., Widehand Spread Spectrum Digital Cellular System, Apr. 1992, Proposed EIArrIA Intenm Standard. (221 D . .T. Tornen , "Performance of direct-sequence systems with long

[24) J. Salz and J. H. Winters, "Effect of fading correlation on adaptive arrays 111 digital mobile radio," IEEE Trans. Veh. Technol., vol. 43, no. 4, pp. 1049-1057, Nov. 1994.

[25] A. F. Naguib and A. Paulraj, "Recursive adaptive beamforming for wireless CDMA," in Proc.ICC'95. Seattle. WA, June 1995, pp, 1515-1519. [261 A. Jalali and P. Mermelstein, "Effect of diversity, power control, and bandwidth on the capacity of microcellular COMA systems," IEEE 1. Select. Areas. Commun., vol. 12, no. 5, pp. 952-961, June 1994. [271 S. Ariyavisitakul and L. F. Chang, "Signal and interference statistics of a CDMA system with feedback power control," 1£££ Trans Commun , vol. 41, no. 11, pp. 1626-1634, Nov. 1993. [281 A. F. Naguib and A. Paulraj. "Power control in wireless CDMA: Performance with cell site antenna arrays," in Proc GLOBECOM' 95, Singapore, 1995, pp. 225-229. t29j J \1. Holtzman, "A simple, accurate method to calculate spreadspectrum multiple access error probabilities:' IEEE Trans. Commun.,

vol 40. no 1, pp. 461-464, Mar. 1992. \301 S. U. Pillai, Array Signal Processing. New York: Springer Verlag,

pseudonoise sequences," IEEE 1. Select. Areas Commun., vol. 10, no. 4, pp. 770-781, May 1992.

1989.

[311 G. H. Golub and C. F. V. Loan, Matrix' Computations, 2nd ed.

i23] A. F. Naguib, "Adaptive antennas fur COMA wireless networks," Ph.D. dissertation. Stanford University, Stanford, CA, 1996.

more, MO: John Hopkins University Press, 1989.

499

Balti-

Abstract

Base station antenna arrays are a promising method for providing large capacity increases in cellular mobile radio systems. This article considers channel-modeling issues, receiver structures, and algorithms, and looks at the potential capacity gains that can be achieved.

Smart Antenna Arrays for COMA Systems JOHN S. THOMPSON, PETER M. GRANT, AND BERNARD MULGREW

ireless communications have become a significant area of growth within the last few years. There are a diverse range of products and services currently on the market, but cellular or personal communications services (ECS) radio networks probably have the highest public profile. These services provide highly mobile, widely accessible two-way voice and data communications links [1]. In general, the most complex and expensive part of the radio path for these systems is the base station. As a result, manufacturers have been designing networks that have high efficiency in terms of the bandwidth occupied and the number of users per base station [2]. This trend has been at the expense of highpower transmitters and receivers which employ very computationally expensive signal processing techniques. The second generation of cellular telephones, based on digital signaling and time and frequency division multiple access. have recently been introduced round the world. Typical examples include the European Global System for Mobile Telecommunications (GSM) and the North American IS-54 access protocols [3]. However, there is also considerable interest in code division multiple access (COMA) techniques for cellular systems [4]. The IS-95 standard for CD MA cellular systems was published in 1992; there is also interest in CDMA systems for the U.S. 1.9 GHz pes bands [5] and the European thirdgeneration universal mobile telephone system (UMTS) [6]. Whatever the relative merits of a given cellular system. it seems that considerable system capacity gains are availahle from exploiting the different spatial locations of cellular users [71. There are a number of methods to achieve this, from simple sectorization schemes [8] to complex adaptive antenna array techniques [9]. This article will consider antenna arrays for the mobile-to-base-station or reverse link of a COMA cellular system. It begins with an introduction to CDMA communications systems and also addresses the general topic of antenna array receivers. Channel modeling is then discussed, because this will influence the design of CDMA receivers. The specific form of receiver algorithms will then be discussed, and some performance comparisons are provided. Finally, the most important question for implementing antenna array systems is what capacity gains are achievable. Some simple analysis is presented to provide an initial answer.

of ways to achieve this, but this article will focus on directsequence spread spectrum techniques, which are used in IS95-based systems. The reverse links for all users within a CDMA system can be conducted over the same radio frequency (RF) bandwidth so that complete frequency reuse for that link is obtained throughout all cells [4]. To distinguish one user's transmission from another, each mobile modulates the voice data symbols by a pseudo-noise (PN) code. Each symbol is composed of W binary "chips which have a much shorter period than that of the original data symbols, so the signal bandwidth is considerably increased. The generic form of the reverse link for a binary phase shift keying (BPSK) spread spectrum system, using noncoherent detection at the receiver is shown in Fig. la. A PN code c(t)~ such as that shown in Fig. lb, is used to modulate the baseband data x(t) and the resulting signal transmitted to the base station. The receiver employs noncoherent quadrature demodulation to recover the signal amplitude and phase. This signal is correlated with the PN code to provide a delayed estimate of the transmitted data i(t - tei)' where td denotes the time delay. A typical PN code auto-correlation function observed at the correlator output is shown in Fig. l c: it takes on significant values only within one chip of the code arrival time. The reverse link of a CDMA system such as that specified by IS-95 has a number of essential characteristics for effective multiple access communication. A detailed introduction to spread spectrum and CDMA techniques can be found in [11-13], but here only points relevant to this discourse will be addressed.

Direct-Sequence CDMA

Modulation Srheme - The reverse link of an IS-95 system employs 64-ary orthogonal data modulation, transmitted using offset quadrature phase shift keying (QPSK) (11]. This article will be concerned with assessing general trends rather than providing specific results for different cellular systems. So, for simplicity, a system employing differential phase shift keying (OPSK) modulation will be considered [141.

C

D MA techniques are based on spread spectrum communications, which were originally developed for military applications. A simple definition of a spread spectrum signal is that its transmission bandwidth is much wider than the bandwidth of the original signal [10]. There are a number

Spread Spectrum Bandwidth - The chip rate of the spread spectrum signal is an important parameter, and is inversely proportional to the chip period t.: A number of different chip rates have been proposed for such systems, but a chip period of approximately 800 ns (chip rate 1.25 MHz) will be assumed for this article. Such a system is often called narrowband COMA, because the baseband bandwidth is much smaller than the RF carrier frequency, which is usually at least 900 MHz for cellular systems.

Reprinted from IEEE Personal Communications, Vol. 3, No.5, pp. 16-25, October 1996.

500

Multipath Diversity - In urban are as. mult ipath

pr opag ation is common . wh ereb y th e rec e iver o bse rves a lar ge number of co pi es of the t ransmitt ed signa l, e ac h with a d iffe ren t time del ay, The noise-like autocorr elat ion function of a PN code. shown in Fig. Ic , means th at the co rrelato r receiver can r e s ol ve mult ipa t h compo nent s whic h a re spac e d by I c h ip p e r iod u p to t he symbo l pe riod. Th is pr o vid es a fo rm of mu ltipa th diver sity, whic h ca n be exploi ted by using a RA KE receiver at the o utput of th e code co rrelat ors [15].

i

I

II I

I

Chip period

I

!

H

~

Asynchronous Operation - T he rever se link of

a C O MA syste m is usu ally async hro no us, in th e se nse that the arrival times for ea ch user's code a re different. This means th at th e rece iver for e ac h us er will observ e int erfer en ce fr om a ll o the r users in the syste m, since th e tr an smitted co des will not be orth og on al. Hen ce. the n umber of users that can be si m ulta ne o usly acco mmodated in one cell is inte rference-lim ited .

,

I

1 -1 1

Symbol period b)

-1 1

.

.' , ~PI;t~i'1L .'. " . . , - ',· ·0 \: ",' '.' .' Delay -+i ~ Chip period c)

• Figure 1. a) Th e gene ral form of a direct-sequ en ce spread spectru m system; b) a typ ical PN code; c) a typical auto-correlati on f unction for a PN code ,

Power Control - A co roll a ry to the above is that powe r co n-

tr ol is esse n tial o n th e re verse li nk. to min imi ze mu lti p le access inte rfe re nce . Othe rwise. mobile tr ansm itt e rs fa r awa y fro m th e ce ll's base sta tio n will be swa mped by interfe re nc e ge ne ra te d by users closer to the re ceiver. If a ll signals ar rive with the same power. the rece iver's tole rance to COMA interfe re nce is pr op ortio nal to the pro cessing gain [ 121. W = t .t t.: where t, is the symbol pe riod , T he obvio us way to inc rease the capacity of a C O MA syste m is to redu ce th e leve ls of mu lt ipl e a cce ss inte rfere nc e , This may be ac hieve d by d ire ctl y ca nce ling the inte rfe re nce [ 16] o r by e mployi ng a mult i-u ser rece iver which sirnu lta neo usly dem odul ates a ll users [ 17]. In th is a rt icle. an approac h based on antenna arr ay rece iver s will be co nside red.

A

-i

.

Why Use an Antenna Array?

n a n te n n a arrav con si st s o f .\1 ide n t ica l! a nt e n na receive rs. whos e o pe ra tio n a nd urrun g IS usu ally co ntrolle d by on e ce nt ra l pr ocessor. The ge o me try of th e antenna loc at ion s ca n va ry widely, but the most commu n con figurations are to pla ce th e ante nnas aro und a circle (circular ar ray ) o r a lo ng a stra ight line (line ar array) . Antenna a r ray s have fre que ntly been pr op osed fo r the o pe ratio n of radar and co mm unica tio ns systems in a milita ry con text [18] : it is pos sible to perform direction findin g ta sks a nd to null o ut e ne my inte rfe re rs . How e ver, in th e co ntext of c ivilia n ce llu la r sys te ms, the ai m of the a nte n na a rray recei ver is pu rel y to pro vide acceptable error pe rf orma nce an d hen ce maxim ize the signal-to- inte rference and noise ra tio (SIN R) for each use r in the syste m. An an ten na a rray co ntain ing ,\4 c le me nts can pn'vide a me an pow er gain o f 1\1 ove r whi te noi se. but suppression of inte rferenc e from o the r ce llular user s is de pende nt o n the form of the received data. A goo d model for the rece ived data in a power -con troll e d CO MA system is a stro ng de si red signal corrupted by a la rge nu mb er of sma ll cr oss-correl at ion inte rference te rms. which arrive with a uniform di str ibution fro m thr ouuhou i th e ce ll." T he arr ay can null o ut M - I inte rfe re rs. but fo-r a CO MA system thi s is unlikely to signif icant ly improve the received SIN R

[9]3 because of the very lar ge number of int erferen ce component s. In ge neral , a bett er meth od ol og y is to es timate th e fo r m of th e received signa l a nd det ermine the mat ch ed filte r sol ut io n [19] . T his fo r m of rec e iver ca n explo it an y sp a tial d iversity present, whi le suppress ing th e mean level of C OMA int e rfe re nce by a fac to r pr op ortion a l to M, Ass u ming tha t the an tenna array provides significa ntly improved SI NR levels a t the base statio n rece iver, the numbe r of channel e rro rs. me asure d by the bit erro r rat io ( BE R) , will reduce, This provides the ce llu la r o pe ra to r with so me deg re es of freedo m whic h may be used for the following pu rposes [9. 20. 211: • To inc rease the num ber of ac t ive use rs fo r a give n BER quality threshold • T o improve the BER perform an ce fo r a give n numb er o f use rs within a cell • T o reduce the SINR required at each ante nna to achieve a ta rge t BER. thus reducing the transmit powe r requ ired by the mobile hand set for the rever se link • T o increase the range of th e base sta tio n rece iver and thus the cell size • T o permit a less stringe nt for m of rev erse link power control while maintaining acce pta ble BER performan ce While antenna arrays provide man y advantages , these mu st be offset against the cost and co mplexity of the ir impleme nta tio n. The re are a numbe r of po ints whi ch mu st be taken into acco unt her e [22, 23J: • The hardwar e/softw ar e re q uire men ts incre ase as .\1 de mo dulators ar e require d for eac h user. • The M receiver s mu st be acc urately synchronized in time to prov ide effec tive per for mance. • Th e co mp uta tio na l co m p lex ity of array p roc essi ng a lgo rit hms ca n be very larg e . • T he array size will be co nstrained by the availab le spa ce for a ba se st a tio n . Usua lly, th e spac ing of a nte nna e le me nts var ies fro m one -half to ten s of RF ca rrie r wavelen gth s. Th is assump tion permits general results 10 be obtained [or system capacity. H owever, the validity of this assum pt ion will depend 0 11 the layout of the cell fo r practical situations,

!

,;An exception is if a mobile undergoes a power control error lind transII//{S
This is not strictly necessary: orthogonally po larized clemen ts. fo r (~tam pie. could be used instead . I

501

multipath components which cannot be separated in time [26, Ch . 2] . The resolvable channel taps are assumed to be uncorrelated as each tap arises due to contributions from different multipath scatterers. The exact distributions of the signal envelopes are a func tion of the signal bandwidth but , for a 0.Q1 2 4 6 8 10 0 "2 . 4 . 6 8 narrowband CDMA system , two distribuExcess time delay (microsec) Excesstime delay (microsec) tions are often proposed. If a dominant line b) a) of sight (LOS) path exists between the transmitter and rece iver, the first received signal • Figure 2. a) A typical channel impulse response for an urban area; b) the discrete component will follow a Rician distribution. tap channel model. However, in urban areas, thi s is often not true : each channel tap consists of a number of independent multi path scatterers with the same probability distribution . Applying the central limit theo• Practical antenna arrays may be adversely affected by chanrem. the received signal envelope statistics can be assumed to nel model ing errors, calibration errors , phase drift , and noise which is correlated between antennas. follow the Rayleigh distribution. As the mobile moves, the sigWith these points in mind , this article will move on to connal strength regularly changes by 20-30 dB: the phenomenon sider channel modeling aspects of antenna arrays . This will of a sudden loss of signal power in this context is often called motivate a discussion on the likely form of a CDMA antenna fading . Over a longer period of time, the average (averaged over array base station receiver. the Rician or Rayleigh fading) received signal power levels vary according to shadowing effects, which have been found to follow a log-normal distribution . The standard deviations quoted for this distribution vary between -l-12 dB according n order to correctly specify the structure of an antenna , array base station receiver, it is important to understand to the type of environment encountered [27]. In addition. the average received power varies inversely with the transmitterthe characteristics of the radio channels that are likely to occur. There are many different types of channel model receiver distance R, raised to the power n. Again, the value of appropriate to different radio systems and scenarios, but here n varies widely, but for urban areas its value is often approximated as 4 [26. Ch . 2J. In this article, it will be assumed that a channel model will be developed for a large CDMA macropower control can adequately co mpensa te for these effects cell operating in an urban environment. For simplicity, the but that it is unable to compensate for Rayleigh "fast" fadin g: case of a single antenna receiver will be initially considered: the results will be generalized for an antenna array receiver in this is somewhat pessimistic compared to the quoted results the next section . for IS-95 power control systems [11]. One of the most important methods for characterizing a The time variation of each channel tap depends on the radio channel is to det ermine its impulse response . This promotion of the mobile . As the mobile moves through spatial locations with different field strengths, each multipath comp ovides an indication of the severity of multipath propagation, nent of the received signal is subject to a Doppler shift in frewhich occurs due to multiple copies of the signal arriving at the receiver with different amplitudes and time delays. In quency. For a CDMA signal not subject to data modulation, calculating the power spectrum of each channel tap shows the dense urban areas there are many buildings and obstacles that distribution of Doppler frequencies for the constituent multigive rise to multipath propagation, and the range of times of paths. The maximum Doppler frequency Vm is proportional to a rrival can be significant. A typical impulse response for an urban area , drawn from the Europe an COST-207 channel the vehicle speed v, according to the equation models [24], is shown in part Fig. 2a. (1 ) The correlation function of a typical CDMA code takes where A. is the carrier wavel ength . There are two d ifferent significant values within :± 1 chip of the time of arrival of the forms of rnultipath scattering, acco rd ing to the exce ss time code. This mean s that a CDMA correlator receiver is able to resolve multi path components of the signal which are spaced delay of the given channel tap [13]: in time by 1 chip up to the symbol period . As a result, the channel impul se response measured by a CDMA receiver is Small Excess Time Delays - The channel tap may be modoften modeled by discrete channel taps [25], spaced in time by eled as the accumulation of multipath components received 1 PN code chip . A typical discrete tap model is shown in Fig. from scatterers close to the mobile . This gives rise to the clas2b. Each channel tap may be charact erized by the following parameters: the stati stical distribution of S(v) the received signal envelopes and phases, the temporal variation of each tap and the spatial variation of each tap . The symbol period for an 15-95 type system is quite high (approx . 100 us) compared to the impulse response duration (typically a few us) , so it is commonly assumed that the number of significant channel taps K « W. It is often assumed that each channel tap is wide -sense-stationary for the mobile moving over short distances, up to a few tens of carrier wavelengths. This means that the signal var iation is due • Figure 3. The classical Doppler model: a) Geometry of multipath scattering; purely to phase changes in a set of independent b) the associated Doppler power sp ectrum for the channel tap. " '"

Channel Modeling Considerations

502

Reflector

/

sica l Doppler power spectrum of the rece ived multi path co m po ne nts. which is illustrated in Fig. 3. The equation for the Doppler power spec tru m S(v) is given by [26. Ch. 2]

a

S(v)=

l - (. ~)'

Ivl~

v",

vm

---.'"

\//.

------.:.'--.. ..

.

rays

··- ,~obile -......

(2)

,

'<, ' ::. .

.

• Figure 4. a ) The geometry fo r scattering with large excess time delays; b) the Gaussian -distributed Doppler power spec trum .

tr ibuted over [0. 21t ] radi an s. a nd the Doppler frequen cie s v, a re di str ibuted to g ive th e a p p ro p r ia te Doppler s pe c t r u m sho wn in Fig. 3 or 4. Equati on (3 ) makes explicit th e fa ct th at eac h channel tap is a continuou s function of time. a ltho ug h th e PN co de au to- co rre lat io n functi on means th at it may onl y be o bse rve d o nc e p er tr an smitt ed sy m b o l. Equ ati on (3) ass u mes th at K « W so th at t he re is no significa nt int er-sym bo l interference . T his mod el ca n easi ly be exte nded to the case of a n an te nna a rray. using the narrowb an d C O MA ass u mp tio n fro m the second sec tio n. Th e a rrav will be co nside red to lie in the horizo n tal p lan e . wi th m aximum le ng t h a nd bread t h of seve ral ca rr ie r wav e le ngt hs. In t his case . each rnultipath arrives a t all the a rray e le me nts a t the sa me tim e . but wi th an a ppro p ria te p hase sh ift a t each a nte nna . The fo rm of the ph ase s hifts is specified by the M x I a rray stee ri ng vecto r a(e l. whic h may be tho ug ht of as the ar ray res po nse to a uni t im pulse a rrivi ng fro m beari ng 8 . T he c ha nne l model now co nsists of a n ,VI x 1 vector for each cha nne l tap : the kt h channel tap vecto r for the nth symbo l may be writte n as [32]:

The insta nta neous va ria tio n o f signal power in space for a c ha nne l ta p depends o n the a ng les of ar riva l of th e mu lt ipath co m po ne n ts. The dist ributio n of mult ip ath e ne rgy with a ng le has been si m ula te d using seve ra l different method s in the lite ra t u re . Le e [29] u se d a mode l base d o n a cosi ne fu nc tion ra ise d to a h igh p owe r to rep rese n t the ang ula r wi d th (or ang le sp rea d): a G aussian dis t rib u tio n of rnult ip ath e ne rg y with a ng le has also b ee n use d [30] . However. o ne of the simp les t models is du e to Sa lz a nd Win te rs [31]. who use a uni fo rm d istributi on of e nergy with angle - th is mo del will be use d fo r the rem a inder of th is article . T he model is sho wn in Fig . 5a . which s ho ws th e geo me try of the mode l for ne ar-in sca tte ri ng . Th e a ng u la r wid t h 2.J de pe n ds o n t he sca tteri ng circle rad ius r an d the d ista nce to the bas e statio n R. T he center bearing 80 is simply th at o f th e mobile : fo r scatte ri ng wit h la rge exce ss time del ays. a sim ilar geo me try applies replacing t he circle o f sc atterers with th e refle ct or givi n g ri se to th at co mpo ne nt.

x ( k .nt

=

(3)

1=1

wh er e Qk denotes th e nu m be r of multipaths that make up the kth t ap , d (n) denot e s th e nth tr an sm itted symbol , and {Aki. Vi. QJ;}denote t he a m plitude . Doppler fr equ ency a nd phase of th e ith m ult ip ath . In thi s art icle it will be as sumed th at for a fix ed value o f k the ampl itudes A ki a rc all equ al. th e ph ases i ar e un iformly dis-

lx, i k ,n t I + i]: - 1)1, ), X ~ (/.. .IU, + ( k Q,

=I

d (II JA)'i exp(j2 n:I'[ IlIJ

,=1

= J (1l )Akq (k .lU, + i]:

n o rde r to mod el a mult ip a th cha nnel. it will be ass u med tha t each cha n ne l tap co ns ists of a num be r of indepen dent co m po ne nts. Ass u me th at th e firs t chip of th e firs t cha nnel tap for th e firs t tr an smitt ed sym bo l begi ns to arrive a t th e re ce iver a t time t = O. Then t he kt h c ha nnel ta p fo r the n t h symbo l will be detect ed at time lit , + (k - I )te . whe re I, a nd t, are th e symbo l a nd chi p periods. respectively. This ta p. whe n measured at th e o utp ut o f th e desi red PN code cor relator for a sing le re ce ive r. will be de noted as x l( k. lll s + (k - 1)le ) a nd may be expanded as:

,

+ ( k - \) 1I" )

I

Xw

Modeling the Received Signal

x ,(k .IIt, +( k- ! )I e )

b)

a)

does not provide a pl au sibl e geo me tric mod el fo r this type of sca tt eri ng . Inste ad. mult ip ath e ne rgy is more like ly to have a narrow Doppl er sp rea d. having arise n from re flection s off isolated ob st acl es suc h as buildings or hills. One such retlecti on ma y be m odel ed as havin g a G aussian di stri b ut e d power spe ctr u m [24 ]. as s ho wn in Fi g. 4 . The be arin g of th e rnult ip ath component m ay b e d et ermin ed by drawin g a n e llipse - so me times ca lled the "e llipse o f Kassini " - wh ose m ajor axis is t h e mu lt ip a t h len gth a nd whose foci a rc th e tra ns m itte r a n d receiver [28 ]. as shown in Fig. 4 . A typ ica l cha nne l tap with a lar ge time de lay ma y co mp rise seve ral suc h refle ctio ns.

2>").] ex p{j(2 n: v , [nr, + ( k - 1)l e ] +
......

S(v)

Reflector locus ellipse

Large Excess Time Delays - T he classical Doppler mod el

= d( lI)

..

...........................

whe re v is the D oppler freque ncy an d a is a sca ling factor.

Q,

---

-

-'-

\)r, l.. ...

ik .n s, + Ik -I )1, ljT

(k - 1)1(' I + $; )a(8, )

(4)

\)1, )e xp(<j>k l

wh ere 8 i denotes th e beari ng of th e ith multipath . xm(k . nt; + (k - \ )t e ) is the c ha n ne l t ap o u tp u t a t the mth a n te n na. Ak de notes th e signa l a mp litude. a nd xT de no tes th e vec to r tr an spose o pe ra tio n. The firs t e nt ry of q is spe cified to be a po sitive re al num be r. so th e term exp( 't',) represents the ca rr ier phase of the kth multipath a t th e first ante nna at tim e t. T he s ta tis tics of a give n c ha n ne l ta p vec to r x(k . I) a re of inte res t. T he p hase of eac h e ntry of x(k. I) is un iformly distribu te d ove r [0, Zn], so th e mean vec to r o f x(k . r) is the zero Extent of mu ltipath / scattering

Power

i

I i i

I

i ii

" ~i;~I~'ofscatterers ..

.:'..>. . ':....;.. a) • .. ·i...'.

Bearing . b) '.

,-:, '.

• Figure 5. a) The phy sical geometry f or the Salzlwinters m odel; b) the uniform distribution of multipatk ene rgy with angle.

503

vector. The second order moments of x(k, t) are specified by its M x M mean covariance matrix \Ilk, which is defined as: \}It

= E[x(k,t)x

== Ai

r

H

(k,t)]

9 0 + )a H

2L\ J90-~

multichannel must also be subject to data demodulation to determine d(n).

A1aximal Ratio Combining - If the interference observed on (<1»/ a H (

each separate multichannel is assumed to be uncorrelated, maximal ratio combining is the method which maximizes the SINR of the combined signals. If the interference has the same standard deviation signal for all multichannels, this method scales each signal by the complex scalar Al exp l-jq»}. Practical methods for estimating the carrier phases {
(5)

where E[] denotes the statistical expectation operator. The integral term arises from the fact that the distribution of multipath energy is uniform over the bearings [90 - 6, 80 + L\]. The leading diagonal entries of this matrix specify the mean power levels of the channel tap at each antenna element; the other entries specify the cross-correlation between the channel tap signals at two different antennas. In practice, measurements of x(k, t) are corrupted by three sources of interference: • Background noise • Auto-correlation interference from all the other taps of the given user's channel • Cross-correlation interference from other users operating over the same bandwidth in the CDMA system In a cellular system, the effect of the third is usually much worse than that of the first or second, and it is the limiting factor on CDMA capacity. Neglecting the first and second, the measured data vector y(nt s + (k - 1 )t c ) ' of size M x l , from the antenna array at time Ills + (k - 1)tc may be written simply as the sum x(k, Ills + (k - l)lc) + Tl(nts + (k - l)t c ) . The 1\1 x 1 vector l1(t) represents the total multiple-access interference measured across the array. Initially, it will be assumed that T\(t) consists of spatially and temporally white Gaussian noise of zero mean and variance cr2.

Noncoherent Combining - If the receiver employs DPSK detection, the carrier phase reference is simply the data sample obtained for the previous symbol. The magnitude of the previous data sample also provides an estimate of the amplitude AI, so the multichannels may be combined as follows: " L d(n)= I9t{:(i,n)z*(i,n-l)}

(6)

i=l

whered(n) is the estimate of the current transmitted symbol,

z" indicates the complex conjugate operation and 9\ denotes

the real part of a complex number. As the combiner weights z*(i. n - 1) are noisy, the SINR of the combined signals tends to be poorer than for maximal ratio combining. For currently used dual-diversity antennas (L = 2), the loss in SINR at the system operating point is typically less than 1 dB: however, as L increases the losses become greater [34, p. 302].

Wiener Filtering - The three techniques described so far are

Signal-Combining Methods

I

t is clear from the previous section that the antenna array has to process a number of copies of the desired signal, each of which is corrupted by undesirable interference. If there are K significant channel taps observed at the A1 elements of the array, there are K x .1.\1 separate data samples to be considered when making a decision on each transmitted symbol. This situation requires what is called a "multichannel" receiver, equivalent to receiving the information over K x il1 separate narrowband flat-fading channels. The best approach to this problem is to weight each channel appropriately and combine them together, before making a data decision. In this section, methods for combining an arbitrary number of channels will be considered; in the next section, specific receiver structures will be described. The main difficulty in designing such a receiver is to decide how to scale each data sample before combining. Consider L multichannels, denoted as z(l, n), which are to be combined. The nth sample for the lth multichannel is of the form d(n) Al exp{j
based on maximizing the signal power at the combiner output. It is also possible to use a Wiener filter. which attempts to suppress interference and maximize the SINR at the combiner output. The performance of this technique is likely to be better than that for maximal ratio combining when the interfer.ence is correlated between multichannels. This technique is discussed in more detail later.

Selection Diversity - If the receiver has to process a number

of multichannels simultaneously, one method is simply to choose the multichannel which is presumed to have the largest signal power. This approach is quite simple, while permitting some performance improvement over single antenna receivers. However, this method does not provide the optimum improvement in SINR that can be obtained. The chosen

504

Multichannel combiners for other modulation schemes are described in (34. Ch. 7].

Antenna Array Receiver Structures

T

he purpose of the antenna array receiver is to estimate the transmitted data sequence d(n), based on the interference-corrupted measurements vv«, + (k - 1)tc ) of the K channel taps. There are two approaches that one might consider for combining the data samples, as described below.

1D RAKE Filter - In a single antenna CDMA receiver, noncoherent combining is often used to combine the K channel taps, a method which is normally called ~'RAKE filtering" [15]. A simple approach to designing an antenna array receiver is to employ a RAKE filter to combine all the K x M entries of the vectors {y(nt s + (k - l)tc ) } . For DPSK modulation, this is performed using Eq. (6) with L = K x M. However, as pointed out above, noncoherent combining of a lot of multichannels can give rise to large losses in SINR compared to maximal ratio combining. 2D RAKE Filter - A more effective and compact approach to dealing with the channel tap vectors {y(nt s + (k - 1 )t(')} is to

apply a separate spatial filter to each tap vector in turn. The receiver can exploit any structure that might be present, such as the directions of arrival of the multipath components, their Doppler frequencies, and so on. This permits the receiver to perform coherent combining of the tap vector elements,

!::".; -: ~ ... . ... . .

improving performance over the l D RAKE filter. Denoting the kth filter as hh K complex outputs a re generated for the nth sym bo l by the ve cto r inn er products { h l!y ( n t, + (k - l )t c )} ' wh ere denotes th e complex conjugate transpose operatio n. This means th at only K outputs ne ed to be co mbi ne d using the DPSK demodulation method o f Eq. (6). The appro ach ha s been called th e " 2D RAKE filter" because the receiv er o pe ra tes two sepa rate se ts of combiners in ti me and spa ce. It is sh own in F ig. 6: the receiver pick s out th e lar gest chan ne l taps and selects appropriate spatial filters in e ach ca se. The outputs fr om t he spati al filter banks are then combined in a conventional RAKE filter, ready for decision making. Base stations ar e fre quently split into three sectors, to provide 120 cove rage, which in this ca se corresponds to th e bearings [30°, 150°] as shown.

hI!

0

Filter

ban ks

• Figure 6. The 2-D RAKE filter combiner operating in a 1200 coveragesector.

There are a number o f method s to determine th e form o f the spatial filter hk. Many o f the se techniqu es employ th e M x M est imated covariance matrix Rk o f th e sig na l y (nt , + (k I )tc )' which is de fined as: I v H R k = N"LY lnt,+ (k -l )tc )y (nt, +(k- I)t/) .

(7)

1l= 1

The not ation N indi ca te s the number o f co nsecu tive symbols used for averaging. For the chose n filte r to opera te e ffec tively, the form of the array response vecto r q(k, nt , + (k - I )tc) sho uld not change significa ntly over this time. In th is a rt icle . it will be ass u med th a t pow e r co nt ro l ensur es that the SINR of eac h channel ta p is large eno ug h for eac h vector q(k, nt, + (k - I )tc ) to be correc tly identified witho u t th e transm itter e m p loyi ng a n in itial trai n ing se q ue nce . Th is ta sk ma y be performed by blind cha nne l ide n tifica t io n techniques. such as: Beamspace Transformation - This technique applies J fixe d spa tial filters, denoted as length M vecto rs Wj , to the dat a [23]. By measurjqg the average power at the outp uts, given by the pr oduct wfRkw), the receiver ma y act accordingly. A sim p le approach is to apply selecti on diversity by choosing the filt er with the largest power output to pick out the desired signal. Bearing Estimation Techniques - As the vector q(k, nt s + (k - I )t() is co mpos ed of a number of ste e ring vec to rs, it is possible to apply bearing estimation techniques to th e cova riance matri x Rk to pick out the major direct ional compon ents. as proposed in [36]. There a re a number o f well-known h igh-re solut ion techniques such as ESPRIT and M USI C; however , the se a lgor ithms perform poorly in th e pr esence of highl y co rr e la te d mult ip ath signal s, whi ch fre q ue nt ly occ u r in th e urban communic a ti on c ha n ne ls bei ng co nsi d e re d he re . A si m p le r approa ch is to use th e conv ention al beam forme r (C BF) den sity spect ru m, which is de fined as:

The beamspace and be aring e sti ma tio n techniqu es po int o ne o r more narrow be am s a t the in coming sign als from the mobile. Thi s c h o ice is o p t ima l o n ly if each ch annel t ap appears as a point so urce. If multipath scatte ring o f th e signa l gives rise to a significa nt a ngula r width of the signal , th e per fo r ma nc e will degr ad e . H owever , the eigenfilt er meth od a lways gives weights that maximiz e the co mbine r signa l power . so it sho u ld perform bett er in t he p resence o f sig ni fica n t angula r width. If the signa l power of a given tap is no t sign ifica nt ly lar ger th an the inte rfe rence power, t he above techniqu es are like ly to inco rrectly pick o ut an inte rfe re nce instead of th e de sir ed tap vec tor. Howe ver. give n th at C D MA inte rfe re nce ge ne ra lly co m p r is es the co n t r ib u tio ns o f a large number of users. it se e ms unl ikel y th at any technique co uld co rr ectly pick o ut the c ha n ne l t ap wi t h suffi c ie n t SIN R fo r t he purp o se s o f da ta de cis io n making. Having noted thi s po int , the proce ed ing sectio n will now lo ok in more d et a il a t th e o p e r a t io n o f th e se blind techniques.

Algorithm Performance n o r d e r to provide a n in iti al comp arison of th e se a lgo, rithms , a very simple cla ssic al Doppler one -tap channel

model will be con sid ered .' The rec eiver cont ain s an eighte le me n t uniform line ar array (ULA ), who se e le me n ts -a re space d by o ne half o f th e ca rr ie r wa ve le ng t h .> Th e cha n ne l tap ang ula r width 20 (de fined in Fig. 5) was varied fro m O- fiO° a nd a rrive d fr om th e a rray broadsi de . be ar in g 90° (i .e ., perpendicular to th e a rray, as sho wn in Fig. 6) . Results from [311 sugges t th at fo r thi s be aring th e signa l co rre la tio n betwe en e le me n ts fa lls most rapidly wi th th e a ng u la r width 20. The sym bo l rat e o f the C D M A signa l wa s 10 ksymbol s/s, and th e maximum Dopple r fre que ncy was se t to II HZ,6 50 H z. or 200 H z. Th e maximum ac hievable S IN R level , d e fin e d as A f/cr2, ~ as set to 14 dB , and N = 50 sna ps ho ts we re used to es ti ma te R I . T he beam space me thod e mployed eig h t o rthogona l spa tia l

(8) Agai n, the steeri ng ve cto r for the largest value of P(S) may be se lected - thi s approach is quite similar to jitter diversity [37]. Eigenfilter Techniques - In o r d e r t o identify the ve cto r q (k , nt, + (k - 1)tc ) ' it is po ssible to calculate the eigenvalu e dec omposition of Rk • Provided the SINR is large, th e M xl e ige nvecto r "1 corresponding to the largest eigenvalue of Rk provides an estimate of q(k, nt ; + (k - l )t c) ' This is th e statistical method of princip al co mpo ne nt analysis and was used in

[38].

505

4 Although these results are for a single rap. they provide some insight into the performance of the algorithms when they are applied 10 the individual channel taps of a multipath cha nnel.

This ant enn a spacing was chose n to amid problems with the CBF bearing estim ation approa ch. Spat ial aliasing effects can occur if the ant enn a spacing is chosen to be much larger than half the carrier wavelength.

5

6 l n the case of 0 Hz Doppler, for each realization of the fad ing channel, the channel tap vector do es not change over the time of observation.

filters : the bearings were chosen as those which provided the poorest SINR for angular width 2.1 = 0°. As the antenna array provides a gain of 9 dB, the SINR at one antenna is 5 dB . Results from [34, p. 302] suggest that applying the 1D RAKE filter to a simple nonfading 8-multichannel scenario with each multichannel having S[NR of 5 dB results in a loss of 2.5-3 dB (i.e., an SINR of 11-11.5 dB at the 10 RAKE fil-

ter output) . This provides a baseline to compare the other algorithms. The average SINR at the output of the beamformers selected by the three algorithms has been. determined from Monte Carlo simulations using 10,000 trials in all cases . The desired signal was generated according to Eq . (4) with Qk = 100. The results for no Doppler effects (0 Hz) are shown in Fig. 7a , with a horizontal line showing the maximum atta inable S[NR in all cases. The results demonstrate th at the eigenfilter method consistently performs well and achieves 16 output SINR values close to the optimum. By contrast, the Eigen <> beamspace and CBF algorithms degrade as the angular width (BF + 2.1 increases, because the anguJ.ar width of the signal energy 15 Beamspace 0 becomes wider than the main lobe of the spatial filters {a(8)} - for large values of 2.1, the performance is not much better eo 14 ....'f ....¢ .... 'O..... Q ..... <>.....o ....¢ . ....<>.....O-.... O .....<>... :s + than for the 1D RAKE technique . In the case of the + 0::: beamspace technique , selecting and combining the two or z + + Vi 13 three filters with the largest power levels may improve perfor+ ~ + mance . 0 <, 0 + 0 + 0 ::J In practice, the mobile will often be in motion, giving rise + 12 0 + 0 0 to a positive Doppler frequenc y. The performance of the 0 0 three algorithms has also been mea sured for Doppler fre 0 0 0 11 quencies of 50 and 200 Hz in Figs . 7b and 7c. Assuming the carrier frequency 'IS 900 MHz, this corresponds to vehicle 10 speeds of roughly 30 and 120 mph (the difference between a 10 20 30 40 50 60 0 a) car moving in an urban area and so me o ne communicating Angular spread of scatterers (deg) from a train) . Otherwise , the simulation conditions are unchanged fr...nn before. 16 As the maximum Doppler frequency increases, the perforEigen <> CBF + mance of all three techniques degrade for large multi path Beamspace 0 15 scattering angles. This occurs because the form of the signal vector q/\k, nt, + (k - 1)tc ) can vary significantly as the phases eo 14 ....4"....~ · ....O..·..O....O·....o..·..~ ....·~ · · ..· ~· · ..O··.. ·~· ..· of the c.onstituent multipaths change over the sampling epoch . :s + . It is interesting to note that the performance of the eigenfilter 0::: z + technique degrades significantly for 200 Hz Doppler. much Vi 13 ''''':- . + .. more so than the CBF or beamspace techniques. However. + ::J + 0 Co . .! 0 + the eigenfilter approach still provides the best absolute S[NR. 0 0 + + ::J 12 Reducing the number of snapshots N to form R, may improve o + 0 0 the SINR performance since the signal vector is more likely 0 0 0 0 0 11 to be stationary over the sampling epoch ; however, this increases the vulnerability of the algorithms to the effect of 10 't:..-_-'-_ _..L-_--'-_ _. . l - _ - - 1 _ - - - I background noise. 10 20 30 40 50 60 o The SINR comparison shown here between techniques Angular spread of scatterers (deg) b) based on beam-steering (i.e., the beamspace and CBF methi ods) and adaptive arrays (i .e ., the eigenfilter approach ) i 16 r---.---,---~--.,---......-,.---, broadly correspond to previous published results [20]. HowevI: Eigen o er, no account has yet been taken here of the statistical distri CBF + Ii bution of the signal at the beamformer output, which will .15· Beamspace 0 " significantly affect the error performance of the receiver. CalI culating the eigenvalues of 'f'k defined in Eq . (5) indicates the '. l eo 14 amplitudes of the independent Ra yleigh processes present in a given channel tap x(k, t) which can be tracked to obtain spa, ~ tial diversity gain . As the probability of several independent I VI <> 13 o o + Rayleigh processes simultane ously entering a large fade is + <> o o <> o much less than that for one Rayleigh process, the receiver is 0 + 0 " 12 0 + 0 much more likely to correctly estimate the transmitted data + + 0 0 sequence. The cumulative distribution function (CDF) indi+ + ,11 0 0 cates the likelihood of the received signal fading below a o o given threshold: th is has been measured for the three algo rithms in another Monte Carlo simulation. The simulat ion conditions are the same as for Fig . 7, except that the scattering width has been fixed at 20° and the maximum Doppler frequency is a Hz . The CDFs for the algorithms have been • Figure 7. The average SINR performance vs. angular width 2.1 calculated from 100,000 trials, and the results are shown in of the beamspace, CBF and eigenfilter algorithms for maximum Fig . 8. The theoretical CDF for x(k, t) has been calculated by Doppler frequencies ofa) 0 Hz; b) 50 Hz (30 mph); C) 200 Hz substituting the eigenvalues of 't'k into the probability density (120 mph). equation [34, Eq. (7.5.26)].

.e-

... ...

I ~

II

•

~:'!i!~J~~J~~~~,~\~~f~~ti'

506

The CDF fo r the e ige nfilte r is close to the optimum achievable, exce pt at low SINRs, where the performance of th e alg orithm de grades sligh tly du e to th e effect of no ise . Inte rest ingl y, both the CBF and beamsp ace a p proaches a re ab le to e xplo it the s pati a l d ivers it y of t he wi d e a ngle m ult ip ath scatt eri n g, a nd th e curves are o n ly slightly worse th an for the eigenfilter. As a result, th e error performan ce of three algo rit hms may be q ui te s imi la r, any d ifference bein g du e to backgro u nd no ise effec ts ra the r tha n the CDF of th e filter o utputs. As a contrast, th e CDFs a re a lso sho wn for 0° sca tte ri ng a nd for a single fixed sp a tia l filt er pl aced at th e so u rce be a rin g . In th e former cas e, the c ha n ne l ta p vec to r e n tr ies a re co m ple te ly correlated. There is no spa tia l d iversity ob tai nab le for thi s case, and the filter o u tp ut follow s a Raylei gh dis tr ibution. In the latter situation . th e filte r cannot be alte re d wh en a large fa de o ccurs at the so urc e b e arin g; inde ed, it is a we llkn own re sult of multi vari at e s ta tis tics th at th e filt er o u tput will aga in follow a Rayleigh distribution [39] . For both curves. large fades are much more likely th an fo r the eig enfilter. C BF or beamspace algorithms: th e rec ei ver error performance will be poor unl ess the SINR is very high o r o the r mult ipath co m ponents are ava ilab le to pr ovid e d iver sity. In ge ne ra l, the co rrela tio n be twee n a n te n nas re du ce s as the a ngula r width 2a inc reases o r, in th e cas e of the e ige nf ilter method. the a nte n na spaci ng is increased. As th e mobile to -b a se -st ation d ist an c e R in cr e as e s, th e val ue of 2a wi ll generally reduce fo r a fixe d val ue o f the sca tte ri ng rad iu s r ( Fig. 5) . The availa ble spatia l di ver sity will reduce acco rd ingly. so t he ra nge e x te nsio n offe red by a n te n na a rra ys ma y be affec ted by this facto r [20] . In cre as ing the o r der of di ve rs it y pro d u ce s d im in is hing re turns in terms o f the e rro r pe rforman ce of the receiver. If the decision va ria b le C D F is me asure d for a re ce ive r co mb ining sev e ral ch a n ne l taps. the im proveme nt in th e C D F d ue to inc reasin g the a ngula r width 2a o f ea ch ta p will no t be as d ram a tic as tha t s ho wn in F ig. 8 . H ow e ver. so me pe rform an ce ga in will sti ll be o b ta ined . Th e m ain d rawbac k of increas ing the va lue of 26 is that th e co rrel a tio n between the rever se a nd fo rwa rd link c ha n n e l ve c to rs re d uces [91. If it is in te nded to use a n a n te n n a a rr a y to tr an smit to the mobile s o n th e for ward link . using the reverse link we ight s for retransmission o n t h e fo rwa r d link may n ot b e ver y effe ctive . Altern ati ve a p pro ac hes su ch as transmissi on di versity [20] . tim e di vision dupl ex reception/transmission. o r tr an smi ssion feedb ack tr ain ing [40] ma y be more ap pro p ria te. A nothe r so u rce of de gr ad ation in pr act ice is th e effect of mul tipl e acc ess inte rfe re nce. U n ti l now , th e algo ri thms whic h ha ve be en used attempt to m aximi ze th e s igna l pow er a t th e o u tputs of th e co mbi ne rs fo r th e in coming cha nnel taps. T his a p p roach maximizes the SINR wh en th e in te r fe re nce is spa ti all y a nd temporally wh it e : how e ver. if th e C DMA inte rfe re nce d o e s not fo llow th is dis trib u tio n . t he per for m a nce of th ese algo ri thms will degra d e furthe r. In o rd e r to ove rcome t h is probl em , th e Wi en er fil te ri ng o r opt imu m co m b in ing techniqu es may be use d to su p p ress th e in te rfe re nce and max imise th e co m bi ne r's SI NR [9]. The Wi en er filte r sol ut ion fo r hk may be expres se d as: hk = wh ere

R;'rk rk is th e

(9)

cro ss -cor r el at ion o f the de si r ed d at a with the N sym bo ls. The ma in difficul ty he re is to determine the desired sig na l. It is possible to feedback previous deci sions: however. th ere will be lo sse s in per formance whe n de cis ion e rro rs o cc ur. T o minimize the effects of s uc h error s , i t is po ssibl e t o empl o y p er iodic tr a in in g sequences, a ltho ugh thi s reduce s the e fficie ncy o f th e rev erse link tr an smi ssions.

y(nt s

+ (k - 1)tc ) ove r

f

:~ .

:g.

0.1

1lI . .

. .0

o .

:~

.

Eigen <> (BF +

0.01

Beamspace 0 Theory - -

o deg

. : O~OOl

Fixed •

.

L.--......._--'-_---'-_---L._---'-_-'-_--'----'

-20

-15

-10

-5

0 5 SINR (dB)

10

15

20

• Figure 8. The cum ulative distribution f unctions for the beamspace. CBF, and eigenfilter algorithm s, and fo r a fixed beamfonner (rfixed ") with a scattering width of 20°. Tire theoretical distribution s fo r scattering width s f or 20° ("Th eory") and (f' ("0 deg"] are also shown.

In o rde r to avo id tr ain in g se que nces a b lind method for ma ximizin g the SI NR, based o n th e eige n fi lte r method. ha s bee n pr op osed [38]. Both pr e- a nd post-correl at ion da ta at th e re ceive r a r e u sed to es ti mate th e in te rfe re n ce cova r ia nce ma t rix 12k' It is s ub tra cte d fro m Rk in o rder to im prove th e es ti matio n of the und erlyin g sig na l ve ctor q(k. t ). T he largest eige nvec t o r u 1 is o b tained fro m th e modi f ie d cova ria nce 1 ma trix. a nd the spa tia l filte r is give n by the e q u atio n Q k- u 1. whe re Q- 1 denotes th e ma tr ix inve rse o pe ra tio n .

Performance of CDMA Antenna Arrays

T

he performan ce of a nte nna a rr ay ba se sta tio ns has been a nalyzed in a number of publications. including [36, 4 1]. Tw o papers have a na lyzed base sta tio n sc he mes spec ific to CDMA syste ms [19, -1 2]. M ore re cently, analys is has be en pr e sented for IS- 95 M- ary m odulation systems [43]. In thi s secti o n. the perform ance of a C D M A system b as ed o n th e e ige n filte r method a nd a DPSK RAKE filter will be present ed. usin g the me an BER of a give n user as a qu al ity mea sure . A si ng le -ce ll sys te m will b e co nsi d e r e d with a numb er of active users o pe ra ting ove r the sa me RF bandwidth . Th e fo llo wi ng ass u m p t io ns h a ve been m a d e a bo u t t he C DMA syste m : • Eac h us er is ass u me d to obse rve a K = 4 t ap c ha n ne l , acco rdi ng to th e im pu lse resp onse in Fig. 2. The normaliz ed c ha nne l tap pow e r lev e ls {Sj} become 0, - 3. - 6, a nd - 9 dB , respectively . T he chan ne l is " slowly fa ding" with m aximum D oppler freque ncy 0 H z for eac h cha nne l ta p. T he re ce iver is assumed to be a b le to perfe c tl y tr ac k th e d esir ed user' s cha nnel. • Eac h cha nne l tap is co rru pted by C D MA inte rf e re nc e fr om a ll o the r users. Ass um ing tha t th e me an power o f a ll use rs is the sa me . the PN co de filt e r for the desired user will suppr ess th e power level of each o the r user by a factor of 3Wj2 for rectangular pulse sha ping [44] . • The CDMA in te rfe re nce is as sumed to ha ve a uniform di stributi on o ve r the range of be arings [30°, 150°]. The normalized o utp ut power y from the desired user's spa tial filt er h for one interferer at be aring 8 is sim ply hHa(8)a(8)H h/(hHh). It is important to determine the s ta t ist ica l moments of y; for exa m ple, the mean y is given by:

507

' M = l 'M=2 :- - ·

0.0001,

',M ;,,4 ·····.,

" M = S' -

100

·

120

• Figure 9. The BER results plotted for different numbers ofusers and antenna array sizes.

y= ~J5lt6/6 (h Ha($)a($)" h dcl»/(hHh) 2lt

For each antenna size and scattering width, the largest value of y has been selected to provide a pessimistic capacity estimate . The performance of adaptive array filters can be difficult to determine analytically. but here the power suppression level y ha s been modeled as Gaussian using the central limit theorem. Its first two moments for P active users are v (P - 1) and yl (P - 1), where yl is the covariance of y for one interferer. As h always represents the matched filter for the given channel tap xik: z), h has the zero vector as its mean vector and its mean covariance matrix is '¥ k/Af. The following steps were then used to calculate the BER performance in each case : • The mean SINR at the spatial filter output for the Jth tap is given by the equation WSi p= K (11) I (2/3)il. k = l s k

where Sk denotes the normalized power of the kth channel tap and the processing gain W = 128. • The eigenvalues of the matrix '¥k ind icate the number of independent Rayleigh processes at each channel tap. If there are a total of Z significant Rayleigh components for all channel taps. denote their amplitudes as {aJ. • Assuming that the add itive interference is Gaussian distributed, zero mean. and uncorrelated between channel taps, a closed form expression for the BER may be obtained. The output of each channel tap at each antenna follows a Rayleigh distribution . The BER for a given user may be obtained by simple modification of a result from [34, Ch . 7] BER= 1

K-IK -I-m

-,-L L 2- K - 1 m=O 11=0

(

2K-I) n

-l

Z

1

LIp

L- "=1 1I,, I+l/"

)m+1

Z

a

TII=IG,,-l/ i

Conclusions

T

(10)

nt

the results demonstrate a significant performance improvement for a given number of users, by increasing the size of the antenna array. If a mean BER of 10- 3 is taken as the threshold of acceptable performance, the capacity according to this measure increases from six users for one antenna to 63 for eight antenna elements. For cellular systems with a uniform distribution of users throughout the network, an additional noise term of approximately 50 percent of the single-ceil CDMA interference is present. This effect will reduce the capacity by approximately 1/3; however, it can be compensated for by including error correction coding and data interleaving [11] . In addition, the system capacity may be doubled by using voice activity detection, which only permits mobile transmissions when the user is speaking; typically , one person speaks during 40 percent of a telephone conversation.

(12)

,>'"

Equation (12) may be integrated over the distribution of gamma to determine the final result for each number of users . In order to compare the performance of different antenna array sizes, Eq. (12) has been evaluated for M = 1,2,4, and 8. In each case, the maximum value of y for a scattering width 2t1 = 0.1 has been calculated, assuming the source bearing is in the range [30°, 150°]. The results are shown for the desired user's BER vs. total number of users P in Fig. 9. In this case,

508

his article has provided an introduction to the subject of antenna arrays for narrowband CDMA base station receivers. A number of points have been discu ssed, and are summarized below : • The topic of antenna arrays has been introduced, noting that they can reduce cellular inte rfe re nce levels and improve capacity. Results in this article suggest that employing M antennas can multiply the reverse link capacity by a factor of roughly M . However, this requires additional base station hardware and software. • Channel modeling aspects have been de scribed: in urban areas, several channel taps are often resolvable . Each channel tap arises geometrically through Classical Doppler or Gaussian Doppler models and has an angular width related to the distance and width of the scattering. • The channel taps observed at antenna arrays may be modeled as the summation of array steering vectors. In urban areas, it is common for each vector entry to fade according to the Rayleigh distribution. • The 2D RAKE filter appears to be a promising approach to handling a CDMA antenna array receiver. Several algorithms have been compared. with the eigenfilter approach performing consistently well. As the angular width of a channel tap increases, the receiver is able to exploit more spatial diversity. • All the algorithms degrade in the presence of high Doppler frequency signals, particularly when the tap's angular width is large. In these cases, the receiver may need to employ short data lengths for channel estimation. • The BER performance of an antenna array receiver has been estimated and significant capacity increases demonstrated. In general, the angular distribution of interference will affect the ability of a channel tap's spatial filter to suppress interference. The location of a mobile is particularly critical in determining the interference suppression levels . There are a number of points which remain to be addressed. An important issue for antenna arrays is the scattering width of multipath components. Values have been estimated for narrowband systems (e .g., [30]), but no results appear to exist in the literature for frequency-selective CDMA systems. Efficient implementations of channel-identification algorithms are required, and more work is needed on their performance in realistic CDMA channels. The interaction of the reverse and forward links is important in practical systems, particularly to ensure that the forward link can handle the increased traffic that antenna arrays can offer on the reverse link.

Acknowledgments This work was sponsored by EPSRC, an MOD CASE sp?nsorship, and Nortel Technology. The aut~ors would also like to thank the anonymous reviewers for their helpful comments on this article.

References

[1] J. E. Padgett, C. G. Gunther, and T. Hattori, "Overview of Wireless Personal Communications," IEEE Commun. Mag., vol. 33, no. 1, Jan. 1995, pp. 28-42. [2] D. C. Cox, "Wireless Personal Commu.nications: What Is It?" IEEE Pers. Commun., vol. 2, no. 2, Apr. 1995, pp. 20-35. [3] D. D. Falconer, F. Adachi, and B. Gudmundson, "Time Division Multiple Access Methods for Wireless Personal Communications," IEEE Commun. Mag., vol. 33, no. 1, Jan. 1995, pp. 5 0 - 5 7 . . . [4] A. J. Viterbi, "The Orthogonal-Random Waveform Dichotomy for Digital Mobile Communication," IEEE Pets. Commun., vol. 1, no. 1, 1st qtr., 1994, pp. 18-24 [5] C. I. Cook, "Development of Air Interface Standards for PCS," IEEE Pets. Commun., vol. 1, no. 4, 4th qtr., pp. 30-34. [6] P. G. Andermo and L.M. Ewerbring, "A CDMA-Based Radio Access Design for UMTS," IEEE Pers. Commun., vol. 2, no. 1, Feb. 1995, pp. 48-53. [7] M. Barrett and R. Arnott, "Adaptive Antennas for Mobile Communications," lEE Elect. and Commun. Eng. J., vol. 5, no. 4, Aug. 1994, pp. 203-14. [8] G. K. Chan, "Effects of Sectorization on the Spectrum Efficiency of Cellular Radio Systems," IEEE Trans. Vehie. Tech., vol. 41, no. 3, Aug. 1992, pp. 217-25. . . [9] J. H. Winters, J. Salz, and R. D. Gitlin, "The Impact of Antenna Diversity on the Capacity of Wireless Communication Systems," IEEE Trans. Commun., vol. 42, nos. 2, 3, and 4, Feb./Mar./Apr. 1994, pp. 1740-50. (10] J. L. Massey, "Informatron Theory Aspects of Spread Spectrum ~om munications," Proe. 3rd IEEE lnt'l, Symp. Spread Spectrum Techniques and Apps. (lSSSTA), Oulu, Finland, July 1994, pp. 16-21. [11] R. Padovani, "Reverse Link Performance of IS-95 Based Cellular Systems, IEEE Pers. Commun., vol. 1, no. 3, 3rd qtr., pp. 28-34. [12] M. K. Simon et ei., Spread Spectrum Communications Handbook (Revised Ed.), New York: McGraw-Hili, 1994. [13] W. C. Y. Lee, "Overview of Cellular COMA," IEEE Trans. Vehie. Tech., vol. 40, no. 2, May 1991, pp. 291-302. [14J S. Haykin, Digital Communications, New York: John Wiley, 1988. . [15] R. Price and P. E. Green, "A Communications Technique for Multipath Channels," Proe. IRE, vol. 2, Mar. 1958, pp. 555-70. [16] R. Kohno, "Spatial and Temporal Filtering for Co-Channel Interference in COMA," Proe. 3rd /SSSTA, Oulu, Finland, July 1994, pp. 51-60, (17] S. Verdu, "Adaptive Multiuser Detection," Proe. 3rd ISSSTA, Oulu, Fin. . land, July 1994, pp. 43-50. [18] B. D. Van Veen and K. M. Buckley, "Beamfor minq: A Versatile Approach to Spatial Filtering," IEEE ASSP, Apr. 1988, pp. 4-24. . [19] A. F. Naquib, A. Paulraj, and T. Kailath, "Capacity lmprovem~nt With Base-Station Antenna Arrays in Cellular COMA," IEEE Trans. ven«: Tech., vol. 43, no. 3, Aug. 1994, pp. 691-8. [20] J. H. Winters, "The Diversity Gain of Transmit Diversity in Wireless Systems with Rayleigh Fading," Proc. ICC '94, New Orleans, LA, May 1994, pp. 1121-25. [21] A. F. Naguib and A. Paulraj, "Performance Enhancement and Trade-offs of Smart Antennas in COMA Cellular Networks," IEEE Vehic. Tech. Conf. (VTC), Chicago, IL, July 1995, pp. 40-44. [22] S. Haykin et al., "Some Aspects of Array Signal Processing," lEE Proc., pt. F, vol. 139, no. 1, Feb. 1992, pp. 1-26. . [23] J. E. Hudson, Adaptive Array Principles, Stevenage, U.K.: Peter Peregnnus, 1981. [24] Commission of the European Communities, "Digital Land Mobile Radio Communications: COST-207 Final Report," Ch. 2, 1988. [25] G. l. Turin et al., "A Statistical Model of Urban Multipath Propagation," IEEE Trans. Vehic. Tech., vol. 21, no. 1, Feb. 1972, pp. 1-9. [26] R. Steele (ed.). Mobile Radio Communications, London: Pentech Press, 1992. [27] J. D. Parsons, The Mobile Radio Propagation Channel, London: Pentech Press, 1992. [28] A. S. Bajwa and J. D. Parsons, "Small-Area Characterisatron of UHF Urban and Suburban Mobile Radio Propagation," lEE Proc., pt. F, vol. 129, no. 2, Apr. 1982, pp. 102-9. [29] W. C. Y. Lee, "Effects on Correlation between Two Mobile Radio BaseStation Antennas," IEEE Trans. Commun., vol. 21, no. 11, Nov. 1973, pp. 1214-23. II

[30] F. Adachi et aI., "Correlation between the Envelopes of 900 MHz Signals Received at a Mobile Radio Base Station Site," lEE Proe., pt. F, vol. 133, no. 6, Oct. 1986, pp. 506-12. . [31] J. Salz and J. H. Winters, "Effect of Fading Correlation on Adaptive Arrays in Digital Wireless Cornrnurucations." Proc. ICC '93, Geneva, Switzerland, May 1993, pp. 1768-74. [32] G. Raleigh et al., "Characterisation of Fast Fading Vector Channels for Multi-Antenna Communications Systems," Proe. 28th IEEE AS/MOLAR Conf., Pacific Grove, CA, Nov. 1994, pp. 853-57. [33] D. G. Brennan, "Linear Diversity Combining Techniques," Proe. IRE, June 1959, pp. 1075-1102. [34] J. G. Proakis, Digital Communications, New York: McGraw-Hili, 1989. [35] A. F. Naguib and A. Paulraj, "Performance of CDMA Cellular Networks with Base-Station Antenna Arrays, Proe. tnt'i. Zurich Seminar on Digital Commun., 1994. (36] S. Anderson et a/. , "An Adaptive Array for Mobile Communications Systems," IEEE Trans. Vehic. Tech., vol. 40, no. 1, Feb. 1991, pp. 230-36. (37) C. Farsakh and J. A. Nossek, "Application of Space Division Multiple Acc~ss to Mobile Radio," Proe. 5th IEEE lnt'l. Symp. Persona/Indoor and Mobile Communications (PIMRC), The Hague, Holland, Sept. 1994, pp. 1736-39. [38] B. Suard, et al., "Performance of COMA Mobile Communication Systems Using Antenna Arrays," Proc. IEEE tnt'l. Cant. Acoustics, Speech and Signa/ Processing (lCASSP), Minneapolis, MN, Apr. 1993, pp. IV 153-56. [39) R. J. Muirhead, Aspects of Multivariate Statistical Theory, New York: Wiley, 1982. . [40) D. Gerlach and A. Paulraj, "Adaptive Transmitting Antenna Arrays With Feedback," IEEE Sig. Processing Letts., vol, 1, no. 10, Oct. 1994, pp. 50-52. [41] S. C. Swales et aJ., "The Performance Enhancement ~f Mult.ibeam Adaptive Base Station Antennas for Cellular Land Mobile Radio Systerns," IEEE Trans. Vehie. Tech., vol. 39, no. 1, Feb. 1990, pp. 56-67 .. (42] J. C. Liberti and T. S. Rappaport, "Analytical Results for Capacity Improvements in COMA," IEEE Trans. Vehie. Tech., vol. 43, no, 3, Aug. . 1994, pp. 680-90. [43] A. F. Naguib and A. Paulraj, "Performance of OS/COMA With M-ary Orthogonal Modulation Cell Site Antenna Arrays," Proe. ICC '95, Seattle, WA, June 1995, pp. 697-702. (44] R. Meidan, R. Kohno, and L. B. Milstein, "Spread Spectrum Access Methods for Wireless Communications," fEEE Commun. Mag., vol. 33, no. 1, Jan. 1995, pp. 58-67, Jan. 1995. II

509

Efficient Direction and Polarization Estimation with a COLD Array Jian Li, Petre Stoica, and Dunmin Zheng Abstract-This paper considers angle and polarization estimation by means of a cocentered orthogonal loop and dipole (COLD) array. We show that by using the COLD array, the performance of both angle and polarization estimation can be greatly improved, as compared to using a crossed dipole array. We present an asymptotically statistically efficient method of direction estimation (MODE) algorithm that can be used with the COLD array for both angle and polarization estimation of correlated (including coherent) or uncorrelated incident signals. Numerical examples are given to show the better estimation performance of the MODE algorithm than that of the multiple signalclassification(MUSIC) and the noise subspace-fitting (NSF) algorithms.

W

I.

INTRODUCTION

HEN array signal processing algorithms are devised to estimate incident signal parameters with uniformly polarized or diversely polarized arrays, it is important to take advantage of array geometries and receiving properties of antenna elements. Although many algorithms have been developed for array signal processing recently, the characteristics of specific antenna sensors are only beginning to attract more attention. Previous work on angle and polarization estimation using crossed dipoles [1], [2] and orthogonal dipoles and loops [3] are examples of using specific antenna sensors to estimate the angles and polarizations of incident narrowband electromagnetic plane waves. In this paper, we study the advantages of an arbitrary linear array that consists of cocentered orthogonal loop and dipole (COLD) pairs. By using the COLD array, the performance of both angle and polarization estimation can be improved significantly, as compared to using a cocentered crossed-dipole (CCD) array. We consider the case where all incident narrowband electromagnetic (EM) plane waves are completely polarized. A completely polarized EM wave is a limiting case of a more general type of EM wave, viz. a partially polarized EM wave. The state of polarization of a partially polarized EM wave is a function of time, while a completely polarized wave has a fixed state of polarization (see [4] and the references therein). Manuscript received February 10, 1995; revised October 9, 1995. J. Li and D. Zheng were supported in part by the NSF Grant MIP-9308302. P. Stoica was supported in part by Goran Gustafsson Foundation, the Swedish Research Council for Engineering Sciences (TFR), and the Swedish National Board for Technical Development (NUTEK). J. Li and D. Zheng are with the Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32611 USA. P. Stoica is with the Systems and Control Group, Department of Technology, Uppsala University, P.O. Box 27, S-751 03 Uppsala, Sweden. Publisher Item Identifier S 0018-926X(96)02638-5.

We present an asymptotically statistically efficient signal subspace-based method of direction estimation (MODE) algorithm [5], [6] for both angle and polarization estimation. Since the MODE algorithm is a signal subspace-based approach, it is asymptotically statistically efficient for both correlated (including coherent) and uncorrelated incident signals. We show with numerical examples that the estimation performance of MODE is better, especially for highly correlated or coherent signals, than that of multiple signal classification (MUSIC) and noise subspace fitting (NSF) [7]. (We remark that the signal subspace eigenvector based MODE algorithm and the NSF algorithm are asymptotically statistically equivalent whenever the signals are noncoherent. For coherent signals. MODE remains asymptotically statistically efficient. whereas NSF is no longer asymptotically statistically efficient [8]. This observation suggests that when the correlation coefficient is close/very close to one, the NSF may need a much larger number of data samples than MODE to converge to the asymptotics, and hence. for a given finite N, MODE is likely to perform better than NSF in such a case of highly correlated signals.) In Section It we define the array geometry, describe the properties of the COLD array and formulate the problem of interest. In Section III, we describe how the MODE algorithm can be used to estimate the angles and states of polarization of incident signals with a COLD array. In Section IV, the asymptotic statistical performance of MODE is given. In Section V. several numerical examples are presented to compare the MODE algorithm with the NIUSIC and the NSF for both angle and polarization estimation. The Cramer-Rao bound (CRB) for the COLD array is also compared with that for the CCO array. Finally. Section VI gives our conclusions. II. COLD

ARRAY AND PROBLEM FORMULATION

Consider a 2L-element linear array consisting of L COLD pairs, as shown in Fig. I. The signal received from each antenna sensor is to be processed separately for direction and polarization estimation. The lth COLD pair, l == 1.2,···,L, has its center on the y-axis at an arbitrary y == 81• For the lth COLD pair, the dipole parallel to the z axis is referred to as the z-axis dipole and the loop parallel to the x-y plane as the x-y plane loop. Assume K (with K ~ L) narrowband plane waves impinge on the array from angular directions described by () and
Reprinted from IEEE Transactions on Antennas and Propagation, Vol. 44, No.4, pp 539-547, April 1996.

510

where A o is the wavelength of the signal. The effecti ve heights of the short dipoles and small loop s are given by [Ill

z

hsd

= Lsd sin ¢

and h oi

27r A s l

(6) .

(7)

= - j - - slll ¢

>'0

respe ctivel y. Including the time and space phase factors in (4), we find that an incoming signal characterized by (() , ¢, ,. TI , E) produces a signal vector in the COLD pair centered at y ~I as follows :

Y

=

(8)

x

Fig. I.

A linear COLD array.

where Ii

wave with an arbitrary elliptical-electromagnetic polarization [9] . Assume that the electric field of an incoming signal ha s transverse components

where unit vectors e O. e 'D' and -e r • in that order. form a right -hand coordinate syste m for the incoming signals and E: and E will describe a polarization ellipse . For a given signal polarization. specified by con stants ~( and '1. the electric -field components are given by [aside from a co mmon narrowb and phase factor So( t ) )1

= E cos -y.

= E sin ~/C) '1

.

JlCOLD = [1'~:i~~~;)II]

(9)

( 10)

where (11)

( 12)

(2)

(3 )

where E denotes the amplitude of the incident signal. The --y and fJ can be used to compute (t and (3. which are the ellipticity and orient ation angles of the polari zation ellipse . respecti vely . "! is alw ays in the range 0 :S 'Y :S "if / 2 and '1 is in the range - 7r :S TI < 7r . ex and ri ca n also be used to compute '"'I and T) [Il. [I OJ . We assume that each dipole in the array is a short dipole (i.e.. the length of the dipole is equ al to or less than one-tenth of a wavelen gth ) with the same length L'd and each loop is a small loop (i.e.• the perimeter of the loop is equal to or less than three tenth s of a wavelength ) with the same area Asi ' Thu s, the output voltages from each dipole and loop are proportional to the electric-field components parallel to dipole and loop. respectively . An incoming signal described by arbitrary electri c-field components Eo and Ed> ca n be written as E = E[ (cos ; )e e + (sin;ej 'l)e o ].

sin dJ cos, ] sin ¢ sin , C)I,

27l'.-1.,1 Ao

L sd

An advantage of the COLD array is that its antenna elements are not sensitive to the azimuth angle () of the signal becau se both the loop s and dipoles have the same si n (b field pattern . as may be seen from (9) . Hence. the incoming signal described by (4) is independent of () . We assume that the antennas and the inci de nt signals are co planar. i.e.. ¢ = 90° . Thu s. (9) become s

(I)

£ (J E
j

COLD = [ _

Note that Fe and \ /~!> repre sent the comple x voltages induced at the loop and dipole outputs by a sig nal with a unit elect ric field parallel to the loop s and dipoles. respecti vely. Let s(t ) = E so(t )Ve cos -y. The ZI(t ) in (8) can be rewritten as Z(t )

= US(t )(ll

( 13)

1 ] = [ \'; F" ta n; e j'1

(14 )

where U

We ass ume that K signals. spec ified by incident angles Ih , l: = 1, 2. · · · . K. are incident on the arra y. In addition. we ass ume a thennal noise voltage vecto r 01 (t ) is present at each output vector z/(t ). The o /(t ) are assumed to be zero-mean cir cul arly-symmet ric complex-Gaussia n random processes that are statistically independent of eac h other and to have covariance matrix a 2 I, where I denotes the identity matrix . Under these assumptions. the total output vector recei ved by the COLD pair centered at y = 51 is given by

(4)

Let us define the spatial phase factor

(5) 1 For a narrowband BPSK (binary phase -shift keyed) signal. for example. so (t ) e}[-"ol + ..·(t }] . where ~'o is the carrier frequency and .;.(t) is [he modulating phase.

=

511

K

ZI(t ) =

L UkSk(t )qlk + o /(t ) ,

l = L 2. · · · . L

(15)

k=l

where U k and qlk are given by (14 ) and (5). respectively, with subscript k added to each angular quantity. Furthe r. Sk(t) = Eksok(t )Vo cos ; /c. where Eks odt.) denotes the kth narrowband signal. The incident signals may or may not be correlated (including completely correlated. i.e.. coherent) with each other.

Let z(t),s(t), and n(t) be column vectors containing the received signals, incident signals, and noise, respectively, i.e.,

z(t) ==

[:~:~g],

s(t) ==

ZL(t)

[:~~g],

n(t) ==

[:~m].

(16)

nL(t)

8l\-(t)

The received signal vector has the form

z(t) == As(t) where

A

+ n(t)

(17)

is a 2L x K matrix

A ==

[<11

<12

== [A ® I]U

lJll

lJ12

l}IK

fJ21

q22

(]2K

fJLl

and

u

==

[UI o

(24)

We assume that ic is known (if ic is unknown, it can be estimated from the data as described, for example, in [13]). Note that if no components of the signal vector s( t) are fully == K (provided N > K). correlated to one another then Further, the A in (23) is a diagonal matrix with diagonal elements ~ 1 2: ~2 2: ... 2: ~ K' which are the largest eigenvalues of it, and

tc

tc

(25)

( 18)

with 0 representing the Kronecker product

A==

subspace eigenvectors of it that correspond to the ic largest eigenvalues of it, with ic == Inin[N, rank(S)]. Here, S is the source covariance matrix

lJL2

where

( 19)

qLK

0 ]

(20)

(J2

=

~- Ai =,2£ -1 K..., [tr(R) - 1:. Ai]. .

t=K+l

where

pi == I -

where (. )If denotes the complex conjugate transpose and it denotes the estimate of the following array covariance matrix

R == E[z(t)zH (t)].

(22)

It has been shown in [5], [12], and [8] that an asymptotically (for large N) statistically efficient estimator of the angles f) == [0 1 , ()2, " ' , O](]T and the polarization parameters r == [rl' rz, ... r K]T can be obtained by minimizing the fOllowing function 'J

f(B, r) = Tr[PitsA,- A-lE~]

±

(23)

"'here the symbol P stands for the orthogonal projector onto the null space of AH, and the columns in Es are the signal

512

(26)

PA~.~~I

+ P(ATH'~:I)V

(27) (28)

v-

III. ANGLE AND POLARIZATION ESTIMATION USING MODE

(21)

1

and

with The MODE [5], [6] and (in a related form) the weighted subspace fitting (WSF) [12] algorithms were derived for angle estimation with uniformly-polarized arrays. We present below how to use the signal subspace-based MODE algorithm with the COLD array for both angle and polarization estimation. Let

t=

We show below that we can concentrate out r first, and hence, reduce the dimension of the parameter space over which we need to search to minimize (23). It is shown in the Appendix that

UK

We assume that the element signals are sampled at ~V distinct times tn. ti == 1.2.···.;.V. The random noise vectors n(t n ) at different sample times are assumed to be independent of each other. The problem of interest herein is to determine the azimuth arrival angles Ok and the states of polarization de .cribed by C:k. "k)~ or (C~k .... :h:). J-; := 1.2.···. K from the measurements z(t n ) . ti == 1.2.···. N,

1 __ 2£ - K

v» == Thus, minimizing

.

VI [

o

(29)

VK

- r* ] [ 1k ,

f (() ~ r)

!(().r) == Tr[(-PA~I

0 ]

k ==

1~

2..... K.

(30)

in (23) is equivalent to minimizing .)

+ P(AtH~I)V)EsA-A-1E~].

(31)

Let

w == {"vH[(AH A) -1 0

I]V} -1

(32)

be formed from some consistent estimates of () and r. Since

P(AtH®I)vE s

= O(ljJN),

{VH[(A H A)-l ~ I]V}-l

can

be replaced by W without affecting the asymptotics of the MODE estimator [5], [6]. Then we have

(33)

where

(34)

and " 2

f2((}, r) == Tr[(At H ~ I)VWV H (At ~ I)EsA A-1E~]. (35)

The MODE estimates {{), r} are obtained by minimizing i.e.,

j,

{{} ~ r} == arg lllin [11 (B) + f2( B, r)].

(36)

A h ==

odd columns of (At H @ I)

(37)

A u ==

even columns of (At H 0 I).

(38)

8,r

To summarize, we have the following MODE algorithm for angle and polarization estimation: Step 1) Obtain initial estimates of () and r (see the discussions below). Step 2) Determine {) by minimizing II (0) + .f3(()) as shown in (49) with W in (32) formed from the initial estimates obtained in Step 1). Step 3) Calculate by using the {) obtained in Step 2) in (47). Step 4) Determine the ~ and r, from r with

Let

r

and

,k~ ==

Let V h and V v be the following K x K -diagonal matrices (39)

tan

_1(!f-kVd>V8!)

r,k = -arg ( _r~9}

and

V u == I.

(40)

Then

(A t H ® I)V Thus,

12 (fr r)

==

Ah V h

+ Au V».

" 2

(41 )

+Tr [

A--lE~ s; Vh W]

v;H A hH"Esj\ ~

v

2

~ I"H AE s AvVvW ]

(42)

IV.

Since V h and V v are diagonal matrices, (42) can be written in the following matrix form

h(fJ. r) =

[v~

e T ] Q(fi)

[Veh

1

(43)

where (44), as shown at the bottom of the page, with (~) denoting the Hadmard-Schur matrix product (i.e., the elementwise multiplication) (45)

and

e == [1

(46)

Note that the polarization parameters are contained only in v h. By setting 8f2/8vh == 0, we obtain

Vh == -Ql 1(B)Q2(())e.

(47)

Using (47) in (43) gives

f3(()) == e T [Q3(()) - Q~ (8)Q 1l(B)Q2(8)]e

and using

e in

8

== 1,2.···.K.

(51)

(49)

STATISTICAL PERFORMANCE ANALYSIS

We present below the asymptotic (for large ;.V) statistical performance of MODE for both direction and polarization estimation with the COLD array. Before we present the analysis results, however, we first describe the method we use to describe the accuracy of the polarization estimates. For reasons discussed in [I), we define the polarization-estimation error to be the spherical distance between the two points M and JI on the Poincare sphere that represent the actual state of polarization (",(",) and the estimated state of polarization ('"Y. 'f)), respectively. Let ( be the angular distance between AI and .:.v[. Then [11

cos ( == cos 2, cos 21'

+ sin L--Y sin 21' cos( r} -

,))

(52)

where ( is always in the range 0 S ( ~ tt, Applying the first-order approximation to the left side of (52) yields ( k2

(48)

which is a concentrated function depending only on f). The MODE-estimates {{), r} can be obtained by

{) == arg min [11 (f)) + 13(8)]

k

For signals that are not highly correlated or coherent with each other, the initial estimates of 0 and r in Step I may be obtained by using MUSIC [14], which requires a onedimensional search over the parameter space. For highlycorrelated or coherent signals, the initial estimate of () may be determined by setting W == I and minimizing 11(8) + 13(B). as shown in (49). The initial estimate of r can be calculated by using the initial estimate of B in (47). The initial estimates obtained by using MUSIC for noncoherent signals or MODE with W == I are known to be consistent [6], [151·

in (35) can be rewritten as

I2(B~ r) == Tr[V~ Af:EsA

(50)

== 4(""t» -

+

. 2'2 ,k ) 2 SIll \ I k )(" 1Jk -

iT} k )? -.

(53)

The asymptotic variances of the polarization estimates are obtained with (53) and the accuracy results on :y and fry given below.

Let

(54)

(47) to obtain r.

" ?

(A~Es~: A-IE~ A v) 8 (A~EsA A-lE~ AI') 8 513

W W

T T

]

~ [Ql(()) Qlf(O)

(44)

It follows from [5], [8] that the asymptotic (for large N) statistical distribution of f is Gaussian, with mean T and covariance

MUSIC WIth COLD : NSF with COLD :

matrix equal to the corresponding stochastic Cramer-Rao bound (C R B ). The ij th element of CRB - 1 is given by [C R B-l j,.j

= ~ Re[tr{AfPiA iSAHR-l AS}]

CRB for COLD :

(55)

where A i = 8Aj8Ti with Ti denoting the i th element of T.

o o

V. N UMERICAL RESULTS

+ Lsd sin ~!ej " sin (} cos L sd 5111 . (/)" Sill Ie in

_ [LSd cos y cos (} -

¢] .

o

8

10

MUSIC Wll h COLD:

,e)"

(\Ve remark that if the antennas and the incident signals are

not coplanar, we will need two-dimensional CCD or COLD arrays for angle and polarization estimation, which is the case not considered herein. For this case, however, the COLD array will not always perform better than the CCD array.) First, we present two examples that illustrate how the angle separation between the two incident signals affects both the direction and polarization estimates. We begin with the case of two signals with identical circular polarizations (al = a 2 45°). Fig. 2 shows the root-mean-squared errors (RMSE' s) of the estimates of the first signal as a function of angle ,eparation t::.B when two correlated signals with correlation coefficient 0.99 arrive at the array from angles Bl t::.B /2 and B2 = t::.B /2. We note that MODE performs better than MUSIC and NSF. Further, MODE achieves the best possible unbiased performance, i.e. , the corresponding CRB , as the angle separation increases. Because the signals arrive from angles near the broadside of the arrays, the CRB' s for the COLD and CCD arrays are similar. This case corresponds to

=

=-

18

20

22

o

NSF w,th COLD : MODE wrth COLD : CRB for COLD:

o

.

o

o

,

(57)

.

12 14 16 Angle Separation (deg)

o

10 ' rf ---r--~-~----~---r--~--~-,

=

L sd cos I cos B] . ' [ - Lsd Sill

o

. o

.o

o

18

20

(56)

In the following examples, the antennas and the incident 90 0 , for both signals are assumed to be coplanar, i.e., ¢ 0 the COLD array and CCD array. For ¢ = 90 , (56) becomes

=

o

(a)

10

J.LCCD

o

.

We present below several examples show ing the performance of using the MODE algorithm with the COLD array and comparing the asymptotic statistical-performance analysis results with the Monte-Carlo simulation results . We compare MODE with MUSIC and NSF for both angle and polarization estimation. The simulation results were obtained by using 50 Monte-Carlo simulations. In the examples, we assume that there are K = 2 incident signals and both signals are assumed to have the same amplitude Ek, such that We Eki = W Ekl = 1, k = 1. 2. Hence, the signal-to-noise ratio (SNR) used in the simulations is -10 log 10 a 2 dB . The array is assumed to have L = 8 COLD pairs that are uniformly spaced with the spacing between two adjacent COLD pair s equal to a half wavelength. We also compare the est imation performance of using the COLD array with that of using a CCD array with the same array geometry. The CCD array consists of crossed yand z-axes dipoles . The counterpart of (9) for the CCD array can be written as f.LC C D -

°

MODE Wllh COLD : !l

,

12 14 16 Angle Separanon (
i

22

(b)

Fig. 2. Root-mean- squared errors (RMSE's) of estimates versus ~8 for the -~0/2 . O ~ +~0/2. n 1 O'! '+50 . first of the two signals when 0\ J~ 0° . correlation coefficient 0.99. X 400. and SNR 10 dB (the CRB 's for the CCD array nearly coincide with those for the COLD array ): (a) Direction estimates and (b) polarization estimates .

}\ =

=

=

=

=

=

= = =

small incident angles, which make J.LC CD similar to Jl C OLD as may be seen by comparing ( 10) and (57) . In Fig . 3, we consider the case where the signals with identical horizontal polarizations (al = a 2 = 0 0 • /31 = /32 = 0 0 ) arrive from angles away from the broadside of the array . In this case, the CCD array is outperformed by the COLD array. This result occurs because the signal outputs at the yaxis dipoles are attenuated by a factor of cos B for the CCD array [see (56)] . For the COLD array, however, the signal outputs at both the dipoles and the loops are independent of the incident angle B [see (10)]. Note also that the RMSE's of the angle estimates first decrease and then increase even as the angle separation increases. Th is result occurs because the incident angle of the second signal approaches 90 0 for very large ti.B and the RMSE's of angle estimates are approximately proportional to 9 for large ti.B [15]. We note again that MODE gives better performance than MUSIC and NSF and achieves the CRB as t::.B increases. We have also found that

514

co;2

MUSIC with COLD:

MUSIC WithCOLD : NSF WithCOLD: MODE with COLD:

°

NSF with COLD : MODE with COLD : CRB for CCD: CRB lor COLD:

g10

°•

CRBfor CCD: CRB for COLD:

1

~

'" .§ Ql

-;;

°

.-

+

+

°

_ 0

°

-

w 510°

_ - 0

ti

o ~

°

°

°

°

°

10

20

30

40

50

60

70

°

w

10 "

10.2

10

8

6

°

"0

'" ~

10 .2

°

12

14 18 16 20 Angle Separation (deg)

22

24

0

26

Potanza tion Separation (deg)

BO

90

(a )

( a) 10'

10'

r

MUSIC with COLD: NSF with COLD:

MUSIC with COLD :

°

NSF with COLD :

•

MODE with COLD :

J'i l

CRB forCCD : CRB lor COLD :

, I

;ij

~

'" Ql

"

1ii

§ -;;

w c

~

" "0 Q.

[ ° •

"0

°

~ 10' ~ [ 10"

10

°

°

o

°

°

1

~ 10' •~ r r

Q.

8

10

12

14 16 18 20 Angle Separation (deg)

22

24

o

0

,

10

I

20

Fig. 3. Root-mean-squared errors (RMSE's) of estimates versus .3.8 for the second of the two signa ls when 11 1 :)0° . (12 50° + .3.8, cq 02 0° .1 [ 12 0°, correlation coefficient 0.99, S 400. and SNR to dB: (a) Direction estima tes and (b) polarization estimates.

= =

0

0

0

•

..

II:

•

II:

..

30

, 40

50

60

70

80

0

I

Polanzation Separat ion (deg)

1

90

(b )

(b )

=

o

- - - - -- --- - - - --- --- -- - - -~--

II:

26

j 1

f- ---°

w

1J 6

1

;ij

+

I

Q.

CRB 10r CCD: CRB lor COLD:

~102

!I ~ 10' ~

°

MODE with COLD:

=

=

=

=

=

=

Fig. 4. Root-mean-squared errors (RMSE 's) of estimates versus ~o for the seco nd of the two signals when 81 50° .82 70° . o 1 = 45° - .3.0 , and 0 '2 = 45°, h = 12 0° , correlation coefficient 0.99, S = 400. and SNR = \0 dB: (a) Direction estimates and (b) polarization estimates .

=

=

=

=

the CRE's for COLD array are much lower than those for ization estimates. Fig. 5 shows the RMSE' s of the direction and polarization estima tes as a function of the correlation CCD array. especially when () approaches 90° . 0 Second, we consider how the polarization separation af- coefficient. The two signals arrive from (}1 = _6 and (}2 =6° , 0 0 fects the estimator performance. Consider the case where and we have a1 = a2 = 45 and f31 = {32 = 0 . We note that two incident signals with a correlation coefficient 0.99 arrive for highly correlated or coherent signals. the performance of from angles (}1 = 50° and (}2 = 70 0 • We assume that MODE is much better than that of MUSIC and NSF, Fourth, we consider the performance of the estimators as a the corresponding ellipticity angles are 0: 1 = 45 0 - 60: 0 function of SNR. In Fig. 6. we consider the case where (}1 == and a2 = 45 and the orientation angles are {31 = {32 = 0 0° . The polarization separa tion between the two polarization 50 , (}2 = 70 0 • 0:1 = 0:2 = 0 0 • {31 = f32 = 00 , correlation states is 260: , Fig, 4 shows the RMSE' s of the direction and coefficient = 0.99, and N = 400 . In this case, the advantage polarization estimates as a functio n of 6a for the second of MODE over MUSIC and NSF is increasingly eviden t at low signal. Note that MODE performs better than MUSIC and SNR. We note again that the COLD array gives much lower NSF since the signals are highly correlated, Since the second CRE than the CCD array, especially for angle estimation. Fifth, we consider the performance of the MODE estimator signal arrives from a large angle away from the broadside of the array, the performance of the COLD array is again much as a function of the number of snapshots N . We consider the better than that of the CCD array. case where (}1 = 50° , fiz = 70 0 , a1 = 0:2 == 45 0 , {31 == ;32 == Third, we consider how the correlatio n coefficient between 00 , correlation coefficient = 0.99. and SNR = 10 dB. It can the two incident signals affects both the direction and polar- be seen from Fig. 7, that although MODE is an asymptotically

515

10' MUSIC wllh COLD :

MUSIC with COLD : NSF with COLD :

0

MODE with COLD :

-

MODE with COLD:

Il C;;10 '

CAB lor COLD :

•

:g. ~

CABlarCCD: CAB tor COLD :

~

W

o

c:: 10°

~

(

o

~

0 .2

0 .3

0.4

0 .5

<5

.

*

0.1

0.6

Correlation C oefficient

o

NSF Wllh COLD:

0 .7

0 .8

(;

w

en

~

10 .

1

,)

0.9

0

2

(a)

6

!

10

SNA (dB)

, 12

.

I

14

16

I

18

20

(a)

'1

10' .----.,---~--,----.---~-~--~-.,.._-.,___, MUSIC With CCD: NSF With COLD :

MUSIC with COLD :

o

NSF Wllh COLD :

-

MO DE w,th COLD : CAB lor COLD :

o

MODE With COLD :

I

CAB larCCD: CAB lar COLD :

-j 1

J 0 .1

0.2

0 .3

0.4

0.5

0 .6

Co rrelation Cceruceot

07

0 .8

10 0

0.9

2

10

SNA (dB)

12

14

16

ta

20

(b)

(b)

Fig. 5. Root-mean -squared errors (RMSE' s) of estimates versus Source-correlation coefficient for the first of the two signals when a 8( 6 a • 82 6 • 0'( = 0'2 ~ 5 a . J( .h O? . .\" 400, and SNR = 10 dB (the CRB's for the CCD array nearly coincide with those for the COLD array): (a) Direction estimate s and (b) polarization estimate s.

Fig. 6. Root-mean-squared errors (RMSE's) of estimates versus SNR for 50 a . liz 70 a • n 1 n2 O" . the second of the two sisnals when /II il J2 O" , correlation coefficient = 0.99, and V 400 : (a ) Direction estimates and (b) polarization estimate s.

=-

=

=

= =

=

(for large N) statistically efficient estimator, MODE estimates achieve the CRB even for very moderate N . MODE again performs better than MUSIC and NSF . Finally, we remark that although we make the Gaussiannoise assumption in the problem formulation, the Gaussiannoise assumption is not critical to the algorithms. As a matter of fact, the asymptotic distributional properties of all of the algorithms are most likely to remain the same , even in the non-Gaussian noise case. However, of course, in the latter case, MODE is no longer asymptotically statistically efficient and the CRB given in (55) is no longer the true eRB matrix, but rather the asymptotic theoretical performance (TP) of MODE. Fig . 8 shows an example when the Gaussian noise assumption is violated. Fig. 8 is the same as Fig . 7, except that the noise is the contaminated Gaussian noise [16]. More specifically, the probability density function of n( t) has Ute form (1 - €)N(O , a 2 I ) + €N( O,9a 2 I), where N(a, B )

= =

=

=

=

=

=

denotes the Gaussian probability density function with mean a and covariance matrix B . The SNR here is defined as -10 loglO(a 2(1 + 810)] dB. We used 10 = 0.7 in the example. Note that MODE can also achieve its asymptotic theoretical performance for moderately large N for this case and MODE has the best performance. VI. CONCLUSION We have presented a cocentered orthogonal loop and dipole (COLD) array. We have shown that with the COLD array , both angle and polarization estimation performance can be significantly improved as compared to using a similar crossed dipole array. We have also described how to use the asymptotically statistically efficient MODE algorithm for both angle and polarization estimation with the COLD array . Numerical examples have been given to show that MODE gives more accurate angle and polarization estimates than MUSIC and NSF, especially for highly -correlated or coherent signals.

516

MUSIC wIth COLD: NSF wIth COLD:

MUSIC w;th COLD: NSF with COLD : MODE w ith COLD:

o

MODE w;th COLD: CRB for CCO:

TP for MODE wrth CCD:

CitO' • :8.

CRB for COLD:

o

o

TP for MO DE wIth COLD:

" '"; ;" E

+

o

~

o

o

o

o

ur

o

o

c

o

Q10°

o

o

0

a

o

1:3

o

o

i!! i5 "0 w

::; ., a: 10

Vl

100

50

150

20 0

N

250

300

350

40 0

.50

100

50

150

200

(a)

tOJ ~_ ~_~_

j

CRB forCC O: CRB lor COLD:

Fit:..

t.

" v ........ -

II I\,...'. ....

0

I

r~'-tu.'"'.I. l..u

200

....l IUI.:J

o

N

o

250

=

o

350

300

-

[

1

a

'.

"

VI C :'> lI I I1CJl t: :'>

=

=

VlIU

vt:r~U:-i I

_,

lu r

Lue

= , >2 =

= 10 dB : ra)

(58)

-

[V~(:~~I) ][B 01 0

, .

,':,. 'c' -

-0 -

-

')

o

_

-0-- -

-

_ 0::)

---

II

!

-

..

0 ..

tb ) . E . s ) 01. estimates . ..ig. lI . xoor-mean-squareu errors (KMS versus .\ " t'or the sec ond of the two signals in the pre sence of contaminated Gaussian noise 0 50 and FI ~ 70 0 " I 1 \ '2 ~5 ° i l h 0° . when Ii, correlation coefficient = 0.99 . and SNR 10 dB : (a) Direction estimates and (h) polari zation estimates.

=

=

=

=

=

=

= 2L -

Thus ~ has full column rank 2(L - K ) + K dim[N(AH )]

= 2L -

=

K. Since

K

(61)

it follows that the columns of ~ span the entire null space

N (A H). Hence

(59 )

.

This result shows that the columns of .6. belong to Moreover

(B lI B ) ® I

o

10 L_ _"'--_--'-_ _'":-_=_-::-::-:---,:-:::-_~::____:_;:;:_-_;; 450 0 100 150 400 200 250 300 350 50

. 50

400

BlI ® I ] VH (A t :2lI) .

:::.HA = [(BHA ZI)U ] _O

= [

~_---,

11:.

vf!

H

,

~

o

We first show that the columns of the matrix .6. span the Uk = 0, we obtain null space of A H , Since B H A = 0 and

.6. ti =

~_~_ _~_-.,.-

TP for MODE wit h COLD'

Proof of (27J: Let B be an L x ( L - K ) matrix whose columns span the null space of A H . Also, let a (2 L - K ) x 2L matrix .6. H be defined as H _

_

TP lor MODE wIth ceo:

ApPENDIX

~

450

N

\l!RL,L,. "'}

=

o

second of the two signals when H, 50 0 and H ~ 70°. n ~5 ° . 11 1~ 0° , correlation coefficient 0.99 . and SNR Direction estimates and (b) polarization estimates.

=

400

MODE wrth CO LD:

i

150

100

350

MUSIC w,th COLD : NSF with COLD:

1

o

MODE with COLD:

o

300

(a)

MUSIC wrtn COLD: NSF wIth COLD:

cr - _ _ o

250

N

pi = P.:> =

.6.(D.H .6. )-1 D. H

= (B 0 I)[(BHB ) -

I

® I](B H ® I )

+ (AtH @ I)V {VH[(A H Af1 V

N(A H).

H

@

I]V} - l

(A t @ I ).

(62)

Since PBt~I = P:L3;I' we conclude that (27) must hold true.

(A tH ® I )V ]

REFERENCES

0 ] VH [(AHAf l @I]V . (60) 517

[Ij J. Li and R. T. Compton, Jr. , " Angle and polari zation estimation using ESPRIT with a polarization sensitive array," IEEE Trans. Anten/las Propagat ., vol. 39, pp. 1376-1383, Sept. 1991.

[2] Y. Hua, HA pencil-MUSIC algorithm for finding two-dimentional angles and polarizations using crossed dipoles," IEEE Trans. Antennas Propagat., vol. 41, pp. 37~376, Mar. 1993. [3] J. Li, "Direction and polarization estimation using arrays with small loops and short dipoles," IEEE Trans. Antennas Propagat., vol. 41, pp. 379-487, Mar. 1993. [4] J. Li and P. Stoica, "Efficient parameter estimation of partially polarized electromagnetic waves," IEEE Trans. Signal Processing, vol. 42, pp. 31 14-3 J25. Nov. 1994. (~; P. Stoica and K. C. Sharman, "Maximum likelihood methods for direction-of-arrival estimation." IEEE Trans. Acoust.. Speech, Signal Processing. vol. 38. pp. 1132-1143, July 1990. (6] _ _ , "Novel eigenanalysis method for direction estimation," in lEE Proc.. Pt. F, vol. 137, Feb. 1990, pp. 19-26. [7] A. Swindlehurst and M. Viberg, "Subspace fitting with diversely polarized antenna arrays," IEEE Trans. Antennas Propagat., vol. 41, pp. 1687-1694, Dec. 1993. [8] B. Ottersten, M. Viberg, P. Stoica, and A. Nehorai, "Exact and large sample ML techniques for parameter estimation and detection in array processing," in Radar Array Processing, ch. 4, S. Haykin, 1. Litva. and T. J. Shepherd. Eds. New York: Springer-Verlag, 1993.

[9! C. A. Balanis, Antenna Theorv-i-Analvsis and Design.

[101

[11] [12]

l13~ [14] [15] [161

518

New York: Harper & Row, 1982. G. A. Deschamps, "Geometrical representation of the polarization of a plane electromagnetic wave." in Proc. IRE. May 1951. vol. 39, pp. 540-544. R. C. Johnson and H. Jasik. Antenna Engineering Handbook. New York: McGraw-Hill, 1984. M. Viberg and B. Ottersten. "Sensor array processing based on subspace fitting," IEEE Trans. Acoust., Speech, Signal Processing, vol. 39. pp. 1110-1121. Mav 1991. M. Wax and T. "Kailath. "Detection of signals bv information theoretic criteria." IEEE Trans. Acoust.. Speech. Sig..n. al Processing, vol. ASSP-33. pp. 387-392. Apr. 1985. E. Ferrara. Jr. and T. Parks, "Direction tinding with an array of antennas having diverse polarizations," IEEE Trans. Antennas Propagat., vol. AP-31. pp. 231-236. Mar. 1983. P. Stoica and A. Nehorai, ·'MUSIC. maximum likelihood. and Cramer-Rae bound:' IEEE Trans. Acoust., Speech. Signal Processing, vol. 37. pp. 720-741. May 1989. P. 1. Huber. Robust Statistics. New York: Wiley. 1981.

Upper Bounds on the Bit-Error Rate of Optimum Combining in Wireless Systems Jack H. Winters, Fellow, IEEE, and Jack Salz, Member, IEEE Abstract- This paper presents upper bounds on the bit-error rate (BER) of optimum combining in wireless systems with multiple cochannel interferers in a Rayleigh fading environment. We present closed-form expressions for the upper bound on the bit-error rate with optimum combining, for any number of antennas and interferers, with coherent detection of BPSK and QAM signals, and differential detection of DPSK. We also present bounds on the performance gain of optimum combining over maximal ratio combining. These bounds are asymptotically tight with decreasing BER, and results show that the asymptotic gain is within 2 dB of the gain as determined by computer simulation for a variety of cases at a lO-J BER. The closed-form expressions for the bound permit rapid calculation of the improvement with optimum combining for any number of interferers and antennas, as compared with the CPU hours previously required by Monte Carlo simulation. Thus these bounds allow calculation of the performance of optimum combining under a variety of conditions where it was not possible previously, including analysis of the outage probability with shadow fading and the combined effect of adaptive arrays and dynamic channel assignment in mobile radio systems. Index Terms- Bit-error rate, optimum combining, Rayleigh fading, smart antennas.

A

1.

INTRODUCTION

NTENNA arrays with optimum combining combat multipath fading of the desired signal and suppress interfering signals, thereby increasing both the performance and capacity of wireless systems. With optimum combining, the received signals are weighted and combined to maximize the signal-tointerference-plus-noise ratio (SINR) at the receiver. Optimum combining yields superior performance over maximal ratio combining, whereby the signals are combined to maximize signal-to-noise ratio, in interference-limited systems. However, while with maximal ratio combining the bit-error rate can be expressed in closed form [1], with optimum combining a closed-form expression is available only with one interferer [2], [3]. With multiple interferers, Monte Carlo simulation has been used [3]-[5], but this requires on the order of CPU hours even with just a few interferers. Thus the improvement of optimum combining has only been studied for a few simple Paper approved by N. C. Beaulieu, the Editor for Wireless Communication Theory of the IEEE Communications Society. Manuscript received September 21, 1993; revised November 28, 1996. This paper was presented in part at the 1994 IEEE Vehicular Technology Conference, Stockholm. Sweden, June 8-10, 1994. J. H. Winters is with AT&T Labs-Research, Red Bank, NJ 07701 USA. J. Salz, retired, was with AT&T Labs-Research, Crawford Hill Laboratory, Holmdel, NJ 07733 USA. Publisher Item Identifier S 0090-6778(98)09388-X.

y

User~

Fig. 1. Block diagram of an Jl-element adaptive array.

cases, and detailed comparisons (e.g., in terms of outage probability) have not been done. In [6], we showed that, with ]VI antenna elements, the received signals can be combined to eliminate L (L < M) interferers in the output signal while obtaining an M - L diversity improvement, i.e., the performance of maximal ratio combining with ]\II- L antennas and no interference. However, this "zero-forcing" solution gives far lower output SINR than optimum combining in most cases of interest and cannot be used when L 2: Ail. In this paper we present a closed-form expression for the upper bound on the bit-error rate (BER) with optimum combining in wireless systems. We assume flat fading across the channel and independent Rayleigh fading of the desired and interfering signals at each antenna. 1 Equations are presented for the upper bound on the BER for coherent detection of quadrature amplitude modulated (QAM) and binary phase-shift-keyed (BPSK) signals, and for differential detection of differential phase-shift-keyed (DPSK) signals. From these equations, a lower bound on the improvement of optimum combining over maximal ratio combining is derived. In Section II we derive the upper bound on the BER. In Section III we compare the upper bound to Monte Carlo simulation results. A summary and conclusions are presented in Section IV. II. UPPER BOUND DERIVATION

Fig. 1 shows a block diagram of an M -element adaptive array. The complex baseband signal received by the ith antenna element in the kth symbol interval Xi (k) is multiplied by a controllable complex weight ui, and the weighted signals are summed to form the array output signal So (k). I As shown in [7], the gain of optimum combining is not significantly degraded with fading correlation up to about 0.5. Thus our bounds, based on independent fading, are reasonably accurate and useful even in environments with fading correlation up to this level.

Reprinted from IEEE Transactions on Communications, Vol. 46, No. 12, pp. 1619-1624, December 1998.

519

With optimum combining, the weights are chosen to maximize the output SINR, which also minimizes the mean-square error (MSE), which is given by [8] MSE == (1 + U~R~~1£d)-l

(1)

where Rnn is the received interference-plus-noise correlation matrix given by

n.; =

0"

2

1+

L L

(2)

Uj1£}

j=l

CJ2 is the noise power, I is the identity matrix, Ud and

are the desired and j th interfering signal propagation vectors, respectively, and the superscript denotes complex conjugate transpose. Here we have assumed the same average received power for the desired signal at each antenna (that is, microdiversity rather than macrodiversity) and that the noise and interfering signals are uncorrelated, and without loss of generality, have normalized the received signal power, averaged over the fading, to 1. Note that the MSE varies at the fading rate. For coherent detection of BPSK or QAM, the HER is bounded by [9]

1.£j

t

r. ::; e(1/er;) E [e( -l/MSE)] = e((1/IT~)-l) E [e-U;,R;;-,:Ud]

(3)

the bound. Also, note that with only noise at the receiver, An = (1~, where O'~ is the variance of the noise normalized to the received desired signal power, and from (4) and (5) ., ((1~)~I 1 (6) Pe < -2- = = 2p AI where p is the received SINR, while the actual BER is 1/2(1 + p)}VI [1]. Thus even without interference, the bound differs from the actual BER, and this difference increases as the received SINR decreases. Let us consider the case of interference only. In this case, IRnnl, which is giyen by (2), may also be expressed as

IRnn I = IQ t QI

L

±DI Dm1D~Dm2 ... DtI D 1n M

(7)

where Q = (D 1 , · · · , D~I), D·, U = ((U1).m···(1£L)m)T, (Uj )In is the mth element of 1£j, the sum is extended over all M! permutations of the Il.;' s, D rn , is the ith element of the permutation of the D 111 's, the "+" sign is assigned for even permutations (i.e., an even number of swapping of DnJ.'s in the permutation), and the "-" sign for odd permutations. Now

"2 L

t E[D·m,D,u]

= L...-J

(8)

aj

j=l

where O'J is the average power of the .ith interferer normalized to the desired signal power, and

= L O'f· L

E[D!nDnD;"D.rn]

where now the expected value is taken over the fading parameters of the desired and interfering signals, and O"~ is the variance of the BPSK or QAM symbol levels (e.g., O'~ == 1 and 2 for BPSK and quaternary phase-shift keying (QPSK), respectively). For differential detection of DPSK, assuming Gaussian noise and interference.? the BER is given by [1] 1 [ e-udt B-1 Ud] . P = -E nn e 2

=

(9)

j=l

Similarly, from (7), it can be shown that

(4)

Thus the BER expression for both cases differs only by a constant, and we will now consider the term E[e-u~R;~Ud]. As shown in the Appendix, this term can be upper-bounded by (5)

where IRnnl denotes the determinant of Rnn, and An is the nth eigenvalue of Rnn. Since (5) is the key inequality in our bound (and is the only inequality we use in determining the bound for differential detection of DPSK), let us examine its accuracy. The bound is tight if An ~ 1, and since the An's are proportional to the interference signal powers, the bound is tight for large received SINR, i.e., low BER's. Although for all cases (1 + (l/An»-l < 1 and thus BER < 0.5, for An > 1 the BER as given by the bound may exceed 0.5. Thus with small received SINR, occasionally BER's greater than 0.5 may be averaged into the average BER, reducing the tightness of

where the sum is over all sets of positive integers ik and lk that exist such that M ~ ... > i 2 > iI, with Ek iklk ~ M. For example, when M = 5, there are 6 sets of {ik' lk} such that Ek iklk ~ M (see Table I). All sets are of the form {iI, II}, e.g., {i 1 = 3, II = I} for 3 ·1 < 5, except for the set {i 1 = 2, 11 = 1, i 2 = 3, 12 = I} for 2 . 1 + 3 . 1 = 5. Q~}VI) is an integer coefficient corresponding to the qth set with M antennas. Note that a~/)';l) is obtained by summing the coefficients (±1' s) for similar terms in E[ IQ t QI]. a.~Nf) can be determined as shown below. Since E~=l CJ; 1/ p, and a~lvI) 1 when iklk 0, (10) can also be expressed as

2 Since the stronger the interference, the more that optimum combining suppresses it, with the Gaussian assumption we overestimate the probability of strong interference. Note that this is consistent with the derivation of an upper bound on the BER.

520

E[ IQtQI]

Ek

=

=

= p-Al

[1 + L a.~AI) (t(p.aJ)i q

=

1

)

h

J=1

-(t,(pa;)i2)'2.. -]

(11)

TABLE I

VALUES OF

FOR

II

i2

AI

= 2 TO

5

V ALUES OF

TABLE II

0'

q

M

i}

I}

1

-1

6

2 2

2 3

1 1

-3 +2

1 2 3 1 2

2

1 2 1 1

-6 +3 +8 -6

1 2 1 1 1 1

-10 +15 +20 -20 -30 +24

i

2

2

3

4

I

2

3 4 2

2 3 2

4 5

/2

1

3

2 3 3 4

5

7

2 2

1 2 3

6 7 2 2 2 3

1 2 1 1 1 1 1 1 1 1

2

2

3

3

4

where now M 2: ... > ';'2 > '£1 > 1. To determine the a~ll'..t), s, first note that if a~) 1, "', L, then L~=l

aJk

== Lo?", and

E[IQtQIJ = (L M +

t,

(J''J.

' .J'

(11) becomes

IhLM-k+l)iT2M

(12)

where the 13k's and the a~Af),s can be seen to be closely related. From [6], P; == 0 for L < M, and thus the {3k's are the coefficients of the N/th-order polynomial in L, L(L - l)(L - 2)··· (L - M + 1). This result is not only useful when all interferers have equal power, but also serves as a consistency check on our calculated values of Q.~AI). (1\1) Q.q

.

were generated USIng a computer Th e va Iues 0 f program to examine every permutation in (7) for given M. The number of each type of iI, ll, i 2 , l2, ... term was calculated to determine Q.~lYf). Tables I and II list these values for M == 2-7. Note that only i 1 and II terms exist for M < 4 and i? and

l2 terms also exist for 5 ~ M Values for ;;~l\j) for higher M can also be easily calculated. However, since the amount of computer time to generate the values of a.~j\1) increases exponentially with M, our program could only generate these 0

p-M

[1 + l:= Q.~JvI) (t(PiT JY1) q

i2

12

1 1

3 4

3

1 1

4 3

1 1

4. 5

1

AND

7

a(M) q

-15 +45 -15 +40 +40 -90 +144 -120 -120 +90

-21 +105 -105 +70 +280 -210 +504 -840 +720 -420 +630 -504 -420 +210

and from (4), the upper bound on the BER with differential detection of DPSK is given by

r. < ~ p- M

[1 + l:= Q.~J\I) (t(PiTJ)i 1) q

II

J=1

o

(t,(PiTjYi2 ) 12 o' oj

0

(14)

For the case of noise with L interferers, consider the noise as an infinite number of weak interferers with total power equal to the noise. That is, let

values in a reasonable amount of computer time for up to M == 10 (where a hundred CPU hours on a SPARCstation20 would be required). From (3), the upper bound on the BER with coherent detection of BPSK or QAM is now given by

e(l/O"~)-l)

AI = 6

1

1 1 1

2

FOR

1

6 2 2

5

Fe :S

~l\t{)

a(M)

M

5

n~lV/)

2 aj

a;

== K - L'

j

== L + 1, ... , K,

(15)

II

J=l

.(t,(piTJ)i y2 .. oj 2

(13)

for i k > 1. Therefore, with noise, the BER bound is the same as in (13) and (14), but with p including the noise. In this case, if we define the received desired signal-to-noise ratio a;;2 and the jth interferer signal-to-noise ratio as as d

521

r

f

j

= aJj a~, then (14)

becomes [similarly for (13 )]

10 r - - - - - - - - - - -- - - - - - ,

...........-

8

_.---_ ---....

__-----

_----8::: M=5

f l=10dB

Coherent Detection of BPSK L=1

2;M

Since is the bound with maximal ratio combining , the tenn in the brackets is the improvement of optimum combining over maximal ratio comb ining based on the BER bound. Defining the gain of optimum combining as the reduction in the required p for a given BER, from (17), this gain in decibels is given by Gain (dB) 10

=- M

10' \ .5

Fig. 2. Gain versus BER for coherent detec tion of BPSK-compari son of analytical result s to the asym ptotic gain.

log10

12 , - - - - - - - - - - - - - - --, 10

- - Theoretical Results •••• Simula tion Results Asymptotic Ga in M=2

This gain is therefore independent of the desired signal power (because the bound is asymptotically tight as p ---+ 00 ). However, this is the gain of the BER bound with optimum combining over the BER bound with maximal ratio combining. Since the required p for a given BER with maximal ratio combining is less than the bound , the true gain may differ from (18) and to obtain a bound on the gain, the gain in (18) must be reduced accordingly . For example, with differential detection of DPSK, to obtain a bound the gain given in (18) is reduced by the factor (pj( l + p))IIJ . Note that as p ---+ 00 , this factor reduces to one and the gain approaches ( 18) . Thus we will refer to (18) as the asymptotic gain.

III.

COMPARISON TO E XACT THEORY AND SIMULATION

In this section, we compare the bound to theoretical results for L = 1 and simulation results for L ~ 2. Fig. 2 compares theoretical results (from [1]-[3]) for the gain to the asymptotic gain (18) versus BER with coherent detection of BPSK. Results are generated for M = 2 and 5, and I' 1 = 3 and 10 dB. In all cases the gain monotonically decreases to the asymptotic gain as the BER decreases . The gain approaches the asymptotic gain more slowly with decreasing BER for larger M and also, at low BER's, the accuracy of the asymptotic gain decreases with higher f l . Thus the accuracy of the asymptotic gain decreases as the p required for a given BER with optimum combining decreases, as predicted by the approximation in Section II. Fig. 3 compares theoretical and Monte Carlo simulation [5] results for the gain to the asymptotic gain with M = 2 and L = 1, 2, and 6. Results are plotted versus f j , where all L interferers have equal power , for coherent detection of BPSK

L=2

5

10

f j (dB)

15

20

Fig. 3. Gain with .\1 = 2 for I. 2, and 6 equal-powe r interferers versus signal-to-noise ratio of each interfe rer-s-co mparisc n of analytica l and Monte Carlo simulation res ults with coherent dete ction of BPSK [5] to the asymptotic gain.

at a 10- 3 BER,3 In all cases, the asymptot ic gain has the same shape as the gain and is within 1.7 dB for L = 1, 1.0' dB for L = 2, and 0.4 dB for L = 6. Since optimum combining gives the largest gain when the interference power is concentrated in one interferer and the least gain when the interference power is equally divided among many interferers, L = 1 and L = 6 represent the best and worst cases for the gain in an interference-limited cellular system. Thus from the results in Fig. 3, we would expect the asymptotic gain to be within 0.4-1 .7 dB of the actual gain for all cases in cellular systems with M = 2. 3Th is BER was used bec ause the result s in [5] were obtained for this BER. As shown in [5], the gain does not change sign ificantly for BER 's between 10- 2 and 10- 3 , the range of interest in most mobile radio sys tems.

522

of cases at a 10- 3 BER. These cases include interference scenarios that cover the range of worst to best cases for the gain of optimum combining in cellular systems with M == 2. The bound is most accurate with differential detection of DPSK and high SINR, corresponding to low BER and a few antennas. Because of the 2-dB accuracy, the bound is most useful where the optimum combining improvement is the largest, which is the case of most interest. The closedform expression for the bound permits rapid calculation of the improvement with optimum combining for any number of interferers and antennas, as compared with the CPU hours previously required by Monte Carlo simulation. These bounds allow calculation of the performance of optimum combining under a variety of conditions where it was not possible previously, including analysis of the outage probability with shadow fading and the combined effect of adaptive arrays and dynamic channel assignment in mobile radio systems,

6 -----------------., • • •• Simulation Results Asymptotic Gain

. .. .. ... . . ..... . ..... . . . ..•.. BER=10· 3

r j U=1,L)=3dB

'---------"""'

4

2

0'----.-..-----------"--------6 3 2

5

4

7

M

Fig. 4. Gain versus AI with two and six equal power interferers-comparison of Monte Carlo simulation results with coherent detection of BPSK [3] to the asymptotic gain.

ApPENDIX

Diagonalizing Now, consider the lower bound on the gain obtained from the BER bound (17), as compared to the asymptotic gain. Without interference, differential detection of DPSK with maximal ratio combining and All == 2 requires fJ ~ 13.3 dB (theoretically [10]) for a 10- 3 BER, while the BER bound (17) gives p ~ 13.5 dB. Thus the lower bound on the gain (from (17)) at a 10- 3 BER is 0.2 dB less than the asymptotic gain for any interference scenario-in particular, the lower bound on the gain is 0.2 dB less than the results shown in Fig. 3. Similarly, coherent detection of BPSK with maximal ratio combining and 1\;1 == 2 requires p ~ 11.1 dB for a 10- 3 BER, while the BER bound (13) gives 15.0 dB. Thus the bound is most accurate with differential detection of DPSK and low BER's. Fig. 4 compares Monte Carlo simulation results [3] for the gain to the asymptotic gain for L == 2 and 6. Results are plotted versus !vI with r j == 3 dB for all interferers and coherent detection of BPSK at a 10- 3 BER. Again the asymptotic gain has the same shape as the simulation results. The cases include both many more interferers than antennas and many more antennas than interferers, but in all cases the asymptotic gain is within 1.8 dB of simulation results.

Rnn by a unitary transformation W, we obtain (19)

where diag (.) denotes an M x !VI matrix with nonzero elements only on the diagonal, or

R n- n1 and

1 tRnn Ud

'Ud

-

,,/,t I.fI

di:lc1g (\/\ 1-1

t I}/,t - U d I.fI

di.rag (\Al-1

\ -1),1/ AI If),

A

\ -1),,/, " . /\;'1 If/Ud,

(20) (21)

Let (22) Then

tR- 1 u d nnU,d and

-

AI

j') An

~ ICn ... ~

n=l

(23)

E[e-U~R~~Ud] = E [exp(_~ Ic~2)] =E

IV, CONCLUSIONS In this paper we have presented upper bounds on the biterror rate (BER) of optimum combining in wireless systems with multiple cochannel interferers in a Rayleigh fading environment. We presented closed-form expressions for the upper bound on the bit-error rate with optimum combining, for any number of antennas and interferers, with coherent detection of BPSK and QAM signals, and differential detection of DPSK. We also presented bounds on the performance gain of optimum combining over maximal ratio combining and showed that these bounds are asymptotically tight with decreasing BER. Results showed that the asymptotic gain is within 2 dB of the gain as determined by computer simulation for a variety

."

[IT exp (J~~2) ].

(24)

Since with independent, Rayleigh fading at each antenna, the elements of U,d are independent and identically distributed (i.i.d.) complex Gaussian random variables, the elements of C are also i.i.d. complex Gaussian random variables with .the same mean and variance. Furthermore, the An's are independent of the c.,' s. Thus we can average over the desired and interfering signal vectors separately, i.e.,

523

E

[IT exp (_1~~2) ] E [IT E [exp(J~n~2) ]]. =

A

Cn

(25)

Since the en's are complex Gaussian random variables with zero mean and unit variance

[1]

(26)

[2]

E en [exp

IcnI2)] = 1 +1 } (- ~

REFERENCES

n

and

[3] [4]

Since the

An'S

1 1+

E[e-u~R;:~Ud] ~ E lR..nl

1

An

and, therefore,

where

[5]

are nonnegative

A

< An

(28)

[g An] =

denotes the determinant of

[6]

[7]

E A [ lR..nl]

R..n.

(29)

[8] [9] [l 0]

524

w. C. Jakes Jr. et al., Microwave Mobile Communications. New York: Wiley, 1974. V. M. Bogachev and I. G. Kiselev, "Optimum combining of signals in space-diversity reception," Telecommun. Radio Eng., vol. 34/35, no. 10, pp. 83, Oct. 1980. J. H. Winters, "Optimum combining in digital mobile radio with cochannel interference," IEEE J. Select. Areas Commun., vol. SAC-2, no. 4, July 1984. _ _ , "Optimum combining for indoor radio systems with multiple users," IEEE Trans. Commun., vol. COM-35, no. 11, Nov. 1987. _ _ , "Signal acquisition and tracking with adaptive arrays in the digital mobile radio system IS-54 with flat fading," IEEE Trans. Veh. Technol., Nov. 1993. J. H. Winters, 1. Salz, and R. D. Gitlin, "The impact of antenna diversity on the capacity of wireless communication systems," IEEE Trans. Commun., Apr. 1994. J. Salz and J. H. Winters, "Effect of fading correlation on adaptive arrays in digital wireless communications," IEEE Trans. Veh. Technol., vol. 43, pp. 1049-1057, Nov. 1994. R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980. G. 1. Foschini and J. Salz, "Digital communications over fading radio channels," Bell Syst. Tech. J., vol. 62, pp. 429-456, Feb .. 1983: J. H. Winters, "Switched diversity with feedback for f)PSK mobile radio systems," IEEE Trans. Veh. Technol., vol. VT-32, pp. 134-150, Feb. 1983.

The Range Increase of Adaptive Versus Phased Arrays in Mobile Radio Systems Jack H. Winters, Fellow, IEEE, and Michael J. Gans Abstract-In this paper, we compare the increase in range with multiple-antenna base stations using adaptive array combining to that of phased array combining. With adaptive arrays, the received signals at the antennas are combined to maximize signalto-interference-plus-noise ratio (SINR) rather than only form a directed beam. Although more complex to implement, adaptive arrays have the advantage of higher diversity gain and antenna gain that is not limited by the scattering angle of the multipath at the mobile. Here, we use computer simulation to illustrate these advantages for range increase in both narrow-band and spreadspectrum mobile radio systems. For example, our results show that for a 3° scattering angle (typical in urban areas), a 100element array base station can increase the range 2.8 and 5.5-fold with a phased array and an adaptive array, respectively. Also, for this scattering angle, the range increase of a phased array with 100 elements can be achieved by an adaptive array with only ten elements. Index Terms-Adaptive arrays, mobile communications, multipath channels, phased arrays.

M

1.

INTRODUCTION

ULTIPLE antennas at the base station can provide increased received signal gain and, thus, range in mobile radio systems. Two approaches for combining the received signals are the phased array, which creates an antenna beam directed at the mobile, and the adaptive array, which maximizes signal-to-interference-plus-noise ratio (SINR). Here, we compare the range increase of phased arrays to that of the more complex adaptive array technique for both narrow-band and spread-spectrum systems. Previous papers have studied the increase in gain with phased arrays [1]-[6]. With phased arrays, the signals received by each antenna are weighted and combined to create a beam in the direction of the mobile. The same performance can also be achieved by sectorized antennas, whereby a different antenna is used to form each beam. As the number of antennas increases, the received signal gain (range) increases proportionally to the number of antennas, but only until the beamwidth of the array is equal to that of the angle of multipath scattering around the mobile. Beyond that point, the increased gain of more antennas is reduced by the loss of power from scatterers outside the beamwidth. The range can even be reduced with narrower beamwidths because the resulting reduction in delay spread can cause a loss of diversity

Manuscript received September 19, 1994; revised July 19, 1998. J. H. Winters is with AT&T Labs-Research, Red Bank, NJ 07701 USA. M. J. Gans is with Lucent Bell Labs, Holmdel, NJ 07733 USA. Publisher Item Identifier S 0018-9545(99)01067-1.

gain in systems using equalization, e.g., in spread-spectrum systems using a RAKE receiver. This limitation in range increase can be overcome by the use of adaptive arrays [5]-[9]. With adaptive arrays, the signals received by each antenna are weighted and combined to maximize the output SINR. Although the most widely studied advantage of adaptive arrays is interference suppression [7J-[ 10], maximizing SINR also forms an antenna pattern matched to the wavefront (which is not a plane wave for nonzero scattering angle) and therefore provides a range increase that is not limited by the scattering angle. In addition, adaptive arrays can provide higher diversity gain than phased arrays, since all the receive antennas can be used for diversity combining. Thus, for a given number of antennas. adaptive arrays can provide greater range, or require fewer antennas to achieve a given range. In this paper, we describe the limitations of phased arrays for range increase and describe how these limitations can be overcome using adaptive arrays.' We use computer simulation to illustrate our results for the range increase in both narrowband and spread-spectrum mobile radio systems. For example, our results show that for a 30 scattering angle, a l Otl-elcmcnt array base station can increase the range 2.8 and 5.5- fold with a phased array and an adaptive array, respectively. Also. for this scattering angle, the range increase of a phased array with 100 elements can be achieved by an adaptive array with only ten elements. In Section II, we discuss the theoretical performance of phased and adaptive arrays. We present a mobile radio system model and illustrate the performance results by computer simulation in Section III. II.

DESCRIPTION OF PHASED AND ADAPTIVE ARRAYS

A. Phased Array

Fig. 1 shows a block diagram of a phased array with omnidirectional elements linearly spaced at >"/2, where X is the signal wavelength. The signals received by the antennas are weighted and combined to form a beam at angle ¢, i.e., the signal at the i th antenna is phase shifted by IT (i - 1) sin (P ~ 't == 1,··· .Ad.

For the mobile radio base station, the antenna beam should be narrow in elevation and the antenna characteristics should be independent of azimuth. A narrow elevation angle can be I Note that we consider range increase as a convenient way to express the effect of gain increase, and it also corresponds to a decrease in required number of base stations to cover a given area.

Reprinted from IEEE Transactions on Vehicular Technology, Vol. 48, No.2, pp. 353-362, March 1999.

525

signals should also be weighted by the voltage gain in the given direction to maximize signal-to-noise ratio (SNR) in the array output. These weighted signals are summed to generate the array output, with the output SNR for a beam with direction ¢ given by

L8 J.'1

r ec 1 .

si((p)

t=l

AI

Ll s i (¢ )1

2

(1)

2

•

•2

•

•

3

i=l

M

where

Fig. 1. Linear phased array with omnidirectional elements linearly spaced at ,\/2.

• • • •

• • • •

• • • •

• • • •

1~21 (a)

• • •

• • • •

• • • • 'JJ2

• • •

1

(b)

Fig. 2. (a) Array with linear elements on four panels in a square and (b) with elements on a cylinder.

created by using a vertical array of antenna elements for each horizontal element. The azimuth dependence can be reduced by placing the linear elements on four panels in a square, as shown in Fig. 2(a) [11]. However, a cylindrical array, as shown in Fig. 2(b), is usually used to create azimuth independence. Each antenna element is typically spaced at A/2, since smaller spacing reduces gain by creating a wider beamwidth with increased mutual coupling, while wider spacing can also reduce gain by decreasing the beamwidth and creating grating lobes, i.e., gain in directions other than the desired angle-ofarrival. The effect of antenna spacing on mutual coupling is studied in Appendix A. To create a beam in a given direction, the signals from the antenna elements are cophased, based on a plane wave arrival. Since to reduce mutual coupling between elements, each element should have higher gain in the direction pointing away from the center of the cylinder (see Appendix A), the

i,

8'i (

"rec,

is the complex received signal voltage at antenna

¢) is the expected (based on antenna location) antenna

voltage gain and phase (relative to the other antennas) for a signal arriving from angle rjJ, and the superscript * denotes complex conjugate. The weights can be implemented at radio frequency (RF) by different cable lengths for the fixed phase offsets and fixed attenuators for the amplitude weighting. The weighted signals for each beam are then combined, with a separate combiner and signal for each. beam. For each mobile radio user, the receiver then selects the beam output with the largest power to use for signal demodulation. However, this technique can require a large amount of hardware, including amplifiers, with large !VI, but the complexity can be reduced somewhat by combining only a portion of the antenna outputs-the signals from the antennas with the largest gain in a given direction-for each beam. Alternatively, the signal from each antenna can be brought to baseband and analog-digital (AID) converted, with the combining done in software. Although this method is similar to adaptive array processing, with the phased array the combining software needs to determine only one parameter, the angle-of-arrival ¢ (which changes slowly with time), for each mobile radio user. The same performance as the phased array can be achieved by using sectorized antennas, i.e., separate antennas for each beam, as is currently done at many mobile radio base stations. However, to create uniform coverage using sectorized antennas or phased arrays with predetermined (fixed) beams, overlapping beams should be used. (This is also useful for obtaining diversity-see below.) This doubles the number of antennas (with sectorized antennas) or the combining hardware (with phased arrays with fixed beams) without increasing the gain. Arrays increase the range by providing additional received signal gain due to two factors-antenna gain and diversity gain. With an M -element phased array and a point source, the antenna gain is lVI, neglecting mutual coupling (see Appendix A). The range increase is the gain raised to the inverse of the propagation loss exponent '"'(, typically a fourth power loss. Thus, with a point source, the range increase due to the antenna gain of an .1\11 -element array is MIlT. However, signal scattering around the mobile means that the signal received at the base station cannot always be considered as coming from a point source. As shown in Fig. 3, with scattering the signal arrives from a range of angles, called the scattering angle. Typically, the mobile signal is scattered mainly by objects within 1000 ft of the mobile,

526

•

Mobile

•

•

•

• •

• •

..

•

•

•

" •

•

•

•

Base Station

Base Station Fig. 3. Mobile radio environment with scattering around the mobile. where all signals from a mobile arrive within a scattering angle II

Fig. 4.

but this distance can vary widely, e.g.. with reflections off mountains 112]. Furthermore, this scattering angle increases with decreasing base-station height. Measured results for rural areas with 130-ft antenna heights show scattering angles of only a few tenths of a degree, while suburban and urban areas have much larger scattering angles r 131. Measured results in urban areas of Tokyo, Japan. for ranges up to 7 km [141, show a 3° scattering angle at a 50-m antenna height increasing to 360° at a l-m height (as on the mobile). In addition, digital mobile radio systems in North America (IS-136) and Europe (GSM) are designed to handle delay spreads up to 41 and 16 ItS, respectively, which, with an 8-mi cell radius. correspond to scattering angles of 52° and 21 0, respectively. Also, these scattering angles are for 900-MHz mobile radio systems, while at 2 GHz the range is reduced by about 509c (from the Hata model [15], for an antenna height of 50 m at the base station and 1 m at the mobile, medium-small city, and 8-mi cell radius), corresponding to a two-fold scattering angle increase. We expect that microcells will have even larger scattering angles because of the lower antenna height. Here. we do not consider what the likely distribution of scattering angles will be for any given system, but show results obtained for a wide range of scattering angles. Since receive signal power is lost when the beamwidth, which is approximately 360° /1\,1 (for a cylindrical array), is less than the scattering angle, the signal gain will be less than J.\;1 in the phased array with large enough 1\1. For example, for a uniform distribution of power within a scattering angle of a degrees, the maximum signal gain is given by an array with "Ai! == 360/ Q elements. Additional elements increase the antenna gain, but the power lost outside the beam reduces the signal gain by the same amount (under the uniform power distribution assumption). Thus, with phased arrays the signal gain, and the corresponding range increase, is limited. The other factor for receive signal gain is the diversity gain. Multipath fading results in a higher average output SNR required to achieve a given average receiver performance (e.g.,

Cylindrical array using of angle diversity.

BER in digital systems) than without fading. The fading in the output signal can be reduced by using multiple receive antennas and combining the received signals. We define diversity gain as the improvement in link margin beyond the factor of AI for array gain. For example, for a 10- 2 BER averaged over Rayleigh fading with coherent detection of PSK, a 9.5dB higher average output SNR is required than without fading. Two antennas provide up to a 5.4-dB diversity gain, while 3, 4, and 6 antennas provide up to 6.8,7.6. and 8.3 dB, respectively, with maximal ratio combining. Thus, six antennas can provide within 1.2 dB of the maximum diversity gain (i.e., the 9.5-dB gain achieved when the fading is eliminated). However, to achieve the full diversity gain, the fading at the antennas must be nearly independent. This requires that the spacing between antennas is at least the distance such that the beamwidth of an antenna with this aperture is approximately the scattering angle. For example, a spacing of lO-20A is used for the typical scattering angle of a few degrees [12], [14], [16]. For a cylindrical phased array, such an antenna spacing between elements is impractical and would create numerous grating lobes without providing the antenna gain commensurate with the diameter of the array (or providing diversity gain). However, when the beamwidth of the array is comparable to the scattering angle (i.e., the total array aperture size corresponds to a beamwidth given by the scattering angle), different beams can cover part of the same scattering angle and thereby angle diversity can be used [4], [13], as shown in Fig. 4. For the square array, another set of flat arrays could be spaced lO-20A apart on each side to provide diversity, as shown in Fig. 5. Note that this is not practical with cylindrical arrays, as the arrays would partially block each other. Similarly, to provide diversity with sectorized antennas, a separate set of antennas can be spaced lO-20A apart (as is used today) with overlapping sectors to provide more uniform coverage over all azimuth angles. In all cases, though, diversity gain requires additional hardware. To minimize the added cost, usually only dual diversity with selection combining is considered. Note that for the example case of a 10- 2 BER,

527

1\11 -element adaptive array is given by

I?J~II.....·--NA.-~

• • • •

• • • •

(2)

• • • •

• • • •

•

• •

•

• •

Fig. 5.

1-=1

• •

• • • •

Square array using space diversity.

• • • •

selection diversity with two antennas provides only about 3.9 dB of the maximum-possible 9.S-dB diversity gain (which is also I.? dB less than maximal ratio combining with two antennas). Frequency-selective fading due to delay spread can also be used to provide diversity by using equalization [9] in narrow-band systems, or a RAKE receiver in spread-spectrum systems [17]. In this case, the diversity gain of additional antennas is reduced. For example, a three-finger RAKE is used in the IS-95 CDMA system (three fingers on the downlink, but four fingers on the uplink). With received signal energy uniformly distributed over three code symbol periods (2.4 J-Ls), maximal ratio combining of the three fingers provides three-fold diversity, ora 6.8-dB diversity gain at a 10- 2 BER, and dual antenna diversity provides up to 1.5 dB (the overall combining is equivalent to six-branch maximal ratio combining) of the remaining 2.7-dB maximum diversity gain. Note, however, that, compared to a narrow-band receiver, one finger of this CDMA receiver is 4.8 dB lower in signal power, i.e., the RAKE receiver does not give any increase in average SNR (antenna gain). Finally, note that beamwidths smaller than the scattering angle can reduce the delay spread, and therefore the diversity gain, in systems with phased arrays.

B. Adaptive Array With an adaptive array, the received signals are combined to maximize the output SINR. Thus, the array can null interference in narrow-band systems/ (as discussed below), but here we consider only the increase in range due to higher antenna gain. Without interference, the output SNR of an 2 For spread-spectrum systems, nulling of all strong interferers is generally not possible since the number of interferers is typically much greater than the number of antennas.

Although (2) is simpler than the SNR equation for the phased array (1), the adaptive array is more complex to implement because the weights are not fixed, but depend on the received signals. Thus, variable gains and phase shifters are needed for each signal on every antenna. These can be implemented in hardware at RF or IF, or in software at baseband. For the software implementation, the signals from each antenna can also be digitized using block processing. Another complication is the need to acquire and track the weights. As compared to the phased array where the beam or the weights only need to track the angle of the mobile, the adaptive array weights must track the rapid fading of the signal. Algorithms to generate the weights include the constant modulus algorithm (CMA) [18], least-meansquared (LMS) algorithm [19], and the direct matrix inversion (DMI) algorithm [19]. It should be noted, though, that when interference is not a concern, i.e., when range increase is the issue as in this paper, simpler techniques may be possible for determining the weights. With the adapti ve array, though, the array pattern is matched to the multi path wavefront. That is, there is no antenna gain limitation due to multipath scattering angle, as with phased arrays, and an Ail-fold diversity gain can also be obtained. Achieving this diversity gain requires adequate antenna spacing however. With a base-station array oriented broadside to a small angle, a degrees, of scatterers around the mobile and with power arriving uniformly at the base from within n~, the magnitude of the correlation coefficient between two array elements spaced x wavelengths apart is approximately [see also [14], which approximates the envelope correlation Pc(x) by the square of the complex phasor correlation I p(:t) 12 ]

( )I ~ Ipx

sin( 1r 2ax /180) . (1r 2ax /180)

(3)

Thus, an antenna spacing of (360°/1ra) (>../2) is required for independent fading at each antenna, but spacings of about half of this still give low-enough fading correlation «0.7) that nearly the full diversity gain can be achieved. However, even with a spacing of (360° /(7fa))(>"/4), the required array size can be too large. For example, a 3° scattering angle requires a 10-ft antenna spacing at 900 MHz, and, thus, in particular, a 100-element cylindrical array would require a 330-ft diameter. However, since only a few-fold diversity is needed to obtain most of the maximum diversity gain, an array with a diameter of a few times the required antenna spacing (20-30 ft in the above example) should obtain almost all the maximum-possible diversity gain. Finally, we note that, although not studied in this paper, the adaptive array can also suppress interference. With the narrow beams of large arrays, the number of interferers is greatly reduced in both narrow-band and spread-spectrum systems. Since an M -element array can eliminate N interferers with an M - N diversity gain, large arrays can eliminate any significant

528

interference with little loss of di versity or antenna gain. Thus, these arrays can not only greatly increase the range when there is little interference, but they can also be used for future expansion by permitting the capacity to be greatly increased without increasing the number of base stations.

III.

RES ULTS

A. Model To verify and illustrate the above concl usions, we used Monte Carlo simulation with the following model (see Fig. 3). We considered transmission from a mobile to a base station. The multipath model consisted of 20 scatterers uniformly distributed in a circular area of radius T around the mobile. These scatterers had equal transmitted power, with a fourth law power loss from each scatterer to the base station. The phase of each multipath reflection at each antenna was determined from the path length. Recei ved power variation due to shadow fading was not considered. The base-station array was a cylindrical array of J\;1 equally spaced cardioid antennas [20], with each antenna pointing out from the center of the array, and one element at 0°. The mobile was at 90°. Note that for AJ == 2, the mobile at 90° results in equal gain from the two antennas, while with a mobile at 0° only one antenna has nonzero gain. Thus, for ~\1 == 2, the results depend strongly on the angle of the mobile (i.e., dual diversity at 90° versus no diversity at 0°). However. for l\lI 2 -1, the effect of angle is negligible, and therefore this angle was fixed at 90°. We considered spacings between elements of A/'2 or greater, and therefore neglected the effect of mutual coupling (see Appendix A). With the phased array, the weights were set to generate a beam that was pointed directly at the mobile. From (A-8) and (A-IO), these weights are given by

s; (!.l00) = /2 cos { ~ [Sill(21f(i . e- j ( 2 7r r / ).,) sin (2;"1 ( l -

1) II'v!) - l]}

1) j.'I [) .

i.

== 1. . . . . AI (4)

and the SNR is then given by (1). With the adaptive array, the weights are 8;ec i == 1.···.M and the SNR is given by (2). We consider coherent detection of phase-shift-keyed (PSK) signals, for which the BER is given by 1

'

BER == ~ erfc( jS/1V).

(5)

We used Monte Carlo simulation to determine the BER averaged over 10000 cases. Note that the BER depends on the ratio of transmit power to receive noise power. This ratio was adjusted to obtain a 10- 2 average BER for the baseline case of an omnidirectional transmit antenna with the mobile at a given range and scattering radius. With this ratio and the scattering angle fixed, we generated results for the 1\1[element phased and adaptive arrays, increasing the range until the BER exceeded 10- 2 , thus giving the range increase. All the following results for range increase and diversity gain are referenced to 10- 2 average BER. Note that the increase in range is not strongly dependent on the modulation and detection technique considered, but will vary significantly with the power loss exponent and the BER. Specifically, the range increase will be greater than we show

in the next section if the power loss exponent is less than four or the required BER is less than 10- 2 . We considered both the low data rate case (no delay spread) and the delay spread case. For the delay spread case, the signaldelay for each scattered signal depends on the distance from the mobile to the scatterer plus the distance from the scatterer to each base-station antenna. For the spread-spectrum system with delay spread, we studied the use of a three-finger RAKE receiver for both the phased and adaptive arrays. To simulate the RAKE receiver, the computer program first convolved the delayed impulse of each scatterer with the spread-spectrum correlation function given by

.f"( f'.) == { 1 O.

ltd - t 0.8

o.~\

.

for ltd -

tl ~

elsewhere

().~

/-LS

(6)

where tel is the time delay corresponding to the distance from the center of the base station to the mobile. The responses from the 20 scatterers were then summed to obtain the signal at each antenna. These signals were weighted and combined by the phased array weights or the adaptive array weights (s;{'c, . i == 1.···.1\1). Note that the adaptive array weights vary as a function of delay. We then determined the three largest peaks in the output response that were separated by integer multiples of the code rate and combined these three signals to maximize the output SNR. That is, these three peaks were cophased and weighted by their signal amplitudes before combining. For the phased array, we considered three different models. In the first model, we considered a single beam pointed at the mobile, i.e., the phased array weights as given in (4). Thus, our model corresponds to phased array combining with a RAKE receiver after the combiner. followed by maximal ratio combining of the RAKE output. To model the 15-95 CDMA system with a phased array, we also considered a RAKE receiver on each antenna, followed by phased array combining of the RAKE outputs, with the beam direction optimized for each delay [rather than set to 90° as in (4)]. Thus, a separate beam was fanned for each of the RAKE fingers. Finally, we modified the second model to consider the beam direction optimized over M different, equally spaced angles, which models sectorized antennas. For the adaptive array, our model corresponds to a RAKE receiver on each antenna branch, with adaptive array combining of the antenna signals followed by adaptive array combining of the three highest output peaks, with the receiver timing optimized to maximize the output SNR. For the no delay spread case, in our simulations we used a 40000-ft range as the baseline case, with the scattering radius given by the required scattering angle. However, our results can be generalized to any range, as they depend only on the scattering angle and not the absolute values of the range and scattering radius. Therefore, in the next section, we present our results only in terms of the normalized range. Similarly, although we generated results for a one foot wavelength, our results can be generalized to any wavelength. Therefore, our results on antenna spacings are only in tenus of A. Also, for the delay spread case, our simulations used a 1.25-Mbps data rate (as in the 1S-95 CDMA system). The scattering radius was

529

set to 1200 ft (which is typical in mobile radio in suburban and urban areas) which results in a delay spread of three symbols. This radius was chosen because, as shown in the next section, this is the minimum delay spread for which the maximum diversity gain is achieved with the three-finger RAKE receiver. Thus, the scattering radius was chosen to maximize the RAKE diversity gain as well as the effect of a narrow beam width on the performance. Again, our results do not depend on the absolute values of the range and scattering radius and are therefore presented in terms of normalized range and scattering angle. Finally , note that by keeping the scattering radius constant as we increase the range (which would be typical in mobile radio), the scattering angle decreases. For example, a 10° scattering angle with the baseline case is only about 3° with a three-fold range increase. With fixed scattering radius, the predicted range increase discussed in the previous section must therefore be modified. It was noted before that, for a given scattering angle 0', the maximum gain is 360 /0' , and therefore the maximum range R, normalized to the omnidirectionalantenna range R o, is given by

!i = (360) 1/4 Ro

n

(7)

But since the scattering radius is kept constant, the scattering angle at range R is less than the baseline scattering angle no at Ro. specifically

(8) Therefore, from (7) and (8) , the maximum range increase is given by

!i = Ro

(360) 1/ 3 0'0

(9)

=

(360/0'0)1 / 3]. This increase is [with the corresponding M greater than the maximum range increase of (360/n)1 /4 for the fixed scattering angle case, e.g., the range increase is 4.9 for 0'0 = 3° versus 3.3 for 0' = 3° .

B. Results for Range Increase Fig . 6 shows the normalized maximum range versus the number of antenna elements for phased and adaptive arrays with >./2 antenna spacing, neglecting the delay spread. Results are shown for different fixed scattering radii, with the scattering angle for the baseline case of one antenna element given. We also show the theoretical range due to the antenna gain (M 1 / 4 ) without diversity, and due to antenna gain and M -fold diversity . Also, the predicted maximum range with phased arrays is shown. With the phased array, the range is shown to be limited to the predicted range limitation. However, the range improvement is degraded due to the scattering angle for M less than the theoretical value corresponding to the range limitation, and it requires many times more antennas to actually reach this limitation. For example, with a 20° scattering angle, the predicted range limitation is 2.6, corresponding to 46 antennas, but with 46 antennas the range is only 2.3 . Note that at a range

6

- _ . Adaptive Array . . . . . Phased Array Theory

Phased Array Maximum Range Increase for

U

5

"0 =3

0

Ql

c: '" ee

a:

11

.~

4

(ij

E (;

z

3

..

"

0 20 .... .

2 . 4 5~ . . . .

....... 0.5

1.5

2

.... 60;···· 2.5

3

109 10 (M)

Fig . 6. Normalized maximum range versu s the number of antenna elements for phased and adaptive arra ys' with ),, / 2 antenna spac ing. neglectin g the dela y spread.

of 2.6, the scattering angle is reduced to about 80 for the 20° baseline curve. For the adaptive array , the range exceeds the no-diversity theoretical range for all scattering angles. due to antenna diversity . The diversity gain incre ases with the scattering angle and M, as expected. However, the diversity gain does not increase for scattering angles greater than about 20 0 • Thus, because the adaptive array has greater range with increased scattering angle, the difference between the adaptive and phased array increases dramatically with scattering angle. Next consider the effect of antenna spacing. With the phased array, our results show that the range does not increa se with wider spacing, and, in fact, the range decreases if the spacing is wide enough. With the adaptive array, the range increases with antenna spacing, up to that corresponding to the maximum diversity gain. Fig . 7 shows the increase in range with spacing for M = 2, 10, and 100 and baseline scattering angles of 3°, 10°, and 20°. Theoretical results for the range with maximum diversity gain are also shown. With baseline scattering angle s of 10° or more, the maximum range can be achieved with a spacing of about 10>'. Note that a baseline scattering angle of 10° corresponds to scattering angles of 6.20 , 3.4° , and 1.8° at the maximum range with J.\1 = 2, 10, and 100 , respectively. Consider the extreme example of a very large array. For-a baseline scattering angle of 30, with 100 elements a spacing of 10>' achieves a 5.IS-range increase versus the maximum 5.46 , even though the scattering angle at this range is only 0.58° (the array diameter would be 350 ft at 900 MHz and 160 ft at 2 GHz). Thus, with large arrays the antenna spacing can be much less than that required with two antennas to achieve nearly the full diversity gain . As a further example, a 100-element array increases the range about 2.8 times with a phased array and a scattering angle at the maximum range of 3° (about an 8.4°

530

7r--

I

- - - <Xo=20'

~ : ~ ::: ~~. 5

-

----

D M~~.~'~ :':-'::;'- _ ::-:

Theory

.... --:::"

,...

........ .. -

~ 4 c:

.- -

,...-~

,

,

-

.

l Oa-fOld ~

Drversuy

"~ , - ,- , - , ,

..... . .

6

.'

5

'"c:

OJ

a: '"

-0

.§

-0

'"

"iii

E

~

Ada ptive Array Pha sed Array Theo ry

. ;"

"

a: '"

- - - - , - -- - - - - , - - - - - - - - - ,

_~

3

10·l o ld

4

"iii

-=

E

Diversity

(;

z

3

2 2 -lo ld Diversity

3 Spacing (A)

10

20

2

n

2

Fig. 7. Increase in range of adapti ve arra ys with antenna spacing for J [ =2 . 10. and 100 and baseli ne scattering angles of 30 • 10° and 20 0 . neg lecting the de lay spread.

3

log10 (M)

Fig. 9. Normalized maximum range versus the number of antenna elements for phased and adap tive arrays with .\/2 antenna spacing and a three-finger RA KE recei ver.

1

2

3

Maximum Delay (Symbo ls)

Fig. 8. Diversity gain versus the maximum delay spread for a three- finger RAKE with a single antenna at the base station.

baseline scattering angle ) versus 5.5 times for an adaptive array with 10,,\ antenna spacing. Also, for this scattering angle , the range increase of a phased array with 100 elements can be achieved by an adapti ve array with only ten elements. For the delay spread case with the RAKE receiver , let us first consider the effect of the scattering radius on the diversity gain of the RAKE recei ver. Fig . 8 shows the diversity gain versus the maxim um delay spread for a three-finger RAKE with a single antenna at the base station . For our model, the maximum delay spread is given by twice the scattering

radius in symbol periods. That is, the minimum delay is given by the delay from the mobile to the base station. while the maximum dela y is given by a scatterer at the far edge of the scattering radiu s along the line between the mobile to the base station. The maximum delay is therefore the propagation time corresponding to twice the scattering radius. The diversity gain is seen in Fig. 8 to be within 0.1 dB of the maximum possible diversit y gain (three-fold diversity) for scattering radii corresponding to delay spreads of three symbols or greater. Therefore, in our simulation s, we set the scattering radius to three symbols. Note that with our model , the maximum delay spread does not decrease with the beam width of the array because the maximum delay variation is along the line between the mobile and the base station. Fig. 9 shows the normali zed maximum range versus the number of antenna elements for phased (with the IS-95 COMA system model ) and adapti ve arrays with ,,\ /2 antenna spacing and a three -finger RAKE receiver. As in Fig. 6, results are shown for differ ent fixed scattering radii, with the scattering angle for the baseline case of one antenna element given. However, in Fig. 9 the baseline case includes a three-finger RAKE with its 6.8-dB diversity gain. Thus, the actual range in the baseline case is 1.48 (= 106 . 8 /-<°) times greater than in Fig. 6. We also show the theoretical range increase due to antenna gain (lvI l /-<) and due to antenna gain and 3M-fold diversity (versus three-fold diver sity due to the RAKE receiver). With the phased array and a single beam pointed at the mobile , the range limitation is similar to that of the narrowband system (Fig. 6). However , with a separate beam for each RAKE finger, Fig. 9 shows that the range limitation is negligible for scattering angle s less than 20°, but there is

53 1

degradation in the range increase for scattering angles of 45 ° and 60° with more than about 40 antennas. This degradation is somewhat larger when fixed sectorized antennas, rather than continuously adjustable phased array antennas, are used, as Fig. 9 shows for the case of a 60° scattering angle. With the adaptive array, the range exceeds the theoretical range due to antenna gain and three-fold diversity. showing the additional diversity gain. Thus, there is a significant improvement with adaptive arrays for large scattering angles and large NI. Furthermore, in all cases the diversity gain of adaptive arrays increases with larger spacing, as shown in Fig. 9 for SA spacing with scattering angles of 3° to 0°. IY. CONCLUSIONS In this paper, we have compared the increase in range with multiple-antenna base stations using adaptive array combining to that of phased array combining. Our computer simulation considered a multi path model with a uniform distribution of scatterers within a given radius around the mobile, and determined the increase in range with arrays for 10- 2 average BER with coherent detection of PSK. -From our results we make the following conclusions. • Phased arrays were shown to have a range increase limitation given by the scattering angle. For scattering angles of a few tenths of a degree (typical in rural areas), this limitation is significant only for arrays with more than 100 elements, while with larger scattering angles (typical in suburban and urban areas), the range increase limitation can occur with far fewer elements. • For spread-spectrum systems, using a RAKE receiver with phased arrays, the maximum range increase degradation was much less than that of narrow-band systems. • In both narrow-band and spread-spectrum systems, adaptive arrays had no range limitation and could achieve diversity gain with A/2 antenna spacing with sufficiently many elements. Almost full diversity gain could be achieved with large arrays with antenna spacings of only a few wavelengths for scattering angles as low as 1°.

A. Effect of Antenna Spacing on Mutual Coupling With an NI-element array, the maximum gain is A1 without mutual coupling. Because of mutual coupling, however, this gain will vary with antenna spacing. Specifically, this gain is given by the directivity, i.e., the ratio of the peak to average gain for a signal arriving with a flat wavefront [20] maxli,'" IE(B,¢W

-

-1 41f

17I"1 0

0

27r

sin eI E (B, ¢) 12 dB d¢

E((),¢) == Ee(B)A(¢)

(A-2)

where E e (B) and A( ¢) are the variation in gain with elevation angle and azimuth angle, respectively. Thus, from (A-I) and (A-2), the directivity is given by

D ==

.l

l

IEe (Omax)A (cPlnax)12 l?7r sinBIEe(BW dB - jA(¢W d¢

1r

4~ 0

(A-3)

.0

where Omax and ¢max are the peak-gain elevation and azimuth angles, respectively. If we consider the typical base-station antenna with a very narrow elevation beamwidth, then the directivity can be expressed as

D

~G ~

IA(¢nlax)1

el

~

2

(A-4)

?

t11"IA(¢)1 2 d¢

21r .J0

where Gel is the gain due to the elevation beamwidth. Since we are interested in the effect of mutual coupling on the horizontal antenna spacing, we will set Gel == 1 for simplicity. Now, the voltage gain for a signal arriving at angle ¢ onto an M-element array (using maximal ratio combining to maximize gain) is given by

~ S,((P)Si'(¢maxl

A(¢) =

(A-5)

where s, (¢) is the complex signal gain (with phase relative to the other antennas) at the zth antenna element. We have assumed that the elevation beamwidth is very narrow, so that waves arriving at angles far enough from the equator to affect the relative phase between elements have negligible effect in the directivity calculation. Thus, the directivity is given by 1\/

Ll s i(¢ maxI

2

D ==

i==l

1

27r

Jo(27I" t;Si(¢)Si' (¢max) 1

AI

(A-6)

2 1

d¢

First consider a linear array of omnidirectional elements with narrow elevation' beamwidth, as with a vertical colinear array of dipoles (YCAD). In this case, the complex element responses are given by

ApPENDIX A

D -

i.e.,

(A-I)

where E ( e, 4» is the voltage gain at elevation angle e and azimuth angle 4>. For the base-station antennas, we will assume that the variation in gain with elevation angle is independent of azimuth,

for i

== 1, ... , NI

(A-7)

where d is the element spacing and 4> is the angle relative to broadside. Fig. 10 shows the directivity versus antenna spacing for an M -element array with the desired angle of arrival at broadside, cPmax == O. There are large fluctuations in directivity with antenna spacing (particularly at spacings which correspond to the onset of new grating lobes), showing substantial mutual coupling, with a spacing of A having about half the gain in decibels of a spacing of A/2. For a cylindrical array of equally spaced VCAD's with radius r and element 1 at 0°, the complex receive signal response at the 'ith element for a signal arriving at angle cP

532

M=100

20

Fig. 12. Cy lind rical array using cardio id-pattern antennas. with each e le ment pointed away from the center of the array.

20

iii'

15

::<:!-

Spacing (A)

z~ u

Fig. 10. Directi vity vers us antenna spacing for an .\I -ele rnent linear array with 0 mi'l..'\ 0° .

~

=

i:5 10

0.5

20

1

1.5

2

Spacing (A)

iii'

Fig. 13. Directivity versus antenna spac ing for an .\ 1-e lernent cy lindrical array with cardio id elements and C'nr n x = gO o.

::<:!-

~

.~

u

e

i:5

10

0.5

1

1.5

2

Spacing (A)

Fig . II. Directivity versus antenna spacing for an .\I -elernent cy lindri cal array with dipol e elements and Cl m a , = go o

is given by

s,(¢ ) = ej(2rrr / ,\)cos [d> -

(2 7C (i - l l /M l],

for i = I .·· · . /v£.

greater than >. /2 , the gain variation with spac ing is large for small M, but is less than 2 dB for M = 100 . Thus, with the cylindrical array, the mutual coupling becomes much less than that of the linear array as M increases. Thi s can be considered to be a result of adjacent elements being similar to endfire or broadside arrays, depending on their location around the circumference. Since grating lobe s arise at different spacings for endfire than for broadside array s. the mutual coupling fluctuations are somewhat reduced. To decrease the mutual coupling for small M , consider the use of cardioid pattern antennas (with narrow elevation beamwidth), rather than VCAD ' s, with each element pointed away from the center of the array (see Fig . 12) . With the card ioid antenna, the voltage gain for the zth element at angle dJ is given by [20]

(A-8)

v",(¢) = J2cos

(A-9)

Fig . 11 shows the directivity versus antenna spacing for an = 90 0 . Note that for spacings

M -elernent array with ¢ max

1)/M ] - 1)) ,

for i = 1. . " , iVI.

The spacing between adjacent elements is given by

d = 2r·sin ~ . M

(~ (COS [d> - 27r( i -

(A 10)

Fig. 13 shows the directivity versus antenna spacing for an 90° . Note that for spacing greater than >. /2, the gain variation with spacing is greatly reduced with small M . Our results show

M -element array with cardioid elements and ¢ max

533

=

that the directivity variation is approximately the same for other values of rPmax as well. Thus, with the cylindrical array of cardioid elements, the mutual coupling generates a gain variation of less than 2 dB for spacings greater than A/2 for all values of M. We will therefore ignore the mutual coupling in our simulations and assume a gain of 1Y1. ACKNOWLEDGMENT

It is a pleasure to acknowledge helpful suggestions by L. J. Greenstein. REFERENCES [1] S. C. Swales, M. A. Beach, D. 1. Edwards, and J. P. McGeehan, "The performance enhancement of multibeam adaptive base-station antennas for cellular land mobile radio systems," IEEE Trans. Veh. Technol., vol. 39, pp. 56-67, Feb. 1990. [2] G. K. Chan, "Effects of sectorization on the spectrum efficiency of cellular radio systems," IEEE Trans. Veh. Technol., vol. 41, pp. 217-225, Aug. 1992. [3] 1. C. Liberti and T. S. Rappaport. "Reverse channel performance improvements in CDMA cellular communication systems employing adaptive antennas," in Proc. Globecom '93, Houston. TX, Nov. 29-Dec. 2, 1993, pp. 42-47. [4] S. P. Stapleton and G. S. Quon, "A cellular base station phased array antenna system," in Proc. Veh. Tee/mol. Conf., Secaucus, NJ, May 18-20, 1993, pp. 93-96. [51 B. Khalaj, A. Paulraj, and T. Kailath, "Antenna arrays for CDMA systems with multipath," in Proc. Milcom '93, Boston, MA. pp. 624-628. [6] A. F. Naguib and A. Paulraj, "Performance of CDMA cellular networks with base-station antenna arrays," in Proc. Int. Zurich Seminar on Digital Communications, Mar. 1994, pp. 87-100. [7] J. H. Winters, "Optimum combining in digital mobile radio with cochannel interference," IEEE 1. Select. Areas Commun., vol. SAC-2, pp. 528-539, July 1984. [8] _ _ , "Signal acquisition and tracking with adaptive arrays in the digital mobile radio system IS-54 with flat fading," IEEE Trans. Veh. Technol., vol. 42, pp. 377-384, Nov. 1993. [9] 1. H. Winters, 1. Salz, and R. D. Gitlin, "The impact of antenna diversity on the capacity of wireless communication systems," IEEE Trans. Commun., vol. 42, no. 2/3/4, pp. 1740-1751, 1994. [10] T. Ohgane, H. Sasaoka, N. Matsuzawa, K. Tekeda, and T. Shimura, "A development of GMSKffDMA system with CMA adaptive array for land mobile communications," in Proc. Veh. Techno!. Conf, May 1991, pp. 172-177. [11] "Smart antennas," Northern Telecom product brochure. 1992. [12] W. C.- Y. Lee, "Effects on correlation between two mobile radio basestation antennas," IEEE Trans. Commun., vol. COM-21, pp. 1214-1224, Nov. 1973.

[13] W. C. Jakes, 1r. et al., Microwave Mobile Communications. New York: Wiley, 1974. [14] Y. Yamada, K. Kagoshima, and K. Tsunekawa, "Diversity antennas for base and mobile stations in land mobile communication systems," IEICE Trans., vol. E 74, pp. 3202-3209, Oct. 1991. [15] M. Hata, "Empirical formula for propagation loss in land mobile radio," IEEE Trans. Veh. Technol., vol. VT-29, pp. 3] 7-335, Aug. 1980. [16] 1. Salz and 1. H. Winters, "Effect of fading correlation on adaptive arrays' in digital wireless communications," IEEE Trans. Veh. Technol., vol. 43, pp. 1049-1057, Nov. 1994. [17] R. Price and P. E. Green, "A communication technique for multipath channels," Proc. IRE, vol. 46, pp. 555-570, Mar. 1958. [18] 1. R. Treichler and B. G. Agee, "A new approach to multipath correction of constant modulus signals," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-31. pp. 459-472, Apr. 1983. [19] R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980. [20J W. L. Stutzman and G. A. Thiele, Antenna Theory and Design. New York: 1981, pp. 115, 116, and 141.

534

A Comparison of Two Systems for Downlink Communication with Base Station Antenna Arrays Per Zetterberg Abstract-In this paper, we will compare the performance of the downlink of two systems with antenna arrays at the base stations. One system uses a third of the available spectrum per cell 1/3, but reuses the channels three times within each cell. The other system reuses all the spectrum in all cells 111, but does not reuse channels within a cell. Thus, the maximum number of users in the two systems is the same. In order to account for the interference, the 1/3 system directs antenna pattern nulls against same cell cochannel users, while the 1/1 system directs nulls against cochannel users in other cells. The performances of the two systems are compared as a function of the azimuth angular width (seen from the base) of the multi paths generated at the mobile, using simulations and analytical derivations. The results herein indicate that the system with 1/1 reuse has a higher performance than the system with 1/3 reuse if fast intercell handover is used or if high dynamic range power control, nulling and synchronization is employed. Index Terms- Adaptive arrays, antenna arrays, array signal processing, land mobile radio cellular systems, radio spectrum management, space division multiplexing.

T

1. INTRODUCTION

HE USE of antenna arrays at base stations has been proposed as a means of increasing the capacity (spectrum efficiency) of mobile cellular networks. For high-mobility systems such as GSM, AMPS, and D-AMPS, the capacity enhancement achievable with an antenna array is likely to be given by the downlink enhancement. This is the case because the processing applicable to the downlink is limited to beam steering types of techniques, while in the uplink high-performance diversity combining techniques can be used. Downlink beamfonning has previously been studied in the papers [4]-[7], [10], (16], and [21]. The transmit techniques in [5J and [6] are based on instantaneous channel knowledge while the techniques in [4], [7], [16], and [21] are based on knowledge of the (complex) fading correlation between the antenna elements. Multicell systems are treated in [10], [16], and [21]. The three papers all show that substantial capacity gains can be achieved but that the downlink performance degrades with increasing angular dispersion (spreading of the signal energy in azimuth angle seen from the base). Reduced channel reuse distances (cluster size) are investigated in all Manuscript received December 15, 1995; revised March 30, 1998. This work was supported in part by the Swedish National Board for Industrial and Technical Development (NUTEK). Part of this work was done at the Center for PersonKommunikation (CPK), Aalborg University, Denmark. The author was with the Royal Institute of Technology (KTH), Stockholm, Sweden. He is now with Radio Design AB, S-164 28 Kista, Sweden. Publisher Item Identifier S 0018-9545(99)05739-4.

three papers but [21] also looks at allocating multiple mobiles on the same channel within the same cell. The results in [21] indicate that multiple mobiles in a cell are more efficient than reduced reuse distances. However, directed nulls against users in other cells is not considered in [21]. In this paper we compare a system which uses a third of the available spectrum in a 1200 sector cell (1/3) with a system that uses all the available spectrum (1/1) in all 1200 sectors. However, the 1/3 system reuses the channels three times within each cell and thus the maximum number of users in the two systems is identical. The 1/1 system is considered both in versions with and versions without intercell nulls (i.e., nulls directed toward cochannel users in other cells). The influence of uplink power control is also investigated. The uplink power control actually influences the downlink performance since it affects which interfering mobiles the base is able to identify. i.e., determine their presence and their direction of arrival. Simulations are performed assuming the air interface of GSM. Plots of power control ranges and downlink outage probabilities are given. The outage probabilities of the two systems are compared as a function of the azimuth angular width (seen from the base) of the multipaths propagation. This is important since the performance of beam steering arrays is very sensitive to the multipath propagation. The results herein indicate that the approach with 1/1 reuse has higher performance than the approach with 1/3 reuse if fast intercell handover is used or if high dynamic range power control, nulling, and synchronization is employed. This paper is original in that it compares different channel planning and beamforming strategies, and that it provides analytical results for systems employing nulling. While being fairly close to the simulation results the analytical expressions take only a few minutes to evaluate, whereas the simulations take several days to perform. Furthermore, the derivation of the analytical results gives insight into the systems. The paper is organized as follows: the cellular geometry and the propagation model are introduced in Section II. An overview of the signal processing performed at the base station is also given in Section II. In Section III, a transmission technique that exploits the propagation model is derived. The concepts of same sector frequency reuse and reduced cluster size (channel reuse distance) are introduced together with their associated algorithms in Section IV. Simulation and analytical results are presented in Section V. Details of the simulations and the analytical expressions are given in Appendixes A and C, respectivel y .

Reprinted from IEEE Transactions on Vehicular Technology, Vol. 48, No.5, pp. 1356-1370, September 1999.

535

----r.
'

i

'/.

-'

0

;J

r

_

- r~::::~:~' ." " -'

·

/

__ _---.L(

• •

120 n

'\,

/

_ ----"'.

Fig. 2.

~

'- L

, . ~-.

--

120

0

The uniform linear array (ULAI and the polar coordinate system,

the complex-v ector valued "array manifold" functions a l ' L(H) and a DL(H) , in up- and downlink, respectively. In the case of a linear uplink array of uniformall y distributed identical antenna elements as depict ed in Fig. 2 below, al'L( H) is given by Fig. I.

Pan of the cellular sys tem.

a I'' L (H)

= P1'1'( H)

l'l . x [ 1. ex p] -.J" 2r.f ~ I' 'L slll (H )/ r·).. . . .

exp] -j2( 1TI.

II . PRELIMINARIES

The following three sections introduce the cellular geometry, the propagation model and the signal process ing at the base station, respectivel y. A. Cellular Systems

The coverage area of a mobile radio syste ms is divided into a network of cells , where each cell is covered by a base station . The cell s are assigned a subset of the available spectrum such that the same frequency cells are sufficiently separated spatially. In the theoretical analysis of such systems it is common to assume hexagonal cells with the base station in the center of the cell , as depicted in Fig. I. Contrary to the usual procedure, the cell labelin g in Fig. I does not refer to the spectra used in the cells but will be used as an aid in describing algorithms and results in the following sections. In this report , the hexagonal cells are also divided into three 1200 sector subcells or sectors. The subcells are covered by the linear arrays . The three subcells of a cell are labeled with the cell number as prefix and "a," "b." or "c" as suffixes . The "a," "b,' and "c" subcells are always oriented upward s, to the right and to the left, respe ctively. Two frequency reuse patterns denoted III and 1/3 are cons idered. In 1/1 reuse, all sectors use all the available bandwidth while in 1/3 reuse, "a," "b.' and "c" subcells use disjunctive subsets of the available spectrum. The sectors that are assumed to significantly disturb the downlink in subcell la are Ib , Ic, 2b, 2c, 3b, 7c, 4a, Sa, and 6a, and 4a, Sa, 6a when III and 1/3 reuse is employed, respectively.

B. Propagation Mod eling Consider a base station which employs the same or two different antenna arrays for up and downlink. The gain and phase of the antenna elements (the latter relative to antenna element number one) assuming a single emitter at azimuth direction B (as seen from the base), are assumed to be given by

-

1 )7rfn ~ 1 Lsill(H)/ (;) ] J

(I )

where pL'L(H), 2l 1.iL, l'": are the element pattern, antenn a spacing, and carrier frequency of the uplink. respectively. The param eter ITI is the number of antenna element s. which for simplicity is assumed to be the same in the up- and downlink array . The multidimen sional complex-valued base-band timedependent downlink impulse response between the antenna array at the base and the single antenna at the mobile is denoted ii Dl (t , T), where t is time and T is delay. The vector of signals transmitted on the multiple antennas of the base antenna array, xDl (t ), to a certain mobile . is given by '

(2) where w is a vector of complex weights and s(t) is the waveform modulated by the transmitted bits. From these assumptions it follow s that the signal received at the mobile, uDl(t ), is given by

where * denote s complex conjugate transpose. If the transmitted signal s( t) is generated from a linear modul ation, it can be written as

(4) 'I

where I(~l are symbols drawn from some complex alphabet, p8( t ) is the modulation pulse shape and Ti, is the symbol period, Combining (3) and (4), the following relation between the symbols transmitted from the base, I~l, and the signal I In the case of transmission to multiple mobile s on the same carrier at the sa me time, the signals to the differ ent mobile s are superpositioned, i.e .,

x DL (t )

536

=

L i w , ._ ,(t ).

sampled at times t = qTb + ~TDL at the mobile, is obtained:

-

1

E == -

(5)

T

n=-oo

where the "taps," h~L(t), are given by

h~L(t) =

1:0

iiDL(t, T)ps(nTb + L}.T

DL

- T)dT.

(6)

l

We define the local-mean symbol energy, at time to as

00

tO

+ (T / 2 )

t==to-(T/2)

E(t) dt

(14)

where T is a time-interval short enough for the ray parameters to remain fixed and long enough for the integration to converge. Combining (9) and (12)-(14) yields

Without loss of generality, the normalizations

(7) and

1:0

for all 6.T.

(9)

n=-oo

Equation (9) follows from (8) if ps( T) is band limited f S 1/2Tb . In practice this is not the case,2 however, we judge that the errors imposed by the assumption is minor. Assuming that the propagation can be described as the sum of JV rays, the impulse response flDL (t. T) can be written as

L ~V

g[ exp(j27TdPL t

+ 13pL)aD L ((-) t)8(T - Tt)

[=1

(10) where (h, 9l, dpL, 13PL, and T1, are the azimuth direction (as seen from the base), complex gain, Doppler frequency, phaseoffset and delay of the lth ray, respectively, and 8(T) is the Dirac function. Assuming that the azimuth angle of the lth ray is ¢l, as seen from the mobile (using the speed vector of the mobile as reference), the Doppler frequency dPL is given by

dPL = fDL COS(cPl)V C

h~L(t)

=L

9l exp(j27rdPL t

l==l

x ps (nT + 6.T D L

-

( 16)

(15) can be rewritten as

E == w*RDLw.

(17)

Let us now define the path gain G as the energy delivered at the desired mobile assuming transmission with a single antenna and unit power. Consider first transmission using antenna element 1 of the array. In this case it follows from (17) that the desired energy at the mobile is given by

[R D L ] L1

( 18)

where [M)r: e denotes element row == r column == e of the matrix M. If transmission is performed using antenna element r we obtain similarly ( 19)

We define G as the energy delivered at the kth mobile assuming transmission with a single antenna in the array-averaged over the elements of the array, i.e.,

G=

~ 1

f

r=l

[RDLL.r

== - Trace{R D L } m

(20) (21)

.

Similarly, (as in the downlink) the vector of uplink signals xUL(t), sampled at times, t == qTb + 6.T C L , at the base, is given by

L ex)

x

VL

(qT b + 6.Tl: L ) ==

(22)

where h~L(t) is given by

L 9l exp(j27rdfL t + j3f L) iV

(12)

h~L(t) =VpUL

Using (5) and (7) the following expression for the energy, per symbol, received at the mobile is obtained:

L

IglI2aDL(Bz)aDLo*(fJz)

n=-oc

+ !BpL)aDL(()z) Tt).

v

L 1=1

(11)

where f DL, u, and c are the downlink carrier frequency, the mobile speed, and the speed of light, respectively. The ray parameters N, ()l, 91, dpL, and Tl vary with time. However, they are assumed to be practically fixed during the time the mobile moves a couple of wavelengths. Combining (6) and (10) yields N

R D L ==

(8)

IpS(T)[2 dr = 1

are introduced. The following discrete time version of (8) is also assumed:

iiDL(t. T) ==

Defining the downlink channel covariance matrix R DL as

l==1

x aUL(fh)ps(nT

+ ~Tl:L

-

Tl)

(23)

00

E(t) =

Tblw·h~L(t)12.

(13)

n=-oo 2 Calculations we have made using the pulse shape of a linear approximation of the GSM modulation have shown that the left side of (9) varies between 0.0094 and 1.0015 using different ~ T.

and peL is the mobile transmit power. From the uplink signals x lJL (qTb + ~TUL), it is possible to estimate the uplink bit stream I~L by exploiting training sequences and channel coding [1], [20] (even if cochannel interference and noise is present). Given J qU L it is then straightforward to estimate

537

h~L(t). Unfortunately, it is impossible to estimate the instantaneous downlink impulse h~L (t) from the corresponding uplink entity h~L (t). The explanation to this can be found in the assumption that the up- and downlink carrier frequency differs by more than 10 MHz, which causes the instantaneous up and downlink impulse responses to become unequal and uncorrelated functions of time [22]. We therefore propose that the downlink beamfonning vectors are calculated using the downlink channel covariance matrix of the desired and cochannel users as input. We also propose that this matrix is obtained by first estimating the uplink channel covariance matrix defined by

R UL =

L

Next, it is assumed that the number of rays N is large the power of all rays given by 91 =

IglI2aUL(fJz)aUL, *(ez)

E{VDL(t)} ==0

(24)

n==-CX)

hUL( 1 ri t )h 'T/U L . *( t ) dt..

R

Trace{R

}

G==----m.

This implies that gain is the same in the up and downlink arrays. 1) Special Case: Flat Rayleigh with Gaussian-Distributed Power: The algorithms and systems introduced in the paper is based on the very general propagation model described above. However, the simulations and the analysis, are based on the assumption of frequency flat channels, Rayleigh-distributed signal envelopes, Gaussian distributed azimuth angular power distribution and uniform linear arrays in both up- and downlink. The frequency flat assumption appears mathematically as for l

== 1, . . . , L

(28)

using the framework above. Combining (12) and (28) yields

h~L(t)

==

vDL(t)ps(nT

+ ~TDL - TO)

(29)

where vDL(t) is given by

vDL(t) ==

L gl exp(j21rdPL t + (jpL)aDL((Jz).

==p

;.00 .-00

1

ICC. V 27fa

2

')

exp(-I/ j(2a-))

+ 1/) aDL. * (f) + 1/) d1/.

(35)

For the case of a linear array of uniformally distributed antenna

R DL

N

(30)

l==l

3This actually requires pUL to be known. Otherwise ROL becomes unknown to the degree of a scalar factor.

= GRDL(f).

a)

(36)

where the row == T, column == e element of the complex matrix-valued function R( f}, a) is defined as [R( fl.

a) ] r.

t!

== exp ( -

21 (J- 2 (e - r ) 2)

X eXI{j(e - T)

2:f~:L

Sill(IJ))

(37)

where ~DL and fDL are the antenna spacing and carrier is given by frequency of the downlink and

a

(27)

pUL'rn

(34)

(25) elements, R DL can be closely approximated as [22]

(26)

_ Trace{R UL}

DL

X aD L (f)

This relationship is obtained from (23) and (24). Similarly as in the downlink case, we define the path gain, as the power received in a single antenna element of the array, assuming unit power is employed in the mobile, averaged over the elements of the array. We also assume that the path gain is reciprocal, i.e., DL

(33)

The assumption of Gaussian-distributed Hi yields

and then employing one of the four methods listed in Appendix D to form an estimate of R DL.3 The uplink channel covariance matrix estimate can be estimated from estimates of the uplink channel by exploiting the following relationship: L:x:J

(32)

E{ vDL(t) (vDL(t)) T} = 0 E{ vDL(t) (vDL(t))*} = R DL

i=l

1 ~.tO+(T/2) R CL = T . f=to-(T/2)

(31)

and the angles fli Gaussian distributed with mean at the position of the mobile, fI, and standard deviation a. From these assumptions it follows that yDL (t) is a complex Gaussian zero-mean circular symmetric random vector with covariance matrix R DL. i.e.,

lV

pUL

1ft

a ==

7f2~DL fDL

c x 90°

cos( ())a.

(38)

The path gain G is assumed to be given by

G= (

~) 1p(e)12 L

(39)

'Y

where R is the cell radius, r is the base-mobile distance, 1 is the path-loss exponent, p( B) == cos( f}) is the antenna diagrams of the identical antenna elements (assumed identical in the up- and downlink), and L is a lognormally distributed random variable. This means that 10 log( L) is a normally distributed random variable with mean zero and standard deviation a L. The factor L represents the path-loss variations due to shadowing objects such as buildings and hills. In practice this loss varies with time. In the simulations and analysis of this paper, it is, however, assumed constant for the time-interval considered. In the uplink we obtain similarly

h~L(t) == )pULvUL(t)ps(nT + 6.T UL -

TO)

(40)

where yUL(t) is a zero-mean complex Gaussian random vector. The uplink channel covariance matrix is also given by (36)-(39), but obviously with "DL" replaced by "UL" and multiplied by the mobile transmit power pUL. However, due to the assumption of different up and downlink frequencies, vDL(t) and vDL(t) are uncorrelated.

538

C. Outline of the Signal Processing at the Base Stations

In the previous section, we have argued that the downlink beamforming should be based on the downlink covariance matrix R D L rather than the instantaneous downlink impul se responses. The reason being simply that the instantaneous downlink impulse respon se cannot be estimated by the base without feedback, i.e., using uplink signals only. The previ ous section has also given some indication of how the matrix R D L could be estimated. In this section, we will try to clari fy that issue further. Consider a system consisting of multiple base stations and mobiles . Let us denote the up and downlink channel covariance matrices associated with the propagation between the kth base station and the ith mobile R~,~ and Rr,~ , respectively. Let us further denote tap 1/ of the uplink impulse respon se between the ith mobile and kth base by h ~\ . t ). The path gain is denoted C k . ; . For the purpose of downlink beamsteering, the kth base needs the downlink channel covariance matrix of its desired user and of the cochannel user s. We therefore propose that the kth base station estimates the downlink channel covarian ce matri x of the users in its own and neighb oring cochannel cells by empl oying the "re ceiver module" sketched in Fig. 3. Thi s module estimates the downlink channel covariance matri x of the ith user given the training sequence and frequency hopping pattern of that user (in the case of a TDMA system employi ng frequency hoppin g). Thus, such a receiver module has to be applied for all users within the desired cell as well as those of the neighboring cell s. The input to the module is the sampled uplink signals, and the training sequence and frequency hopping pattern of the considered mobile . The module internally employs seven signal proce ssing units. The first of which is the mixe r. Thi s module down- converts and filters the received signals to produce a base-band signal. The frequencies and filter empl oyed are determined from the frequency hopping pattern and power-spectrum of the considered transmission. Given the base-band signal the "uplink-combiner" estimates the transmitted bit stream (possibly soft-outputs) by employing methods such as in [I] and [20]. The uplink combiner employs the training sequence of the considered user. The output of the uplink combiner is fed into a decoder which utilizes the channel coding to remove errors in the bit estimation process. The decoder also deli vers an indicator i k . i , which shows whether the decoded bits are reliable or not. Given the decoded bit stream, the "p ostdecoding channel estimation " unit forms an estimate of the tran smi ssion channel h~.\. i (t ), assuming that the actuall y transm itted symbols agree with those obtained from the deco der. With h~\ ! (t ) at hand R ~,~ is straightforwardly estimated usin g (2 5). Finall y, the R~,~ estimate is mapped onto an estimate of pF L Rr,~ . In Appendix D,four methods for doing this are proposed. With L known , an estimate of R r ~ can finally be obtained. Thu s, the kth base must somehow be informed of the power used by the ith mobile. In the following section , we introduce a transmission algorithm which uses the channel covariance matrices of the desired and interfering users as input. This method involves

Hopping Sequence-:-- - - - --{

,--

,(

PF

Fig. 3.

--- -- -tI

i k ,1

,

j UL q

.

pULIlDL I

k ,l

Block dia gram of the "receiver module ."

the solution of a generalized eigenvalue problem. In the case of frequency hopping, such a problem must be solved for each time- slot and each mobil e. As a result, the propo sed system will require an extreme amount of computational power. Thus, simplifications for the proposed algorithms have to be made before implementing the sys tem in practice. III. A TRA NSMISSIO N ALGORITHM

Con sider the situation where q base stations are transmitting simultaneously (on the same frequency) to Q mobiles." The three sec tors of a cell-site are regarded as three different base stations. Let the kth mobile be connected to the k base station. The base then transmit s using the weighting vector W k . The purpose of the section is to derive a good choice of W k, assuming that the downlink channel correlation matrix of the desired and some of the cochannel user s are known . The notations introduced in Section II-C are used . Seen from the viewpoint of the Q - 1 cochannel mobiles the signal transmitted from base k to mobile k is interference. Therefore, these mobiles are referred to as the interfered mobile s. Before givin g a criterion function for the choice of W k the ad hoc restriction is made that the "antenna arra y gain ," in the direction of the desired mobil e should be unity, i.e., _ Ek. k =

Ck . k

=

Trace {

Rr.l }

m

(4 1)

From (17 ) E k . k is given by E k. k

• DL = w kR k . kw/c .

(42)

Simulations have indicated that the constraint (4 I) is a good choice although no optimality is claimed. Assume for a moment that no other base station than base station k creates interference. Then the (mean) desired signal (or carrier) power 4This assumption seems to be inconsistent with the allocation of multiple mobile s on the same channel at the same base . However, by interpreting the results of the section pragmatically, the obtained algorithm can be used also in that case. see Section IV·A2.

539

to the (mean) interference signal power, SIR i , at the ith mobile, is given by SIR'i

== E i , i

Ek,i

w~RDLw .:I, 'I, 'l,1,

is made, i.e., the sum of the interference to carrier ratios at these mobiles is assumed to be proportional to II w k 11 2 . This approximation is reasonable if the positions of these mobiles are well separated. The constant in (49) will be obtained from loose reasoning and will vary between systems.

«;*RDL k,i w k Trace{R~t }

where the last equality follows since the ith base station also has a unity antenna array gain toward its desired mobile (all base stations employ the same transmission algorithm). Now, the criterion function for deriving W~: is selected as Wk

==

a~g min {. t

1,=1. :f.k

SIR;-l}"

(44)

Thus, in a sense, WI..: is chosen to minimize the sum of the inverse of the signal to interference ratios of the mobiles in the system. Combining (43) and (44) yields Wk

==

(45)

arg lllin{wZMwk} w

where the matrix, M, is given by

q

M =

'Tn

~

r:

{DL}

Trace R /,

l,=l,:j;k

t

DL

Rk,l,"

(46)

In practice, the interference at the ith mobile will of course not only come from base k, but from all bases. However, the criterion (45), which was derived at ignoring such interference, is still used. The solution to (46), subject to the constraint (41), is given by

W

where e maximizes

==

Trace { R~.\ } --~--e

me*RDL k,k e

DLe e*R k,k e*Me .

(47)

(48)

(RP\,

L.J

'i=p+1, #k

m DL { DL} R k i Trace R. . I

t , I,

rv rv

constant x I

(49)

In this section, we propose a number of systems which all use the transmission algorithm of the previous section. The systems differ in terms of channel allocation, power control, synchronization, and identification of interfered users. In terms of capacity enhancement strategy the systems can be divided into two approaches: same cell frequency reuse (SCFR) and reduced cluster size (RCS). The two groups of systems are described in the following sections.

A. Same Cell Frequency Reuse (SCFR) In this approach the capacity enhancement is achieved by allocating d mobiles within a cell, on the same channel. Thus, the base stations transmit to d mobiles simultaneously on the same channel. To make this feasible, dynamic channel allocation is applied to separate the cochannel users in the cell (with respect to the mobiles dominant direction of arrival). In principle, the deployment of SCFR in the downlink does not imply that the same capacity enhancement strategy has to be used in the uplink. However, if it is used, channel allocation and power control must be employed to combat near far effects in the uplink processing. Since some cellular systems dictate a one to one relation between uplink and downlink channels, e.g., GSM, we assume that the mobiles simultaneously accessing the same channel in the downlink also do so in the uplink. 1) Channel Allocation and Power Control: The mobiles in the cell are sorted with respect to their path gain to their desired base. Then, they are divided into power groups say r 1 ~ r 2, . . . such that all the mobiles in r 1 are stronger than all the mobiles in r 2 and so on. Power control is applied such that all mobiles on the same channel are received equally strongly (averaged over fast fading). The common power level is given by O.5G m a x +O.5G m in , where G In ax and G In in are the maximum and minimum path gain among the mobiles in a decibel scale. The power groups are allocated in timefrequency space such that their dissimilar power does not negatively affect the uplink processing. The dynamic channel allocation algorithm of [21] is applied on each of the power groups. This algorithm begins by sorting the mobiles in the power group with respect to their (}k, k angle, where (}k, k is the dominating azimuth direction in the propagation between the kth mobile and its desired base-station. In the special case treated in Section II-Bl, Ok, k is simply the mean of the Gaussian angular power distribution. Assuming that the sorted list of angular positions is "71, ... , tt«, x d» the mobiles corresponding to the angles T/i, "7i+n c ' ••• , "7i+(d-1)n c are allocated to the same channel. If frequency hopping is applied, the same channel users hop together from frequency to frequency.

CPr

It can be shown that e is the generalized eigenvector associated with the matrix pair M), corresponding to the largest eigenvalue, [13]. Methods to compute e may be found in [8], where it should be noted that R~\ is Hermitian and positive semidefinite and M is Hermitian' and positive definite (from the assumptions below). The parameters required to calculate W k are the downlink covariance matrix of the desired and cochannel mobiles and the desired signal power at the cochannel mobiles. In practice, the parameters required for the cochannel mobile will only be known for a subset of them, say mobile 1, ... , p. These mobiles will be referred to as the identified interfered mobiles. In order to account for the remaining mobiles the approximation Q ~

IV. TWO-CAPACITY ENHANCEMENT ApPROACHES

(43)

IrnwZRp,~Wk

-

540

2) User Tracking and Nulling: The users on the same channel within the cell are considered as the identified interfered users in the framework of Section III. Since the considered base station and the base station of the identified interfered user is the same base, it follows that R?~ == RP'~, in (46). The constant in (49) is chosen from the following crude reasoning: Assume that only mobiles in the first tier of cochannel cells are interfered. Assume that the path gains to these mobiles are given by Gk,i == (RID)r where D is the distance between the considered base and the first tier of cochannel base stations. Assume further that these mobiles have received desired power E; i = 1. Geometry yields that the number of significant unidentified interfered mobiles is approximately d x 1 where l == 6 and l == 3 using 1/1 and 1/3 reuse planning, respectively. With these assumptions we obtain q

'"

m

i=P~# Trace{RPr}

L q

(RjD)'Y

,=p+L # ~

"

Fig. 4.

User k Illustration of the considered situation.

p.UL

'mR D L

~

==

PoG~~ i •

(54)

~

k.l.

Trace{ Rr,~ }

is used, where e is a design parameter. If e == 1 then (50)

where the last approximation is accurate if the unidentified interfering users are well spread as seen from the kth base. The constant in (49) is thus chosen as constant == (RI D)r I d.

(51)

B. Reduced Cluster Size (ReS) In this approach the capacity enhancement is obtained by changing the frequency allocation plan, i.e., the cluster size. For instance, by going from a 1/3 to a 1/1 reuse pattern, a threefold capacity enhancement is obtained. The antenna array base stations are employed to compensate for the increased intercell interference. Several versions of this approach are investigated. The versions appear as different power control settings and with or without directed nulls. The development below assumes that uplink cochannel mobiles are downlink cochannel mobiles, i.e., mobiles that are interferers in the uplink are interfered in the downlink. For TDMA systems this implies that the timeslots of the base stations are synchronized. 1) Power Control: Consider the situation with two mobiles: the kth and the ith mobile, as illustrated in Fig. 4. The kth mobile is connected to the kth base station and the ith mobile to the ith base station. The path gain of the desired signal divided by the path gain of the interfering signal in the uplink at base k is given by Gk,kIGk,'i' If the kth and ith mobile use transmit power ?;:L and PiUL, respectively, the quotient of the effective uplink path gains at base k, QPGU L is given by

= Gk, kPi!L

Gk ,t.p.UL· t

(52)

The quotient between the downlink path gains at the i th interfered mobile is given by

Gi , i QPGPL == G ' k, i t

Basek

2

Herein, the power control law

Rr~

(RI D)~(l ell

QPG U L k

Base

QPG U L = k

Gi,i

Gk , ~

.

(55)

Thus, QPG~L == QPGPL. This implies that a mobile which is received strongly at base k, will be vulnerable to transmission from base k. This is a desirable property since it implies that the base will have a good chance of identifying (i.e., estimate the channel covariance matrix) the mobiles with poor downlink signal to interference ratio and subsequently suppress the interference transmitted toward these mobiles. Another advantage with e == 1 is that all mobiles within the cell are received equally strongly at the base and, the adjacent channel interference is therefore small. In practice, the use of e == 1 may be prohibited by the power control range allowed by the mobile. In that case a smaller e has to be employed. 2) Dynamic Channel Allocation: When e -j. 1 the power of the mobiles may vary significantly which causes adjacent channel interference problems. We propose that this problem is solved using dynamic channel allocation in the following manner: Sort the mobiles in the subcell with respect to their path gain to the base G k , k- Divide the mobiles into power groups say r 1, r 2, . .. such that all the mobiles in r 1 are stronger than all the mobiles in r 2 and so on. Finally, allocate the groups in time-frequency space such that the dissimilar power of the power groups does not negatively effect the uplink processing. If e = 1 is used, random channel allocation is simply employed. 3) User Tracking and Nulling: We will consider two versions with respect to nulling: with and without directed nulls. i) With directed nulls: The required entity of identified interfering users in (49) are Rr,~/Gi,i. When power control with e = 1 is used, the following equality holds:

(53)

541

(56)

Since PiULRr~ is obtained from the receiver module, see Section II-C, ~nd Po is a constant common in all cells, the required entity RP~/G"." is obtained without knowledge of

P iUL .

"

If e =I- 1 however, the kth base must be informed of the uplink power employed by the zth mobile. In order to choose the constant in (49) we assume that the base is able to identify interfering users with an effective path gain larger than F m in , and that the value of P 1n in is known by the base. This means that the users i which satisfy (57)

are not identified by the kth base. Equation (57) yields

G. C-t>G ru.,

fJ

1./

-

1

1.1.

T

I=p+l.-:j:.k

(J

<

L

k. i

(58)

RDL

n-1Gf'-1 k.l P IfllllrO Ti,1 -G T J.:. I

i=p+ 1. -:j:.1.. ~ T

R DL G

~

Pin in P()- I C'w- l I

(59)

(60)

T

where r is the number of significant unidentified cochannel users, and C is the median value of C t , ·i. The approximation (59) follows from the assumption that the number of unidentified cochannel users is large and that they are evenly distributed in azimuth angle as seen from the kth base. Based on geometry T is set to 11 in the 1/1 frequency reuse case and 5 with frequency 1/3 reuse. The median value G is assumed known. In the special case of propagation according to Section 11- B 1, it has been estimated that G = 0.1, under the assumptions used in Section V. ii) Without directed nulls: In this case, no interfering mobiles are identified in the framework of Section III, i.e., p = O. It follows that the constant in (49) can be chosen arbitrarily. With this approach synchronized bases are not a requirement. V. RESULTS

The downlink outage probability and uplink power control requirements of the proposed systems are investigated by means of simulations and analytical calculations in the two examples below. The propagation model of Section lIB 1 is employed. Other issues, e.g., uplink dynamic range requirements are treated in [22]. Example 1: A 1/3 same cell frequency reuse (SCFR) system with, d = 3, users per channel is compared with a 1/1 reduced cluster size (RCS) system. The comparison is made by simulations and analytical derivations. The RCS system is considered in four versions: with or without intercell nulls and using e == 1 or e 0.3 (Section IVB 1). The air interface is GSM with frequency hopping and discontinuous transmission [9]. Each simulation considers 48

=

users in each subcell using six 200-kHz carriers with eight time slots multiplexed on each. This is approximately seven times less bandwidth than the corresponding one antenna per sector system would require for the 48 users, [3]. The user activity factor is 0.5, i.e, a user transmits in 50% of its time slots. Linear arrays are of ten antenna elements m == 10, with half a wavelength spacing ti == O.5A are employed to cover the 120-degree sector cells. The propagation model of Section II-B I, applies. The propagation exponent used is 1 == :3.5 and the standard deviation of the lognormal fading is a L == 8. However, the lognormal fading between a mobile and two base stations is correlated with correlation coefficient 0.5, i.e., E{ 10 log(L i 1, JIO log(L i 2 . d} = a~I)/2. The angular spread a is independent of the distance from the base. The three values a = 0°, J O• and GO are considered. An analysis of measurements collected in an environment in an area of three-five stories high buildings and a high base-station installation (40 m) has indicated that this model yields reasonable performance predictions [22]. The handover is based on geometry and antenna pattern. This means that a mobile is connected to its geometrically closest base with a bias according to the antenna element patterns (a more precise description is given in item #1 of Appendix A). In practice this handover strategy corresponds to a case where the signal strength measurements are passed through a low-pass filter with a very large-time constant, before being used for handover decisions. The simulations are static in that the positions of the users and the lognormal fading is assumed fixed during the simulation. The SCFR system uses four power groups with twelve mobiles each (see Section IV-A 1). Group 1 uses the first and second time-slot, group 2 the third and fourth and so on. The RCS system with nulling use a P min which is half of the mean power of the six desired users in a TDMA time slot (see Section IV-B3). Two values of e are considered, e == 1 and e = 0.3 (see Section IV-Bl). When e == 1 random channel allocation is employed, whereas when e == 0.3 the channel allocation of Section IV-B2 is used with eight power groups consisting of six mobiles each. Group number one is allocated to the first time slot in the TDMA frame, group number two to the second and so on. More details of the simulation procedure is given in Appendix A. Simulations for the described systems are made 40 times according to the simulation method described in Appendix A. Histograms of the power transmitted from the mobiles using the power control methods described in Sections IV-AI and IV-B 1, are shown in Fig. 5. The upper, middle, and lower subplots consider the SCFR system, the RCS systems with e == 1 and the RCS system with e == 0.3, respectively. The powers are in all three cases normalized such that the mean power transmitted from a mobile is 0 dB. . In Fig. 6 the estimated outage probability, defined as the probability that the instantaneous signal to interference ratio is less than 3 dB more than 20% of the time (motivated in Appendix B) is plotted as a function of the multipath angular spread a, see Section 11-B 1. In Appendix C approximative analytical expressions of the outage probability are derived for the SCFR system, the RCS system with nulls (only the case

542

j10l :: 500 Ql

.0

E

:OJ

Z

0 -30

jl~1 :: 500 Ql

.0

E :OJ Z

Fig. 5.

0 -30

- 20

-10

0

10

Power transmitted from the mobile (dB)

Distribution of th e power control settings; upper: SCFR, middl e: RCS with e

=

I , lower: RCS with e

=

0.3.

0.16 0.14 0. 12 >,

~

:.0ell .D

0 •1

o

0:

S'o

0.08

ell

+" ~

0 0 .06

cl

c8

c3 X- - :: :: c2

*-

c4 __--.:~ __c8

0.04 c5

x-

0.02 c6 X" "

c7

0

0

234

Angula r spread

(J'

5

in degr ees

6

7

Fig. 6. Outage probability as a function of (J with geometric based handover. c l : RCS-WON , 1/1 analytical. c2: RCS-WON. 1/1. e = 1 simulation. c3: RCS-WON. 1/1, e 0 .3 simulation. c4: RCS-WIN. Ill . e 0 .3. simulation. c5: SCFR, 1/3, d 3. simulation. c6: RCS-WI N. Ill , e 1. simulation c7: RCS-WIN, Ill , e 1, analytical. c8: SCFR, 1/3, Ii 3. analytical. WIN with nulling, WON without nulling.

= =

=

=

=

=

e == 1), and the RCS without null s (e independent). Results using these expressions are also plotted in Fig . 6. Example 2: All the simulations and computations performed in Example I, are repeated but assumi ng signalstrength based handover. Thi s means that the mobiles are connected to the base station with the lowest path-loss (except for some hysteresis), see #1 of Appendix A. Figs . 7 and 8 are the counterparts of Figs. 5 and 6, respecti vely.

=

=

VI. CONCLUSIONS AND DISCUSSION The following sections list conclusions drawn from the ob servations made above, and discuss critical assumptions.

A. Power Control The results of Examples one and two show that the dynamic power control range in the mobiles must be larger than 50 dB in the e == 1 case, while ::::;30 dB is sufficient in the e == 0 .3

543

,=Jb= I 1~1 ~=~~::::B~, I -20

10

20

r~IL__-,-__p~ow=e, =-rt~"Eoon.

20

-30

-20

-20

-30

-10

0

-10

0

-10

10

0

10

30

I 30

(dB)

Power transmitted from the mobile (dB)

30

20

Fig. 7. Distribution of the power control settings: upper: SCFR. middle: RCS with (' = 1, lower: RCS with (' = D.3. 0 .06

r-----r----,---.---,-------r-r---.----.---, c2

c3 0.05

c4 c5 c6 c7

I I I

..,>, 0.04 ..0

2o

0'::

c2 /

0 .03

x c8

Q)

be

..,:tl

:: 0 0.02

0.01

.-'

c6

clx-

c3 c5 c4

x- - - _ ..

_

x-

,

- x

c7

c8

/

x- -

o

6

7

Fig. 8. Outage probability as a function of (7 with ~i gnal-.s tren gth based h~do.ver. cl : SCFR, 1/3, d = . 3, si m ulati~n. c2: RCS-W?~ I~- =_1 , S =:. 1, analytical. c3: RCS-WON, 1\ = 1, S = 1. e = 0.3 , simulation. c4: RCS-WIN, I~ = 1. S = 1. e = 0.3, simulation, c5. R,?S-W?N, I~ - 1.:-S -:- 1, e -:- 1, simulation. c6: SSFR, Il3,d = 3, analytical. c7: ReS-WIN, Ill, P = 1, analytical. c8: RCS-WIN, 1/1, e = 1, simulation, WIN - with nulling, WON = without nulling.

and SCFR case. As a reference, the GSM standard supports a power control range of 30 dB [9]. B. Dependence of Downlink Performance on Uplink Power Control The results of Examples I and 2 show that the uplink power control is critical for the downlink performance of systems with downlink intercell nulling . In particular, the results show

that the power control parameter e = 1 yields much better results than e = 0.3. Thus, the conjecture of Section IV-J~H appears correct, it stated that the base will be able to identify and null the mobiles with poor downlink quality if e = 1 is applied. Is this result general? If the identification threshold Pm in is made sufficiently small (i.e., the base can identify very weak mobiles), then e = 0.3 will perform equally well. This

544

means that the conclusion may not be true for any system. However, the result indicates the importance of an issue which is typically overlooked. It should also be noted that the beamfonning used in the paper takes the desired signal strength at the identified interfering mobiles into account in the criterion function (in order to achieve this, information has to be transmitted between the bases in the e = 0.3 case but not in the e == 1 case, Section IV -B3). If this is not the case, the effect may be worse since deep nulls will point toward users who already have a good signal to interference ratio. This problem does not arise in systems with only two users and two base stations, and analysis and experiments under such conditions can therefore be misleading. C. Downlink Outage Performance

The simulation results herein indicate that the RCS systems performs better than the SCFR system if signal-strength handover is applied, or if the uplink power control completely compensates for the path-loss, i.e., e == 1 (see Section IV -B 1) and nulling is applied. This would not have been so in the geometry handover case if more users and channels had been simulated [22]. The reason being that this would have separated the same frequency users in azimuth and thus made the SCFR system more robust against angular spreading. On the other hand, the simulation and analysis assume basically uniformally distributed users, which is favorable for the SCFR system. The simulation also assumes that all multipaths are confined to an area relatively close to the mobile. If this is not the case, a larger degradation is expected in the SCFR than in the RCS cases, since the SCFR system tries to separate mobiles in azimuth to avoid the influence of angular spreading. i\PPENDIX A SIMULATION PROCEDURE

The enumeration below describes the simulation procedure used in the paper. 1) The positions of the 11 x 3 x 48 users in cells 1-9. 12-13, are generated as follows: The position of user ~ is randomized with equal probability in the area 0

COS(30 ) ) 2/", ( COS((}i. i)

(~)

R:::; 2,

(61 'I

where ri,i and Oi,l are the distance and angle to the desired base station, respectively [the factor (cos(300)/cos( O',l) )2/1' models the cell radius as a function of angle 0 when the antenna patterns are given by p ( 0) == cos( 0)]. The lognormal fading to each neighboring base station is randomized and the corresponding path gain is calculated. The position of the user and the lognormal fading are regenerated (randomized) if a "mistake" is detected. In the geometryhandover case a mistake has occurred if the average path gain defined by (39), using L == 1, is larger for some other base than base i. In the signal strength handover case a failure has occurred if the strongest path gain, i.e., max, Gi,i is more than 3 dB stronger than the desired-

545

base path gain Gi, t. If there are n base stations which are stronger than O.5G·i , l (including the zth base), then a "mistake" is generated with probability (n - 1)In. Thus a random number is drawn to determine if a "failure" has occurred or not. This procedure is used to simulate the case with a fast handover and a hysteresis of 3 dB. 2) The channel allocation algorithms are invoked (Sections IV-A 1 and IV-B2). In the SCFR approach four power groups with 12 mobiles each are used. Group number one uses the first and second time-slot, group number two the third and fourth and so on. In the RCS approach, with e == 0.3 eight power groups (one for each time slot in the TDMA frame) are used. With e == 1 random channel allocation is employed. All simulations assume that the TDMA slots of the base stations are synchronized, although this is critical only for the reduced cluster size approach with directed nulls. However. the TDMA frames are desynchronized in the sense that each base has a random offset of one through eight bursts. 3) Weighting vectors (Section III) are calculated for all users in sectors I a-c, Zb-c, 3b, 4a, Sa. 6a. and 7c in the 1/1 reuse case and 1a, 4a, Sa, and 6a in the 1/3 reuse case. In the same-cell reuse approach only one weighting vector per user is necessary. This applies also to the reduced cluster size approach if nulling is not applied. With nulling however, multiple weighting vectors per user must be calculated. This is because frequency hopping is applied and the identified interfered users thus change between time slots. In order to calculate the weighting vectors it is therefore necessary to determ i ne which users are identified by the base. This requires, in turn, that the power control settings are calculated. Thus the power control at the mobiles are first calculated. Then it is determined which interfered users are stronser than Pmin' (and thus identified, see Section IV-B3). For subcell 1a only users inside subcells 1b, Ic, 2a-c, 3a-c, and 7a-c are candidates. Once the identified users have been determined the weighting vectors for all possible cases are calculated. 4) For each of the 48 users in subcell 1a it is investi sated whether they are experiencing acceptable speech q~lity or not. Based on the reasoning in Appendix B, we assume that this is obtained if the instantaneous signal to interference ratio exceeds 3 dB in at least 800/0 of the time slots. The fraction is calculated as follows: The mean desired power averaged over fadinzo G·I, I. for the considered user is calculated using (39). A random frequency hopping pattern is simulated by randomizing the cochannel user in cells 1-7 with nei bzhborin bo cells 10000 times. For each of the 10000 hops the cochannel users are drawn with equal probability among the mobiles allocated in the time slot. The mean interference (averaged over fading) at user 'i, is calculated for each hop using the formula

t, ==

L k

"7kGk,iw~Rvv(f)k,i,

ak,·dwk

(62)

where W k is the weighting vector of the kth user and G i, i, f)k, i, and o», i are the propagation parameters between the zth user and the kth users desired base (can be the same base in the SCFR case). The sectors selected in the sum of (62) are la-c, 2b-c, 3b, 4a, Sa, 6a in the 1/1 ReS case, and la, 4a, 5a, 6a in the 1/3 SCFR reuse case. Note that with SCFR, there is d cochannel users per sector (Section IV -A). To simulate discontinuous transmission the factor 'fJk is randomized independently for each hop (Pr{ 7]k = I} == 1 - Pr{ T}k == O} = 7]DTX == 0.5). The same cell cochannel users which are assumed to be active all the time constitute an exception. Note that the users which use the same frequency within the cell are the same in each time-slot (Section IV-A-I). When the mean desired and interfering signal has been calculated the probability for the instantaneous signal to interference ratio to exceed 3 dB is calculated using (63). This probability is averaged over the hops to produce the sought fraction. Finally, the number of users with acceptable speech quality are counted and the outage probability is estimated as the fraction of users in subcell 1a with unacceptable quality.

ApPENDIX C ANALYTICAL RESULTS

In this section, we derive analytical approximations of the outage probability for the SCFR system, the ReS system with intercell nulling (only the e == 1 case) and the ReS systems without nulling. The analysis uses the same assumptions as the simulations with a few exceptions. Among those are the antenna spacing, the number of users in the system and the spatial distribution of the users. The downlink antenna spacing is slightly increased to ~ == A/ J3, and the number of users is assumed large (infinite). The spatial probability density of the user positions (seen from the desired base) is assumed to be given by as shown in (65) at the bottom of the page, where the choice of TO is defined by the handover algorithm assumed. The reason for the choice of the distribution (65) is that it enables the derivation of an analytical solution for the outage probability while at the same time being very close to uniform at J == 3.5-4.0. In Appendix C-A-C below approximative expressions for the outage probability (probability of unacceptable speech quality) conditioned on the user position are obtained for the three cases. In order to obtain the unconditioned outage probability ~ the subcells are divided into "elements." ~1('il, 'i2 ) , defined by ~~('il ~ -l2) ==

ApPENDIX B INSTANTANEOUS OUTAGE PROBABILITY

Previous results have shown that 9 dB average signal to interference ratio is sufficient to provide reasonable speech quality in GSM (neglecting noise), [15]. We assume that the relevant property for the receiver is the probability that the instantaneous signal to interference is less than 3 dB. Assuming flat Rayleigh fading and one interferer this fraction can be computed using the formula (see [14]) 1

Pr{ SIRinstantaneous ::; SIRd = 1 + SIR/SIR

t

(63)

10 0 . 3 and SIR 10 0 . 9 . This yields Pr{SIRinstantaneous ::; SIR t } == 0.2. The formula (63) applies in the case of a single interferer only. However, with multiple interferers, the interference is usually dominated by the strongest interferer and we use (63) as an approximation in these cases. In Appendix C, analytical approximations of the outage probability are derived. In these derivations the following approximation of (63) is used with

SIR t

SIR o Pr{SIRinstantaneous ::; SIRt} = (1 + SIRo/SIRt)SIR

(64)

which is a "linearization" of (63) around SIR == SIRo. The natural choice of SIRo under the assumptions here is SIRo = 10°·9.

() .. )f( T 2.. , 2' 2,'1. -

const. x r.1.,2. COS(1-4f,) (6·2,1.' :)

{ 0,

cos(30 0 ) ) 2/ 1, O.05'i l ~ ( ~, (r 1.• { cos] f}i. 'L)

+ 0.05'£1

and 5'£2 - 60

t/ R)

:::; D.DS

< fI ::; S'i2 - 55}. (66)

This partitioning is illustrated in Fig. 9 using i 1 == 0, ... , 17, iz == o.... ~ 23, i.e., TO == 0.9. The outage probability is calculated for a central point in each element. Finally, the unconditioned outage probability is obtained as the sum of the central point outage probabilities, weighted by the fraction of users in the element. These fractions can be calculated analytically, [22]. The intention is that the elements should be small enough that the outage probability is approximately constant within an element. It is easily shown that the desired signal strength (disregarding lognormal fading) along the borders of the "annular elements" [where annular element i 1 is defined as Ui.., neil, 'i2)J is constant, when the element patterns are given -by p((}) == cos( fJ). If the user distributions of all subcells in the system are added, only small spots are left "empty" if TO = 0.9 is used. Thus TO == 0.9 will be used when "geometry based handover" is assumed. Previous results, [9], have shown that the gain of signal strength handover (described in item #1 of Appendix A) over geometry based handover is about 4 dB. We model this effect by choosing TO == 0.7, and thereby moving the mobiles (a distance corresponding to 4 dB), closer to the base. In Appendixes C-A, C-B, and C-C below, ReS with nulling, ReS without nulling, and SCFR are treated, respectively. The

if ~ (cos(300)/ cos((}i,i))2 f'(ri,i/R) ~ fa, elsewhere

546

I(}'i,il ~ 60°

(65)

0.8 0.6 0.4 0.2

o - 0.2 -0.4 -0.6 -0.8 -1 L.-_--'-_ _--L..._ _- ' - - - _ - - - '_ _::>.k=:--_ - ' -_ _L.-_--'-_ _-'--_---l 0.2 0.4 -1 -0.8 -0.6 -0.4 -0.2 0.6 0.8

o

Fig. 9.

The division of the subcell.

Res version with nulling is onl y considered in the case (' (e is defined in Section IV-8 I). A. Reduc ed Cluster Si;e (R CS) with Nulling and (:

=I

where D k . , and B (:r) are defined by

Dk

=1

We assume that all base s erroneously estimates their desired mobile to have zero angular spread. i.e., Il k . k = 0° for all /,; .5 Thi s yields

B (.I)

G i. , C ,' , R ,.,.(O . oi, cos(lh .,))

+ ( r - l )Pmi/lI

(71)

= Jiag (1.

ex p ] -):t) . . . . . exp ( - j (m - 1):1))

(72)

respectively . Using the equations above we obtain that the interfering power (averaged over fading) at the ith (from the kth base) is given by

= [1. exp ( -j J3Jr Sin( B)) . ex p ( - j J3( rn - I )Jr sin( I1) ) r

=

and

where arB ) is given by

arB)

,

G k . iwkR (B k ,. Il k . i )W k

=

Gk.,a *(O) B *(iid D;;l, R (O.

(68)

Il k . ,

cos(Bk , ;))Dk",ljB(O:k)a(O )

(a *(O) B *(a k )Dk',l,B (O:k)a( O))2

in (47) and (48) . With R k . k given by (67) it can be shown that the transmit vector at the kth base is given by

(73)

where

[(2Jr/J3) sin (Bk. d - (21l)J3) Sin Uh,j )] modulo 2Jr.

Assuming that only the kth and -ith user are identified by the kth base, the matrix M is given by M

(74)

=B (( 2JrI J3) sin( Bk, i) )Dk. jB* (( 2JrI J3) sin ( /h .;)) (70)

5 This is a pessimisti c assumpt ion if the angular spread of the desired user is large because in that case there exist possibilities to avoid transmitting toward the interfered mobiles by pointing the main beam toward the multipaths . However, the angular spreading considered herein is so small that this effect is negligible .

The impact of frequency hopping and discontinuous transmission is modeled by averaging (74) over the distribution of i'ik (which is shown to be uniform [0, 21r] in [22]) and assuming that the mobiles are active with probability TJDTX . Using the results of Appendix B, and assuming that all base stations have identified the -ith mobile but no other mobile, the probability

547

M, D k , i » and D(x, y) all be equal to the identity matrix i.e., M == Dk,i == D(x, y) == I.

that the instantaneous SIR is not exceeding SIR o is obtained as 1 - Pr{ Outage}

== Pr

{

SIRo SIR o

1+--

'TJDTX

c:

"

j

C. Same Cell Frequency Reuse (SCFR)

.

The derivation of the analytical approximation of the outage probability (conditioned on the user position) for the SCFR system is very similar to the derivation of [21, Theorem 11. The details of the derivation can be found in [22].

t

SIR t

(75)

~t

FOUR

::::: Pr { 10 log(G k , i) - 10 log(G i , , )

< 10 10 (

g 9

(-1

TJDTX

ApPENDIX

TO

R DL

D

TRANSLATIONS METHODS

1) If the up- and downlink manifolds are the same, i.e.,

(1 + SIRo/SIR t ) SIR t o

(81)

'Vi}

fJiCOS(Oi))),

R UL

(76)

it follows from (16) and (24) that the up- and downlink multipath covariance matrices are the same (except for the power scaling) i.e., RUL = pCLRDL, and the translation problem is thus eliminated. This requires two different antenna arrays for up- and downlink. The two arrays should have the same structure but scaled to their respective wavelength. This idea was first proposed in [16], and is referred to as "the matched array approach," in that paper. 2) If the same array is used in up- and downlink, i.e.,

where g(z , y) is defined as shown in (77)-(79), given at the bottom of the page, this approximation is possible as the interference usually is dominated by one base station. This is more true in systems with antenna arrays than otherwise. Assuming that the log normal fading (between a mobile and several base stations) is correlated with correlation coefficient, c, the outage probability at the 'lth user condition on its position is obtained as given in (80) at the bottom of the page. When (80) is used in Examples one and two, only neighboring cells are taken into account in the product. Furthermore, only the sector directed most closely toward the mobile of each cell is considered, since overly pessimistic results would otherwise be obtained. This is because (80) does not assume fully correlated lognormal fading between a mobile and the three sectors of a base station site. Also with ~ == A/2 there are fewer side lobes outside the ±60° region than with ~ = AI J3 [which (80) assumes]. The parameters SIR o and SIR t are set to SIR o == 9 dB and SIRt == 3 dB, respectively.

(82)

and (83)

and the relative duplex separation (fUL - fDL)/(fUL + is small then there may exist a compensation matrix Aconlpensate such that

f D L)

B. Reduced Cluster Size (ReS) Without Nulling see [22]. If (84) is valid R DL may be approximated as

By again assuming that all bases estimate the spreading of their desired mobile to be zero, i.e., o». k = 0° for all k,6 the results of previous section can be used by letting the matrices

pULRDL-"A -..

6 Simulations we have made have shown that the loss of neglecting the angular spreading in the ReS case without nulling is typically less than 1 dB, assuming linear arrays with eight-ten elements.

g(z, y) == max{f(x, y) z

~

compensate

RDLA*compensate'

3) If the spatial distribution of power is well approximated by a finite number of rays (which is less than the number

z}

(77)

f(x, y) =x (21f a*(O)B*(a)i>-l(x, Y)~v(O, y)f>-l(x, y)B(a)a(O) da }ii=O (a*(0)B*(a)D-1(x, y)B(a)a(O))2

(78)

D(x, y) == xRvv(O, y) + (r - l)Pmin I Pr{Outagelr"i,Oi,d=lx

~ roo

v 21r } x=O

II Q (m k'

exp ( i -

(85)

2-

(79) x2

20 dB (1 - c)

10 log

)

(g( 7JD~X (SIR ol + SIR;-l )t, O"k, i a dB v!1="C

k#i

548

cos( 8k ,i))) - mi, i-X)

dx

(80)

of antenna elements), i.e.,

lV

ftUL ~ ~ pULlhnI2aUL(fln)(aUL(enJ)*;

[10] T. Ohgane, "Spectral efficiency evaluation of adaptive base station for land mobile cellular systems," in Proc. IEEE Veh. Technol. Conf., 1994, pp. 1470-1474. [11] S. J. Orfanidis, Optimum Signal Processing, An Introduction. Singapore: McGraw-Hili, 1990. [12] B. Ottersten, M. Viberg, and T. Kailath, "Analysis of subspace fitting and ML techniques for parameter estimation from sensor array data," IEEE Trans. Signal Processing, vol. 40, no. 3, pp. 590-600. Mar. 1992. [13] B. Parlett, The Symmetric Eigenvalue Problem. Englewood Cliffs, NJ: Prentice-Hall, 1980. [ 14J R. Prasad and A. Kegel. "Improved assessment of interference limits in cellular radio performance." IEEE Trans. Veh. Technol., vol. 40, pp. 412-419, May 1991. r15] K. Raith and J. Uddenfeldt, "Capacity of digital cellular TDlVIA systems," IEEE Trans. Veil. Technol., vol. 40, pp. 323-332, May 1991. [16] G. Raleigh, S. N. Diggavi, V. K. Jones, and A. Paulraj, "A blind adaptive transmit antenna algorithm for wireless communication," in Proc. IEEE

N<m

n=l

(86) then the powers 1 hIT/. 12 and directions gri of these rays can be estimated from Rf~, using a conventional direction finding technique, e.g., [2], [11], [121, [17], and [18]. These estimates may then be used to calculate PtGLRDL using

JV

pULR D L ~ pUL ~ IlllnJ2aDL(Bn)(aDL(Hn))*.

(87)

Int. Conf Communications, 1995.

n=l

4) If a uniform linear array is used in the uplink, i.e., a UL (H) is given by (1), and the model described in Section II-B 1 applies, then the method of [19] may be employed to estimate the signal power, as well as H and a. With these estimates at hand, the transmit matrix pUL R DL may be explicitly calculated. REFERENCES [11 S. Andersson, U. Forsse n, and J. Karlsson, "Ericsson/Mannesrnann GSM field-trials with adaptive antennas," in Proc. IEEE Veh. Tee/mol. Conf., Phoenix, AZ. May 1997, pp. 1587-1591. f2J Y. Bresler and A. Mocovski, "Exact maximum likelihood parameter estimation of superimposed exponential signals in noise," IEEE Trans. Acoust., Speech, Signal Processing, vol. 34, pp. 1081-1089, Oct. 1986. [31 C. Carneheirn. S. O. Jonsson, M. Ljungberg. Nt. Madfors, and J. Naslund, "FH-GSM frequency hopping GSM," in Proc. IEEE Veil. Techno!' Conf., Stockholm. Sweden. June 1994, pp. 1155-1159. [4] C. Farsakh and 1. A. Nossek, "Channel allocation and downlink beamforming in an SDMA mobile radio system." in IEEE Int. Symp. Personal. Indoor and Mobile Radio Communications, Sept. 1995, pp. 687-691. [51 D. Gerlach and A. Paulraj, "Adaptive transmitting antenna arrays with feedback," IEEE Signal Processing Lett., vol. I. pp. 150-152, Oct. 1994. [6] _ _ , "Adaptive transmitting antenna arrays with feedback." IEEE Trans. Veh. Techno!., 1995, submitted. [7] _ _ , "Base station transmitting antenna arrays for multipath environments," Signal Processing, vol. 54, no. 1, pp. 59-73, 1996. [81 G. H. Golub and C. F. Van Loan. Matrix Computations. Baltimore. MD: The Johns Hopkins University Press. 1983. [91 M. Mouly and M. B. Pautet, "The GSM system for mobile communications," Michel Mouly and Marie-Bernadette Pautet, 49 rue Louise Bruneau, F-91120 Palaiseau France, 1992. ISBN 2-9507190-0-7.

[171 R. Roy, A. Paulraj, and T. Kailath, "ESPRIT-A subspace rotation approach to estimation of parameters of cisoids in noise." IEEE Trans. on Acoust., Speech. Signal Processing, no. 34, p. 1340. 1986. [181 R. O. Schmidt. "Multiple emitter location and signal parameter estimation," in RADC Spectral Estimation Workshop, Griffiths AFB. NY, 1979, pp. 243-258; reprinted in IEEE Trans. Antennas Propagat., vol. AP- 34, pp. 281-290. Mar. 1986. [19) T. Trump and B. Ottersten. "Maximum likelihood estimation of nominal direction of arrival and angular spread using an array of sensors." Signal Processing, vol. 50, nos. 1/2, pp. 57-69, Apr. 1996. [20J J. H. Winters. "Optimum combining in digital mobile radio with cochannel interference," IEEE Trans. Veh. Technol.. vol. 33. pp. 144-155, Aug. 1984. [211 P. Zetterberg and B. Ottersten, "The spectrum efficiency of a basestation antenna array system for spatially selective transmission." IEEE Trans. veh. Technol.. vol. 44. pp. 651-660, Aug. 1995. [22] P. Zetterberg, "Mobile cellular communications with base station antenna arrays: Spectrum efficiency, algonthms and propagation models." Ph.D. thesis, Royal Institute of Technology, Stockholm. Sweden. June 1997.

549

Chapter 4

Implementation Issues

P

OSSIBLY the most challenging problem related to adaptive antennas is their practical implementation, from both a technical and a cost point of view. In realworld adaptive antenna systems there are a number of sources of random errors, ranging from antenna element misplacement and mutual coupling to amplitude and phase mismatches and quantization errors. This chapter includes work that deals with these issues from both a theoretical and a practical implementation angle. It starts with a tutorial paper from Dudgeon that describes mathematically and intuitively the fundamentals of digital array processing. Then an adaptive algorithm and its efficient pipelined architecture in the form of a triangular systolic array, particularly applicable to VLSI design, are described. A multiple input, multiple output orthogonalization algorithm, its systolic implementation, and its comparison with the wellknown Gram-Schmidt orthogonalization procedure are discussed in the paper by Yuen et a1. DuFort considers the design of optimum beamforming networks, and Er and Cantoni et al. present a unified approach to designing robust array processors. The article by Hansen discusses

551

design trade-offs and a procedure for selecting design parameters for Rotman lenses. Neural beamforming has been suggested as a means to increase the performance of an adaptive antenna (it has been shown that neural networks can control arrays in an accurate manner even with element and network errors) and reduce manufacturing and maintenance costs. The paper Mailloux and Southall presents a comparison between a neural network and a Buttler matrix performing the same direction finding task, and the paper by Southall et a1. discusses a direction finding system implemented with a neural beamforming network and presents some test results. Several papers in this chapter deal with mismatch problems with adaptive antennas. Among the issues discussed are nonlinearity effects in digital manifold phased arrays (Mathews), array imperfections and methods to cope with the reduction of the nulling capabilities (Jablon), mutual coupling compensation (Steyskal and Herd), forward-backward averaging methods for array manifold errors (Zatman), and the use of orthogonal codes for remote antenna calibration (Silverstein).

Fundamentals of Digital Array Processing DAN E. DUDGEON,

Abstract-With the advent of high-speed digital electronics, it has become feasible to use digital compu ters and special purpose digital processors to perform the computational tasks associated with signal reception using an antenna or directional array. The purpose of this paper is mainly tutorial, to describe mathematically and intuitively the fundam~ntal relationships necessary to understand digital array processing. It 18 hoped that those readers with a background in antenna theory or array pr~essing will.see some of the advantages digital processing can offer, ~hile those WIth a ~ackground in digital signal processing will recognize the array processing area as a potential application for multidimensional signal processing theory.

M

I. INTRODUCTION

UCH of the theoretical work being done today in the area of multidimensional ~ignal proc.essin g is motivated by the need to process signals earned by propagating wave phenomena. For radar to be successful, it was necessary to develop directional transmitting and receiving antennas so that azimuth as well as range and range rate information could be ascertained from the radar return. Similarly, this problem is also encountered in active sonar and ultrasonics applications. In applications where the source signal is not precisely controlled (such as exploratory seismology) or where the received signal is externally generated (such as passive sonar, bioelectrical measurements, or earthquake seismology), it is desired to elicit characteristics of the received signal (its signature) as well as its direction and speed of propagation. In recent times, it has become more and more feasible to perform the signal processing operations associated with array processing using digital compu ters or special purpose digital processing hard ware. Correspondingly, digital signal processing theory has grown to encompass these various applications. The following references are representative of recent articles of digital processing in the fields of radar [ 1] , seismology [2] , sonar [3] , ultrasound [4] , and bioelectrical measurement [5] . The point of this paper is to examine the fundamental array processing techniques, in particular the concept of bearnforming to determine the speed and direction of propagation of an incoming wave, from the point of view of a multidimensional Manuscript received July 26, 1976; revised November 12 1976. The author is with the Computer Systems Division, Bolt' Beranek and Newman, Inc., 50 Moulton Street, Cambridge, MA 02138.

MEMBER, IEEE

signal processing problem. We shall see the close relationship between conventional sampled-data systems and the sensor array as a receiver sampling the waveform in space. Accordingly, Section II reviews some essential points about sampleddata systems and digital signal processing techniques. In Section III, a linear array of sensors is used as a basis for discussing the weighted delay-and-sum beamformer with attention given to how to choose the appropriate weights. In Section IV, the relationship between the computation of beam spectra and the computation of a two-dimensional (2-D) discrete Fourier transform is examined. Section V looks at extending the results of Section III to higher dimensions. As an example of results from digital signal processing which can be applied to digital beamforming, the problem of designing the sensor weights for a multidimensional beamformer is discussed in Section VI. In the case of a Cartesian array of sensors, an ingenious mapping due to McClellan [6) can be used to design and implement beams with nearly spherically symmetric main lobes in a computationally efficient manner.

II. IMpORTANT CONCEPTS IN DIGITAL PROCESSING In this section, several important concepts from digital signal processing theory will be reviewed. These concepts will be presented in terms of a one-dimensional (1-0) signal for ease of understanding, but they are easily generalized to multidimensional signals. The reader is directed to [71 as a text on digital signal processing and to [8] as a review of 2-0 filtering concepts. The fundamental assumption of digital processing is that input signals are bandlimited to frequencies below one-half the sampling rate. If a continuous-time signal is sampled at a rate too slow (undersampling) for the frequency content of the signal, the Nyquist sampling theorem tells us that frequencies above one-half the sampling rate in the continuoustime signal will act like frequencies below one-half the sampling rate. This phenomenon is known as aliasing, and it is explained in detail in [7] as well as in a variety of texts and papers on sampled-data systems. Although we have been speaking of a l-D time signal, the same statements apply to signals which are a function of distance or other continuous independent variables.

Reprinted from Proceedings of the IEEE, Vol. 65, No.6, pp. 898-904, June 1977.

553

We can represent a I-D digital signal by s(n) where n is an integer. By doing so we are effectively normalizing the sampling rate to be unity. The Fourier transform of such a digital signal is defined by S(w)

=L s(n)e-j w n . n

se:

) =

~s(n) exp (_i

2

:

k

).

(2)

If the signal s(n) is zero for n outside the range from 0 to N - 1, then the sum over n in (2) extends only from 0 to

N - 1. In this case (2) defines a discrete Fourier transform (DFT) which is invertible; that is, the samples s(n) may be recovered from the values S[(21fk)/N] by s(n)

Nt S (21rk) exp (i 21rnk) N N N

=.!.

l

k=O

for n = 0, N - 1.

The DFT of a signal may be computed by an efficient algorithm (an FFT), the details of which are contained in texts [7], [9], [10], and papers [11], [12]. The advantage of the FFT algorithm is that the computation of S[(21Tk)/N] is proportional to N log2 N rather than N 2 as in a direct evaluation of the OFT (see (2)] . One type of digital filter which is important to the understanding of digital array processing is the finite impulseresponse (FIR) filter. Again we shall briefly review the 1-0 case which is covered in detail in [7], [9], [13] , and [14]. The name FIR refers to the fact that the impulse response of the filter is nonzero only over a finite domain of the independent variable. For example, if a filter has an impulse response h(n) such that h (n)

=0

for n

<0

and n

~

N

where nand N are integer-valued, then the filter is said to be an FIR filter. A special class of FIR filters are those which are said to be "linear phase." The impulse response of such a filter (assumed to be purely real) possesses even (or odd) symmetry about the midpoint of its nonzero region. An example of such an impulse response is

= 0 for n < 0 and n ~ N h (n) =h (N - n - 1).

hen)

(3)

Because of this symmetry, the phase response of such a filter is exactly linear and produces no phase distortion. A· further specialization can be made by requiring h(n) to be even about h(O). In that case, the frequency response of the filter is purely real. In addition to the perfect phase characteristics of linear phase FIR filters, they have the additional advantage of being easily and quickly designed by a computer program (15). This program approximates a given ideal frequency response optimally in a weighted mini-max sense; that is, the weighted maximum deviation from the ideal is minimized.

DIRECTION OF PROPAGATION

a

(1)

The reader will quickly recognize that the Fourier transform is continuous and periodic in the radian frequency variable w with period 21f. If only samples of S( w) are desired, (1) can be evaluated at the points w = (2trk)/N, where k is an integer variable taking on values from 0 to N - 1 and N is an integer constant k

~ / /

- - - - N SENSORS- - - -

Fig. 1. A plane wave impinges upon a linear array of N sensors at an angle Q.

III. BEAMFORMING The reasons for studying the formation of beams from an array of sensors are straightforward. In several of the applications mentioned in Section I, particularly passive sonar, the objective is to use the signals received by the sensors in a phased manner so as to preferencially detect signals coming from a particular .direction (i.e., signals coming in on a particular beam). In addition, by averaging over many sensors, the signal-to-noise ratio (SNR) is increased to aid in the measurement of other signal parameters [ 19) . An appropriate analogy is that beamforrning is related to multidimensional spectral analysis in the same way that bandpass filtering and I-D spectral analysis are related. Both beamforming and spectral analysis can be used to segregate received energy by frequency, direction and speed of propagation. (An excellent discussion of the latter approach is contained in [30].) Our discussion will concentrate on the beamforming approach, since it is the author's opinion that that formulation more accurately reflects the type of processing done in a realtime digital array processing system. In many physical situations, the signal one is interested in receiving and analyzing can be modeled as a propagating plane wave. In such a signal model, the value along a line (or plane) perpendicular to the direction of propagation is constant. If we assume that the plane wave is propagating with a speed c and in a direction at an angle a to the y-axis (Fig. 1), then the signal value r at a particular place (x, y) and time t may be written r(x, Y, t) = s ( t - (

x sin a + y cos a )) c

.

Note that this is really a function of one independent variable. Consequen tly, the function s(·) along with the direction and speed of propagation completely specifies the model signal. In order to focus on the structure of the beamforming computation, we shall assume that the signal s(·) is deterministic, not stochastic, and that the SNR is high enough that we may ignore the contribution of the noise. Later we shall indicate the way in which knowledge of the signal and noise statistics can be used in beamformer design. A simple way to try to measure s(') and the direction of propagation is by using a delay-and-sum beamformer (19). In Fig. 1, a plane wave impinges upon an array of N sensors uniformly separated by a distance D. If we let a denote the angle of incidence of the wave, then we would expect the signal received at the i + 1st sensor to be a delayed version of the signal received at the ith sensor (in the absence of noise and other waves). The amount of delay is D sin a/c. If we want to look for a signal coming from an angle a, we can form

554

the sum

g(a, r) =! Nt r, (t _iDa) l

N ;=0

C

where a = sin a and ri is the received signal from the ith sensor. Suppose, however, that an incoming wave has a different angle of incidence ao =1= Q and speed Co =1= c. Then

iDao) ( +~ .

Fig. 2. The array pattern for a delay-and-sum beamformer with N ::: 7 sensors.

ri(t) = s t

Substituting for ri(t) g(a, t)

=! Nil s(t - iD(~- ~)). N ;=0

C

Co

If we let s(t) be represented by its continuous Fourier transform Sew)

1

00

1 s(t)=-

2rr _

and let k

= waje, then g(a, t)

=-1

21T

L-ao

S(w)e1wtdw •

00

oo

Sew) W(k - k o ) e j •w

t

dw

(4)

where 1 N-l

W(v) = -

L

N ;=0

..

e-j 1v D

We shall call W(v) the array pattern. Note that (4) has the form of an output signal being equal to the inverse Fourier transform of the product of the Fourier transform of the input signal and the frequency response of a filter. Thus for particular values of a, D, c, and Co, the pattern W tells us how the frequencies in the input signal are weighted to form the output signal. For the case at hand, it is easily shown that

ao,

W(v) -

sin (NvDj2)

N sin (vDj2)

[.(N-

exp -J

I)VD] .

2

The magnitude of W(v) is plotted in Fig. 2 for N = 7. Notice that it is periodic in u with a period of v = (21T)/D. Since

lJ=w(~_ao) c

Co

(5)

the width of the central lobe of the array pattern decreases with increasing temporal frequency (w) and with increasing sensor spacing D. However, if w or D become too large, the array pattern will exhibit other large lobes in addition to the main lobe because of the periodicity of W(v). Historically, these extraneous lobes have been called "grating" lobes because of the analogy with optical diffraction gratings. We shall review the phenomena of grating lobes from the point of view of spatial sampling of a propagating wave. First, we shall assume that the wave-set) is of a single frequency Wo (monochromatic). This is no real restriction since a more general waveform can be decomposed into an integral of weighted sinusoids. If such a wave is traveling at a speed Co, then it will have a wavelength "-0 = 21TC0 I Wo. If the wave were incident at an angle a (Fig. 1), and we were to measure its value along the line connecting the sensors at one instant of time, we

would observe a sinusoidal variation as a function of position along the line of sensors. The spatial period of this variation can be shown to be Ao Isin a. If the spacing D happened to equal Ao/sin a, the sensor measurements would all be identical and one might mistakenly conclude that the wave arrived perpendicular to the array rather than at the correct angle Q. This is precisely the problem of aliasing described in the previous section except that here we are sampling a waveform that is a function of position (by choosing the sensor spacing D) rather than sampling a waveform that is a function of time (by choosing the sampling period D. Consequently, if it is expected that a signal will have a component with a wavelength as short as Amin' then the sensor spacing should be at most Amin/2 to avoid the effects of spatial aliasing (e.g., grating lobes). In designing a beamformer, the objective (in the most straightforward case) is to have W(v) be as close to an impulse as possible using only a finite number of sensors. Traditionally, measures of closeness are the width of the central lobe (the smaller the better) and the height of the side lobes (also the smaller the better). One way to decrease the central lobe width and the sidelobe height is to increase N, the number of sensors. Obviously, this expedient can be used only so much before economic constraints or a breakdown in the signal model come into play. Another way to alter the performance of the beamformer is to weight the sensor signals individually before summing them. If w(i) is the sensor weighting for the rth sensor, then the beamformer output can be represented by g(a,t):!Nt

l

N ;=0

W(j)S(t-iD(~-~))' C

Co

Following the previous derivation we can again write g(a, t) = -

1

21T

fao

.

S(w) W(k - ko) e1w

t

dw

-00

where now 1 N-l .. W(v) = w(i) e- j 1v D

L

N ;=0

is a generalization of the previous definition of W which includes the sensor weights w(i). The problem of determining the sensor weights so that. the array pattern has some desired characteristic is the same as designing a good data window for spectral estimation [16], or designing a prototype low-pass filter for use in a digital filter bank [29]. Either problem may be stated in terms of an FIR filter design problem and the FIR mini-max design technique [13] , as well as others, brought to bear on it.

555

Array patterns can also be designed to take advantage of knowledge about the expected distribution of noise which a array is likely to see. The sensor weights w(i) can be adjusted to maximize the SNR. This is analogous to' designing a Wiener filter given the spectral estimates of the signal spectrum and noise spectrum. Intuitively, the filter will have a frequency response which passes parts of the spectrum where the SNR is high and rejects the parts where the SNR is poor [1 7] . Furthermore, it is possible to adapt the sensor weights as the received signal varies, thus attempting to maintain a high SNR under nonstationary conditions.

signal, the sensor spectra, and finally to the multidimensional spectrum of the sensor signals. Recalling that T = 1, we see that the beam signal may be written as before g(

d nC

D' n

)

=

1 N-l

N j~

w(i) rj(N - idn) ·

The short-period spectrum of such a beam signal may be written as

m2;n

M+n-l (d C ) 21fk dnc ) G ( M'n,n = v(m-n)g ~,m

IV. DIGITAL BEAMFORMING AND 2-D DFT's Thus far we have treated time as a continuous independent variable and we have treated the spatial independent variable as being discrete, since measurements can only be obtained at the sensor positions. Now, however, we shall constrain the time variable to be discrete as well by insisting that sensor measurements be made at intervals of T seconds. In doing so, we must remain aware that there will be aliasing problems if received signals possess any frequency components above 1/2T Hz as seen by the sensor. We shall further assume that all of the sensors are sampled simultaneously. Thus we can denote the output of the N sensors ss r, (nT), i = 0, ..., N - 1. As before we can form beams by weighting, delaying, and summing the sensor measurements. g(a, nT)

1 N-l

=-

L

N ;=0

where v(·) is a spectral window as discussed in [16], [29]. Because of the limits of the summation, the FFT algorithm cannot be applied directly, but the formula is easily rewritten as

-jM

21fk dnC) [21fkn] G ( M'D,n =exp

dnTc

21fkm] -jM·

As indicated in [29], the FFT may be used to calculate the above sum. The exponential term external to the summation may even be incorporated into the FFT by making use of the circular shift property [7] . Using similar reasoning we may write the short-period sensor spectrum for the ith sensor as

s,

sin a = - - d n an integer. D Consequently, only a finite number of beams may be formed. It should be mentioned here that it is possible to interpolate other beams between these beams. This is equivalent to interpolating the sensor outputs ri(nT) to a higher sampling rate (18) so that T is reduced and the inter-sensor delay d n T can have a higher num ber of possible values. Making T small, however, increases the number of samples to be processed and correspondingly, the amount of computation per unit time. For any practical beamforming processor, a lower limit for T is dictated by computation speed and the amount of available data storage. For notational convenience, in the following derivations we shall assume T = 1. This may be viewed as taking our unit of time measurement to be one sample period. Consequently, frequencies will be measured in rad/sample period rather than rad/s. Quite often, one is more interested in the time evolution of the short-period spectrum of a beam signal than in the beam signal itself. This is equivalent to passing the beam signal through a bank of bandpass filters and examining their output. Early on it was recognized that the FFT could be used to make such computations efficiently [26] , (27). Recent work in the speech area has further demonstrated an efficient way of realizing a digital filter bank using the FFT [29]. We shall proceed to derive the formula for the time evolution of a beam spectrum showing its relationship to the beam

)

m2;o u(m)g'D,m +n -exp [

w(i) 'i(nT- idnT)

where a = sin a and d n T = (D sin ca!«. Because the time variable has been discretized, d n must be an integer. This puts constraints on the values of sin a which are allowed, namely

(dnC

M-l

L

[21fkn] M-l v(m)r;(m+n) 21fk ) ( - , n =exp i rt:: r

M

M

m=O

-exp [

21fkm] -iT·

Consequently, we may write the beam spectrum in terms of the sensor spectra as G (21fk dnc

M ) D'

n) •.N N~l w(i) R. (21fk n- id ) i~ 'M ' n

-exp

[_/1T:d

n

]-

(6)

The form of the above equation also suggests an FFT, but the form of the exponential term is a bit troublesome since kidn/M is not necessarily an integer multiple of lIN. To gain more insight into the structure of the computation of the beam spectrum, we shall turn to a formulation using the 2-D short-period spectrum of the sensor signals. The usefulness of thinking in terms of the multidimensional spectrum when approaching array processing problems has been previously recognized [28] t [30]. We define the 2-D shortperiod spectrum of the sensor signals as

556

21fk 21f1 ) R(- , -, n M N

L

1 N-l

=-

N ;=0

w(i)

L

n+M-l m=n

v(m - n) Ti(m)

N+ km)] M .

. exp [ - j21f( Ii

Based on the discussion of the last section, it is easy to imagine a 2-D or 3-D array of sensors located in space with the intention of detecting and recording propagating wave distur-bances. As before we shall assume that the signals we are interested in may be adequately modeled by a plane wave propagating . with . a speed c in a direction a =a x i x + a y i Y + azlz. . . whose amplitude vanes as a function of tune s(t) if we re.. cord the wave from a single stationary sensor [23]. (The vectors ix, i y , and i z are of unit length in the direction of the x, y, and z axes.) Thus the received signal for the rth sensor located at position Pi =xji x + yiiy + zii z will be

As before this can be written as

21Tk

2nl)

R (- , - , n =exp M N

[21Tkn] N-1 M-1
;=0 m=O

. ri(m

1

-w(i)v(m)

N

+ n) exp [-

j21T(~ + k:)].

In this form, R may be evaluated using a 2-D FFT in the same manner that G was calculated by a }-D FFT. [As an aside, note that the separable window function w(i)v(m)1N could be generalized to a 2-D window w(i, m).] Using the relationship for R i , the above equation may be written more concisely

R (21Tk, M

21TI ,n) =.!.N

i¥

"i: ;=0

w(i)R; (21Tk, M

n) exp [_

a . po)

';(Pi, t) = s ( t - ~ .

N

By comparing (6) and (7), the relationship between the shortperiod spectrum of the beam signal and the 2-D short-period spectrum of the sensor signals becomes evident. First, R must be evaluated for I = kdnNIM. This value for I may not be an integer, so we are faced with the problem of interpolating between FFT points as we were in (6). The problem can be circumvented by adjusting the FFT lengths so that M divides N evenly. Second, we must make the approximation

R . (21fk

' M'

g( a, t) =

M'D'

n)

':::::!..

R (21fk 2nkd n

M'M

,n

i

»

t

+ a0

.

Co

Pi)

Making the substitution

S(t)=~JS(w)eiWfdw 2n _OG and using the definition of the wave number vector [20 I ) [30] wale we see that

k =

1 JOG . Sew) W(k- ko)e 1w t dw 21T _co

(8)

g(a,t)=-

where the multidimensional array pattern is now given by W(v)

=L

w(i) exp (-jPi . v)

i

=L w(i)

)

i

relating the short-period beam spectrum and the 2-D shortperiod spectrum.

v.

i

=~W(i)s(t- Pi' ~- ::}

n) ': : :- R '. (2nM'k n- id n ) .

dnc

L w(i) r (p ;

By returning to the defining relation for R i , the reader will readily see that this approximation requires that the shortperiod spectrum of 'i(n) be the same over the M points from n to n + M - I as it is over the M points from n - id n to n - id.; + M - 1. (Note that there is no rotating phase factor in the approximation, since both short-period spectra are referenced to the same time origin at n = 0.) Intuitively, one would expect the approximation to be valid for well-behaved signals if id.; «M. The approximation will be exact if 'i(n) is periodic with a period of M samples. With these two points in mind, we can write

C(21Tk

a;

(Note that a is a unit length vector so that a; + a~ + = 1.) The quantity (a/c) is often called the "slowness" vector sin.ce it points in the direction of propagation with a magnitude of one over the speed [30]. If we want to look for signals coming from a particular direction ao at a speed co, we can add up the weighted and delayed sensor signals

j 2nli]. (7)

MULTIDIMENSIONAL BEAMFORMING

In the previous section, the fundamental techniques for processing signals received by a linear array of sensors was described. The advantages of spacing the sensor locations a uniform distance from one another were discussed, namely that a uniform spacing allows one to take full advantage of signal processing techniques (filter design algorithms, FFT's, etc.) developed for sampled-data systems where the sampling period is uniform. In this section, we shall outline the similarities between multidimensional array processing and the techniques of multidimensional signal processing. The primary emphasis will be to point out how existing techniques could be applied to the design of beamformers and how problems in multidimensional array processing can be approached by reformulating them in ways easily understood by 2-D signal processing researchers.

exp [- j(vxxi + vyYi + vzli)]

0

As pointed out by Kelley (20] , a wide-band signal processed by a beamforming operation will be altered since the argument of array pattern function depends on frequency. Equation (8) demonstrates this. It has the same interpretation as its counterpart (4), namely that the signal spectrum S is weighted by the array pattern W, which is a function of the difference in wave-number vectors k - k o , and implicitly the sensor positions Pi. The multidimensional beamforming operation is therefore performing the task of a multidimensional bandpass filter in frequency-wave number space, rejecting signals whose direction and speed of propagation are sufficiently different from the bandpass center. Contrast this to the processing method of Halpeny and Childers [301, where a spectrum analysis approach is used to the same end. Let us now assume that the sensors are located on a Cartesian grid. For simplicity, we shall further assume that the intersensor spacing is D in all dimensions. In general, the spacing can be different for each dimension. The sensors will be indexed on n x , ny, n z , and their positions will be

557

p(n x, ny, n z ) = (n x Di x + "» Diy + nzDiz).

The array pattern becomes W(v)

=2: 2: 2:

nx ny nz

w(n x, ny , n z)

exp [-jD(n x II x +n y v y +n z liz»)'

VI.

(9)

DESIGNING MULTIDIMENSIONAL DIGITAL ARRAY PATTERNS

In this section we shall borrow multidimensional FIR filter design techniques from the discipline of digital signal processing and apply them to the problem of designing an array pattern when the sensors are positioned at points on a Cartesian grid . As before, the array pattern should have a narrow central lobe and small sidelobes. Ideally, it would be an impulse. A variety of design techniques for 2-D FIR filters (which can be extended to higher dimensions) are reviewed in [8] , but we shall restrict ourselves to two techniques which should suffice in most cases. The easiest way to design a multidimensional array pattern is to consider separable solutions of the form

v,

Fig. 3. Contours fOT McClellan's transformation with [6 J, [24 I).

0 (after

Vz

- .'

then w(n x , ny, n;:) = wx(n x) wy (n y) w:(n;:) .

V

Co nseq uently the problem has been broken in a number of 1-0 design problems which may be solved as indicated earlier. A separable design technique is suited to the symmetry of the Cartesian grid (resulting in central lobes which are roughly rectangular) but not necessarily suited to the desired array pattern. In particular, for certain applications, it may be desirable to have an array pattern which exhibits circular (or spherical) symmetry. In the past this has been accomplished by using sensors arranged in circular (more accurately . polygonal) arrays [19]-[ 22]. To design the sensor weight s in general, an optimization must be applied to (9) to force W to approximate some ideal array pattern. However, an ingenious technique due to McClellan [61 was developed to design 2-D linear phase FIR filters from 1-0 linear phase FIR filters with nearly circular symmetry. McClellan's technique can be easily extended to higher dimensions as shown in the example below , and thus applied to the beamformer design problem when the sensors are equally spaced on a Cartesian grid. Let us assume that we have designed a 1-0 zero-phase array pattern (by zero-phase, recall that we mean that the sensor weights have symmetry) N-I

=0, .. ., -2-'

wU) = w(-i) for i

We shall further assume that N is odd so that the array pattern can be classified as a type 1 zero-phase FIR filter [ 14] . Then (N -1)(2

W(II)

= L

;=0

where a(O) =w(O) , and a(i) = 2w (i) for i = 1,2,· ·· , (N - 1)/ 2. Following McClellan's derivation, W can be written as W(II)

=

L

(N-l)(2

;=0

a' U) [cos

. II]'

(10)

by using the appropriate trigonometric identities. Now, for

Jr

2

I

4

I

I

I

I ".

..'

j ..'

.

.···· --8,v ' v e u Y

J:

,/~....,

- - - 8=

••~ •.,.

•

VI

IDEAL

".

Z

=uy , u z =0

8, v,. u y

'

v, e 0

".

T

4'

8

Fig. 4. Deviat ion from spherical symmetry of a 3·D McClellan transformation. The dotted line represents the ideal and the actual mapping along the axes. The dashed line represents the mapping along the path Vx = v y in the Vz = 0 plane. The solid line represents the mapping along the path V x = vy =v z .

the 3-D case, we make the substitution cos II = 1/4 cos II x + 1/4 cos lI y + 1/4 cos liz

+ 1/4 cos

II x

cos lI y + 1/4 cos II x cos

+ 1/4 cos II x cos lI y cos liz -

3

II;:

+ 1/4 cos lI y cos liz

4'

(11)

Equation (11) is such that if liz = 0, the transformation of variables is identical to McClellan's 2-D circular transformation. Furthermore, the expression is symmetric in II x , lI y , and VzSubstituting (11) into (10) will yield the 3-D array pattern W(lI x , lI y , liz)

aU) cos vi

-

=

(N -1)/2 (N -1)(2 (N -1)(2

L

nx=O

L

ny=O

L

n z=O

a" (n x, ny, n z)

. cos n x II x cos n y lI y cos n z

liz

(12)

where the a" coefficients are derivable from a' , the transforrnation (11), and trigonometric identities. Finally the sensor weights w(n x, ny, n z) can be derived from the a" coefficients by comparing (12) and (9) . A plot of constant values of II in the (lIx , lI y)-plane is shown in Fig. 3. Fig. 4 shows the fre-

558

quency variable u plotted against a parameter 8 for three paths in (v x ' vy ' vz ) - space. The dotted line represents the transformation tor e = vx , "» = uz = O. The dashed line represents the transformation for e = V x = v y, V z = O. Finally, the solid line represents the transformation for e = Vx = "v = lJz. Deviation from the dotted line is indicative of deviation from spherical symmetry along the three paths. Designing and implementing McClellan transformation filters have recently been studied in detail [24], [25]. The remarkable fact emerges that the method of deriving the 3-D weights w(n x , ny, n z ) from the }..D weights w(i) can be combined with the actual filtering operation so that the amount of computation needed to calculate the beam signals is significantly reduced (proportional to N rather than N 3 in this case). We see, therefore, that one advantage of locating sensors on a Cartesian grid is the availability of design and implementation techniques for array patterns which exhibit good circular symmetry and reduce computation. As before, in certain applications one may be more interested in short-period beam spectra rather than beam signals. If the sensor locations are on a Cartesian grid, then the FFT algorithm may be used to compute beam spectra similar to the way described in the previous section. The dimensionality of the FFT will be higher to reflect the num ber of degrees of freedom in direction-frequency space a signal may have. The use of multidimensional filter design techniques (in particular, McClellan's transformation) and multidimensional filter banks using FFT's represent two important examples of results from the field of digital signal processing which can be applied to digital beamforming and array processing. The reader should bear in mind that the preceeding discussion is more an illustrative than comprehensive presentation of digital signal processing techniques applied to array processing problems. Much work remains to be done. ACKNOWLEDGMENT

The author wishes to express his thanks to H. Briscoe and R. Estrada of BBN for several educational discussions of practical beamforrning systems. REFERENCES [ 1) P. E. Blankenship and E. M. Hofstetter, "Digital pulse com pression via fast convolution," IEEE Trans. Acoust. Speech, Signal Processing, vol. ASSP-23, pp. 189-201, Apr. 1975. (2] L. C. Wood and S. Treitel, "Seismic signal processing," Proc. IEEE, vol. 63, pp. 649-661, Apr. 1975. (3] F. J. Harris, "A maximum entropy filter," Naval Undersea Center, Rep. TP 441, Jan, 1975. [4] K. R. Erikson, F. 1. Fry, and 1. P. Jones, "Ultrasound in Medicine-A Review," IEEE Trans. Sonics Uttrason., vol. SU-21, pp. 144-170, July 1974. (5) L. J. Pinson and D. G. Childers, "Frequency-wavenumber spectrum analysis of EEG multielectrode array data," IEEE Trans. Biomed. Eng., vol. BME-21, pp. 192-206, May 1974.

(6) 1. H. McClellan, "The design of two-dimensional digital filters by transformations," in hoc. 7th Annu. Princeton Conf. Information Sciences and Systems, 1973, pp. 247-251. [7] A. V. Oppenheim and R. W. Schafer, Digital Signal Processing. Englewood Cliffs, NJ:' Prentice-Hall, 1975. [8] R. M. Mersereau and D. E. Dudgeon, "Two-dimensional digital filt ering," Proc. IEEE, vol. 63, pp. 610-623, Apr. 1975. [9] L. R. Rabiner and B. Gold, Theory and Application of Digital Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1975. [10 ) B. Gold and C. M. Rader, Digital Processing of Signals. New York: McGraw-Hill, 1969. [ 11 ] W. T. Cochran et al., "What is the fast Fourier transform?," IEEE Trans. Audio Electroacoust., vol. AU-IS, pp. 45-55, June 1967. (12) G. D. Bergland, "A guided tour of the fast Fourier Transform," IEEE Spectrum, vol. 6, July 1969. [13] J. H. McClellan and T. W. Parks, "A unified approach to the design of optimum FIR linear phase digital filters," IEEE Trans. Circuit Theory, vol. CT -20, pp. 697-701, Nov. 1973. [14 ) L. R. Rabiner, J. H. McClellan, and T. W. Parks, "FIR digital filter design techniques using weighted chebyshev approximation," hoc. IEEE, vol. 63, pp. 595-610, Apr. 1975. [151 1. H. McClellan, T. W. Parks, and L. R. Rabiner, "A computer program for designing optimum FIR linear phase digital filters," IEEE Trans. Audio Etectroacoust., vol. AU-21 , pp. 506-526, Dec. 1973. [16 ) G. M. Jenkins and D. G. Watts, Spectral Analysis and its Applications. San Francisco, CA: Holden-Day, 1968, Ch. 7. [17] J. P. Burg, "Three-dimensional filtering with an array of seismometers," Geophysics, vol. 29, no. 5, pp. 693-713, 1964. [18] R. W. Schafer and L. R. Rabiner, "A digital signal processing approach to interpolation," hoc. IEEE, vol. 61, pp. 692-702, June 1973. [19] J. Capon, R. J. Greenfield, and R. T. Lacoss, "Design of seismic arrays for efficient on-line beamforming," Lincoln Lab. Tech. Note 1967-26, June 27, 1967. (20) E. J. Kelly "Response of seismic arrays to wide-band signals," Lincoln Lab. Tech. Note 1967-30, June 29, 1967. (21) J. L. Allen, "The theory of array antennas," Lincoln Lab. Tech. Rep. no. 323, July 25, 1963. [22] D. K. Cheng, "Optimization techniques for antenna arrays," hoc. IEEE, vol. 59, pp. 1664-1674, Dec. 1971. [23] E. 1. Kelley, Jr., "The representation of seismic waves in frequency-wave number space," Lincoln Lab. Tech. Note 1964-15, Mar. 6, 1964. [24] R. M. Mersereau, W. F. G. Mecklenbrauker, and T. F. Quatieri, Jr., "McClellan Transformations for two-dimensional digital filtering: I. Design," IEEE Trans. Circuits and Systs., vol. CAS-23, pp. 405-414, July 1976. [25] W. F. G. Mecklenbrauker and R. M. Mersereau, "McClellan transformations for two-dimensional digital filtering: II. Implementation," IEEE Trans. Circuit and Systs., vol. CAS-23, pp. 414-422, July 1976. [26] J. R. Williams, "Fast beam forming algorithm," Acoust. Soc. Amer., vol. 44, no. 5, pp. 4154-55,1968. [27] P. Rudnick, "Digital beam forming in the frequency domain," J. Acoust. Soc. Amer., vol. 46, no. 5, (part I), pp. 1089-1090, 1969. (28) S. Haykin and J. Kesler, "Relation between the radiation pattern of an array and the two-dimensional discrete fourier transform," IEEE Trans. Antennas Propagat., vol. AP-23, no. 3, pp. 419-420, May 1975. (29) M. R. Portnoff, "Implementation of the digital phase vocoder using the fast Fourier transform," IEEE Trans. Acoust. Speech, Signal Processing, vol. ASSP-24, pp. 243-248, June 1976. [30] O. S. Halpeny and D. G. Childers, "Composite wavefront decomposition via multidimensional digital filtering of array data,t' IEEE Trans. Circuits Systs., vol. CAS-22, pp. 552-563, June 1975.

559

I

A Novel Algorithm and Architecture for Adaptive Digital Beamforming CHRISTOPHER R. WARD, PHILIP J. HARGRAVE,

A bstract-A novel algorithm and architecture are described whicb have specific application to high performance, digital, adaptive beamformiDg. It is shown how a simple, linearly constrained adaptive combiner forms the basis for a wide range of adaptive antenna subsystems. The function of such an adaptive combiner is formulated as a recursive least squares minimization operation and tbe corresponding weight vector is obtained by means of the Q - R decomposition algoritbm using Givens rotations. An efficient pipelined architecture to implement this algoritbm is also described. It takes the form of a triangular systolic/wavefront array and has many desirable features for very large scale integration (VLSI) system design.

T

I.

INTRODUCTION

HE OBJECTIVE of an adaptive antenna is to select a set of amplitude and phase weights with which to combine the outputs from the elements in an array so as to produce a farfield pattern that, in some sense, optimizes the reception of a desired signal. The substantial improvements in system antijam performance offered by this form of array processing has meant that it is now becoming an essential requirement for many military radar, communications and navigation systems. The key components of an adaptive antenna system are illustrated in Fig. l(b). The amplitude and phase weights are selected by a beampattern controller that continuous~y updates them in response to the element outputs. In some systems the output from the beamformer is also monitored to provide a feedback control. In all cases the resulting array beampattern is continuously adjusted to ensure cancellation of interference and jamming sources. The most commonly employed technique for deriving the adaptive weight vector uses a closed loop gradient descent algorithm where the weight updates are derived from estimates of the correlation between the signal in each channel and the summed output of the array. This process can be implemented in an analog fashion using correlation loops [1] or digitally in the form of the Widrow least mean square (LMS) algorithm [2]. The value of this approach should not be underestimated. Gradient descent algorithms are very cost-effective and extremely robust but unfortunately they are not suitable for all applications. The major problem with an adaptive beamformer based on a gradient descent process is one of poor convergence for a broad dynamic range signal environment. This constiManuscript received June 5, 1985; revised October 4, 1985. This work was supported by the Procurement Executive, U.K. Ministry of Defence.. . C. R. Ward and P. J. Hargrave are with Standard Telecommunication Laboratories Ltd., London Road, Harlow, Essex, U.K. eMl? 9NA. J. G. McWhirter is with Royal Signals and Radar Establishment, St. Andrews Road, Great Malvern, Worcestershire. U.K .. WR14 3PS. IEEE Log Number 8407019.

AND

JOHN G. McWHIRTER

tutes a fundamental limitation for many modern systems where features such as improved antenna platform dynamics (in the tactical aircraft environment, for example), sophisticated jamming threats and agile waveform structures (as produce; by frequency hopped, spread spectrum formats) produce a requirement for adaptive systems having rapid convergence and high cancellation performance. In recent years, there has been considerable interest in the application of direct solution or "open loop" techniques to adaptive antenna processing in order to accommodate these increasing demands. In the context of adaptive antenna processing, these algorithms have the advantage of requiring only minimal input data to accurately describe the external environment and provide an antenna pattern capable of suppressing a wide dynamic range of jamming signals. Open loop algorithms may be explained most concisely by expressing the adaptive process as a least squares minimization problem. In fact, the least squares algorithm may be considered to define the optimal path of adaptation. In this paper we describe a novel algorithm and architecture for high performance, digital, adaptive beamforming. The adaptive combiner function is formulated as a recursive least squares minimization process and the corresponding set of linear equations is solved using the Q - R decomposition algorithm. It is further shown how the Q - R algorithm can be implemented using an efficient pipelined architecture in the form of a triangular. systolic array.

II.

BASIC CONFIGURATIONS

The form of adaptive combiner which we consider in this paper is illustrated in Fig. 1(b). The inputs to the combiner take the form of a primary signal y(t) and set of N - 1 (complex) auxiliary signals x(t). The weight vector w is adjusted to minimize the power of the combined output signal which is given by e(t) = x T(t)w + y(t).

(1)

This type of adaptive linear combiner may be used in a wide range of adaptive antenna applications. It is well known, for example, how it may be applied to adaptive sidelobe cancellation. In this case the primary signal constitutes the output from a main (high gain) antenna while the auxiliary signals are obtained from an array of N - 1 auxiliary antennas. The adaptive combiner serves to modify the beampattern of the overall antenna system by directing deep nulls toward jamming waveforms received via the sidelobes of the main antenna.

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. AP-34, No.3, pp. 338-346, March 1986.

560

?RIMARY

I NPU T . yll l

'UXiU ,, ,m INPU TS

: ~ :

~1 11

~

OUTPUT elll:{ (t)'!{+ ylll I

FEEDBACK

, SIGNAL _ J

(a)

Fig . I.

I

(b)

Key component s of an adaptive antenna processor . (a) Constraint preprocessor. (b) Adaptive combiner.

maintains a constant value J.I. in a given look direction specified by the vector c. It is worth pointing out that the " end-element clamped" configuration described above constitutes a particularly simple form of linearly constrained process in which the constraint vector is given by

It is also well known how this form of adaptive combiner may be used in conjunction with a suitable reference signal to control a more general antenna array in which all of the elements are essentially equivalent. The reference signal. which is assumed to be correlated with the desired signal. provides the (negative) primary input to the combiner while the signals received by the antenna array provide the N - I auxiliary inputs . In this case the weighted sum of the auxiliary inputs provides as close a match as possible to the reference signal and hence produces the desired output from the beamformer. The basic combiner illustrated in Fig . I(b) may also be used in the so-called " power inversion" mode which has particular application to communications . In this case the N antenna element s are assumed to be omn idirectional and of comparable gain . The received signal s are fed into the combiner. one of them going to the primary channel and thus having its weight coefficient constrained to unity . The other N - I signals enter the auxiliary channels with their adaptive weights initial ized to zero and so. prior to adaptation. the overall beampattern is determined solely by the (omnidirectional) response of the " primary element. " Thi s " end-element clamped" configuration provide s no inherent mechanism to inhibit the adaptive process from null ing the desired signal. However. the syste m is only allowed to adapt when the desired signal is known to be absent. When it is present. the weight vector is frozen thus allowing signal reception . Thi s is referre d to as the " power inversion" mode of operation becau se the differential interference power s received by the antenna elements are inverted by the combiner. A particularly important application of adaptive antenna arrays requires the power of an N clement combined signal

CT =

(4)

However. the incorpor ation of a general linear constraint is not so straightfo rward . A number of techniques have been proposed in the literature but in all cases the resulting implementation is extremely cumbersome. For example. Widrow [3] et 01. suggested the injection of an art ificial look direction signal into the antenna array receiver channels and introducing a corresponding reference signal into the adaptive process . This technique then requires an additional " slave processor" to apply the adapted weight vector . Frost [4] also showed how a general linear constraint could be incorporated into the adaptive process using projection operator techniques but the resulting algorithm is rather expensive in terms of computation . We will now show how the general linear constraint in (3) may be incorporated in a much simpler way. It may be assumed without loss of general ity that eN = 1 and so (3) may be expressed as (5)

where c and w denote the first N - I elements of the vectors C and w, respectively. Equations (2) and (3) can therefore be combined in the form

(6)

(2) to be minimized subject to a linear beam constraint of the form

(3) This constraint ensures that the gain of the antenna array

(0, 0------0 I).

where itt) denotes the vector of signals received by the first N - 1 channels of the N element array . Since the constraint has been absorbed explicitly by eliminating the coefficient WN and thereby removing the Nth degree of freedom, the power of the combined signal e(t) may now be minimized with respect to the unconstrained N - 1 element weight vector oW. The form of (6) is therefore identical to that of (1) and so the output

561

power minimization may be carried out using the type of adaptive combiner illustrated in Fig. l(a). The term + p,XN(t) corresponds to the primary signal y(t) while the transformed vector x(t) - XN(t)C corresponds to the vector of auxiliary signals x(t). This input data transformation may be implemented using a simple linear preprocessor array of the type depicted in Fig. l(a). In effect the Nth antenna signal is arbitrarily chosen as the primary combiner input. The corresponding antenna element is assumed to have omnidirectional coverage and the constraint preprocessor ensures that any signal which enters it from the required look direction is removed from the auxiliary channels before they enter the combiner. The adaptive nulling of this look direction signal is thus prevented. From the discussion in this section it should be clear that the type of adaptive combiner illustrated in Fig. l(b) has a wide range of applications in adaptive beamforming. In the remainder of this paper we concentrate on the development of a novel direct solution adaptive control technique which applies specifically to this basic configuration. III.

LEAST SQUARES MINIMIZATION

The function of the adaptive combiner in Fig. l(b) will now be formulated in terms of least squares minimization. We denote the combined array output at time t, by e(t i )

= x T(t,)W + y(t,)

For the sake of generality this unnormalized estimator includes a simple "forget factor" fJ which generates an exponential time window and localizes the averaging procedure. Introducing a more compact matrix notation the estimator defined in (8) may be expressed in the form

= I e(n)11

(9)

where e(/.) e(n)

= B(n)

e(t2)

(10)

e(t n )

and

B (n) = diag

{~n - I, ~ n- 2,

•.• ,

I}

(11)

with (32 = o. Now from (7) it follows that the vector of residuals may be written in the form e(n)

= X(n)w + y(n)

x T(t l)

X(n)=B(n)

(12)

x T(t2)

(13)

X T(t n )

and y(n) = B(n)

y(t l) y(t 2)

(14)

y(t n )

X(n) is simply the matrix of all data received by the weighted elements up to time In and y(n) is the corresponding vector of data in the primary or reference channel. The matrix B(n) takes account of the exponential time window and, for convenience, it has simply been absorbed into the definition of e(n), y(n) and X(n). Determining the weight vector w(n) which minimizes E1(n) is referred to as least squares estimation [5]. The conventional approach to this problem is to deri ve an analytic expression for the complex gradient of the quantity E~(n) and determine the weight vector w(n) for which it vanishes. Now from (9) and (12) we have for the complex gradient

V w(E2(n»

(7)

where x(t;) is the vector of (complex) auxiliary signals at time t; and y(t i ) is the corresponding sample of the (complex) primary signal. The residual signal power at time t n is estimated by the quantity E2(n) where

E(n)

where

= 2X H(n)(X(n)w + y(n)

(15)

and setting the right side of this equation equal to zero leads to the well-known Wiener-Hopf equation: M(n)w(n) + p(n) = 0

(16)

M(n) = XfI(n)X(n)

( 17)

where

is the (estimated) covariance matrix and p(n)

= X H(n)y(n)

(18)

is the estimated cross-correlation vector. The solution to (16) for nonsingular M(n) is clearly given by w(n) == - M -l(n)p(n)

(19)

and this provides an analytic expression for the optimum weight vector at time t ; In their classic paper, Reed, Mallet, and Brennan [6] suggested that the weight vector be obtained by solving (16) directly and showed that the problems of poor convergence associated with closed loop algorithms may be avoided in this way. This approach leads directly to the type of signal processing architecture which is illustrated schematically in Fig. 2. It comprises a number of distinct components-one to form and store the covariance matrix estimate, one to compute the solution of (16) and one to apply the resulting weight vector to the received signal data. These data must be stored in a suitable memory while the weight vector is being computed. The system also requires a number of high speed data communication buses and a sophisticated control unit to deliver the appropriate sequence of instructions to each

562

where R(n) is an (N - 1) by (N - 1) upper triangular matrix. Then, since Q(n) is unitary we have

PRIMARY CHANNEL

AUXILIARY CHANNELS

z

E(n) =

W....,

\I e(n)\I

=

II Q(n)e(n)\I

ARRAY

OUTPUT

=

II (R~n»)

wen)

+

(:~:~) II

(23)

where u(n) = P(n)y(n)

and SOLVE

t ~ "!. · -t't Fig . 2 .

yen) = S(n)y(n).

WE IGHT VECTOR

pen) and Sen) are simply the matrices of dimension (N - I) by nand (n - N + I) by n, respectively , which partition Q(n) in the form

Sample matrix inversion architecture .

component. This type of architecture is obviously complicated, extremely difficult to design and not very suitable for very large scale integration (VLSl) . Not only does the analytic solution given in (16) lead to a complicated circuit architecture. it is also very poor from the numerical point of view . The problem of solving a system of linear equations like tho se defined in (16) can be illconditioned and hence numerically unstable . Ill-conditioning occurs if the matrix has a very small determinant in which case the true solution can be subjected to large perturbations and still satisfy the equation quite accurately . The degree to which a system of linear equations is ill-conditioned is determined by the condition number of the coefficient matrix . The condition number of a matrix A is defined by

) . Q(n)= (p(n) Sen)

and so the condition number of the estimated covariance matrix M(n) is much greater than that of the corresponding data matrix X(n). Any numerical algorithm which avoids forming the covariance matrix expl icitly and operates directly on the data is likely to be much better conditioned. DECOMPOSITION

An alternative approach to the least squares estimation problem which is particularly good in the numerical sense is that of orthogonal triangularization [7]. This is typified by the method known as Q - R decomposition which we generalize here to the complex case. An n X (I unitary 1 matrix Q(n) is generated such that

A matrix A is defined in this paper as being unitary if AHA is termed onhogonal if ATA = I. 1

R(n)w(n) + u(n) = 0

(26)

E(n) = IIv(n)lI .

(27)

and hence

where AI and Ap are the largest and smallest singular values. respectively, of the matrix A . The larger Cn(Al . the more illconditioned is the system of equations . It follows from (17) that Cn(M(n» = Cn(X H(n)X(n» = Cn 2(X(n» (21)

Q(n)x(n)=(R~»)

(25)

It follows that the least squares weight vector wen) must satisfy the equation

(20)

IV. Q - R

(24)

(22)

= I. Matr ix A 563

Since the matrix R(n) is upper tr iangular. (26) is much easier to solve than the Wiener-Hopf equation described earlier. The weight vector w(n) may be derived quite simply by a process of back-substitution . Equation (26) is also much better conditioned since the condition number of R(n) is given by Cn( R(n» = Cn(Q(n) X(n» = Cn(X(n».

(28)

This property follows directly from the fact that Q(n) is unitary .

Givens Rotations The triangularization process may be carried out using either Householder transformations [7] or Givens rotations [8], [9], [10]. However the Givens rotation method is particularly suitable for the adaptive antenna application since it leads 1O a very effic ient algorithm whereby the triangularization process is recursively updated as each new row of data enters the problem . A complex Givens rotation is an elementary transformation of the form

(~s ~*) (g ..,' g: ~: ::::: '.'. ,) _ ( 0 , .. 0,

-

r;· .. r;

0 .. · 0, 0 ...

x; .

)

(29)

where the rotation coefficients, c and s, satisfy

-s . r;+c . x;=O s*s+c*c= I c*=c

(30)

and are synonymous with the cosine and sine of an angular rotation in the multidimensional complex space. These relationships uniquely specify the rotation coefficients as

c

(31)

.JXixi+riri

and

Direct Extraction of Residuals

Xi s=_· c.

(32)

r,

A sequence of such elimination operations may be used to triangularize the matrix X(n) in the following recursive manner. Assume that the matrix X(n - 1) has already been reduced to triangular form by the unitary transformation

Q(n-I)X(n-l)= ( R(n-l» 0 ).

(33)

Now define the unitary matrix

(Q(n-l) I~) . Q(n-l)=

o

(34)

11

Clearly

(I3X(n-l») Q(n-I)X(n)=Q(n-l) xT(tn)

Q(n)e(n)

(35)

and so the triangular process may be completed by the following sequence of operations. Rotate the N - 1 element vector x T(t n ) with the first row of (3R(n - 1) 'so that the leading element of x T(l n ) is eliminated producing a reduced vector x r'(l n ) . The first row of R(n - 1) will. of course. be modified in the process. Then rotate the (N - 2)-element reduced vector x T' (In) with the second row of 13R(n - 1) so that the leading element of x T, (In) is eliminated and so on until every element has been eliminated. The resulting triangular matrix R(n) then corresponds to a complete triangularization of the matrix X(n) as defined in (:2~). The corresponding unitary matrix Q(n) is simply given by the recursive expression

Q(n) = Q(n)Q(n - 1)

(36)

where Q(n) is a unitary matrix representing the sequence of

Givens rotation operations described above, i.e.,

.- (,BR(n-l») (R(n») 0 0 . =

XT(/ n )

-0-

(

1») == ( I3v(nu(n)_ 1) )

,Bu(n (jv(n - 1) y(tn )

a(n)

= (R~n)) w(n) +

u(n)

(

f3v(n - 1)

)

(39)

a(n)

and the weight vector must satisfy (26), it follows that the residual vector e( n) is given by

Q(n)e(n) = Q(n)Q(n - 1)B(n)

e(t l) e(t2)

= (f3v(nO- 1)). a(n)

(40) But Q(n) is unitary and so we have

e(ll)) e(t 2

= QH(n) A

(

e(t n )

) 0 (3v(n - 1) . a(n)

(41)

Considering only the nth element of the vectors in (41) it is then possible to deduce that the current residual e(t n ) is given by

e(tn ) = l'(n) . a(n) (37)

(42)

where N-I

It is not difficult to deduce in addition that

Q(n)

In many least squares problems, and particularly in the adaptive antenna application, the main objective is to compute the least squares residual since the corresponding weight vector is not of direct interest. Previous work by McWhirter [11] has described a modified version of the Q - R recursive least squares algorithm in which the least squares residual is produced directly at each stage of the recursive process without any need to derive the weight vector explicitly. The modified algorithm is much more robust since it avoids the solution of a system of linear equations which could be illconditioned. Furthermore, since the back-substitution circuit and the separate beamforming network are both eliminated, it offers a significant reduction in the complexity of the subsequent hardware implementation. The derivation of this technique may be summarized as follows. Since

=(f3R(;-I») x T(t n )

Q(n)

and this shows how the vector u(n) can be updated recursively using the same sequence of Givens rotations. The least squares weight vector w(n) may then be derived by solving (26). The solution is not defined, of course, if n < (N - 1) but the recursive triangularization procedure may, nonetheless, be initialized by setting R(O) = 0 and u(O) = O.

l'(n) =

= (u(n») v(n)

(38)

564

II c, ,=1

(43)

is the product of all cosine parameters generated during the sequence of Givens rotations used to elimina!e th~ ~ector xT(tn ) . Equation (43) follows from the fact that Q(n) IS SImply

the product of (N - 1) elementary rotations of the form

Q(n)

==

Q -1 (n ) Q N

N - 2(

n) . . .

Q1(n ).

( 44)

The ith elementary rotation is simply given by

o

o

Cl . . . . .

Sf (45)

: 1. 1 : rr

S; .•... C;

.where the only nonzero off-diagonal elements occur in the ith row and the ith column. The result may be obtained by considering the effect of a reversed sequence of conjugate elementary rotations on the nth element of the right-hand vector in (41). The parameter yen) may readily be computed during the recursive update of the matrix R(n) while the scalar quantity a(n) is available as a direct byproduct of the corresponding update for the vector u(n). The current residual e(t n ) may therefore be evaluated in a very cost-effective manner. In order to avoid complicating this discussion on adaptive beamforming, we have only considered the most direct form of the Givens rotation algorithm. However, it is important to point out that a very efficient "square root free" Givens algorithm has been derived by Gentleman. The square root free algorithm is equally applicable to the type of adaptive beamformer described in this paper and would almost certainly be used in any practical application. The essential details relating to its use may be found in [10] and [11].

Sensitivity to Arithmetic Precision An important aspect of any signal processing algorithm is its sensitivity to limited arithmetic precision. We have recently carried out a detailed computer simulation study to compare the effect of limited precision on the performance of two adaptive cancellation processors-one based on sample matrix inversion and the other on the recursive Q - R algorithm. The results indicate quite distinctly the improved performance offered by the data domain Q - R method under conditions of finite resolution arithmetic compared with the sample matrix inversion technique. Fig. 3(a) shows a simple schematic representation of the two computer simulations. In both cases the sequence of data samples was generated and applied to the constraint preprocessor. The preprocessor applied a look direction constraint toward the desired signal and was implemented at full computer precision. The transformed data were then truncated to the chosen arithmetic precision, this word length being retained throughout subsequent Q - R decomposition or sample matrix inversion computation. To ensure a fair comparison between the two basic approaches (i.e., covariance versus data domain), the effective sample matrix inversion solution was actually computed by performing a Q - R decomposition on the covariance matrix estimate in (16). In both cases the back-substitution was performed at full computer precision.

Fig. 3(b) shows a typical comparative result which corresponds to a 24-bit floating point word length (16-bit mantissa and eight-bit exponent). Here, we plot the expected signal-tonoise ratio at the output of an eight-element array as an increasing number of data samples are used to compute the adapted weight vector solutions. In this example, we have modelled the effect of three equal power jamming signals received individually at levels of 0 dB relative to a thermal noise floor of - 50 dB at the antenna array elements. The complex envelope of each jammer was described by an independent, narrow-band Gaussian process. The model also incorporated a desired signal received by the array at a level of 15 dB above the thermal noise floor but approximately 40 dB below the total received jamming. From Fig. 3(b) it can be seen that the initial rate of adaptation is extremely rapid for both sample matrix inversion and the data domain Q - R algorithm. In both cases a good level of jamming cancellation is obtained after about ten to 20 data samples. However, with sample matrix inversion there is clear evidence of an unstable weight vector as reflected by extreme fluctuations in the adaptive response curve. In contrast, the data domain Q - R method shows no sign of numerical instability and it is found that, over the timescale shown on these plots, the signal-to-noise ratio performance gets progressively better as the covariance information (in the form of the updated R matrix) gains more and more statistical accuracy with time. For this scenario it was found that the sample matrix inversion technique required a floating point word length of 32 bits (24-bit mantissa and eight-bit exponent) to achieve comparable performance with the data domain Q- R algorithm. It cannot be assumed, of course, that this word length would be sufficient for any arbitrary dynamic range environment. One should only conclude that the word length required by the sample matrix inversion approach will always be significantly greater than that for the data domain Q- R method. V.

SYSTOLIC ARRA Y IMPLEMENTATION

Kung and Gentleman [12] have shown how the Givens rotation algorithm described above may be implemented in a very efficient pipelined manner using a triangular systolic array. The implementation of a five-channel adaptive beamforming network using this architecture is shown in Fig. 4. It may be considered to comprise three distinct sections-the basic triangular array labeled ABC, the right hand column of cells labeled DE and the final processing cell labeled F. The entire array is controlled by a single clock and comprises three types of processing cell. Each cell receives its input data from the directions indicated on one clock cycle, performs the specified function and delivers the appropriate output values to neighboring cells as indicated on the next clock cycle. Apart from the introduction of an extra parameter into the boundary cell, the function of the boundary and internal cells is precisely that required to implement the Givens rotations described above. Each cell within the basic triangular array stores one element of the recursively evolving triangular matrix R(n) which is initialized to zero at the outset of the least squares

565

.

0

N=B

~

.

3 JAMMERS AT 000 DESIRED SIGNAL AT ·3500 THERMAL tJOISE FLOOR AT -5006

~

Covanance Domain

~

.

0

Full Precision

"iii

:o:!0

Vl, 0:: 2 0

Limited precision,-----",;;\;;;;-_-.,

l':: ,

Processing

COVARIANCE-------

,

DOMAI N

~ <:>

-to

10

20

30

40

50

60

NO. OF OATA SAMPLES

70

80

90

100

Full Precision (b)

(a)

Fig. 3.

Comparison of data and covariance domain algor ithms. (a) Simulation flow diagrams. (b) Typical signal-to-noise ratio response. I NP U T DATA I

I

II

WAVEFRONTS

t

I I

I

I

I

I I I I

I I

X22 X12

X21 Xl1

I I I I I

I I I

I

I I

I

I I I

I

X23 X 13

Y2 Yl

X24 X14 I I I I I I

I I I I I I I I

, I

I

I

I

INTERNAL CELL x,o(k)

C,o (k)~Cout(k)

-r;.?: S~,(k)

S,o(k)

xoul(k)

~(k) = - S,n(k). r(k-l)

+ C,n(k) x,o(k)

BOUNDARY CELL

I I I I

r(k) = / 1x.n(k)I' + Ir (k-1

I I

s"",(k ) =

c"",(k)

r(k)

C~(k) .

r(k-l)

+ S~(k) . x~(k)

E

)1'

C"",(k) = C~(k)

r (k-1)

F

= r(k) x,o(k)

S"",(k) = Sn(k)

BEAMfORMED

r(k)

Residual

I 8"",(k) = 8,n(k). coul(k)

Fig. 4.

=

Triangular systolic array for adaptive beamforrning.

calculation and then updated every clock cycle. As a result of this initialization the value of rn within each boundary cell is entirely real. Cells in the right-hand column store one element of the evolving vector u(n) which is also initialized to zero and updated every clock cycle. Each row of cells within the array performs a basic Givens rotation between one row of the

stored triangular matrix and a vector of data received from above so that the leading element of the received vector is eliminated as detailed in (29). The reduced data vector is then passed downwards through the array. This arrangement ensures that as each row x T(tn) of the matrix X moves down through the array it interacts with the previously stored

566

triangular matrix R(n - 1) and undergoes the sequence of rotations Q(n) described in the earlier analysis. All of its elements are thereby eliminated (one on each row of the array) and an updated triangular matrix R (n) is generated and stored in the process. As each element of the vector y moves down through the right hand column of cells it undergoes the same sequence of Givens rotations interacting with the previously stored vector u(n - 1) and generating an updated vector u(n) in the process. The resulting output, which emerges from the bottom cell in the right-hand column, is simply the value of the parameter a(n) in (42). The other value ')'(n) required for direct computation of the least squares residual e(t n) is generated recursively by the additional parameter ')' which appears in the definition of the boundary cell function. The value of l' (initialized to one) is simply multiplied by the "cosine" parameter in each boundary cell and passed on to the boundary cell in the next row two clock cycles later. The extra delay, which is a direct consequence of the temporal data skew, may be achieved by using an additional storage element which is indicated by a black dot in Fig. 4 and would be incorporated within the boundary processor. The required value -y(n) emerges from the final boundary cell and is simply multiplied by the corresponding output value a(n) to produce the desired residual. This operation takes place within the final processing cell F. A consequence of the highly pipelined nature of the systolic array and the need to impose a time-skew on the input data is the presence of an overall delay or latency in the system response. Each output residual e(t n ) corresponds to a data vector whose first element was input to the network 2(N - 1) clock periods previously. The systolic array described in this section clearly exhibits many desirable properties such as regularity and local interconnections which render it comparatively simple to implement. Furthermore. the control overhead is extremely low since the processing cells operate synchronously and the only control required is a simple globally distributed clock. However, the need to distribute a common clock signal to every processor without incurring any appreciable clock skew is one possible disadvantage of the systolic array approach particularly in large multiprocessor systems. It is possible, however, to implement the same basic design as a wavefront array processor of the type proposed by S. Y. Kung et al. [13]. In a wavefront array processor, the required computation is distributed in exactly the same way over an array of elementary processors as it would be on the corresponding systolic array. Unlike its systolic counterpart, however, the wavefront array does not operate synchronously. Instead, the operation of each processor is controlled locally and depends on the necessary input data being available and on its previous outputs having been accepted by the appropriate neighboring processors. As a result, it is not necessary to impose a temporal skew on the data input to a wavefront processor. Instead the associated processing wavefront develops naturally within the array. In order to operate in the wavefront array mode, every processing element must incorporate some additional circuitry to implement a bidirectional handshake on each of its input/output links and thus ensure that the necessary

communication protocol is observed. This represents an overhead which is not negligible but can easily be absorbed within the overall processing.

Obtaining the Weight Vector It is worth pointing out that, as well as being capable of operating in the direct, beamforming mode, the triangular array in Fig. 4 can also be used in conjunction with some additional circuitry to compute the weight solution explicitly. The scheme which was originally proposed by Kung and Gentleman [12] uses the triangular systolic array in conjunction with a linear systolic array which solves for the weight vector by back-substitution. This method could clearly be used with the circuit in Fig. 4 by providing suitable means for extracting the triangular matrix R(n) from the array. However, the weight vector, if required, can be obtained in a much simpler way as a further byproduct of the direct residual extraction technique. The method, which we refer to as "weight flushing" may be explained fairly simply as follows. As the nth data vector x(t n ) and the corresponding input y(t n ) pass through the triangular array in Fig. 4 they update the parameters of the system from their state at time n - 1 to the new state at time n. The vector x (t n) also undergoes a simple linear projection with the implicit updated weight vector w(n) to produce the corresponding output residual (46) Assume that the state of the system is subsequently , 'frozen" by preventing any further adaptation and define a simple N - 1 element projection vector of the form

:=

(0 ... 010 ... 0)

(47)

with unit ith element. If the vector q,j is now input to the array as though it were another vector of auxiliary samples and the corresponding primary input is set equal to zero it follows from (46) that the associated output "residual" must be given hv

cPTw(n) = w;(n).

(48)

It is therefore possible to "flush" the entire weight vector w(n) out of the array by inputting to the N - 1 auxiliary channels the sequence of vectors q,; (i = 1, 2, '.', N - 1)

i.e., by inputting a simple unit diagonal matrix. For the sake of brevity in this paper we have not explained in detail how the adaptive process may be "frozen" in practice. However, the technique is quite straightforward and may be implemented in a very direct manner. It is particularly simple when the square root free Givens rotation algorithm is being used.

567

VI.

CONCLUSION

This paper has described a novel algorithm and associated systolic/wavefront array architecture for high performance, digital, adaptive beamforming. The adaptive beamformer enjoys all the desirable architectural features of a systolic or wavefront array. As each row of data moves down through the

array it is fully absorbed into the statistical estimation process, the triangular matrix R(n) is updated accordingly and the corresponding residual is produced automatically. The circuit architecture is greatly enhanced by avoiding the need to derive an explicit solution for the least squares weight vector W (n). This leads to a considerable reduction in the amount of computation and circuitry required since it is no longer necessary to clock out each triangular matrix R(n), carry out the back-substitution or form the vector product x T(tn ) W(n) explicitly. The adaptive bearnformer described in Sections IV and V is also based on a very stable and well-conditioned numerical algorithm. Indeed the method of Q - R decomposition by Givens rotations is widely accepted as one of the very best techniques for solving linear least squares problems. However the final triangular linear system may, in general, be illconditioned and avoiding the back -substitution process also enhances the numerical properties of the adaptive combiner. In particular the systolic array implementation of the Q - R algorithm produces the correct (zero) residual even if n < (N - 1) and the matrix X is not of full rank. This sort of unconditional stability is most important in the design of real time signal processing systems. As part of the United Kingdom's research program into advanced algorithms and architectures for adaptive antenna array signal processing, Standard Telecommunication Laboratories and the Royal Signals and Radar Establishment are developing jointly an experimental wavefront array processor. This digital processor will be configured primarily as an adaptive antenna test-bed and will have the ability to process six input channels of data in real-time. Each node of the wavefront array processor will be based on an existing digital signal processor chip and hence will provide a useful degree of programmability whilst maintaining a node throughput rate which will allow a comprehensive range of real-time tests and trials. Eventually, the development of high performance processing nodes by VLSI design will permit the practical realization of such parallel processing architectures in extremely compact hardware form. In addition, the VLSI circuitry in conjunction with advanced technology will provide processing throughput rates far in excess of those obtainable by current nsp components and will therefore be matched to future wideband radar and communications applications.

[4]

[5] [6] [7]

[8] {9] [10] [11] [12]

[13]

ACKNOWLEDGMENT

The authors thank the Directors of Standard Telecommunication Laboratories Ltd. for permission to publish this paper. REFERENCES

[1] S. P. Applebaum, "Adaptive arrays," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 585-598, 1976. {2] B. Widrow and J. M. McCool, "'A comparison of adaptive algorithms based on the methods of steepest descent and random search," IEEE Trans. Antennas Propagat., vol. AP-24, pp. 615-637, 1976. [3] B. Widrow, P. E. Mantey, L. J. Griffiths, and B. B. Goode, "Adaptive "antenna systems," Proc. IEEE, vol. 55, no. 12, pp. 2143-2159, Dec. 1967.

568

O. L. Frost, "An algorithm for linearly constrained adaptive array processing," Proc. IEEE, vol. 60, pp. 661-675, 1971. G. L. Lawson and R. J. Hanson, Solving Least-Squares Problems. Englewood Cliffs, NJ: Prentice-Hall, 1974. I. S. Reed, J. D. Mallett, and L. E. Brennan, "Rapid convergence rate in adaptive arrays," IEEE Trans. Aerospace Electron. Syst., vol. AES-I0, pp. 853-863, 1974. G. H. Golub, "Numerical methods for solving linear least-squares problems," Num. Math., no. 7, pp. 206-216, 1965. W. Givens, "Computation of plane unitary rotations transforming a general matrix to triangular form," J. Soc. Ind. Appl. Math., no. 6, pp. 26-50, 1958. S. Hammarling, "A note on modifications to the Givens plane rotation," J. Inst. Math. Appl., vol. 1, pp. 215-218, 1974. W. M. Gentleman, "Least-squares computations by Givens transformations without square-roots," J. Inst. Math. Appl., vol. 12, pp. 329336, 1973. J. G. McWhirter, "Recursive least-squares minimization using a systolic array," Proc. SPIE, 1983, p. 431, Real-Time Signal Processing VI, 2983. H. T. Kung and W. M. Gentleman, "Matrix triangularization by systolic arrays," Proc. SPIE, 1981, p. 298, Real-Time Signal Processing IV. S. Y. Kung, K. S. Arun, R. J. Gal-ezer, and D. V. Bhaskar Rao, "Wavefront array processor: Language, architecture and applications," IEEE Trans. Comput., vol. C-31, no. 11, pp. 1054-1066, 1982.

Nonlinearities in Digital Manifold Phased Arrays BRUCE D. MATHEWS,

Abstract-In digital beamforming (DBF), the phase shifter is functionally replaced with a receiver and digital phase rotation. A Taylor series expansion of mixer nonlinearities is used to generate receiver intermodulation spectrums respective of the element position and the iso-Doppler wavefront directions of signal arrival across the array. The dominant intermodulation distortion at each element experiences linear phase errors across the array proportional to the harmonic number and the desired steering direction phase gradient. The array distortion signals are reduced relative to the desired signal by the array factor sidelobe isolation when desired collimation directions exceed a few beamwidths of scan off the array normal vector. The result of the nonlinear down conversion analysis is extended to inphase and quadrature imbalances and batch manufacturing tolerances for element receivers. I. INTRODUCTION

T

HE MOMENTUM AND expectations of advancing digital circuit technology [1]-[4] encourage attention to alternate phased array architectures. Digital beamforming (DBF) (see Fig. I) utilizes the conversion of the analog microwave signals to digital numbers for preserving the spatial phase information of a wavefront across the array [3]. Whether determined adaptively or a priori [4], the steering weighting and collimation processes are subsequently performed as digital complex number arithmetic. However, large numbers of receivers are necessary to realize such a mechaniManuscript received December 3,

1985~

revised April 18, 1986.

The author is with the Systems Development Division, Westinghouse Electric Corporation. P.O. Box 746, Baltimore, MD 21203. IEEE Log Number 8610030.

ME~BER, IEEE

zation. The nonlinearity subject of this paper follows from both the need to simplify these receivers for realizing affordable, producible systems and a phase error mechanism consequential to DBF which relaxes critical radar receiver design criteria. The array factor will depend upon design features of the receiver as a generalized phase shifter with nonlinear features from downconversion. Harmonic intermodulation of signals due to the nonlinearities of the final mixer is the principal source of distortion in the radar receiver [51. The superhetrodyne receiver uses several stages of down conversion and intermediate frequency (IF) processing to minimize undersirable signals and to efficiently transform the signal for digital formating. The final down conversion to baseband will take caution when the dominant signal from terrain backscatter, i.e., main beam clutter, has a nonzero Doppler spectrum component geometrically determined by the squint angle between the transmitted 1ine of sight and the velocity vector of the radar platform. Mixers are nonlinear devices. As viewed by a baseband spectrum analyzer, intermodulation products from nonlinearities in the final mixer relocate, as broadened replicas of the clutter spectrum, at harmonics of the offset error in positioning clutter to zero IF. In receiver development, a two tone test is performed. to observe the amplitudes of the intermodulation at the harmonICS of the difference frequency. This distortion determines the spurious free dynamic range [6], [7] and is particularlY, problematic for sensitive detection performance under large clutter conditions approach ing receiver saturation.

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. AP-34, Vol. 11, pp. 1346-1355, November 1986.

569

I

I, I

I

,I I

L_

Nx1

--,

I

N x1

I

a priori

J

IL

···

B

U

~Stt Fig. 1.

Nx1

II.

F F E R

N x1

Beam Former

------~I&a

to IPP Buffer and Corner Turn Processor

With a digital manifold. any number of weighting/steering functions may be used to form the beam.

This paper first reviews in heuristic form the interchange of the phase steering and down conversion processes and introduces the mechanisms of distortion generation in the receiver. After an overview of the calculation, the clutter signal wavefront is modeled. The receiver down conversion and the digital collimation processes are described mathematically. The paper closes with summary extentions and graphic comparisons of the conventional and digital manifold phased array spectral results. These suggest DBF receivers are more tolerant of distortion and/or may employ alternate design approaches. DISCUSSION

Consider the receive operation of a linear array of elements in response to the signal wavefronts from terrain scattering of a transmit narrow beam from a nonstationary platform. In the conventional electronically steered phased array, an incident wavefront induces a linear phase error across the aperture. Commanded microwave phase shifters remove this linear phase error from a desired direction, and the element signals are accumulated through a precise analog phase and power combining network to yield a single output signal. A triple conversion superheterodyne, in-phase and quadrature sampling receiver using stable coherent local oscillator signals is employed to present an in-phase (I) and quadrature (Q) representation to a waveform digital matched filter processor

[8].

__

In digital beamforming, the order of the functions of the receiver and the array are interchanged, and the phase shifter replaced by a phase rotation arithmetic at each element. The

output from each element receiver includes distortion generated by the nonlinear processes of the receiver on the elemental signal. An I and Q digital sampling of the signal preserves the directional phase information of the incident signal component at the element. The element signals are collimated by furnishing a uniquely, judiciously chosen inphase and quadrature pair of digital numbers, i.e., the real and imaginary parts of a complex number to each element. These numbers have been called digital steering weights [3] and are applied to phase rotate the element signals so that the linear phase error across the array of elements for a specified direction of wavefront arrival is removed. In Fig. 1, these steering weights can be derived from and applied to the incident signals if a buffer can store the signals until the weights are computed. The collimation process is completed by a vector accumulation of the element signals. The distortion resulting from the nonlinear processes of the receiver may be analyzed, as are almost all nonlinearities, by power series expansions which lead to power products of the clutter signal and the local oscillator (LO). After low pass filtering, only certain combinations of clutter and oscillator products are important, in particular, those combinations which place the resulting center pulse repetition frequency (PRF) line clutter spectrum very near de. The desired portion of the mixer output corresponds to the first-order product of the clutter and the first-order product of the oscillator. Combinations involving higher order self-products of the clutter signal or LO may be termed self-mixing products. A feature of self-mixing is a harmonic relation in frequency and in phase, This may be illustrated by a simple nonlinearity and trigonometric identities. Consider a fourth-order nonlin-

570

earity

"-

I \

I \

(cos ex + cos {3) 4 = cos:' ex + 4 cos ' ex cos {3 +6 cos ? ex cos? {3+ 4 cos ex cos ' {3+ cos 4 {3 .

Gl

.,;

"' a ~

a:

u '" ~ e ...... '"

sin lJo

where

';.,

lJo

,1II

i

\," / \

I .-, -W~5

~

--t--t---r-~

,

\

I

I

if i

Yr

I

1\ • jI r >; .I I.

, .....\

-L/_ ..L C ' - _

\

\1 r

I

O~CTlON

",.....-THRESHOlO

I

•

,.

I

POS ITIONING HARMONIC

\

\

I .r ·,

if t

\

Fig. 2. For perfect positioning. the clutie r spectrum is well confined. Poor positioning leads to spreading by harmonic relocation. Higher level signals experience a more rapid proportionate increase in harmonic conten t.

The desir ed signal will be collimated by specifying rotation ph ase IjJ a nd inphase and qu adrature operations [3] (4) (5) where th e in-phase and quadrature voltage signals th e in-phase and quadrature steering weights W, cos IjJ WIJ sin 1jJ.

Vi. VIJ W" WIJ

(6)

(7)

Fr om (I ), the de sired downconverted component is Vi == cos (ex - f3)

(8)

Vq == sin (ex - f3)

(9)

hence

v: == co s (ex -

13 +

1jJ)

(10)

v ~ == sin (ex -

13 +

1jJ)

(11)

since

(3)

(12) Selecting

cPo arbitrary phase

x

II '1

,. Ii

,.

Q

When the signal is closely positioned to de (i. e . . small w, ). low pass filtering eliminates all terms in (2) except the final term on the RHS. and yields a downconverted version of the original signal located spectrally at 2 w, with a harmonicall y distorted phase. For radar systems on moving platforms. the intermodulation introduces broadened replicas of the incident clutter spectrum into the pass band relocated at harmonics of the frequenc y o ffset error. The harmonic intermod distortion co herently integrates . Fig . 2 dramatizes fast Fourie r transfo rm (FFT) folded PRF lines of a lar ge clutter signa l at the thre shold detect ion stage of a processor with a larg e dynam ic range for both A ID converte r and co he rent integratio n-a radar spectrum analyzer of perhaps 100 dB dy namic range. The received spect rum results from the illumi nation of terrain in relative mot ion. The 3 dB spectral width is proport ional to the antenn a be arnw idth and to the Doppler gradie nt at the pointin g geometry (see (2 1». The spectra l width typi call y is less than 1000 Hz and is much less than the PRF o f the sig nal waveform. Fig . 2 has been dr awn with ex agg erated position ing error to resolve the intermodul at ion produ cts . A goo d mixer may have con vers ion losses exceedi ng 60 dB below signal level. As the signal level increases . the relati ve levels o f the intermod products will increase. Contemporar y radars employ control loops to min imize desensitization of moving target detection and false alarms in presumed clutter fre e regions by precisel y positionin g clutter signa ls at zero IF . In the instance of the co nventional recei ver . the phase associated with the clutter sig nal is o f little significa nce . In the digital manifold pha sed array . this pha se co ntains the spatial information nece ssary for co llimatio n. T he sec ond harmon ic distorti on terms will have spa tial phase arguments ac ross the aperture of twice the linear pha se error of the fundamental . des ired term . This should be clear by letting . in the pre viou s example,

A

I!

'I II

~

+ 112 cos (2(ex + {3}) + 112 cos (2( ex -{3» l. (2)

cP == cPo + -

,.

~

6 cos? ex cos? {3 = 312 [ 1 +cos (2ex)+cos (2{3)

2 1l"X

Ii '\~1

..

z

Let ex = W it + et> be representative o f the return of a main beam iso-Doppler clutter patch. Suppose {3 = w~ [ = (W 2 + w,)t is the baseband positioning local oscillator. The middle term on the right hand side (RHS) of ( I) may be written

I ."\ II - - PERFECT POSITIONING. NEAR SATURATION · - .- POOR POSITIONING . lOW lEVEL I' I \I - - POOR POSITIONING . NEAR SATURATION

..

Q

(I)

the element relative location in the array wavelength the direction of arrival off-broadside,

and considering the final term of (2) as before .

- 27TX •

IjJ == - - sm A

eo

(13)

at all ele ments will place the desired clutter spectra from each element inphase. With distortion, digital beamforming removes only part of the element phase error so that the

571

CD Q

W

7) Retain dominant amplitude terms.

100

en

o

o t-

Z

10

~

10

IV.

a::

3

U

a

\AI

~

...S

.to

-

Clutter

PositionEnor,HZ

Conventional Receiver at Saturation

20

~

-20

-40

Fig. 3. Harmonic distortion is suppressed by the relative array factors for the harmonics.

distortion signal, (2), is not fully collimated across the array. Accentuating positioning errors will resolve the distortion components to the result of Fig. 3 for a small off-broadside scan angle. The above argument for the digital manifold phased array distortion is simplified. The use of local oscillators delayed TTl 2 rad between I and Q channels of the receiver introduces a nuance into the spatial interpretation due to the various selfmixing phases of the local oscillator in the steering arithmetic. Finally, the main beam Doppler clutter will defocus due to slight off-boresight angles of arrival which can be included in the result. The remainder of the paper deals with a rigorous derivation of distortion for the digital manifold phased array.

III.

The largest signal in the airborne radar receiver is usually clutter. i.e., backscatter from terrain. This signal is coherent within the time dwells of data collection and has well defined wavefronts of arrival. This section introduces a mathematical description of this signal which permits interpretation as an infinite number of wavefronts with a Doppler component related to the angle of arrival. This formulation permits a generation of distortion in the spectral domain and a representation of wavefront phase errors due to angle of arrival along the array. Clutter may be simulated as a finite superposition of sinusoids [9] . The amplitudes of these components are determined by scattering models, the radar equation, and the transmit illuminating antenna field pattern [8]. Such simulating sinusoid amplitude coefficients are calculated using area patches and imply a density generating function exploited with numerical integration [10]. The magnitude of such a simulating clutter signal is measured with a spectrum analyzer which effects a finite Doppler bandwidth and/or angle resolution on the clutter cell. For an illuminating transmit beam, the density function will acquire an amplitude taper about the direction of transmit collimation which is commonly approximated by a Gaussian function in angle. By expanding in a first-order Taylor series, the Gaussian argument may be either Doppler frequency or angle off boresight. Near the boresight direction 00 • for a Doppler frequency III and platform speed u.

OVERVIEW OF THE CALCULATION

The mathematics of harmonic distortion analysis produces a bookkeeping maze and a potential distraction. The following are the salient points in the calculation for the output spectrum of a nonlinear manifold phased array radar. 1) Model a single PRF line of the scattered signal spectral density function of a gated Doppler wavefront arrival direction and transmit illumination. Assume a Gaussian spectral amplitude function and justify a wavefront arrival phase argument linear with Doppler frequency. 2) Expand the time signals for down conversion in-phase and quadrature mixers using a Taylor series for the clutter and local oscillator signals. Form an infinite series over the mixer products in exponential notation. 3) Fourier transform into the spectral domain. Use convolution properties of the Gaussian terms for simplification of selfmixing terms. 4) Collect only those terms which peak within the low-pass filter band. 5) Apply the steering weights for a desired pointing angle using complex phase rotation arithmetic as prescribed in (4) and (5). 6) Identify phase error terms in the resulting spectra. Manifold accumulate the element signals into an array factor for each signal component using phased array linear phase error results.

SIGNAL MODELING

Id = 2v/X cos 00

(14)

O(!d)=arc cos (Xfd/2v)

(15)

sin 8(/) = sin 80 + (/ -!d) cos 80 d81

df

J=/d

.

(16)

Consequently. the wavefront phase delay may be approximated. for the main beam, by (17)

where main beam center Doppler the Doppler frequency the location vector for the nth element in the array the propagation vector in the direction of the Doppler frequency component and, for a linear array where

572

(18)

d = element spacing (19)

I3n u

trXn

tan (J

(20)

and f3n are the phase delay of main beam center and the phase delay slope for off boresight Doppler clutter for the nth element, respectively . The Fourier transform of the signal voltage input to the mixer is

CXn

;;;

L.O. " - - - PRF

--1_

0(

C

s

0(

w

...:J:::;

~: {e+ i[an+l3n(f-JI) +le-I!2(f-It/~f)2 + e- i ran -l3 n(f +It) + le -

1!2(f + 1t /~f)2}

: PHAS E

I

w

0

sn(f) =

LO

Z

--1

I

Ul

I

0(

I

Q.

I

Q.

I

:l;

I

0(

il

(21)

where

\ on\

\"""

Ao

the voltage amplitude density coefficient of element peak clutter as may be calculated from the radar equation Vo a scaling voltage magnitude Sf the spectral width of main beam clutter due to Doppler gradient tif = 2v/ D tan 80 for an electronically scanned transmit beam of aperture D = N d m a random component of phase, invariant in space and time, stable for short observations.

Fig. 4 . IF signals input to mixer. The Gaussian shape is due to the transmit illumination pencil beam . A well de tined spatial phase is due to the geometry of the Doppler effect in the main beam . DOWNCONVERT

MA NIF OLD

At the mixer input of the digital manifold element receiver, the signals are spectrally related as shown in Fig . 4 . The phase of the clutter signal depends upon the element location in the array through the subscript n . In the analysis of the mixer . the random component of phase is suppressed, although the reader may mentally note its effect. V.

N

rn

(TO MATCHED FILTERIFFTICFAR)

MATHEMATIC TREATISE OF NONLINEAR MANIFOLD PHASED ARRA y RECEIVER PROCESSES

In receiver operation, interest must ultimately focus upon the signal properties that affect the extraction of information . In radar , that information is often obtained from a digital signal processor which completes a matched gated Doppler filter for the signal and provides thresholded detection . The product of this analysis is the spectrum of an FFT process of known coherent gain . The analysis (see Fig . 5) begins with the generation of the intermodulation distortion through a final stage mixer ostensibly positioning the pulse/PRF line spectrum at baseband video. An ideal low-pass filter rejects out-ofband components . The collimation of the element signals begins with a complex arithmetic phase rotation on each element and is completed by a complex vector accumulation across the aperture analogous to classic phased array theory . Presumably the desired portion of the downconverted spectrum will be equivalent to the conventional phased array result, but the distortion components will have differing array factors . Mixers are nonlinear devices . The p-n junction semiconductor diode is the foremost example [5], [II] . As measured across a load, the junction current produces a voltage transfer

Fig. 5 . Digital manifold phased array clement processes include I and Q down conversion by a local oscillator and beamsteering phase rotation arithmetic .

where

Ao

the de bias voltage the local oscillator voltage magnitude the local oscillator frequency the clutter normalized time domain voltage signal at the jth element of the array due to a superposition of wavefronts , the clutter signal amplitude .

For the quadrature mixer, v,(t)=vs+vo cos (2·llfot+7rI2)+Aosj(t).

(24)

The subscript j, denoting the element location in the array, will be suppressed . The quadrature channel will be noted for how it differs from the inphase. Placing (23) into (22) and expanding the nonlinearity into.a Taylor series and using exponential notation for the eosin usOld [5]-[7]

(22) The voltage across the junction terminals VI will be due to a de bias, a local oscillator,. and the signal from the previous IF stages . For the in-phase mixer, V1r(t) = v~ + Vo cos (hfot) + Aosj(t)

(23)

573

(25)

For the quadrature channel, a is still replaced by a + tr/2, and a factor e1 (m - 2n )1r/2 is included in (34). When the argument (33) equals zero, the replica of clutter, i.e., the Gaussian amplitude envelope, reaches its peak value. For a pulse spectrum containing many PRF lines, the low-pass filter bandwidth is large compared to the PRF, and the clutter spectrum (see Fig. 6) is centered at

where

(m) 11

the Taylor coefficient evaluated at the bias point. (26)

m! =(m-n)!n!'

11=/0

l: x; QD

cos " (21C'lot ).

(28)

m=O

n - 2k+ m - n - 2/= 0

F«r the quadrature term, the bracketed term will include in the argument of the exponent, i(n - 2k)1C'/2, and a 1C'/2 in the argument of the cosine in (28). Equation (25) portrays the self-mixing and up/down conversion characteristic of mixers. A low pass filter will eliminate out of band components. Continuing the analysis in the spectral domain, the Fourier transform is V(f) =

[00 v(t)riz"f/ dt.

m

s"'(I)e- / 2 7i /

1

. (,n) [ ill ] m

dt=e+ I ;3/ ~

I

1=0

-(Xl

--

ili

(37)

The voltage spectrum out of the filter and input to the digital phase rotatation calculator is lJ3/(f)=c3/(f)+2Ao

'lm-l

n

I~I ~ ~o 00

k4 (2m) (n)k (2m I-n) 2m

m

n

where

m-I

.Jm

(36)

l=m-k.

(29)

e+ /(m - 21)(a - {3/ I ) _ ") _ _ _ _ _ _ e-I/21/-(m-2/)/I/~/~"'I"

= 2(/ + k)

will peak within the pass band. Evidently only even nonlinearities place signals in the pass band. Using (36) and rewriting 2m for m,

Attention will now be focused on the Fourier integral of the generalized power products of the signal and local oscillator. This integral simplifies for the Gaussian form of (21). By an induction of completing the exponential argument square, the reader may veri fy.

I:.

(35)

An ideal low-pass filter does not perturb the phase arguments, transfers in band components with no attenuation, and eliminates out of band components. It is very practical to ignore the tails of those spectra components of the distortion which peak outside the pass band. Since 11 and 10 are ostensibly equal, only those terms with indices

the binomial coefficient resulting from the expansion of the exponential argument in the Taylor series and the local oscillator. (27) C2[=

d/·

~ BW ~ PRF ~

x n ,k=1 - (n - 2k)(fo- 11) (30)

(39)

(40)

The spectrum into the low-pass filter becomes

=C1/(f) + 2A o l: QD

lJ1/(f)

",-1

l:

n

m-n

~ ~

m=J 11=0 k=O 1=0

. e + I «m -

n - 2/)a + {3xm ,n ,k ,l J

Where

2:

K

e- l.x7n,n,k,/12(m - n )Af2 J

(31)

For conventional radar receiver analysis, (38) is the spectrum presented to the digital signal processor. The array factor is incorporated into the definition of A o. Equations (38) and (39) indicate that the peaks of the distortion spectra are located at harmonics of the positioning error 10 - II and include so-called real and image spectra. A useful, semiempirical approximation considers dominant terms, (m = n k), (m = n, k = 0). Then QD

~ = ~ Aot:.j xm.n,k,/=I -

C2/(f ) =

(n - 2k)/o - (m - n - 2/)/1'

::;0 ~o K2: (m) n o[f-(m-2n)jo]· 00

m

~ F U3/ ~Ao ~ m con con m = I

(32)

t::. m - 1 _l;0_ 1 "

m

e-I/2(/±m(/I-/0)l~f~q2

(41)

where

(33) (34)

574

~o= ~ Aot:.f ~;

Usat

(42)

Q; (f) = ajco(f)[cos ,-/ LOCAL OSCILLATOR SINGLE PRF LINE OF CLUTIER

where Fig. 6. Spectral characteristics. The mixing products at IF near fo will place desired downconverted terms, as well as certain distortion, into the pass band. The low-pass filter bandwidth is larger than the spectral width of a single PRF line of clutter but smaller than the local oscillator at fo. This design rejects many undesired mixing by-products.

F

_[uovsatJ 4

m-

m

- 1- -d~; 2m (,)2 m. d VI

Gm,n,k=

u2

(43)

w~ =

Qj

sin

(2; (2;

Xj

sin

~2m-n

e- 1/ 2 [! + (n -

2k )(! t - ! o )/ tl! ..J2m - n J2

et>o)

e+ tn1/!=e+ / (N -

l )", / 2

(50)

(51)

sin (Nl/;/2) _

(52)

sin (l/;/2)

because, from (49), (50), and (20), (21), all the terms} depend upon

(53)

Xj=(j-l)d.

For uniform illumination, {I (f)

QJ

=

1 and

.'V

= co(f)

~ [cos OJ - sin OJ] ;=1

'Zm-l

co

n

2; 2; Gm,n,k

+ 1/2 ~

(45)

m= 1 n=Q k=O Xj

sin

et>o)

(46)

.

(1

[ [

+

+ in- 2k + J )e- /(N -

[<1 _i

Q' (f) = co(f)

n-

I lit n,k

sin (N'V -k.) .

:.

sin i' n,k

"k+ 1)e-,(N- I)"',i,k Sin. (N'iY ;k) SIn

N

2; [cos

'It:

k

(48)

575

J]

(54)

OJ + sin f)j]

OD

2m-l

n

+ 1/2 ~ ~ ~ c.c.: m= 1 n=O

k=O

Gm,n,ke+it/tn,k,j

k=Q

]

j=l

n

+ Qj ~ ~ ~ m= 1 n=O

k=O

n=O

the fundamental mixer outputs will be collimated across the array for the signals returning from the direction of transmit illumination. The prescription (45), (46), (4), (5) will be calculated for the general element. Recalling the notation differences between quadrature and in-phase channels, the post steering element spectra are

2m- 1

n=Q

~2m-n-1

N- 1 ~

(47)

OD

m= 1

Combine the brackets in (48), (49), i.e., the beam steering terms, into the exponential phase notation of the element signal. A straightforward summation over the elements, j, may be then undertaken since the exponential has a linear spatial phase error. Assume that all element amplitudes are identical except for the taper coefficients. The summation over elements uses the standard phased array result for uniform illumination

is the mixer conversion gain for the m th harmonic as measured for a two tone test with a receiver saturating sinusoid input Usal ' Equation (41) is useful because it describes the growth of the harmonics with approaching saturation through (42), and (44) provides an empirical quantity for the simplifying the multiple sums of (38). For digital manifold phased arrays, the phase arguments of (38) have yet to be accumulated into an array factor, and the amplitude A o is calculated using an element factor only. From each element, the ordered data pair. (VI, vQ) is to be phase rotated in the complex plane so that the signals from a desired wavefront of arrival will be in-phase and will sum into an array factor equivalent to the conventional phased array. The steering prescription given in (4), (5) is time independent and may be generalized to include an amplitude tapering coefficient Qj for controlling sidelobes. When the phase steering commands at the jth element are set to

cos

n

(2m) (n)k (2mm-k-n) n

2A o 2m 4m K

.

(44)

Qj

2m-l

l/;n,k.j = (n - 2k)cxj -l3j [f + (n - 2k)(fl - fo)]. I'

and

w~=

co

+aj ~ ~ ~ Gm~n,ke+i"'n,k.j

.:1t

L.PF.

1.0

OJ + sin OJ]

_

[

(1 - i n -

2k + l)e-i(N-l)'i'

sin (N~ n,+k) Itk. SIn

i':k

J]

(55)

~here

.;'k=

sin

:d

[(n-2k) sin (Jo - sin epo -

A

2u tan (Jo

[f + (n - 2k)(fl - fo)]]

sin (56)

sin

+ sin epo -

A

2u tan (Jo

[f + (n - 2k)(f, - fo)]].

(57)

The result of (54), (55) is just the electric field array factor of the aperture and should be regarded as very satisfying. If the weight coefficients are other than uniform, of course, another function is substituted for the sin (Nx)/sin (x) with an angle of arrival argument determined from the linear phase error. The phase error arguments (56), (57) determine the collimation properties of the spectrum components. When equal to zero, there are no phase errors across the array, and the array is said to be collimated. Otherwise, there is a residual linear phase error leading to the usual near and far sidelobe features of antenna patterns. The first two terms in (56), (57) indicate the alignment between the transmit illumination scattering and the phase rotation of the digital manifold. The final term models the linear phase error of mainbeam clutter and indicates that the relocated distortion spectrums harmonically defocus at frequencies lying in directions off boresight. A simplifying result will be discussed to clarify the impact of manifolding on intermodulation distortion. Consider as in (41) only the dominant amplitude components of the spectra as apparent in spectrum analyzer measurements with the collimation commanded for the direction of transmission. In general, Km is a decreasing function with increasing m, and, except near saturation, higher order terms are smaller from the multiple convolutions. Let

cPo = - 00

(58)

m=n=k

(59)

m=n, k=O

(60)

then,

(N7rd [(m + 1) sin 0 + AU- m(fl - fo)]J ) 0

A

(7rd

[(m + 1) sin (Jo + AU- m(f, - fo)]J) 2u tan 80

A

(N7rd

[(m+ 1) sin (Jo- A[f+m(f,-fo)]J)

A

. (7rd [(m + 1)

sIn

sin

-

A

(N:d

2u tan 80

L AF~m-1 m CD

0

m=l

sin sin

0

uo.Jm

(N1rd A

(1rd

H~

[(m-1) sin (Jo+AU-m(fl-fo)]]) A 2u tan 00

0+ A[f-+ m(/l - 10)]J) --0

2u tan (Jo

[(m _ l) sin (Jo _

AU + m(fl - fo)]J )

1\

sin

(7rd

2u tan 80

[(m-1) sin (Jo- A[f+m(fl-fo)]J) 2u tan 00

A

(61) where

= e: 1/2(/- m(/1 -/o)/a/..Jm1 2

(62)

H,;' = e- 1/ 2 (1 +111(/I-/O)/~f.J; ,2

(63)

H,~

and other terms are defined in (42), (43). In (61), for m = 1, i.e., the desired down conversion term, the second array factor term of the real spectrum is cancelled, since

(64) Likewise, for the image, the first term is cancelled. The remaining array factors have zero linear phase error except for the defocusing of off-boresight arrival. In general, for the distortion products, there will be array factors with a steering compensation (m - 1), and an anomalous, spoiled factor (In + 1). For odd harmonics, one of these beams is cancelled, while for even harmonics, both will be present. The directive gain is determined from the unnormalized array factor using angle arguments

(1 +;I-m)

[em _ l) sin (Jo + A[f- m(fl - fo)]J ) 2u tan 00

.

SIn

+ 1)

sin 00]

(65)

arc sin [(m - 1) sin 80]

(66)

arc sin [(m

l(f)~

2u tan 80

at the distortion relocation peaks. This nuance from the heuristic Section II arises from the phasing of the quadrature channel as perturbed by local oscillator self-mixing and transfered through the phase rotation steering prescription. VI.

RESULTS AND EXTENSIONS

In this section, results are delineated, and some further remarks are made regarding performance when some of the ideal assumptions of Section V are relaxed.

576

Consider a nose mounted array with a Gaussian array factor . Let the downconversion frequency error be 2v

11-10=- [cos Oo-cos Oil A

_ .- •CONVENTIONAL ARRAY. L OW LEVEL CONVEN TION AL ARRA Y. NEAR SAT - - - OBF. NEAR SAT. •

eo

(67) ui

'" is z

where

00 the direction angle to the main beam clutter 01

of transmit the direction eosin corresponding to positioning error.

~

(68)

\

j j

\

i

i

i

20

By choosing this error large, the distortion spectra may be resolved . For collimation in the direction of transmit , Fig . 7 compares the distortion spectra as calculated for the first five dominant terms of (61) with harmonic conversion power gains below des ired of a well designed mixer, e.g . ,

O 2 = -65 dB 0) = -75 dB 0 4 = -80 dB 0 5 = -85 dB .

40

:3 o

(69)

2V :;:;

I

i

\

I

/>.

if

I

I

\

I

II

16 67 KH z

/ '\ em /: __ \_.../-...~ _ ~.~'~ _ \

I

__

/ .......

\.

10 KILOHERTZ

\

1

1/

I

The higher harmonic distortion peaks for near saturation levels will lie about - 145 dB relative to the mainbeam clutter peak . For extreme downlook dynamic ranges and long coherent integration , the distortion is well below the noise limited threshold level. Fig. 7 shows the distortion level at the processor for DBF has been significantly reduced. The effects of increasing the scan angle near broadside are shown in Fig . 8. The nonlinear case at 10 mrad scan is essentially identical to the conventional array analysis . as expected . When scanned to larger angles , the clutter width and the positioning error increase , due to the geometry change. as expected . The feature of interest is the relati ve level of the distortion peaks . At each harmonic peak, the directive gain is decreased for the increased scan . It may also be observed that the even harmonic terms ar e broader than the odd harmonics . The even harmonic distortion has two beams, each of which will peak at slightly different frequencies with the net effect of broadening the lobe. The higher odd harmonics further illustrate this peaking of directive gain . Due to the selection of m + I or m - 1, however, the relative max imum in directive gain translates the lobe maximum slightly from a pure harmonic of the positioning error. This effect tends to narrow the odd harmonics lobe . The m - I selection for the fifth harmonic leads to a directive gain equal to the third harmonic m + 1 terms, and the spectra differ only due to the convolution differences . In the derivation of these results, a number of ideal assumptions were made . Of particular interest in specifying the receivers are effects of channel imbalances . Phase and amplitude errors which vary randomly from element receiver to element receiver will produce the same sort of effects on array factors as root mean square (rms) errors due to manifold tolerances or phase shifter quantization in conventional phased arrays . Premixer imbalances will be multiplied by the harmonic number, and the sidelobe level for the distortion products will rise . If the rms sidelobe level for the conven-

0 .0

u. \\ .I

I I

(70) (71) (72) (73)

\

.--- 41- - _1-

5 RAO

::

,

I \ i \

60

8 ex:

8,

. I

OJ

a

190:;:;

.\

I .

i

<,

\

Fig. 7. The harmonic distortion experiences a sidelobe direction gain. This result was generated using a velocity of 250 m/s. a wavelength of 0.03 m, an aperture of 0 .75 m. and a Gauss ian taper in azimuth producing a 0.058 rad bearnwidth.

CO Nv Eldl() NAl .. RRU . " ~ 10 "' 1'1

_ -

-

-

-

_

i..'9 F •

•

•U ... '"

0 8 F '0 3 )Q "'R ~e F 'o ~ ~ ,

'

t ·.,

" I I

I "

:11 ~ : I

\

\1..... : ' : I : ,

~

"

1,0 '" QAOI A N

I

\ \ '~./ I ~.:\"'" i 'iI''7\! I I

:, I

: 11 I

.,

1 . I ~ ,"

" I .

I ....

I .

:

, I , , 1' 1 I

I'

Fig . 8. Near broadside distortion spectra. With increas ing scan. the harmonic distort ion decreases as the linear phase error increases , and the array factor moves into the sidelobe region . These results were generated with a velocity of 250 rn/s using a wavelength of 0 .03 m, an aperture of 0.75 m. and a Gaussian weighted antenna patte rn in azimuth with a beamwidth of 58 mrad .

tional pattern is, from an rms error [12],

e = phase error standard deviation

N = number of elements,

(74)

then the relative gain for the mth harmonic distortion spectra peak for large scan angles is

577

(75)

and higher harmonics should completely defocus. Post mixer element mismatches are equivalent to the conventional phased array element error effects. Conventional receiver in-phase and quadrature channel mismatch tolerances are usually held tight to fully cancel the iJIlage portion of the downconverted spectrum [13]. An average amplitude or phase error in the I and Q channels will factor out of the array factor summation over elements «43)(54) and produce finite image cancellation. For a batch l11aaufacturing procedure, such a bias should arise only from the statistics of the sampled mean. The random I and Q imbalances for the total lot of manufactured receivers will lead to a probability that the mean of the smaller number of receivers collected into a radar system will be nonzero. Let Emu = the maximum tolerable imbalance, radians, allowable for a given image cancellation. The standard deviation of the receiver imbalances must be small to ensure a large confidence bound on the sampled mean statistic of the system imbalance. For a 0.99 probability that a system will pass the given image cancellation requirement,

.IN

receivers approaches the number of elements, image rejection is determined by the ensemble character of the receiver imbalances-a single receiver will not have a great effect. Tolerating imbalances with zero mean and finite standard deviation statistics becomes a permitted receiver manufacturing perspective. Receivers for DBF must become less conspicuous in size, weight, power and cost. In addressing receiver design, the array off-broadside cancellation of distortion relaxes specifications for many critical components. This relaxation is a necessity for entertaining increased scales of linear circuit integration and high volume production methods. ACKNOWLEDGMENT

The author is indebted to Dr. K. DeMartino, now with Dynamics Research, Wilmington, MA, for the clarity of his exemplary receiver analyses. REFERENCES

[1]

(76)

[2]

because the sampled mean is a standard normal random variable when normalized by the square root of the sample size (N is the number of receivers). When the in-phase and quadrature element channels are randomly mismatched, the effects appear in the array factor. A curious consequence of random imbalance is the appearance of finite anomalous beams for the odd harmonics. The array factor for the previously cancelled m + 1 term for the desired clutter would appear with a small amplitude coefficient more resemblant of an imperfect null. The other consequences are analogous to the understood element imbalance effects applying to quiescent and/or adaptive array factors [14].

[3]

- - € max nusmatch - < 2.572

(J

VII.

[4]

[5]

[6J [7J

(8)

[9J

CONCLUSION

In DBF radar, receiver generated distortion has accentuated linear phase error relative to the signal. The problem of this distortion to the radar is a desensitization of moving target detection near large clutter portions of the spectrum. For DBF, the array gain of any harmonic component of this distortion is fortuitously less as the spreading is greater. Nonlinearity specification may tolerate a relaxation approaching the relative sidelobe level of the array design, and the need for clutter positioning may be questioned. The final mixer stage involves the largest amplitude signals, highest percentage bandwidth, the largest consumption of power, and the greatest dissipation of heat. The result here argues that the specification of nonlinearities for this stage of the receiver may be moderated due to the array effects. The main consequence for "the receiver is a reduced local oscillator drive for the final mixer, perhaps by an order of magnitude. In DBF, this also means the relaxation of active manifolding for this LO. Components throughout the receive chain are additionally permitted lower intercept point and compression specifications. A further consequence of DBF for the receiver is the altered effect of I and Q channel imbalances. As the number of

[10] [11) [12] (13] [14]

578

S. M. Sze, "Semiconductor device development in the 1970's and 1980's-A perspective," Proc. IEEE, vol. 69, no. 9, pp. 1121-1131. Sept. 1981. Special Issue on Micron and Submicron Circuit Engineering, Proc. IEEE, vol. 71, no. 5. May 1983. P. Barton, "Digital beamforming for radar," Proc. Inst, Elec. Eng., vol, 127. pt. F, no. 4, pp. 266-277, 1980. H. Steyskal, "Synthesis of antenna patterns with prescribed nulls," IEEE Trans. Antennas Propagat., vol. AP-30. no. 2, pp. 273-279. Mar. 1982. W. R. Gretsch, . 'The spectrum of intermodulation generated in a semiconductor diode junction." Proc. IEEE, vol. 54. no. 11. pp. 1528-1535. Nov. 1966. J. R. Reid. "Spurious free dynamic range in wideband high sensitivity amplifiers.' Microwave J., pp. 26-32. Sept. 1965. J. W. Steiner. ,. An analysis of radio frequency interference due to mixer intermodulation products," IEEE Trans. Electromagn. Compat., Jan. 1964. pp. 62-68. M. I. Skolnik. Introduction to Radar Systems. New York: McGraw-Hill, 1962. p. 145. L. E. Brennan and J. D. Mallet. "Efficient simulation of external noise incident on arrays." IEEE Trans. Antennas Propagat., vol. AP-24, no. 9. pp. 740-746. Sept. 1976. . M. B. Ringel. "An advanced computer calculation of ground clutter in an airborne pulse Doppler radar." presented at NAECON '77, Dayton. OH. May 1977. S. M. Size. Physics of Semiconductor Devices. New York: Wiley. 1969. p. 105. R. J. Mailloux. "Phased array theory and technology." Proc. IEEE, vol. 70. no. 3, p. 261, Mar. 1982. H. Urkowitz. "Bandpass filtering with low pass filters," J. Franklin Inst., vol. 276. no. 1, pp. 1-13. July 1963. J. T. Mayhan and F. W. Floyd, "Factors affecting the performance of adaptive antenna systems," in Proc. 1980 Adaptive Antenna Symp., RADC, Rome, NY, pp. 154-179.

Adaptive Beamforming with the Generalized Sidelobe Canceller in the Presence of Array Imperfections NEIL K. JABLON,

Abstract-Antenna designers often employ linearly constrained adaptive beamforming as an antijamming measure. With minimal a priori knowledge of the signal environment, this technique nulls out jammers while simultaneously preserving the quality of the main lobe so that a friendly look-direction signal can be received with unity gain. Unfortunately, in the absence of special strategies, linearly constrained adaptive beamforming is hypersensitive to array imperfections when the input signal-to-noise ratio exceeds a certain threshold. This hypersensitivity manifests itself as a nulling of the friendly signal as if it were a jammer. Luckily, the signal nulling problem can be easily remedied by artificial receiver noise injection. A particularly simple and general structure for linearly constrained adaptive beamforming was proposed during the 1970's, and is known as the generalited sidelobe canceller. A detailed analysis of the generalized sidelobe canceller in the presence of array imperfections is discussed, and two new artificial receiver noise injection algorithms are proposed. Computer simulations are included to demonstrate that use of these new algorithms alleviates the signal nulling problem without seriously compromising jammer nulling. For the special case of the Capon maximum-likelihood beamformer, simple approximations are presented for: 1) the Wiener output signal-to-interference-plusnoise rat~o (SINR:), 2) tbe antenna element error variance that causes a 3 dB Joss of SINR: from its value for an ideal array, and 3) the optimal artificial receiver noise that maximizes SINR:. Manuscript received August 29, 1985; revised March 10, 1986. This work was supported by the Naval Air Systems Command under Contract NOOO 1985-C-0018, and by the Fannie and John Hertz Foundation Graduate Fellowship Program. This paper is based on a dissertation submitted by the author to the Department of Electrical Engineering, Stanford University, Stanford, CA, in partial fulfillment of the requirements for the Ph.D. degree. The author was with the Information Systems Laboratory, Electrical Engineering Department, Stanford University, Stanford, CA. He is now with the Data Communications Research Department, AT&T Information Systems, Middletown, NJ 07748. . IEEE Log Number 8609031.

MEMBER, IEEE

I. INTRODUCTION

A NTENNA DESIGNERS OFTEN employ linearly confistrained adaptive beamforming as an antijamming measure. With minimal a priori knowledge of the signal environment, this technique nulls out jammers while simultaneously preserving the quality of the main lobe so that a friendly look-direction signal can be received with unity gain. Unfortunately, in the absence of special strategies, linearly constrained adaptive beamforming is hypersensitive to array imperfections when the input signal-to-noise I ratio exceeds a certain threshold. This hypersensitivity manifests itself as beamforrner nulling of the friendly (look-direction) signal as if it were a jammer. This paper presents a detailed study of this hypersensitivity by considering a particularly simple and general structure for linearly constrained adaptive beamforming proposed during the 1970's, and known as the generalized sidelobe canceller (GSC). Luckily, the signal nulling problem can be easily remedied by artificial receiver noise injection. In this paper, two new artificial receiver noise injection algorithms are derived for the GSC. Computer simulations are presented to demonstrate that for a GSC with array imperfections, the use of these new algorithms alleviates the signal nulling problem without seriously compromising jammer nulling. The performance of the GSC is studied in the presence of I Throughout this paper, "noise" refers to additive receiver noise only, which does not include jamming.

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. AP-34, No.8, pp. 996-1012, August 1986.

579

random element amplitude and phase errors, for an environment which consists of a look-direction signal (hereafter referred to as just the signal), one jammer, and additive white receiver noise. Both signal and jammer are assumed narrow band, which means that the reciprocals of their bandwidths are large compared to the transit times of their wavefronts across the array. Important additional assumptions made include omnidirectional antenna elements, random element errors that remain constant during the adaptation period, statistically independent and zero-mean wide-sense stationary signal, jammer, and receiver noise, and a linear, homogeneous, and isotropic propagation medium. The latter three properties mean that the medium characteristics are independent of signal magnitude, position, and direction of propagation, respectively. The hypersensitivity phenomenon is discussed in detail using Wiener filter theory to analyze. steady state behavior, and computer simulations to check the results. The Wiener analysis is largely based on the use of a quantity known as output signal-to-interference-plus-noise ratio (SINRo), which is a ratio of all wanted to unwanted power at the beamformer output. In the adaptive antenna literature, SINRo is widely accepted as a valid measure of output signal quality. Specifically, this paper contains the following contributions to the literature on the GSC in the presence of random element gain (amplitude and phase) errors. • The exact Wiener weight vector and steady state output signal-to-interference-plus-noise ratio. A simple approximate expression is presented for output signal-to-interference-plus-noise ratio of the Capon [1] maximum-likelihood beamformer. • A detailed explanation of look-direction signal nulling. Solid evidence is provided that when input signal-to-noise ratio exceeds a certain threshold, even a conventional delay-and-sum beamfonner outperforms the GSC for essentially all jammer angles of interest. • An equation for the antenna element gain error variance that results in a 3 dB decrease in output signal-tointerference-plus-noise ratio of the Capon beamformer from its value when the array is ideal. This equation should be helpful in setting reasonable antenna element error tolerances for linearly constrained adaptive beamformers. • Two on-line algorithms for GSC artificial receiver noise injection, derived from the novel viewpoint of imposing a large penalty on signal components that "leak" into the sidelobe cancelling signal [2]. A surprising result which falls out of this derivation is that the adaptive algorithm must be modified in such a way as to artificially add colored receiver noise. One algorithm extends the leaky least mean square (LMS) algorithm of Widrow and Steams [2], and the other is appropriate for applications where the state vector autocovariance matrix is estimated directly, such as the sample matrix inversion (SMI) method considered by Reed et 01. [3]. • Computer simulations to demonstrate the effectiveness of the above two algorithms.

• An equation for the "optimal" amount of artificially injected receiver noise, in the case of the Capon beamformer. • Suggestions for how the theory can be extended to the two important cases of multiple jammers and wide-band adaptive array processing. The analysis of linearly constrained adaptive beamforming in this paper has at least two major advantages over most previous approaches. The first is that although restrictions are introduced later on, the initial analysis assumes arbitrary size of element amplitude and phase errors. Thus, the correctness of the general approach is not dependent on "small" errors. The second is that by using the GSC, a linearly constrained adaptive beamformer that works with an unconstrained algorithm, the analysis is relatively easy to follow. Capon et al. [1] in 1967 were apparently the inventors of linearly constrained adaptive beamforming. They developed, analyzed, and provided experimental data to support the maximum-likelihood (ML) concept, defined as using a set of weights that minimizes beamformer output power subject to a simple unity gain constraint for signals coming from an assumed look direction. In practice, linearly constrained adaptive beamformers are most often implemented in ML form. In 1977, based largely on a beamformer first published by Applebaum and Chapman [4], Griffiths [5] proposed the GSC, which he and Jim [6] later showed was equivalent to a Frost [7] beam former under certain conditions. The GSC is able to implement look-direction unity gain (zero-order) constraints just like the Frost beamformer , but in addition is easily generalizable to deal with main lobe derivative constraints of any order. Griffiths and Jim [6] pointed out that, "new methods of adaptive beamforming are suggested by the generalized sidelobe cancelling structure," for example combined temporal/spatial constraints. Since the GSC uses an unconstrained rather than a constrained algorithm to adapt the weights, it may be possible to adapt much faster. The GSC also will be less sensitive to coefficient quantization effects, because the dynamic range of the signals in the adaptive portion of the beamformer is compressed. In general, the Frost beamformer and the GSC have different state vector autocovariance matrices and consequently different eigenstructures. Griffiths [5] stated that as long as the GSC signal blocking matrix has dimension one less than the number of antenna elements and its columns are linearly independent, then the Frost beamformer and the GSC will lead to the same steady state SINRo in a stationary environment, based on a comparison of Wiener solutions (i.e., infinitely slow adaptation). However, algorithm performance measures which are formulated in terms of eigenvalues, such as transient response time and misadjustment (due to weight jitter) [2] will be different for the Frost and GSC implementations. Griffiths and Jim [8], [9] wrote several reports dealing with the GSC. They pointed out that the GSC is a particularly suitable structure to use for studying the effect of random element amplitude and phase error effects on linearly constrained adaptive beamformers. By means of some simple

580

The GSC is shown schematically in Fig. 1, consisting of X" analysis and extensive simulation, they reached the conclusion that Gaussian amplitude and phase errors had the effect of elements which ideally would all be omnidirectional With reducing interelement correlation, leading to a lowering of identical amplitude and phase. A simple model for random main lobe gain and decreased ability to discriminate against antenna element imperfections is to let each element have a jammers. The negative effects of these errors were most random complex gain g;(i = 1, ... t K), assumed to remain constant during the period of adaptation. The use of a pronounced for high input signal-to-noise ratio (SNR;). Others who have made important contributions to analysis complex gain implicitly takes into account random element of the "imperfect array problem" include Zahm [10], Cox amplitude and phase errors. Denoting the zero-mean random [11], Takao et al. [12], Vural [13], [14], Mayhan [15], element amplitude error at element i by Sa, and the zero-mean Monzingo and Miller [16], Hudson [17], Compton [18]-[20], phase error by dpi, the complex gain can be written as Bar-Ness [21], Gupta and Ksienski [22], and Godara [23]. (1) Remedies to the imperfect array problem have been proposed by Applebaum and Chapman [4], Griffiths and Jim [6], Cox wherej £ ~, and dg; is the zero-mean complex gain error. [11], Takao et al. [12], Vural [13], [14], Hudson [17], Zahm The assumption of unity nominal gain for each antenna [24], White [25], Charitat [26], [27], Widrow and McCool element in no way hinders the generality of this approach. [28], [29], Er and Cantoni [30], [31], Ahmed and Evans [32J, The adaptive beamformer with and without array imperfecand Compton [33]. tions is illustrated in Fig. 2. Sp, includes the phase error due to Although Zahm' s [24] strategy was originally introduced as random element misplacement, which changes with signal a solution not to the mismatch problem, but to the problem of direction [35], so therefore D,.gj and gi change with signal preventing unconstrained adaptive beamformers from nulling direction. It is also conceivable that Sa, could be a function of out friendly signals, his technique is one of the most widely signal direction, for example if the antenna element pattern used ones for dealing with imperfect arrays. His idea was to was not truly omnidirectional. In the presence of errors, the artificially inject receiver noise in such a way that the weight adaptive beamfonner still adapts in such a way as to minimize vector was computed based on a higher receiver noise level mean square error (MSE), or equivalently output power, but than was actually present. However, the artificially injected the fact that it is unaware of the errors corrupting the data receiver noise does not actually appear at the beamfonner causes a degradation in performance. output. This prevents the beamformer from nulling out signals Later on it will be useful to work with the diagonal matrix G close to the look direction. If properly done, the effect on representing the complex element gains, and also the diagonal jammer nulling is minimal. matrix LlG representing the complex gain errors. Defining I as The outline of this paper is as follows: Section II derives the the identity matrix GSC Wiener weight vector. Section III uses the Wiener weight vector to derive the steady state SINRo, and mathematically G d diag {I +~gh demonstrates its hypersensitivity to array imperfections. SecG and LlG will be different for the signal and jammer, since tion IV derives the two new on-line algorithms for artificial receiver noise injection by borrowing the concept of penalty they are assumed to come from different directions. Therefunctions from optimization theory, and presents computer fore, subscripts will be used on G and ~G. A subscript swill simulation data to support the effectiveness of these algorithms correspond to the look-direction and j the jammer direction. For the narrow-band case, presteering the array to a known in making the GSC robust to array imperfections. In Section V look-direction is accomplished by use of a phase shifter at the it is explained how the results of this paper can be extended to output of each antenna element. In order to steer the array to the two important cases of a multiple jammer environment and wide-band adaptive array processing. Section VI contains the the look direction (Js, a presteering delay of - Ti,s is needed at element i. In the absence of presteering, a jammer arriving conclusions. from angle OJ would undergo a time delay at each element of II. WIENER WEIGHT VECTOR ti.). Thus, the look-direction signal after presteering can be In this section, the Wiener solution of the GSC in the treated as coming from the array broadside. The jammer presence of random array imperfections is derived, and undergoes a total time delay T; at element i of compared to the corresponding relation for an ideal array. The i= 1, ... , K. (3) analysis here is an extension of the author's work in [34], r, g Ti,) - Ti,s, where the GSC was analyzed in the absence of array Imperfections in the presteering electronics can also be imperfections using adaptive noise cancelling techniques included as part of the amplitude and phase error terms Sa, and (Widrow et ale [2]). Our analysis only considers steady state Wiener solution. In other words, adaptation is assumed to be Sp, After passing through the presteering delays, the signal infinitely slow, so that algorithm-dependent effects such as received at each element is corrupted by additive zero-mean misadjustment [2] and non-Wiener signal cancellation [2] do white noise, as shown in Fig. 3. This additive receiver noise is not come into play. assumed to be independent identically distributed (i.i.d.) A. Derivation of Exact Expression from element to element. In this subsection, we derive the GSC Wiener weight vector The beamformer consists of two branches. The upper is in the presence of array imperfections. termed the desired response branch, and its purpose is to form

581

CONVENTIONAL DELAY-AND-SUM BEAMFORMER

ADAPTIVE NOISE CANCELLER

I I

L

_

PREPROCESSOR "l.k

Reference I

B

"K,k

XK,k

Fig. I.

I I I

I

Block diagram of narrow-band generalized sidelobe canceller . Additive receiver noises following the steering delays are not shown.

I r-,

v

Adaptive

Output

Beamformer

r-,

K v

~

Fig. 3.

/

Algorithm to minimize MSE

(a)

Adaptive

Output

Beamformer

K

Algorithm to minim ize MSE (b)

Fig. 2. Adaptive beamfonner with and without array imperfections. (a) Without imperfections . (b) With direction-dependent imperfections.

To preprocessor

Model of ith receiver channel (i = I . ...• K).

the desired response dk> which is the primary input to the adaptive noise canceller. In the absence of array imperfections, the desired response branch is constrained to have a unit look-direction gain. In general , this branch is a conventional delay-and-sum beamformer, with K nonadaptive weights being fixed in such a way that the array beamwidth and average sidelobe level are both satisfactory [6]. This paper assumes uniform 11K weighting, but other weightings could be considered with minor modifications to the analysis. The lower branch of the beamformer is the sidelobe cancelling branch. Its purpose is to form the sidelobe cancelling signal Yle by providing K reference inputs to the adaptive noise canceller. Yic contains estimates of the jamming components in the desired response , so that after subtracting Yic from di, the beamformer output Zle is a "cleaner" representation of the signal. Note the use of complex conjugate weights Wi.IeO = I, "', 1<) in computing Yic. These weights can be updated by several different. methods, for example the complex LMS algorithm of Widrow et al. [2]. The .sidelobe cancelling branch is preceded by the signal 582

blocking matrix B, a preprocessor designed to block the signal so that the sidelobe cancelling branch cannot learn it. The preprocessor has K inputs and K outputs. In this paper it is assumed that K < K. The simplest example of a preprocessor is adjacent element differencing, yielding zero gain in the look-direction (in the absence of array imperfections). Griffiths and Jim [5], [6] showed that use of the latter blocking processor makes the converged GSC behave like a Frost beamforrner. For this preprocessor K = K - 1, and results in K - 1 degrees of freedom being available in the sidelobe cancelling branch to form nulls in jammer directions. The restrictions on Bare [6]

effect of the element gains on si: The same plane wave assumption can be used to represent ik in terms of the jammer i, at time sample k, the diagonal matrix Gj representing the effect of the element gains on Jk » and a diagonal matrix ~ (whose components should not be confused with the amplitude error terms da;) which accounts for the phase shift in components of ik due to presteering:

BIK=Ok rank (B) = K.

w is the common center frequency of the narrow-band signal and jammer, in rad/s. n, in terms of the individual receiver noises 0i,k is

s, = SkGs

A j g diag {e- i wT1,

(5)

f K is a vector of length K whose elements are equal to 1's, and OK is a zero vector of length K. The rank 'of a matrix is just the number of linearly independent rows or columns. In the sequel, matrices and vectors will be symbolized by bold uppercase and lowercase letters, respectively. Complex conjugates will be represented by overbars, transposes by a superscript T, Hermitian (complex conjugate) transposes by a superscript H, and steady state quantities which are based on using the Wiener weight vector by a superscript asterisk. £1[ . ] represents time expectation. Define the state vector u, and weight vector W k at time sample k as follows: T

wk,kl

T

ale represents the signal amplitude, ejwTsk the (noninformation bearing) carrier, and ejl/l le the phase. Ts is the sampling interval, in seconds. a 2, the signal power as measured at any element in the absence of array imperfections, then becomes

(19)

Papoulis [36] showed that the random phase

(7)

The unique Wiener weight vector w* (which minimizes MSE), is then [2] (10)

The complex snapshot vector at the kth time sample x., is defined as the vector of the received signal and jammer, following presteering and including the effects of both receiver noise and array imperfections: (11) Uk:

(12)

The snapshot vector is the sum of a component s, due to the signal, a component jk due to the jammer, and a component n, due to the receiver noise:

"'k

must be - Uta, b) represents a random variable uniformly distributed on the real interval [a, b]. Representing s, and i, in complex envelope notation, as.k and aj,k are the amplitudes of the signal and jammer, respectively. a; and a] are their powers. "'s.k and "'J.k are their phases. Finally, as,k and aj,k are statistically independent, as are "'sik and "'j,k' and the latter four quantities are all assumed to be varying slowly enough with respect to the sampling interval so that s, and l« can both be considered narrow band, since the narrow-band assumption for Xk implies that ale and l/;k are correlated over successive sampling intervals. Using (2), (4), and (12)-( 17), the state vector can be rewritten as - U(O, 21r) for Xk to be stationary, where

(9)

B then transforms x, into

(16)

(18)

(8)

*-R -I uu rude

e- i WTK } .

Sk and lie can both be represented in complex envelope notation. A sampled signal Xk in complex envelope notation is given by

Also define the autocovariance matrix R uu and crosscovariance vector rud:

W

".,

(15)

(17)

(6) .

(14)

jk=JkGjAjf

(4)

Uk,k]

r

(20)

It is worthwhile reemphasizing that in (20) the noise n, is affected by neither the element errors nor the presteering, because it was modeled as being added in after both the signal/ jammer reception and presteering delays. The appropriateness of this model will vary with the application at hand. From (8), (20), and the independence of signal, jammer, and receiver noise

(13)

Utilizing the plane wave assumption for the signal, It IS possible to write an expression for s, in terms of the signal Sic at time sample k and the diagonal matrix G s representing the

where the receiver noise power equal for all channels.

583

0'; £ E,[lni.kI2] was assumed

Equations (2) and (12)-(17) can be used again for the desired response d k (i.e., the adaptive noise canceller primary input) to obtain 1

1

-

dk=J( »: 1 =SkOls+lkOlj+J( nIl. T-

.

Making use of (23), (24) and (28)-(33), the exact expressions for Was and WOj are

(22)

as is the value of the "normalized" array (spatial) factor in the look direction, and Ctj is the analogous value in the jammer's direction: (23)

B. Discussion

(24)

The "normalized" array factor a in any general direction is

1 _ _ a ~ - 1 TGAI K

(25)

where the nonsubscripted steering matrix A is for any direction (J. A is formed by substituting (J for 8j in the expression for r., appropriate for the particular array geometry used, and then using (3) in (16). ex is both the "normalized antenna pattern of a conventional beamformer implemented with an imperfect array, and the ' 'normalized unadapted directivity pattern of the GSC with an imperfect array, assuming that the adaptive weights are all set to zero. Substituting (20) and (22) into (9)

- W Hk BdGs -1. 1k-Sk

t ,

(26)

Inverting (21)2 and multiplying by (26), the exact Wiener solution becomes

jammer. One would expect' this type of behavior, because the beamfonner only cares about minimizing output power (i.e., MSE), in any way it can. Therefore, without having extra information supplied to it concerning the nature of Sk (e.g., a pilot signal [2]), without advance knowledge of the imperfections, and without any other special remedies being taken, the beamformer has no recourse but to walk the plank of signal annihilation.

III. OUTPUT with Was and WOj given in terms of as, aj, SNRi , INRi, on OJ, Os), and 0)51 with the latter six quantities to be defined below. The input signa/-lo-noise and interference-to-noise ratios can be measured at any element to be (28)

(29)

The quantities as, OJ, OSj, and Ojs are also scalar, and will be called the S-, j-, sj-, and js-signal blocking matrix factors, respectively:

Os g (dG s f)HB T(BB T) -IB(dGs 1)

(30)

OJ ~ (GjAjr)HBT(BBT)-'B(GjAjI)

(31)

Os} ~ (aGsf)HBT(BBT)-'B(GjAjI)

(32)

T -I -Ojs = (G1Aj ... l ) H B T (DB) B(dGsl)=osj.

(33)

6

(36)

As long as Ik =I:: 0, the sidelobe cancelling branch uses it to try to estimate the signal. For "high" SNR i , that estimate will be fairly good, and the signal inadvertently gets treated like a

t t

_ 22rud-usasBaGsl +ujCljBGjA)l.

By studying the simpler expression for Uk that results when all array elements are ideal (i.e., ilg; = 0, i = 1, ... , K in (20», it is seen that the effect of array imperfections is to allow the signal to "leak" into the sidelobe cancelling branch despite the signal blocking matrix [6]. This leakage will be denoted by It, and with reference to Fig. I and (20) can be written as

2 Inversion of R ilil is accomplished by two nested applications of the matrix inversion lemma [37]. Considering the multiple jammer case, if there were N

jammers, an exact closed form expression for the Wiener weight vector would require N + 1 nested applications of the matrix inversion lemma, which would greatly increase complexity without adding much additional insight into the hypersensitivity phenomenon under study here.

SIGNAL-To-INTERFERENCE-PLUS-NoISE RATIO

The purposes of this section are fourfold. First, the exact formula for Wiener output SINR is derived. Second, we find the conditions when the signal will be nulled. Third, an approximate equation is presented for output SINR of the Capon beamformer. Fourth, the ideas in this section are clarified by carrying out a "Wiener simulation," which is done by randomly generating array imperfections, and then using (27) to compute the Wiener weight vector that minimizes MSE assuming that the beamforrner is unaware of the imperfections.

A. Derivation of Exact Formula The exact expression for Wiener output SINR is derived below. The output power Po of the GSC is

I E ,[ IZk 12]=21 (E,[ Idkl 2] Po £ 2 -Et[dk.Yk] -Et[dkYk] +E,[IYkI 2] ) .

(37)

With reference to Fig. 1, the sidelobe cancelling signal Yt is (38)

When each term in (37) is evaluated using (20), (22), and (38), it follows that Po can be expressed as the sum of three terms. The first is Pas, the output power due to the signal only,

584

the second term is P Oj , the output power due to the jammer only, and finally POn is the output power due to the receiver noise only. When w* is used in (38) to calculate the steady state sidelobe cancelling signal, and then is used in (37) to calculate the steady state values of Pas, POj , and POn (denoted by P ds ' P and P6n' respectively), one obtains

y:

enough for the jammer to be nulled out. For strong jammers, the second assumption will essentially always be true, since OJ == K away from the look direction [35]. Armed with the assumptions (44), (45), it is straightforward to show that [35]

y:,

w'

p*

.fl

Os -

I W [) W ;;"1 -1 (/21a 121-~-~ 2 s s as as

2

SINRri

== O.

(46)

This result provides solid evidence that the signal gets treated like a jammer. Furthermore, under the two assumptions made, the GSC performs even worse than the conventional beamformer, which at least will have nonzero SINR o given by (43).

(39)

(40)

c. Approximation by Output Signal-to-Noise Ratio This subsection will present a simple approximation for SINRti that is valid for the Capon beamformer. If the designer wishes to check the accuracy of the assumptions (44), (45), some B must be chosen so that (30), (31) can be evaluated. A class of signal blocking matrices known as central difference matrices is formed by using r cascaded columns of differencing, as shown in Fig. 4. Notationally, they are written as B ~- 1), where the subscript K indicates the number of antenna elements, and the superscript (r - 1) the use of a main lobe zero (r - 1)-st derivative constraint in the look direction (when the array is ideal). The quantity K (the dimension of the state and weight vectors) then becomes (K - r). The simplest case is a zero superscript, or r = 1, which is just adjacent element differencing, and configures the GSC to be maximum-likelihood, implementing a simple unity gain constraint in the look direction. The (K - 1) x K matrix B~) is

(41)

SINRti (i.e., the steady state SINRo) is then SINR* ~

o

p*

Os

P*.+P* OJ On

•

(42)

SINRo of a conventional beamformer, symbolized by SINRo,c, can be derived by setting W(b- and WOj to zero in (39)(41), and then substituting the result into (42). There is no need to explicitly indicate that SINRo,c is steady state, since no adaptive weights are involved in computing it:

SNRi las l2 SINRo,c = - - - -

INR'la'12+~ K I

(43)

J

Recall that the analysis so far involved no approximations

with respect to relative power levels of signal, jammer, and

-1

receiver noise. Additionally, no approximations were made as far as the magnitude of the array imperfections were concerned.

B(O) ~ K

0

B. Hypersensitivity to Array Imperfections This subsection will demonstrate signal nulling when SNR; is "high." Two assumptions will first be made. They are SNR·

~

INR·

~

I

I

1 Os 1 -

O.

0

(47) -1

From Fig. 4, it should be clear that the (K - r) x K matrix I) then becomes

B~-

B(r-l)

K

(44)

=

II' B ;=0

(0)

K-r+I'

(48)

Due to the importance of the Capon beamformer in practice,

(45)

J

where Os and OJ were given in (30), (31). The first assumption means that SNR; is high enough for the signal to be nulled out. Os measures the imperfectness of the array from the signal viewpoint, so that when Os is low (good array), only high SNR i will result in signal nulling, and when Os is high (bad array), even low SNR; will result in signal nulling.? The second assumption means that INRi is high

B~) will be chosen to illustrate the form (30)-(33) take. From

[35]:

which allows (30)-(32) to be written in purely scalar form:

) In an actual implementation, even though (44) may be satisfied, nulling of the signal may be prevented somewhat by the limited dynamic range of the weights. Throughout this paper, it is assumed that the weights have infinite dynamic range.

585

(50)

(51)

•••

2

Look-direction

r

signal

Jammer

(). J

K

...

Inputs

Fig. 4.

~

U

K

(0) _

sj -

~ £J

A -

~gi.s

~} ~ :.

K - r

• •• •

Outputs

e

Fig. 5.

• • •• •

Array geometry.

Letting Ea [ · ] represent expectation over an ensemble of antenna elements that are i.i.d., the gain error variance u 2 (which measures the variance of the fractional gain deviation from its nominal value) is defined as

i= 1, ... , K

jWT'

I

t= I

(54)

and the amplitude error variance and phase error variance in a similar manner:

where the superscript (0) has the same meaning as for B~). From these last three equations it is easy to see that as long as ~gi.s ~ 1 and tlgi,J ~ 1, which is realistic: (53)

As shown in Section III-A, SINRti is a complicated function of the number of array elements . the array geometry, the array imperfections, the look direction, the jammer angle of arrival, the input signal-to-noise ratio, the input interference-to-noise ratio, and the signal blocking matrix. Therefore, any assumptions which simplify the expression for SINRri while preserving its accuracy will help tremendously. Fortunately, for the Capon beamformer, there is such an assumption, and in addition to being intuitively pleasing, it can also be shown numerically to be quite reasonable. The assumption is that SINRti can be approximated by the Wiener output signal-to-noise ratio (SNRri). In a nutshell, this means that the Wiener output signal and receiver noise power are the same as if there were no jammer present, and the Wiener output jammer power is zero. For these approximations to hold, the jammer must fall outside the main lobe of the unadapted beampattem. In order to test the validity of the above approximations, Pris, P and Prin were plotted as a function of the gain error variance (cf. (54) below) for a ten-element equally spaced line array with d/). = 0.5, a broadside look direction, SNRi = 30 dB, and INRi = 50 dB. The array geometry is shown in Fig. 5. The jammer angle chosen was the worst one outside the 0 main lobe, at OJ = 17 • As with all other "Wiener simulations" in this paper, the computations were performed on a DEC VAX 11/780, using DOUBLE PRECISION program variables.

w'

~

10 9 8 7 6 5 4 3 2 1

Cascaded columns of differencing.

(1 + ~gi,j A ) e-

A 2

-

i= 1, ... , K

(55)

i= 1, ... , K.

(56)

The gain errors were randomly chosen as the sum of equal variance, independent, zero-mean Gaussian amplitude and phase errors: (57)

where both u a2 and ap2 must be "small" for (57) to hold. "Small" means (1 + Ila;)e j Ap ;

== l+da;+jllpi,

i = 1, .", K.

(58)

For this "Wiener simulation," only angle independent amplitude and phase errors were generated, so that dgi,s = dgi,j(i = 1, ... , K). In the plots referred to above, Pris, P and P6n were compared with and without the jammer present for in the range - 100 to - 20 dB, where

w'

a; (dB) £ 10

10glO

u;.

u;

(59)

The respective curves of Pris and P6n were so close to each other that they were virtually indistinguishable. P~ was at least 70 dB below both P
586

that SINRri can be approximated by SNRti, resulting in SNR· K SINR * == I o l+SNR~I K(K-l)a g2

TABLE I

THRESHOLD GAIN ERROR VARIANCE

(60)

which in the case of an ideal array is SNRiK, as it should be, and drops down 3 dB at the threshold gain error variance of (J2 =:: g, threshold -

1 -----SNR;K(K -1) .

(61)

The similarity between (60) and Hudson's equation [17, eq. (6.2.8)] is considered in [35]. The form of (61) is also consistent with Compton's [20] observations. This is elaborated on in [35]. Furthermore, (61) can be used to define the threshold input signal-to-noise ratio needed in order to have inadvertent signal nulling, for a given number of array elements and gain error variance. The result is a strong suggestion that the signal will be nulled under far milder conditions than given by (44) (it is shown in [35] that o~O) == (K - 1)<1;, which allows direct comparison between the conditions for signal nulling in (44) and (61». The simulation data in Table I (obtained by ensemble averaging 50 curves of SINRti versus (J; and then measuring (J~, threshold) shows that (61) is accurate to within 1 dB when the gain errors are the sum of equal variance, independent, zeromean Gaussian amplitude and phase errors that are angle independent, as discussed earlier. The array geometry is also the same as the one discussed earlier. The measurement error for the simulation data is ± 1 dB. For high SNR i , the threshold gain error variance is extremely small. In fact, since both Mayhan [15] and Schrank [38] have stressed that (J; < - 40 dB is usually considered difficult to achieve in practice, it is clear from (61) that for high SNR i , it will not be possible to control (J2 to a tight enough precision to avoid signal nulling. For example, if SNR i is only 20 dB, and a ten-element array is used, (61) says that (J2 needs to be about - 60 dB, which is an almost impossible specification. Therefore, when SNR i is high, some special

strategy must be taken to avoid signal nulling in linearly constrained adaptive beamforming, D. Wiener Simulation Example This subsection will contain a "Wiener simulation" example. In order to graphically illustrate the ideas in this section, a numerical example was done for a ten-element equally spaced line array with a broadside look direction and d/"A = 0.5, where d is the ideal interelement spacing, and A the radiation wavelength. The array geometry is the same as indicated previously in Fig. 5. Signal blocking was accomplished by adjacent element differencing (r = 1 in Fig. 4), which put the GSC into a maximum-likelihood configuration. The signal had a power level of <1; = 0 dBw, 4 the jammer (J~ = 20 dBw, and J the receiver noise (J2n = - 30 dBw . The imperfections were assumed to be due only to random element misplacement, obeying a two-dimensional zero-mean 4

Decibels relative to 1 W. The formula is: X(dBw)

=

K

1!..

10 10 10 10 10 10

0.5 0.5 0.5 0.5 0.5 0.5

~

8, (del)

8) (del)

SNR, (dB)

0 0 0 0 0

17 17 17 17 17 44

10 20 30 30 30 30

0

Ideally

placed

,.,--

I

/

/

I

\

\

Fig. 6.

"

(dB)

-

Simulation

-39.5 -59.5 -79.5 -79.5 -79.5 -79.S

-39 -59 -79 -79 -79 -79

SO 40 30 SO

" "-

---

\

0

}OII~ L1x

(JT '-....

threshold

Formula

SO SO

- --...

/'"

I Ay

a:.

Misplaced element:

•

element:

INR, (dB)

,,/ -."..,.....,

I

\

\

/

I

/

Geometry of random element misplacement.

Gaussian distribution whose radius had a standard deviation o, of 0.01 A (only 2 percent of the ideal interelement spacing, a specification that may not even be attained in practice for many applications), taking the element position in an ideal array as the centroid, shown in Fig. 6. These random misplacements cause a direction-dependent phase error at each antenna element. However, in [35] it is shown that for far-field signals, under the three assumptions of the random x misplacement ~ and random y misplacement ~y both being "small," statistically independent, and of equal variance, the phase error variance is independent of angle of arrival, 5 and is given by (62)

a;

In this particular simulation there were no amplitude errors, so is also given by (62), and is easily calculated to be - 24 dB, a rather large error for this system, in light of the roughly - 80 dB threshold gain error variance from (61). As a basis for future comparison, Fig. 7 plots SINRti for an array having ideally placed elements. This plot was generated by sweeping thejammer from 8j = - 90° to 8j = + 90° in 1° steps, and at each angle computing SINRti by (42), with SINRo,c (computed by (43» included as a reference in order to

10 log,o (X(W).

587

5 Although this result may seem strange, it [urns out to be a direct consequence of the fact that sin 2 8 + cos ~ 8 = 1.

50 . 00

Look-direction

•

,.> Adaptive

I

10 . 00

I

i

10 .00

I

. 00

-10 .00

-20 . 00

i!

I I

20 . 00

::

\

\"'-.J / \\ ;I\\ ) 1\, '-

I

I.

\ \

~

=

10 SNR i

= 30 dB \ .

d A

=

1 2

INRi

= 50 dB

Os

=

on

O'r

-1 00 . 00

A

=

\

;

\

"

\1)

-50 .00

. 00

i\

) \I Ij

l .j - ~

'\

a

I

II1\;1\\ ' \

i\

I

1,1

:i

I

II

!

i

1\

I\

I

/

i Ii

,i

::

K

- 30 . 00

~

·:·" ,,:.

30 .00

'--'

,I

/

\\ _- ,//

Conventional

50 . 00

100 . 00

Jammer angle of arrival, OJ (deg) Fig. 7.

Wiener output signal-to-interference-plus-noise ratio versus jammer angle for Capon beamformer implemented with ideal array . '

demonstrate the performance improvement due to adaptation , Clearly, in the absence of array imperfections, the GSC works very well. Fig. 8 uses the same parameters as Fig. 7, except that SlNR 1i was computed with the effect of the unknown random element misplacement included. Although the conventional beamformer is hardly affected by slight element misplacement, the GSC SINRri is seen to drop by over 50 dB at most jammer angles , which is a very serious loss. In fact, for the GSC, SINRri falls so low that even the nonadaptive conventional beamformer outperforms it for essentially all jammer angles of interest. These observations are consistent with the analysis presented earlier in this section . The array factors for both cases, computed by (25), are shown in Fig. 9. They are virtually the same for both the array with ideally placed elements , and the array with misplaced elements. The worst jammer angle outside the main lobe is 0 pointed out , which is OJ = 17 • A jammer coming from this angle is at the maximum of the peak sidelobe of the unadapted array. In Fig. 10, the far-field directivity pattern of the GSC for the above jammer angle is plotted, without the effects of element misplacement considered . In Fig. II , the far-field directivity pattern of the GSC for the above jammer angle is plotted, with the effects of element misplacement considered. From Figs. 10 and 11, it is clear that for the ideal array, only the jammer is nulled, with the gain in the look-direction satisfying the 0 dB constraint. However, for the array with slightly misplaced elements, not only is the jammer nulled, but due to the small pointing error resulting from element

misplacement, the signal is nulled as well, and in the process the main lobe has been destroyed. The high directivity shown at most angles for the array with misplaced elements does not violate any physical principles . It is merely indicative of the large weight values needed to null the signal. The beamformer chooses these large weight values because they minimize MSE, without regard for the effects on the directivity pattern. IV. ARTIFICIAL RECEIVER NOISE INJECTION

The goals of this section are to derive the new leaky algorithms for artificial receiver noise injection in the GSC by means of a novel approach , present an expression for the "optimum" level of this noise along with some "Wiener simulations," and discuss results of "data simulation " experiments which support the theory set forth in this paper. Contrasting the terms "Wiener simulation" and "data simulation," the latter means that the performance of the GSC was simulated not by computing the Wiener solution, but by using random input signals whose temporal and spatial characteristics were specified as input variables to a data simulation program. This type of simulation allows one to observe both transient and steady state GSC behavior.

A. Derivation of New Leaky Algorithms In this subsection , the new leaky algorithms for artificial receiver noise injection in the GSC will be derived. The Wiener weight vector computed in (27) solved the unconstrained optimization problem

588

minimize £( [i Zk 12] . w

(63)

10 .00

20 .00

\

\

......... CQ "'0

. 00

-....J

* 0 Pc:

~

-10.00

-10 .00

Look-direction •

\

Conventional

A ~

\

\

\

I

..\)UV\ jVU..

----- .-----.-- ----..-- --. \1/ -.-..--. .-- -. K = 10

SNR j

1 d = >.. 2 aOs

= 30 dB =

50 dB

= 00

~-· - -=~~p--~:~e

':"

:: '.

" "" "

" "

I

-60 .00 . 00

-50 .00

-100 .00

50 . 00

100 .00

Jammer angle of arrival, 8j (deg) Fig. 8.

Wiener output signal-to-interference-plus-noise ratio versus jammer angle for Capon bearnformer implemented with imperf ect array.

, y{ \

20.00

Look-direction

. 00

......... ~ "0

-....J

d

.-ca-

- 20 .0 0

/

~

0

u

ctl ......

-40 . 00

" "

..

Jammer at 8j

=

170

ctl

0. CI.l

>.

-60.00

ctl

K

~

< ~

- 80 .0 0

-d

10

Ideal array

1

Imperfect array

x =

2

=

00

6s

CI r

>..

= 0.01

-1 00 . 00 - 100. 00

- 50. 00

. 00

50 .00

Far-field angle, 6 (deg) Fig . 9.

Array spatial factors of ideal and imperfect arrays.

589

100 . 00

10 .00

Look-direction signal

•

..

Jammer

20 .00

-

.00

CQ ~

-

......"

-20 .00

\(

>.

0> °BU

-10 .00

~

is

-d A

-60.00

1

= -2

as = 0° aj -- 17°

-80 .00

- 100 . 00

SNR i

= 30 dB

INR I.

= 50 dB

-sc .oo

-100.00

. 00

Far-field angle,

10. 00

Look-direction signal

CO

"Cl

-

..

..

-~~V~\(

10 .00

......"

(deg)

100 .00

~

Single jammer far-field direct ivity pattern of Capon beamformer implemented with ideal array .

Fig. 10.

-

a

so.co

[I

.00

Jammer

/>.»>:

V

~

'--

-20 .00

>.

0>

.~

-10.00

a

-60.00

.... U

K

=

d A

1 = -2

10 SNR I.

= 30 dB

INRi

= 50 dB

as = 0°

-80.00

0" r

aj = 17°

-100 .00 - I ~ O . OO

A

= 0.01

-sc.oo

.00

Far-field angle, Fig. 11.

a

so.co

100 . 00

(deg)

Single jammer far-field directivity pattern of Capon beamfonner implemented with imperfect array .

590

where R gg is defined as the covariance matrix of element imperfections:

It was previously shown that the degradation of the GSC performance in the presence of array imperfections was due to leakage lie of the signal into the sidelobe cancetling branch, as given by (36). In order to eliminate this leakage, consider the constrained optimization problem minimize Et,a[1 Zk 1 w

2

(70)

Thus it is hoped that (69) represents a good approximation to the penalty function. Unfortunately, in an on-line implementation of stochastic steepest descent, the computation of P(Wk) as given by (69) could still prove to be quite a mess, due to the presence of the time expectation operator. Consequently, it is proposed that the instantaneous value of the weight vector at the time sample k be used to generate an approximation to the expectation on the last line of (69), in the manner

]

subject to Ik = 0

(64)

where Et,a[·] represents a double expectation, taken over both time and an ensemble of i.i.d. antenna elements. Equation (64) is ill-posed, and therefore cannot be solved directly. The reason is that knowledge of the signal leakage in advance is the same as knowledge of the unknown array imperfections, so the only way to be sure of satisfying the constraint would be to set all adaptive weights to zero. The result would be a conventional beamformer. However, it is well known in optimization theory that the solution of a constrained optimization problem can be approximated by solving a corresponding unconstrained optimization problem. From Luenberger [39], one learns that for the method of steepest descent, the procedure is to convert the constrained optimization problem minimize

!(Wk)

P(Wk) = a;wfBRggB TWk

where P(Wk) represents the instantaneous estimate of P(Wk). The approximation used to obtain (71) becomes succesthe biased Wiener weight sively better as W k converges to vector when the penalty function is used. If the element imperfections are assumed zero-mean i.i.d., then from (70)

w:'

R gg = a;I.

(73)

(65)

The complex LMS algorithm, which can be used to update the weights in the GSC, was shown by Widrow et al. [2J to be

to the unconstrained optimization problem minimize (j(w k) + CP(Wk» w

Wk..-'

(66)

= Wk + 2j.LE-k Uk = (I -

2J.LUkU~)Wk + 2J.Ldl\Uk

(74)

where J.L is called the adaptation constant, and fA; is the error signal at time sample k, which for the GSC is chosen to be the beamformer output z; = d, - Yk' If (74) is used for the GSC, the mean weight vector will converge to W* [2], and all the analysis done so far in this paper concerning hypersensitivity to array imperfections will apply. The complex LMS algorithm can be derived by considering the squared error function i; or performance surface estimate at time sample k of the form

where f(wk) , q(Wk), and P(Wk) all represent suitable functions for the problem. In system optimization theory, P(Wk) is known as a penalty junction, and c as a penalty constant, the latter generally chosen to be "large." For a stochastic problem, one usually chooses P(Wk) as E[lq(Wk)\2], with the expectation taken as appropriate. Then by making the penalty constant large in (66), the minimization of CP(Wk) dominates the minimization of (!(Wk) + CP(Wk». Therefore, the solution to (66) must have small q(Wk). Indeed

{k=~*+(Wk-W*)HUkU~(Wk-W*)

(67)

which leads back to the corresponding constrained minimization problem (65). Based on the above argument, the logical choice for the constrained optimization problem is the one in (64), namely (68)

By using (36) and the conjecture that for large penalty constant, the weight vector will be fairly independent of the array imperfections, the penalty function p( W k) can be written as [35] P(Wk) = Ec,a[\lk 1 2]

== a;Et[wfBRggBTWk]

(72)

Substitution of (72) into (71) results in

=0

subject to q(wk) = 0

(71)

(69)

(75)

with ~k denoting the MSE (i.e., performance surface) at time sample k, and ~* representing the minimum (attainable) mean-square error (MMSE). By taking Jl times the negative gradient of t, with respect to Wk, and adding it to Wk, Widrow et al. [2] developed the complex LMS algorithm. Due to the form of (75), the MSE is a K -dimensional concave upward parabolic bowl, which means that taking the negative gradient of tk on the average leads to descent in the bowl's steepest direction. Use of the estimated penalty function P(Wk) yields the modified performance surface estimate Pk, which is Pk={k+CP(Wk)=~*

(76)

with Pk denoting the MSE at time sample k when the modified performance surface is used.

591

Taking p. times the negative gradient of Pk with respect to and adding it to Wk in the manner of Chestek [40]:

weight vector and MSE by choosing p. in accordance with

Wk,

1 O
Wk+I=(I-r BDT)Wk+ 2P.Ek Uk

=

(1-21-{ukuf+f BBT) )Wk+ 2J.L dkUk

3 tr (R uu ) + o

(77)

From (77), one immediately obtains the equality (78)

The reader is warned that in general, the optimum c is a complicated function of the system parameters (cf. Section IVB).

The constant

r is generally close to zero and must satisfy r

> o. 6 This new algorithm will be called the GSe leaky LMS

algorithm, after Widrow and Stearns' [2] leaky LMS algorithm Wk+ 1 =

=

(1 -

t)Wk

+ 2p.ik U k

(I-2,.,.(ukuf +~ 1) )Wk+2,u1kUk

(79)

n

uu

r

)-1

+ 2,.,. BOT

rud

(80)

which is just like having equivalent receiver noise power a; (cf. (21» of (81)

where the injected receiver noise power

; =i.u:'

a2

T)

where tr( ·) denotes matrix trace [37], defined as the sum of the diagonal elements, and also equal to the sum of the eigenvalues. The bound in (83) is easily calculable in an on-line implementation, since tr(Ruu) is just the total input power to the sidelobe cancelling branch, and p., and B are all specified by the designer. The advantage of using an LMS-type algorithm in the form (77) over other adaptive schemes is that the computation per iteration required to update the weights is only on the order of K complex multiplications, since DB T is generally banddiagonal (e.g., B~)(B~»)T is tridiagonal). Sometimes one wishes to estimate R uu by other more computationally intensive means, such as in the sample matrix inversion (SMI) algorithm considered by Reed et al. [3]. Then artificial receiver noise can still be injected by means of the artificial noise covariance matrix estimation algorithm

r.

asc

with (1 termed the leakage parameter. When the leakage parameter is unity, leaky LMS simplifies to complex LMS, as given by (74). Widrow and Stearns' [2] analysis of leaky LMS can with minor modifications be used to show that the algorithm (77) has the effect of biasing the Wiener weight vector toward

w:= (R

i.tr (DB 2p.

(83)

af is simply (82)

The term (t/2p.)BB T in (80) is known as artificially injected receiver noise, because it is the same as the covariance of white noise after transformation by the signal blocking matrix. It results from solving the modified unconstrained optimization problem (66), and is not due to any noise sources introduced into the adaptation circuitry. Since the GSC leaky LMS algorithm is a special case of Chestek's [40] soft-constrained LMS algorithm, his results can be used to guarantee the convergence of both the mean 6 Strictly speaking, there is nothing that prevents the algorithm (77) from being used in the reverse mode, ~anin~ f < O. When u~ this way, th.e effect is to artificially subtract receiver noise, rather than add It. However, If one is not careful, some eigenvalues of (RUII +
2 T R- uu•i -_ R uu + (1;DB

(84)

where Ruu is the data-dependent estimate of R uu , and Ruu ,; is the data-dependent estimate of R uu , modified by the artificial noise injection.

B. Optimum Injected Noise Power This subsection will present an expression for the optimum artificially injected receiver noise power in the Capon beamformer, along with a "Wiener simulation" to verify that the artificial receiver noise injection strategy works. In order to use the algorithm (77), (17 must first be determined. If is chosen to be too small, the adaptive weights will hardly be affected and the signal will continue to be nulled. On the other hand, if af is chosen to be too large, the adaptive weights will all be driven toward zero and the jammer will not be nulled. In fact, as af tends toward infinity, the performance of the approaches that of a conventional beamformer. In [35] an expression is derived for the "optimum" artificially injected receiver noise power in the Capon beamformer, symbolized by opt. "Optimum" here mea~s ~at SINRti is maximized. The major assumptions in the derivation are that the Wiener output signal power is the same as if there were no jammer present, the Wiener output jammer power is the same as if there were no signal present, the Wiener output receiver noise power is the same as if there were no jammer present, the jammer is outside the main lobe of the unadapted beam pattern, and the leakage of the signal into the sidelobe cancelling branch is small compared to the equivalent receiver noise. Using these five major assumptions, and a few more minor ones

592

a7

asc

a7,

deviation equal to only 2 percent of the ideal interelemeor spacing. All parameters in this example are identical to those used in Section ill-D except for the strategy of artificial receiver noise iniection, Use of (85) suggests o'~It opl/U n2 = 32• 2 ~ dB, which was rounded up slightly to 0';/ o'~ = 35 dB in order to check the effect of an inadvertent 3 dB error when applYing the formula. Fig. 12 shows the resulting far-field directiVity pattern of this adaptive beamformer, using the latter amount of artificial receiver noise injection. Comparing to Fig. 11, it is seen that the signal nulling problem within the main lobe is now completely eliminated, while at the same time sacrificing less than 10 dB of jammer nulling. From the viewpoint Of

TABLE IT

OPTIMUM ARTIFICIALLy INJECTED RECEIVER NOISE POWER 2

O'"opc (dB)

K

10 10 10 10 10 10 10 10 10 10

d

r

0.5 0,5 0,5 0.5 0.5 0.5 O.S O.S

0.5 0.5

8, (deg)

0 0 0 0 0 0 0 0 0 0

OJ (deg)

17 17 17 17 17 17 17 17 30 44

SNR, (dB)

10 20 30 30 30 30 30 30 30 30

INR, (dB)

a~ (dB)

-40 -40 -40 -40 -40 -20 -30

SO SO

50 40 30 SO SO SO

-50

50

-40 -40

SO

a~

For-

Simu-

mula

latioD

18

23

28 26 23 33 31 26 29 30

18

23

28

25

23

34 31 25 29

Wiener filter theory, this result verifies that artificial receiver noise injection alleviates the signal nulling problem without seriously compromising jammer nulling.

30

Table II shows that this formula is accurate to within ± 1 dB when the imperfections are due to equal variance, independent, zero-mean Gaussian amplitude and phase errors that are angle independent, and a ten-element equally spaced line array with d/X = 0.5, a broadside look direction, and receiver noise power of O'~ = - 30 dBw are considered. The data in this table were obtained by ensemble averaging 50 curves of SINRti where versus 0';, and then measuring the two values of SINRti was down 3 dB from its maximum. 0'7. opt was then selected as being in between these two "3 dB points." This procedure assumes == which should be excellent for 0'; ~ 0';. The measurement error for the simulation data is ± 1 dB. The ± 1dB accuracy for the formula should be good enough for most applications, since a plot of SINRti versus u;/o'; has a fairly broad 3 dB width. For the cases in the table, this width was measured to be 36, 26, 16, 12, 10, 9, 13, 23, 18, and 20 dB, respectively. In addition, the loss in SINRti from its value when ideal elements were used was very small. More precisely, for the cases in the table this loss was measured to be only 0, 0, 0, 1, 1, 2, 1, 0, 0, and dB, respectively.

c. Data Simulation Examples "Data simulations" to support the analysis performed earlier can be found in [35]. These data simulations compared the conventional LMS algorithm with the GSC leaky LMS algorithm. The basic finding was that use of the GSC leaky LMS algorithm was effective in solving the signal nulling problem, and had the additional bonus of eliminating weight pathology due to finite precision effects. However, the number of time samples required for convergence was several times greater for the GSC leaky LMS algorithm.

u;

u; u;,

°

V.

EXTENSION OF RESULTS

The purpose of this section is to explain how the results of this paper can be extended to the two important cases of a multiple jammer signal environment and wide-band adaptive array processing.

A. Multiple Jammers

The only results in this paper which are totally independent of the number of jammers are the new leaky algorithms (77) and (84). All other results can be extended to the case of multiple jammers either by analytically inverting the more complicated expression for a multiple jammer autocovariance matrix, or by simply solving for the autocovariance matrix inverse numeriIn practice, although it is pleasing that an optimum exists, one does not have the a priori information needed to cally. The inverse of the autocovariance matrix is then compute it. There are two obvious ways to deal with this multiplied by the crosscovariance vector to obtain the Wiener situation. The first is to use an on-line search procedure to weight vector, after which it is straightforward to calculate any determine o'~ opt' Hudson [17] describes this type of procedure other quantities based on w*. If R uu is inverted analytically, certain approximations may become critical for obtaining as having, "overall speed comparable to matrix inversion," which means computation proportional to the cube of the tractable results, such as assuming that all jammers are large number of antenna elements. The second way to deal with the compared to the signal. Of course, another method of predicting performance in a lack of a priori knowledge is to either guess or substitute worst-case estimates for the unknown parameters in (85). multiple jammer environment is simply to write the data simulation software in such a way that the temporal and spatial Then it is hoped that the robustness of opt will permit this procedure to yield acceptable results. An example is provided characteristics of more than one jammer can be entered as input variables. This procedure was carried out by the author in [35]. In order to demonstrate how effective artificial receiver to confirm that even in a multiple jammer environment, noise injection can be, the procedure will be applied to the linearly constrained adaptive beamformers still null the signal imperfect array of Section Ill-D, which exhibited hypersensi- as long as array imperfections are present, no special remedies tivity to random element misplacement having a standard are used, and at least one extra degree of freedom is available.

u;

Therefore, one does not need to pick very accurately to obtain close to optimum performance, and secondly, a good choice of will essentially restore the output signalto-interference-plus-noise ratio to its value in the case of an ideal array.

u;

u;

at,

593

"0.00

++

Look-direction signal

Jammer

20.00

.00

,,-....

CQ

~

-20.00

..> .ij 0 ....

-40.00

'-"

/

~ .--e

0

-60.00

K

= 10

d A

=

SNR i = 30 dB

1 2

INRi = 50 dB

-80.00

u,

= 35 dB

= 0.01

-100.00 -100.00

.00

-50.00

Far-field angle, Fig. 12.

50.00

a

100.00

(deg) ...

Single jammer far-field directivity pattern of Capon beamfonner implemented with imperfect array, when approximately 3 dB more than the optimum amount of artificial receiver noise is injected.

direction signal as if it were a jammer. Modeling the array

B. Wide-Band Adaptive Array Processing The extension to the wide-band case is trivial, assuming that wide-band adaptive noise cancelling techniques are now used, which give the optimal Wiener weightings as a function of frequency. Although these optimal Wiener weightings are, "ideal, based on the assumption of an infinitely long, twosided (noncausal) adaptive transversal filter," Widrow et al. [2] showed that their performance could be closely approximated by using all-zero filters. Gooch and Shynk [41] recently demonstrated the potential for even better synthesis of the Wiener weightings by using pole-zero filters. When applying the results of this paper to the wideband case, one only needs to keep track of the change in wavelength Aas a function of frequency (since it affects both presteering and any possible random element misplacement), and 2 I CXj 1 , and 0-; must all be interpreted as functions of i», These conditions mean that the array imperfections must be viewed as frequency-dependent, and at some frequencies certain assumptions may no longer hold.

0';, 0';,

a;, 0';,

VI.

CONCLUSION

This paper tackled the problem of hypersensitivity of linearly constrained adaptive beam/arming to array imperfections for' 'high" input signal-to-noise ratio, by considering a particularly simple and general structure known as the generalized sidelobe canceller. The aforementioned hypersensitivity manifests itself as nulling of the friendly look-

imperfections as random element amplitude and phase errors constant during the period of adaptation, the hypersensitivity phenomenon was discussed in detail using Wiener filter theory to analyze steady state behavior, and computer simulations to check the results. Artificial receiver noise injection algorithms were derived for the generalized sidelobe canceller, and simulations were carried out to demonstrate their ability to provide the beamformer with robustness to array imperfections. For the special case of the Capon maximum-likelihood beamformer, simple approximations were presented for the Wiener output signal-to-interference-plus-noise ratio, the random element gain (amplitude and phase) error variance which leads to a 3 dB degradation in this Wiener output signal-tointerference-plus-noise ratio from its value when an ideal array is assumed, and the optimal amount of artificially injected receiver noise. Suggestions for how the theory could be extended to the two important cases of multiple jammers and wide-band adaptive array processing were discussed. Ideas for further investigation can be found in [35]. ACKNOWLEDGMENT

The author is grateful Widrow, for suggesting fect arrays" as a Ph.D. during the course of this 594

to his principal advisor, Dr. Bernard "adaptive beamfonning with imperthesis topic, and for his supervision research. The experience of working

with Dr. Arogyaswami Paulraj, who served as associate advisor, was equally rewarding.

[22] [23]

REFERENCES

[1] [2] [3] [4] [5]

[6] [7] [8] [9] [10] [11] [12] [13] [14]

[15] [16] [17] [18] [19] [20] [21]

J. Capon, R. J. Greenfield, and R. J. Kolker, "Multidimensional maximum-likelihood processing of a large aperture seismic array," Proc. IEEE, vol. 55, no. 2, pp. 192-211, Feb. 1967. B. Widrow and S. D. Steams, Adaptive Signal Processing, PrenticeHall, 1985. I. S. Reed, J. D. Mallett, and L. E. Brennan, "Rapid convergence rate in adaptive arrays," IEEE Trans. Aerospace Electron. Syst., vol. AES-IO, no. 6, pp. 853-863, Nov. 1974. S. P. Applebaum and D. J. Chapman, "Adaptive arrays with main beam constraints," IEEE Trans. Antennas Propagat., vol. AP-24 , no. 5, pp. 650-662, Sept. 1976. L. J. Griffiths, "An adaptive beamfonner which implements constraints using an auxiliary array preprocessor," in Aspects of Signal Processing, pt. 2, G. Tacconi, Ed. Dordrecht, Holland: Reidel, 1977, pp. 517-522. L. J. Griffiths and C. W. Jim, "An alternative approach to linearly constrained adaptive beamforming," IEEE Trans. Antennas Propagat., vol. AP-30, no. 1, pp. 27-34, Jan. 1982. O. L. Frost, ill, "An algorithm for linearly constrained adaptive array processing," Proc. IEEE, vol. 60, no. 8, pp. 926-935, Aug. 1972. C. W. Jim and L. J. Griffiths, "Random gain and phase error effects in optimal array structures," Dept. Elec. Eng., Univ. Colorado, Boulder, Tech. Rep. EE 77-2, Sept. 1977. L. J. Griffiths and C. W. Jim, "A generalized sidelobe cancelling structure for adaptive arrays," Signal Processing Lab., Dept. Elec. Eng., Univ. Colorado, Boulder, Tech. Rep. SPL 78-2, Nov. 1978. C. L. Zahm, "Effects of errors in the direction of incidence on the performance of an adaptive array," Proc. IEEE, vol. 60, no. 8. pp. 1008-1009, Aug. 1972. H. Cox, "Resolving power and sensitivity to mismatch of optimum array processors," J. Acoust. Soc. Am., vol. 54, no. 3, pp. 772-785, 1973. K. Takao, H. Fujita, and T. Nishi, "An adaptive array under directional constraint," IEEE Trans. Antennas Propagat., vol. AP24, no. 5, pp. 662-669, Sept. 1976. A. M. Vural, "A comparative performance study of adaptive array processors," in IEEE ICASSP'77 Rec., May 1977, pp. 695-700. - - , "Effects of perturbations on the performance of optimum/ adaptive arrays," IEEE Trans. Aerosp. Electron. Syst., vol. AESIS, no. 1, pp. 76-87, Jan. 1979. I. T. Mayhan, "Some techniques for evaluating the bandwidth characteristics of adaptive nulling systems," IEEE Trans. Antennas Propagat., vol. AP-27 , no. 3, pp. 363-373, May 1979. R. A. Monzingo and T. W. Miller, Introduction to Adaptive Arrays. New York: Wiley, 1980, ch. 11, pp. 461-475. J. E. Hudson, Adaptive Array Principles, New York: Peter Peregrinus and IEEE, 1981, chs. 4 and 6, appendix 4, pp. 113-121, 155194, and pp. 248-249. R. T. Compton, Jr., "Multiplier offset voltages in adaptive arrays," IEEE Trans. Aerosp. Electron. Syst., vol. AES-12, no. 5, pp. 616627, Sept. 1976. - - , "Pointing accuracy and dynamic range in a steered beam adaptive array," IEEE Trans. Aerosp. Electron. Syst., vol. AES-16, no. 3, pp. 280-287, May 1980. - - , "The effect of random steering vector errors in the Applebaum adaptive array," IEEE Trans. Aerosp. Electron. Syst., vol. AES-18, no. 5, pp. 392-400, Sept. 1982. Y. Bar-Ness, "Steered beam and LMS interference canceler comparison," IEEE Trans. Aerosp. Electron. Syst., vol. AES-19, no. I, pp. 30-39, Jan. 1983.

[24] [25] [26] [27] [28]

[29] [30]

[31]

[32] [33] [34] [35] [36] [37] [38] [39] [40] [41]

595

I. J. Gupta and A. A. Ksienski, "Effect of mutual coupling on the performance of adaptive arrays," IEEE Trans. Antennas Propagat., vol. AP-31, no. 5, pp. 785-791, Sept. 1983. L. C. Godara, "The effect of phase-shifter errors on the performance of an antenna-array beamformer," IEEE J. Ocean. Eng., vol. DE-IO no. 3, pp. 278-284, July 1985. ' C. L. Zahm, "Application of adaptive arrays to suppress strong jammers in the presence of weak signals," IEEE Trans. Aerosp. Electron. Syst., vol. AES-9, no. 2, pp. 260-271, Mar. 1973. W. D. White, "Artificial noise in adaptive arrays," IEEE Trans. Aerosp. Electron. Syst., vol. AES-14, no. 2, pp. 380-384, ~1ar. 1978. J. G. Charitat, I r. , "The effects of error in the adaptive antenna reference," Proc. IEEE, vol, 70, no. 9, pp. 1128-1129, Sept. 1982. - - , " An algorithm for adaptive antennas and superresolution systems with faulty steering vectors," IEEE Trans. Antennas Propagat., vol. AP-34, no. 3, pp. Mar. 1986. B. Widrow and J. M. McCool, "A comparison of adaptive algorith~ based on the methods of steepest descent and random search," unpublished manuscript. J. M. McCool, "A constrained adaptive beamformer tolerant of array gain and phase errors," in Aspects of Signal Processing, pt. 2, G. Tacconi, Ed. Dordrecht, Holland: Reidel, 1977, pp. 517-522. M. H. Er and A. Cantoni, "Derivative constraints for broad-band element space antenna array processors," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, no. 6, pp. 1378-1393, Dec. 1983. M. H. Er and A. Cantoni, ., A new approach to the design of broadband element space antenna array processors," IEEE J. Ocean. Eng., vol. OE-IO, no. 3, pp. 231-240, Iuly 1985. K. M. Ahmed and R. I. Evans, HAn adaptive array processor with robustness and broad-band capabilities," IEEE Trans. Antennas Propagat., vol. AP-32, no. 9. pp. 944-950, Sept. 1984. R. T. Compton, Jr., "An adaptive array in a spread-spectrum communication system," Proc. IEEE, vol. 66, no. 3, pp. 289-298, Mar. 1978. N. K. Jablon, "Steady state analysis of the generalized sidelobe canceller by adaptive noise cancelling techniques," IEEE Trans. Antennas Propagat., vol. AP-34, no. 3. pp. 330-338. Mar. 1986. - - , "Adaptive beamforming with imperfect arrays," Ph.D. dissertation, Elec. Eng. Dept., Stanford Univ., Stanford, CA, Aug. 1985. A. Papoulis, Probability, Random Variables, and Stochastic Processes. New York: McGraw-Hill, 1965, ch. 9, p. 303. T. Kailath, Linear Systems. Englewood Cliffs. NJ: Prentice-Hall, 1980, Appendix, pp. 655 and 658. H. E. Schrank, "Low sidelobe phased array antennas," IEEE Antennas Propagate Soc. Newsletter, vol. 25, no. 2, pp. 5-9, Apr. 1983.

D. G. Luenberger, Introduction to Linear and Nonlinear Program-

ming,2nded. Reading, MA: Addison-Wesley, 1984, ch. 7, pp. 221222. R. A. Chestek, "The addition of soft constraints to the LMS algorithm," Ph.D. dissertation, Elec. Eng. Dept., Stanford Univ., Stanford. CA, May 1979, chs. 5, 8, and 9, pp. 13, 19, and 30. R. P. Gooch and J. J. Shynk, "Wide-band adaptive array processing using pole-zero digital filters," IEEE Trans. Antennas Propagat., vol. AP-34, no. 3, pp. 355-368, Mar. 1986.

An Efficient Algorithm and Systolic Architecture for Multiple Channel Adaptive Filtering STANLEY M. YUEN, KENNETH ABEND,

SENIOR MEMBER, IEEE, AND

A bstracl-A multiple input-multiple output orthogonalization algorithm and its efficient systolic implementation are presented. The processing architecture is developed using a basic two input-two output decorrelation processing element (PE) as the primitive building block. Its features are discussed and compared to the recently published approach based on the well-known modified Gram-Schmidt (MGS) orthogonalizalion procedure.

' A

I. INTRODUCTION

DAPTIVE FILTERING has been a subject of intense research since the early work of the stochastic gradient or the least mean square (LMS) algorithm of Widrow [1], [2]. The virtue of the LMS algorithm lies in its computational simplicity. However. it suffers slow initial convergence and thus poor adaptivity in a rapidly time-varying environment. This drawback has served as motivation for deriving other methods of adaptive filtering which can provide faster cc nvergence and are not so sensitive to the signal statistics. One method of obtaining faster convergence is to adopt an exact least squares (LS) approach rather than the statistical approach. The family of recursive least squares (RLS) algorithms represents one group of techniques which are theoretically less sensitive to the statistical propenies of the data [31-[6]. There are basically two important differences between the gradient-based type of algorithms and the RLS (a.so known as Kalman) family. First of all. Kalman-type algorithms minimize an exact error criterion constructed from the actual input data . in contrast to the statistical error criterion for the gradient-based methods. Secondly, the error criterion is satisfied at every point in time for the Kalman family of algorithms. whereas the error criterion is achieved at convergence or steady state for the gradient -based techniques, as in the case of the LMS algorithm. Although the Kalman-type algorithms possess attractive convergence property, they have two major drawbacks. One is their large computational complexity and the second is their sensitivity to round-off noise. The latter may cause an algorithm to become unstable after a large number of iterations. To be more specific, the Kalman-type algorithms require O(N2) operations per time update for computing an Nth order filter, whereas only O(N) operations are needed for the LMS algorithm. To remedy the problem of round-off noise, a Manuscript received February 12. 1987: revised September 1~ 1987. S. M. Yuen is with the Electronic Systems Department, RCA Government Electronic Systems Division. Moorestown, NJ 08057. K. Abend and R. S. Berkowitz are with the Department of Electrical Engineering. University of Pennsylvania. Philadelphia, PA L9104. IEEE Log Number 8819825.

RAYMOND S. BERKOWITZ,

FELLOW, IEEE

family of algorithms based on orthogonal transformations can be used. They include the Givens, Householder, and modified G~am-Schmidt (MGS) transformations. These algorithms deal WIth data matrices with condition numbers equal to the square root of the condition number of the input signal covariance matrix. The condition number of a matrix of interest is defined as the ratio of its largest and smallest nonzero singular values and has an interpretation of being an error magnification factor. Consequently, these orthogonalization-based algorithms are less sensitive to round-off noise. In many advanced signal processing applications, the use of regularly structured processing is considered as the most feasible approach to obtain real time performance. The development of versatile processing nodes by sophisticated very large-scale integration (VLSI) design can lead to a new generation of adaptive processors which can achieve real time throughput rate as well as flexibility. As a result, researchers have investigated various implementation aspects of orthogo~alization-based algorithms in the context of parallel processmg. For example, time-recursive versions of the Givens transformation and the modified Gram-Schmidt algorithm have been developed and discussed in the context of systolic array implementation [7]-[10]. The MGS approach, in particular, h~s received a tremendous amount of attention in many radar SIgnal processing applications. Besides having a modular and regular processing architecture, the MGS algorithm possesses both time and order recursive properties [10]. Furthermore, it has been shown to yield good performance simultaneously in arithmetic efficiency, stability, and convergence times [11], [12]. The MGS procedure has been considered in the literature as an orthogonalization preprocessor for the LMS algorithm [13] , as a linear predictor for temporal input [14], as a sidelobe cancellor [15], and for clutter rejection in a nonstationary radar environment [16], [17]. More recently, the MGS orthogonalization algorithm and its corresponding triangular processing architecture have been generalized for efficient multiple channel adaptive filtering [18], [19]. The purpose of this paper is to introduce an alternative orthogonalization algorithm which results in a more efficient architecture for filtering applications in which there are as many output channels as there are input channels. One example is adaptive pulse Doppler processing in radar. For completeness, the basic theory and the architecture proposed by Gerlach [19] based on the MGS procedure are reviewed in Section II. We then systematically develop the new alternative algorithm and the corresponding efficient processing structure in Section III, starting with a.simple two

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. 36, No.5, pp. 629-635, May 1988.

596

The vector X I - YW is called the residual vector and is orthogonal to the columns. of Y. Hence we can compute W from the normal equations

input-two output decorrelation processing element as the primitive building block. Finally, the differences between the two approaches are discussed and future work relevant to the new algorithm and processing architecture is addressed in Section IV. II.

The LS Problem

Y'YW=Y'X I

REVIEW OF BASIC THEORIES

9

(5)

Equation (5) is often called the LS estimator and is akin to the Wiener-Hopf equation derived using the criterion of least mean squares error [20]. Equation (4) can also be written explicitly as follows:

The LS problem is known by different names in different scientific disciplines. In the IEEE literature, the solution of the

K K K

L X!(k)Xl(k)- Wz L xi(k)X2(k)-··· -

WN ~ xi(k)XN(k) = 0

2: Xj(k)XI(k)- W L Xj(k)X2(k)-··· -

W N ~ Xj(k)XN(k)=O

k=O

k=O 1\

K

2

k=O

k:O A

k=O

k=O

K K K

2: X~(k)XI (k) -

W2

k=O

2: X:t(k)X2(k) -

[XI (k), x~ (k), ... ,

Since X. and X, - YW are not orthogonal, their inner product can be expressed as K

2: xi(k)x, (k) -

/':=(l

K

W2

2: xi(k)X2(k) /.:=0

A'

- . . . - ~V v ~ Xi ( k ) X\,( k ) =; }l

k=O,l'··, K.

where

(1)

that

J.1.

(4)

is a nonzero quantity. Next we combine (6) and (7) so is replaced with the matrix equation

We desire to determine a weight vector \V which minimizes the sum of squared errors defined as €(K)

==

L 1\

1

Rxxw ==

e2 ( k )

}l

o

(8)

o

1.:=0

==

(7)

k:..:O

xy(k)] ,

=[x\(k), yJ(k)]',

(6)

k~O

LS problem is associated with a number of equations and vector space concepts. The purpose of this section is to review brietly the essential equations and fundamental concepts, and to demonstrate that they are indeed interchangeable in the interpretation of the LS solution. Assume that we have an N channel system with the measurement vector at a given time instant k represented as X (k) =

... - W N ~ X~(k)XN(k) = o.

k=O

where w == [1, -W']' and R xx is the N x lv sample (2) covariance matrix of the input channels calculated based on K + 1 observations.

2: (xI(k)-y'(k)W)2 K

k=O

Wy'J.

The minimization of (2) is equivalent to the solution of the LS problem of minimizing the Euclidean length

According to the theory of signal-to-noise (SIN) optirnization in the field of adaptive arrays. the optimum weight vector W Op l is the value of w that satisfies

IIXI-YWII

Rxxw=J.lS*.

where W == [W2 , W 3 t

.",

(3)

where

XI =

Xl

(0)

X2(O)

X3(0)

XN(O)

XI

(1)

x2(1)

x3(1)

xN(l)

x2(K)

x3(K)

xN(K)

xI(K)

and y=

It is well known that the solution W satisfies the condition Y' (X, - YW) =0.

(4)

(9)

5 == [5., 52, ... , 5 N ) ' is often called the steering vector and J.L in this case can be an arbitrary constant. Equation (9) is known as the Applebaum maximum signal-to-noise criterion (21]. In a linear array antenna with equally spaced elements, the components of 5 are determined by the direction of the desired signal. Although Jl in (9) can be arbitrary ~ Jl in (8) is not, and is chosen so that the first element of w in (8) is a one. The key point in the derivation of (9) is the application of the CauchySchwartz inequality. The similarity between (8) and (9) is obvious. Although the approaches to deriving (8) and (9) are

597

and 12 ,1 is calculated so that (10) is satisfied. It is easy to see that

==== .

YIY1 12 1 , \Y2\2 DP

(12)

In an actual application, a finite number of samples would be taken for each input channel, thus (12) is estimated as s

I

v:1 Fig. 1.

Decorrelation processor [14].

~ Yl (k)Yi(k)

-x-----

_k=O

2,1 -

(13)

~ \Y2(k)1 2

k=O

LEVEL 1

LEVEL 2

··•

LEVEL N·2

LEVELN·l

Fig. 2.

~

DP

~

+OUTPUT

Modified Gram-Schmidt N-channel decorrelator [15].

somewhat different, it is clear that they provide the same solution of the LS problem. In fact. (8) can be considered as a special case of (9) with S = [1, 0, ... , 0]'. An alternative interpretation of (8) is that it is obtained by transforming the input signals of (9) such that the effective steering vector in the transformed signal space has the same simple form S = [I, 0, ·'·,0]'.

Multiple Channel Adaptive Filtering Using MGS The direct implementation of (8) corresponds to the inversion of the covariance matrix. However. it is well known that problems occur in the solution of the weights if R xx is illconditioned. A better approach is the use of the MGS orthogonalization technique which has been reported to have good numerical properties [22], [23]. Its processing architecture is shown in Fig. 2, using the simple decorrelation processor (DP) of Fig. 1 as the building block. To understand the operation of a single DP, we consider two channels of complex values input data: YI and Y2. The objective here is to form an output channel which is decorrelated with Y2. This is equivalent to setting

y;

Y; Yi=O

(10)

Where the overbar and asterisk denote the time average and the cOlnplex conjugate, respectively. We can also express (11)

In Fig. 2, X N is decorrelated with X., X 2 , " . , X N - I in the first level of the processing structure. In the second level, the output channel which results from decorrelating X N with X N - 1 is decorrelated with the other outputs of the first level of DP's, and the process continues as indicated in the figure. At the end, a final output is generated and it is totally decorrelated with the input X 2 , X 3 , ••• , X N • It is clear that the MGS decorrelation (orthogonalization) procedure is not unique with respect to the order in which X 2 , X 3 , • • " X N are decorrelated from XI. In multiple input-multiple output adaptive filtering, there can be as many output channels as input channels. Specifically, given there are N input channels, it might be desirable to generate N output channels so that each one of the N output channels is totally decorrelated with the rest of the N - 1 input channels. For example, this concept can be applied to radar Doppler processing in which a bank of filters is used to cover the entire Doppler band, and each Doppler sub band is processed such that it is totally decorrelated with the rest of the other subbands. Mathematically, this corresponds to solving the matrix equation

Rxxv=[

(14)

where V is the optimal weighting matrix and I is simply an N x N identity matrix. The SIN associated with each of the output channels is maximized by the corresponding column vector of if, and the nth output channel has a desired signal vector:

(0, 0, ... , 0, 1, 0, .'., 0)

I

i nth position. Based on the fact that there is no logic behind the ordering of the input channels in the MGS procedure, Gerlach was able to develop an efficient processing architecture for multiple channel adaptive filtering [15]. The design is illustrated in Fig. 3 for the case of eight input and eight output channels. The key point in the design is that arithmetic efficiency is achieved by taking advantage of computational redundancies and substructure sharing that can occur for different output channels. III.

DERIVATION OF THE NEW MULTIPLE CHANNEL ORTHOGONALIZATION ARCHITECTURE

Using the basic decorrelation processor of Fig. 1, it is possible to configure other orthogonal networks. To derive a

598

Fig. 3.

Complete realization of an eight-channel decorrelation network [15].

tive defined, we then proceed to construct an orthogonalization processing network . One structure which naturally takes advantage of the symmetry property of the PE is the tree-like network of Fig. 5. An eight input channel structure is illustrated in this example . The extension to an arbitrary number of input channels is obvious. Furthermore, the use of eight input channels also provides a one-to-one comparison between the newly derived tree-like architecture and the architecture based on MGS discussed in the previous section. The numbering system N 1(N 1 , N 3, • • • ) used at the output of each PE gives a clear picture of the orthogonalization procedure carried out by the tree-like processing architecture. The notation implies that the N1th output channel is totally decorrelated with the input: N 2 , N J , ••• etc . In the first row or y: y~ I J level of decomposition, XI is decorrelated with X 2 (and vice versa), X 2 is decorrelated with X J (and vice versa), and so on. E (Yj· V i) _"':'---Yj Next , the output channel which results from decorrelating Xl with X 2 is decorrelated with the proper output channel which E Y j \2) results from decorrelating X J with X 2• The decorrelation E (Y i • Yj) process continues as seen in Fig . 5 until two final output ----Vi channels are generated. It is important to emphasize that the E!lYi\2) two inputs to any given PE must be compatible, i.e. . the set of Fig. 4. Two input-two output building block. input channel indices enclosed by the two pairs of parenthe ses must be identical. Thus the first channel XI and the last more regular architecture for multiple channel adaptive channel X s are totally decorrelated with the input set (Xl , X J, filtering, we begin by considering possible modifications at the . . . , X s) and (X" X 2 , " ' , X 7 ) , respectively, in Fig. 5. primitive or building block level. An intuitive approach of The processing architecture of Fig. 5 generates two of the N achieving structural compactness is to employ the two input- desired output channels . In the case of multiple input-multiple two output processing element (PE) of Fig . 4 as the primitive output adaptive filtering , the remaining N - 2 channels can be building block. The only difference between the DP and the efficiently generated using the technique of " sliding-window" PE is the second orthogonal output associated with the PE substructure sharing as shown in Fig. 6. Four windows in our building block. This second output , however, requires a example of eight input channels correspond to the following smaller number of arithmetic operations than the only output four ordered input sequences : of the DP. This is especially true in the case of batch processing in which E{Yj*(t)Y;(t)} and E{Yi(t)Yj(t)} are obtained by time averaging. Once one of the two expectation operations is estimated by summing N time samples , the other is easily obtained by taking the conjugate . As N becomes large, the use of the PE as the basic building block would result in improved arithmetic efficiency . With the PE prirniy.

y. J

I

u

599

x,

PE

1 12~121

31~141

213'M131

V3

1l2 .~2.3 1

213

.t~(5)

31• . ~•. 5'

.• ,

' 15~5.61

",.,%?,.M,~.." "..~s." 112.3.• ~ 612.3'.51

~

5t6~6'

617~817 1

5 16'~16 .7I

".)d".'"

2 13 .4~.4 .5.6,31~1• .5.6.71

M,

~

112.3.•. 5.6~• .5.61 213 .~ 813.'.5 .6.71

~

~

112.3.• . 5.6~2.3 .•. S.6.71

/::~

8(1 ,2.3 .4 .5.6.11

112.3,4,5.6 .7.8'

CHANNEL I

Fig. 5.

CHANNEL 1

Tree-like orthogonaJization network.

x,

x,

\

\ \

\ \

. \

\ \

\

CHANNEL

8

CHANNEL 1

Fig. 6.

CH AN N EL

2

CHANNEL 3

CHANNEL

CHANNEL

•

5

CHANNEL

CHANNEL

6

7

" Sliding window" substructure sharing.

The first window generates the eighth and the first decorrelated output channels , the second window generates the second and the third decorrelated output channels, and so forth. In other words, the two output channels generated correspond to the first and the last elements of the ordered input sequence associated with a given window. The concept of substructure sharing using the "sliding window" approach is a useful tool in reducing the total number of PE's as well as the total number of arithmetic operations. Besides the "sliding window" technique, we can achieve further sharing of substructures by exploiting one other structural symmetry in Fig. 6. This symmetry is illustrated by the two dashed triangles

enclosing two identical substructures. As a result, one of the two substructures can be completely eliminated, and the final processing architecture requires JV2 - 3N12 (JV2 - (3N - 1)12) PE's if N is even (odd), where N is the number of input channels. The final design is given in Fig. 7. In the case of batch processing, we see that as the data are processed through a given row , the input data may be discarded and the output data become the new input data set for the next row of PE' s. Hence the two-dimensional structure of Fig. 6 can be collapsed to just a single row of PE's using simple feedback as shown in Fig. 8. Fig. 7 illustrates that as outputs leave at one side of the parallelogram structure they

600

CHANNEL CHANNEL CH A N N EL CH AN N E L 8 CH AN N EL 2 CHAN N E L 4 CH AN N EL 6 CH A N N E L 1 3 S 7

Fig. 7.

CHANN EL

8

Efficient orthogonalization architecture for multiple channel adaptive filtering .

CH ANNEL

1

CHANNEL

2

Fig . 8.

CHANNEL

CHANNEL

3

4

CHANNEL

5

CHA N NEL

6

CH AN N EL 7

Hardware compaction for batch processing.

enter at the other side. Thus we can imagine that Fig. 7 and Fig. 8 represent a cylindrical systolic architecture and a simple ring structure, respectively , in three dimensions . IV. SUMMARY AND FUTURE RESEARCH The main features of the newly developed multiple channel orthogonalization architecture are summarized as follows . 1) In contrast to the architecture based on MGS orthogonalization, it requires no broadcasting of data and any given processing node in the structure only communicates with its neighboring nodes in a pipelining fashion . Hence the design is "purely" systolic . 2) In terms of the total number of arithmetic operations, it is at least as efficient as the MGS approach. A detailed comparison will be given in a future paper. 3) The new architecture is developed in a very systematic and bottom-up fashion, starting with a simple two input-two output decorrelation processing element as the building block. 4) It is an extremely regular and compact processing structure. This is particularly true for batch processing, since in this case the original two-dimensional systolic array can be collapsed into a linear array with just N processing elements , where N is the number of input channels. 5) No unscrambling of the output channels is needed . The

MGS approach , on the other hand , requires a commutatio n algorithm so that the final output channels are properly aligned with the input channels . 6) The technique based on the MGS approach is most efficient for 2 m input channels , where m is a positive integer. The architecture presented in this paper , however, places no restriction on the number of input channels. In the field of adaptive filtering and estimation using VLS: parallel processing, many problems still exist. We are currently investigating the following research topics relevant to the new multiple channel adaptive filtering technique: 1) This paper focuses on the development of an efficient and compact processing architecture for multiple channel adaptive filtering based on the concept of orthogonalization. For simplicity of illustration in the development, batch processing is emphasized. The time-recursive version is yet to be investigated in detail. One advantage of the time-recursive version of the algorithm is that it only has a latency of N computing cycles for the first output to be generated , whereas N ·N, cycles are required in the case of batch processing. /If and N, correspond to the number of processing channels and the number of time samples used for decorrelation, respectively. 2) In real time applications, divisions take more time than

601

multiplications, since most of the special hardware components are optimized to perform multiplication and addition. It is desirable to modify the new algorithm and architecture so that the number of divisions is minimized. 3) It has been reported that the MGS algorithm possesses good numerical properties. It is important to examine the numerical properties of the new algorithm and compare them with those of the MGS algorithm. 4) The application of geometrical vector space concepts for deriving the rapidly converging recursive least squares adaptive filters is well known. Although the new algorithm derived in this paper is done from an architectural perspective, it is worthwhile to rederive the new algorithm using the geometrical approach.

[23]

REFERENCES

B. Widrow and M. E. Hoff. "Adaptive switching circuits," in 1960 WESCON Conv. Rec., pt. 4, pp. 96-140. [2] B. Widrow et al., "Stationary and nonstationary leamingcharacteristics of the LMS adaptive filter." Proc. IEEE, vol. 64, pp. 1156-1162, [!~

Aug. 1976.

[3] Lee et al., "Recursive least squares ladder estimation algorithms." IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-29, pp. [4]

[51

[6] [7] [8] [9]

[lC: [11] [12] [13] [14 1 [I5] [16]

[17]

[I8: [19] [20] [21] [22]

627-641. June 1981. J. M. Cioffi and T. Kailath .: 'Fast, first-order, least squares algorithms for adaptive filtering." IEEE Proc. ICASSP '83, Boston. MA. Apr. 1983, pp. 679-682. F. Long and J. G. Proakis, "A generalized multichannel least squares lauic algorithm with sequential processing stages." IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, no. 2, pp. 381389. Apr. 1984. 1. D. Pack and E. H. Satorius, "Least squares adaptive lattice algorithms." NOSC Tech. Rep. TR423. Apr. 1979. J. Mcwhirter. ."Recursive least-squares minimization using a systolic array ... SPIE paper 431-15. 1983. H. T. Kung and W. M. Gentleman. "Matrix triangularization by systolic arrays," Proc. SPIE, vol. 298. 1981. S. Y. Kung. "VLSI array processors." IEEE Acoust., Speech, Signal Processing Mag., vol. 2. pp. 4-22. July 1985. F. Ling et al., .. A recursive modified Gram-Schmidt algorithm for least-squares estimation." IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34. no ...L pp. 829-835. Aug. 1986. B. Friedlander. "Lattice filters for adaptive processing," Proc. IEEE, vol. 70. no. 8. pp. 829-867. Aug. 1982. I. S. Reed. J. D. Mallet. and L. E. Brennan.: 'Rapid convergence rate in adaptive arrays:' IEEE Trans., Acoust., Speech Signal Processing, vol. AES-IO. pp. 853-863. Nov. 1974. R. A. Monzingo and T. W. Miller. Introduction to Adaptive Arrays. New York: Wiley. 1980. Ch. 9. N. Ahmed and D. H. Youn. "On a realization and related algorithm for adaptive prediction;' IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28, pp. 493-497, Oct. 1980. F. F. Kretschmer and B. L. Lewis, "A digital open-loop adaptive processor:' IEEE Trans. Acoust., Speech, Signal Processing, vol. AES-14. pp. 165-170. Jan. 1978. F. F. Kretschmer, B. L. Lewis, and F. L. C. Lin .: 'Adaptive MTI and doppler filter bank clutter processing." in Proc. IEEE 1984 Nat. Radar Conf., Atlanta. GA, Mar. 1984. A. Farina and F. A. Studer, "Application of Gram-Schmidt algorithm to optimum radar signal processing," Proc. lnst. Elec. Eng., vol. 131, pt. F, no. 2, Apr. 1984. K. Gerlach. "'Multiple channel adaptive filtering using a fast orghogonalization network: an application to efficient pulsed doppler radar processing, " NRL Rep. 8840. 1984. - - , "Fast orthogonalization networks." IEEE Trans. Antennas Propagat., vol. AP-34, no. 3. pp. 458-462, Mar. 1986. S. Haykin, "Nonlinear methods of spectral analysis," Topics in Appl. Phys., vol. 34, 1983. S. P. Applebaum, "Adaptive arrays," IEEE Trans. Antennas Propagat., vol. AP-24, no. 5, Sept. 1976. C. L. Lawson and R. J. Hanson. Solving Least Squares Problems. Englewood Cliffs. NJ: Prentice-Hall, 1974.

602

A. Bjorck, "Solving linear least squares problems by Gram-Schmidt

orthogonalization." BIT, vol. 7, pp. 1-21, Jan. 1967.

Mutual Coupling Compensation in Small Array Antennas

aperture distribution is obtained in the presence of these parasitics, the mutual coupling can be compensated for. This compensation principle has been reported for a slot array [1] and dipole arrays [2]-[4]. The former is the only one that considers the case of scanning and presents some experimental data; all four rely on computed coupling coefficients. The present study differs in that it rephrases the approach for the receiver mode, appropriate for a digital beamforming antenna where the technique is most practical, and it describes an alternative method to determine the mutual coupling coefficients, that does not require analytically simple or reciprocal array elements. It also presents experimental data for a scanned waveguide array.

HANS STEYSKAL, MEMBER, IEEE, AND JEFFREY S. HERD, MEMBER, IEEE

Abstract-A technique to compensate for mutual coupling in a small amy is developed and experimentally verified. Mathematically, the compensation consists of a matrix multiplication performed on the received signal vector. This, in eifect, restores tbe signals as received b) tbe isolated elements in tbe absence of mutual coupling. The technique is most practical for digital beamforming antennas where tbe matrix operation can be readily implemented.

THEORY INTRODUCTION

We consider an array of single-mode elements, meaning that the element aperture currents (electric or magnetic) may change in amplitude but not in shape, as a function of radiation direction. In the receive mode, the signal at the output of the individual antenna element has several constituents: a dominant one due to the direct incident plane wave, and several lesser ones due to scattering of the incident wave at neighboring elements. As depicted in Fig. 1, we can write the received signal at element m as

The radiation pattern of an array of identical antenna elements is usually taken to be the product of an element factor and an array factor, based on the presumption that all elements have equal radiation patterns. Unfortunately, this may not be true for a practical array, where, due to mutual coupling, each element ., sees" a different environment. The nature of the error thus incurred can be displayed by expressing the individual array element pattern f n ( u) as the sum of one average array element pattern fa(u) and a pattern deviation o!n(u), which leads to the total array pattern

F(u) =

Vm(u) = cmmEm!i(u) +

LQnfn(u)eJnkdu

=f

n Q

(

u) L QneJnkdu + L Qnofn( u) eJnkdu. n

n

(1)

Here an = I a; I exp (jet>n) denotes the complex element weight, k the wavenumber, d the uniform element spacing and u the sine of the angle 8 from broadside, respectively. The first term on the right side of (1) represents the idealized pattern, and the second represents the error. One effect of this error pattern is to introduce a noise floor that precludes synthesis of high-quality patterns with very low sidelobes or deterministic pattern nulls. Other effects appear in signal processing arrays, such as adaptive or superresolution systems, which can be extremely sensitive to small errors due to the nonlinear processing involved. Since real-life signal processing arrays usually are comparatively small arrays, where element pattern differences are relatively large, this is a significant problem. It is clear from (1) that the element coefficients {an} always can be chosen such as to compensate for the pattern error at one particular angle. It is less obvious that the error normally can be corrected for all angles simultaneously. Furthermore, since this correction is scan independent, it also applies in the case of electronic scanning. It is the purpose of this communication to discuss such a technique and to present some experimental results. The key to the technique is an alternative formulation to (1), which recognizes that 1) any composite array pattern can be considered as a weighted sum of the isolated element patterns and 2) the effect of mutual coupling is simply to parasitically excite all elements, even though only one element is driven. Thus, by driving the array with modified element excitations, such that the desired array Manuscript received November 29, 1989; revised June 12, 1990. The authors are with the Electromagnetics Directorate, Rome Air Development Center, Hanscom AFB, MA 01731. IEEE Log Number 9038579.

L

n , m,*n

cmnEnfi(u).

(2)

The incident field Em at element m impresses an aperture current amplitude Em!i(U), where fi(u} is the. isolated element pattern, i.e., the pattern of the current mode assumed in the element aperture. This aperture current will produce an element output voltage cmmEmfi(u), where cmm denotes the coupling from the aperture to the output transmission line. The effect of the neighbor-

Vm Um at element m consists of a directly transmitted and several scattered components.

Fig. 1. The received signal

ing elements is described similarly, with Cm n denoting the COupling of aperture mode n to element output m. From a mathematical point of view, (2) simply expresses the linear relationship between the aperture excitations and the element output voltages. The physical meaning of the Cmn will be discussed below. We introduce the notation

(3) since this represents the desired, coupling-unperturbed signal received by the single element at the aperture. Thus for our unifonnly spaced array of identical elements Eoejnkdufi(u)

= u~(u)

(3a)

where Eo is the amplitude of the plane wave incident from direction

u.

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. 38, No. 12, pp. 1971-1975, December 1995. U.S. Government work not protected by U.S. Copyright.

603

Substituting (3) in (2) leads to

iog antenna system. It then allows all subsequent beamforming operations to be performed with ideal element signals. such as are usually assumed in pattern synthesis.

(4)

DETEIUIlINA1l0N OF THE MU11JAL COUPUNG COEFFICIENTS

On the left side, the vector v represents the coupling perturbed signals { lin} at the element output ports, which via the coupling matrix C is related to the vector y d, representing the unperturbed desired signals {u:l. Thus compensation for the mutual coupling can be accomplished by simply multiplying the received signal v by the inverse coupling matrix C - I , yd

=

(5)

C-1y.

This concept is depicted in Fig. 2, where a network corresponding to C- I is attached to the array antenna. Note that the coupling. compensation is scan independent, i.e., the same matrix C - I applies universally for all directions of the incoming wave, as a consequence of our single-mode assumption. Multimode elements, as considered in [2]-[4], would require a scan dependent coupling compensation. When the received and compensated signals vd are weighted and summed in the conventional beamforming network, shown in Fig. 2, we obtain the array pattern F{u), defined as the ratio of the output voltage and the incident wave amplitude Eo,

L anv~ = f'{u) L anelnkd". o

1

F{u) = E

n

(6)

n

The array pattern (6) now has the desired form of a product of an element factor and an array factor. A comparison with (1) shows that, with the transformation performed, we have succeeded in dissolving the error pattern, the second term on the right side of (1). The matrix C- 1 may be difficult or impractical to realize by an analog network, but it can be readily realized in a digital beamform-

mutual coupling and feed line errors

n

11

coupling compensation desired signals

cf. (2).

and, recognizing that the cmn are the Fourier coefficients of these patterns, determine these coefficients numerically according to

(8) In order to do this, ji(U) must not have a null in the integration

interval. However, since the isolated element pattern normally is very wide, this is no serious limitation. Another restriction on (8) is that the element spacing be larger than A/2. Otherwise the integration interval extends beyond visible space, i.e., beyond the interval - 1 < u < 1 where gm(u) and li(u) are known. For the case of element spacings d < A/2 we can still perform a Spectral analysis of

gm(u) fi(U) -

LC n mn

11n d

scanning

beam shaping

A-

output beam Fig. 2. Illustration of coupling compensations and beam forming in an array antenna. Interelement coupling at the array face. represented by (c",n)' leads to received signals I} n at array element outputs" that are linear combinations of the desired, coupling-unpenurbed signals Multiplication by (c m n ) - I restores these signals. which are then weighted and summed to fonn the desired beam.

I):.

(9)

ejnkdu

to determine the coefficients Cm n' but the convenient orthogonality of the harmonic functions is lost and accuracy becomes a major issue. An advantage of this method is that it does not require reciprocal antenna elements. Thus, it is applicable to receive-only arrays. such as used for digital beamforming, where the element includes an entire microwave receiver. Furthermore, any channel imbalances, i.e., differences in insertion amplitude and phase between the element aperture and the element output terminal, manifest themselves in the self-terms em m and are also compensated for. In this sense the technique is similar to a conventional array calibration. In the second method, the matrix C is obtained from the related scattering matrix S ::: (smn) of the array. This relation is developed

desired signals vn d received on isolated elements

measured signals

There appear to be two di fferent methods to determine the coupling coefficients-one by Fourier decomposition of the measured array element patterns and another by coupling measurements between the array ports. The former requires driving the antenna only in one mode" either transmit or receive, and thus applies to nonreciprocal antenna systems. The latter requires driving each element in both modes and therefore is less practical, as discussed below. In the Fourier decomposition method we measure the complex voltage patterns g m( u) of the elements in their array environment.

8 -

-A'

- -

T T- - - T- - - - - - - 8' o~ b~

b~

Fig. 3. lliustration of the scattering matrices 5 and 5' of the array. Line sections between aperture plane AA' and terminal plane BB' are matched and reciprocal, with transmission coefficients tn.

604

here for the simplest case of a waveguide array fed by matched generators . For the general case the relation is complicated and not very useful. We consider a uniformly spaced array of waveguide elements, shown in Fig. 3, and determine the array element pattern of element m. This element is excited with a wave of amplitude am ' all other elements are passive. Assuming a reference plane A A ' for the antenna element terminals that coincides with the element apertures, the aperture voltages thus are ( 10) where the Kronecker delta wise. The radiated far field

0 mn

=

I for n

= m.

and

=0

other-

complicated than pattern measurements with Fourier decompos itions, in reality often is the less practical method. E XPERIMENTS

The coupling compensation technique outlined in the preceding section was applied to an eight-element linear array of X-band rectangular waveguides in a ground plane. Each element was in turn a column of 8 rectangular waveguides in a common H-plane . combined via a fixed 1:8 power divider. The array axis thus was parallel to the E-plane and in this plane the element spacing d = 1.25 cm = 0.5 17 A. The isolated element pattern corresponds to a normalized uniform apertur e distribution kl sin - u 2 kl

P (u )

( 17)

-u 2

where rn is the distance of element n to the observation point and the usual far-field approximations have been made. Comparing this expression with (7) and requiring that the transmit and receive patterns are identical, shows that

apart from a constant factor of no interest. Thus

C

=I+S

(12)

where I denotes the identity matrix. In real arrays , the scattering matrix cannot normally be measured directly at the element apertures , as assumed above, but only from a reference plane a certain distance behind the apertures. A more realistic case therefore is as shown in Fig . 3, where sections of transmission line are included between the apertures and the reference plane BB' , from which the modified scattering matrix S' is measured. These feed lines have different insertion phase and loss. However , for simplicity , we still assume them to be matched and reciprocal. so that they can be characterized by single transmission coefficients t m : Defining a diagonal matrix T , ( 13)

where I is the interior waveguide height. in our case I = 1.02 cm = 0.417 A. The complex voltage patterns g m( u) of the array elements were measured under matched load conditions , and recorded at 1/ 2-degree intervals with a digital rece iver. The coupling coefficients cm n were then numerically evaluated accord ing to (8) and the inverse matrix C- I was computed. In a second , similar measurement, the received voltages lI m ( u) were again recorded . Then , in an off-line simulation of a digital beamforming system, they were multiplied with C - I for coupling compensation, and amplitude and phase weighted for pattern shaping and scanning, as shown in Fig. 2. Examples of element patterns for a central and an edge element are shown in Fig. 4. We note that there is indeed a considerable difference in shape, which is attributable to mutual coupling effects , and also in overall power level , which mainly is due to a difference in feed line losses . In Figs. 5(a) and 5(b) we show synthesized 30-dB Chebyshev patterns as obtained without and with the mutual coupling compensation. Apparently the compensation technique gives about 10 dB improvement in sidelobe level , with the result that the actual pattern is quite close to the theoretical one. The remaining difference indic ates that the array excitation tolerance erro rs equ al ab out - 35 dB in ampllitude and 1° in ph ase. Figs. 6(a) and 6(b) show th e same 30-dB Chebyshev patt erns scanned to - 30°. Without compe nsation the side lobe level is still

it is easy to show that

(14)

S' = TST

o

and, from (4) and (12), that the received signals at plane BB' are /

( 15)

~

aI ~ II:

The modified coupling matrix C' at plane BB' thus is

C' = T + S'T -

1

-5

W

(16)

which shows that in this case we need to measure not only the scattering matrix S' but also the transmission coefficients {t m} . This mayor may not be possible, depending on the design of the actual array . Clearly for the general case , where the feed lines between the element apertures and output terminals are not matched, the required measurements become still more extensive . Thus , measurement of the network parameters, which intuitively would seem less

~Q. ·10

I

I I /

I

/

l/ ""\

""-

/ ./

./

,,- '"

-

-'

<,

,'\

.:>::::

./

r

'\

V

-

1' \

......

I\I;'

11

~-

I

I

/

""

I i

i i

·15

-ao

·30

o

I 30

60

BO

ANGLE (DEG)

Fig. 4. Measured pattern magnitudes I gm(li) I for center (-) and edge (--) elements of the eight-element array.

605

·10

iii'

~ ·20

a: w :!: 0. 30 a.

irl I

. 40

·50

.......

I

)-"

I I

\

I I

-aO

~ ,

\I

II

/ 'r/\V \

\

'V N'

If'

,I ' I

\

(\

I I I

,/"1'

1,1

\

It I)

I

o

-measured - - -theory

i\

1/'

1J

o

·30

·60

I

v -,

30

I

I

,.

!

\f\

V\

-20

a.

,

·30

-40

~~/ \\

\

\,'

1\

ANGLE (DEG)

-sc

80

60

-_ad ---theory

\ \

~

W

~

iT 1

'~

j

a:

"\ I!

I \I

I I

-10

v "' i\

txtI'

II

I'I

I I

-aD

·10

~ ·20

a: w o:!: a.

·30

-40

\J I~

-aD

I

-60

1

-measured - - -theory

I I I

I

o

iii' ~

~

30

~

~~

r

'"\ \

a.

I~ \ \

'I II 60

-30

<,

~

IIv-- t\.

\f

~

30

, ,

~I , I

If I,

II II

I,

II

80

60

[\

f)

J

V\I M

\ I

I

1j

-so

-so

80

·30

I

o

ANGLE (DEG)

~ 30

....

1\

LX" IV

1\

f

I

I

~

I

- - -thoory

j f).

lV/ \ U A

I

-_ad

\

J

a: A

/

I

·20

W

f' l ~~

(

/

·10

I I I I

1

ANGLE (DEG)

I

\ I

"I ,

\'

,'"

,

\ \ I \ 60

80

(b)

(b )

Fig. 5.

\I

\ \

1\[f,

ANGLE (DEG)

o

\

·30

1\ I

I

1

(a)

A~ ~\ ~, fl I

·50

/

I

iii'

I

,, , "

1\ I I

I

o

-30

-60

(a)

7

V II

h

Fig. 6.

3Q-dB Chebyshev pattern without and with coupling compensation. Scan angle O·.

limited to about -20 dB. When we apply the compensation, the same C - 1 matrix multipl y as for the broadside pattern, we again reduce the sidelobe level by about 10 dB and closely reproduce the desired patt ern.

3Q-dB Chebyshev pattern without and with coupling compensation. Scan angle - 30·.

and thus is valid for all desired pattern shapes and scan directions. Although it may be difficult to realize in analog form, it can be readily implemented in a digital beamforming antenna system.

CONCLUSION

We have developed and experimentally verified a technique to compensate for mutual coupling in an array. The technique should be helpful primarily for small arrays, where the array element patterns differ significantly due to edge effects. In large arrays, where each element sees essentially the same environment, these effects become negligible. Mathematically, the compensation consists of a matrix multiplication performed on the received signal vector. This, in effect, restores the signals as received by the isolated elements in the absence of coupling. An attractive feature is that this matrix is fixed

606

REFERENCES

T. Chiba, " On the realization of the synthesized pattern of scanning arrays , " presented at Int. Conf. on Radar, Paris, France, Dec. 1978. [2] B. Strait and K. Hirasawa, .. Array design for a specified pattern by matrix methods," IEEE Trans. Antennas Pro/Xlgat., vol. AP-17, Mar. 1969. [3] N. Inagaki and K. Nagai, " Exact design of an array of dipole antennas giving the prescribed radiation patterns, " IEEE Trans. Antennas Propagat., vol. AP-18, Jan . 1971. [4] Y-W. Kang and D. Pozar, " Correction of error in reduced side10be synthesis due to mutual coupling, " IEEE Trans. Antennas PrO/Xlgot., vol. AP-33, Sept. 1985. [I]

A Unified Approach to the Design of Robust Narrow-Band Antenna Array Processors MENG HWA ER, ME~fBER,

IEEE, AND

ANTONIO CANTONI,

Abstract-A unified approach to the design of robust narrow-band antenna array processors is presented. The approach is based on the idea of minimizing the weighted mean-square-deviation between the desired response and the response of the processor over variations in parameters. Three specific examples of robust design are considered: robustness against directional mismatch, robustness against array geometry errors, and robustness against channel phase errors. Initially a general quadratic constraint on the weights is developed. However, it is then shown that the quadratic constraint can be replaced by linear constraints or at most linear constraints plus norm constraint. These latter set of constraints are no more complex than those required for designs which do not incorporate robustness features explicitly. Numerical results show that the proposed approach appears to offer a unified treatment for directly designing narrow-band processors which are robust against various types or errors and mismatches between signal model and actual scenario.

1

SENIOR MEMBER, IEEE

>. i",,) ~

~I

"'1

y(t)

L

:

~ (c) JJ

L

O

Fig. 1.

I. INTRODUCTION

PTIMUM ANTENNA ARRAY processing with multiple linear constraints is now a well-known technique. In the simplest case, a single constraint is imposed, namely unity gain response in the look direction; the weight vector is then calculated by minimizing the array processor output power subject to this constraint. Under actual operating conditions, the assumptions of plane wave signals and an ideal propagation medium do not hold, and signal suppression can arise from causes such as beam steer angle errors, phase errors in the receivers, multipath propagation and wavefront distortion. Hence, the study and development of robust antenna array processors has long been an important topic of research. The studies of the effects of these errors on the performance of narrow-band array processors have been reported extensively in the literature [1]-[17]. To combat the problems due to beamsteer angle errors, the use of multiple linear constraints [3]-[6], [8], [22] and nonlinear norm-type constraints [8], [20], [23], [24] has been described in the literature. To combat the adverse effects due to sensor amplitude and phase errors, the use of norm constraints on the weight vector has been considered in [8],[18], [21]. The purpose of this paper is to present a unified approach to the design of robust narrow-band antenna array processors. The approach is based on the idea of minimizing the weighted mean-square-deviation between the desired response Manuscript received January 23,1987; revised June 21,1988. M. H. Er is with the School of Electrical and Electronic Engineering, Nanyang Technological Institute, Nanyang Avenue, Singapore, 2263. A. Cantoni is with the Department of Electrical and Electronic Engineering, University of Western Australia, Nedlands 6009, W.A. Australia. IEEE Log Number 8929472.

Structure of a narrow-band beamformer.

and the response of the beamfonner over variation in parameters. Three specific types of robust design are considered: robustness against directional mismatch" robustness against array geometry errors, and robustness against phase errors. Initially a general quadratic constraint on the weights is developed. However, it is then shown that the quadratic constraint can be replaced by linear constraints or at most linear constraints plus norm constraint. These latter set of constraints are no more complex than those required for designs not explicitly incorporating robustness features. The paper is organized as follows. In Section II, the complex notation for representing a signal incident on the array for narrow-band processing is introduced. In Section III, the formulation of optimum beamformers with robustness capabilities against various types of errors and mismatches is presented. Three examples are considered: robustness against directional mismatch, robustness against array geometry error, and robustness against phase errors. In Section IV, the optimization of the beamformer with robustness capabilities is considered. In Section V, an alternative beamforming structure based on a partitioned processor interpretation is formulated. In Section VI, some numerical results are presented and discussed. Section VII contains the conclusions.

II.

NOTATION

Fig. 1 shows a typical configuration of a narrow-band beamformer with L elements, L complex weights and complex output. The quadrature filter (QF), which is ideally a Hilbert transformer, can be approximated using a filter which provides 90 0 phase shift over the frequency band of the signal or simply a delay line with delay 1/4/0 if the signal is sufficiently narrow band.

Reprinted from IEEE Transactions on Antennas and Propagation. Vol. 38, No.1, pp. 17-23, January 1990.

607

where Q is the L x L dimensional Hermitian matrix defined by

Let W be the L-dimensional complex vector of adjustable weights defined by (1)

Q=

The response of the beamformer to a plane wave front of frequency fo with unity amplitude arriving from direction (6, c/» is given by

G(fo, 8, cP) = WHS(fo, 6, c!J)

p

(2)

(3)

where fo is the frequency of the plane wave and the {T;, i = 1, 2, ... ,L} are the propagation delays between the plane wavefront and the array elements. Let X(/) be the L-dimensional complex vector defined by

(4) where {X;(/), i = 1, 2, ... ,L} are the L complex signals at the outputs of the quadrature filters, then the weighted complex output of the beamformer at a time instant I is given by the complex scalar

(5)

For a stationary source field and a given weight vector W, the mean output power is given by

(6) where R = E[X(/)XH (I)] is the L x L dimensional array correlation matrix.

ill.

ROBUST PROBLEM foRMULATION

Assuming that the response of the narrow-band beamformer defined by (2) is a function of parameters {pi,i = 1, 2, ... .m}. The weighted mean-square-deviation between the desired response A and the response of the beamformer over variation in the parameter vector p ~ (P I , P2, ... ,P m ) can be defined as (7)

where Q(p) is a nonnegative weighting function for deterministic type of parameters or a probability density function (pdf) for those parameters which are modeled as random variable, {3 is a scalar given by (i =

c: pO-l/2~

=

e2

== WHX(I).

...

J

O(p)dp.

O(p)SSH dp

(10)

~J. r": ... In(p)A ·Sdp. Jpo -1/2tl

{3

(11)

Note that (9) is a quadratic function in the Wand it can be factorized as

S(fo, 8, q,) £ [expU21r!OTl),

y(t)

J

and P is the L-dimensional complex vector defined by

where H denotes complex conjugate transpose and S(fo, 0, c/» is the L-dimensional complex space vector defined by

expU2r!OT2),···,expU21rfoTL)]T

I11pO+I/26 ... {3 pO-1/2~

(8)

Substituting (2) into (7) and after some manipulation, one obtains

= (Wo -

W)H Q(Wo - W)

+a

(12)

where W o is the optimum vector which minimizes the meansquare-deviation defined by (9) and is given by

QW o =P.

(13)

a corresponds to the minimum mean-square-deviation defined by (14)

Robustness in the design can be achieved by introducing a quadratic constraint on the weight vector, namely

(Wo - W)HQ(W O - W) ~ e

(15)

where (16)

and 0 < ~ < 1 defines a normalized deviation over variation in parameters. Three specific types of robust designs are considered in the sections which follow.

A. Directional Mismatch The first example of the new approach is the design of a beamformer with beam-broadening capability. In many cases of interest, for example, in communications systems. the direction of the arrival of the desired signal is known only to within some angular tolerance. Also, in passive sonar detection application, when there are only a finite number of beams to span the total bearing angle, any signal that is not exactly matched to one of the beamsteer directions will be treated as an unwanted interference signal by the processor, and therefore will tend to be suppressed. Jt is desirable to have the ability to control the beamwidth of the beamformer in the look direction if required. The use of multiple linear constraints [3]-[6], [8], [22] and nonlinear norm-type constraints [8], [20], [23], [24] to achieve beam broadening has been. described in the literature. This section presents another approach for designing a, narrow.. band optimum beamfonner with robustness against directional mismatch. Beam broadening in cI> domain can be achieved by integrating the mean-square-deviation between the desired unity response (A = 1) and the response of the beam

608

• - o·

former over a spatial region of interest [cPl, cPu] as follows:

Source Direct:1on

e = ;4>1::' 11 - G(fo, 8,4>)12 dd: 2

== WH QW - (PHW + WHP) + 1

(17)

where (18)

p=

;4>I::S(fo,8,4»d4>

+---+-_ _ • • 90·

ac;.

(19)

where cPu == ¢o + 6.4>/2, cPl == 4>0 - 6.4>/2, and 6.<jJ defines the spatial region in the look direction over which the desired unity response is to be preserved. Fig. 2.

B. Element Spacing Deviation

The second example is the design of a beamformer with robustness against element spacing errors. An optimum beamformer can be very sensitive to errors in the assumed array element spacing. Recall that the steering vector S is a function of element spacing through the phase relation of the signal. Thus any deviation of the element spacing from its assumed design value could create an erroneous steering vector which is different from the one assumed in the constraint on the weight vector. Signal suppression may occur when the mean output power is minimized. Robustness in the design can be achieved by integrating the mean-square-deviation over some tolerance bound ~r about the nominal value rl as follows:

where

Azimuth plane of a double- ring circular array having ten sensor elements with five elements equally spaced on each ring.

5 Ii cos (cPO

-

(Xi) -

5 Ii COS (4)0 -

4 V cos (
-

(Xi) -

4 V COS (
5 -; cos (cPo -

(Xi) -

V COS (
vcos (4)0 -

ai) -

Ii cos (
4

4

5

I:S i , j :S 5

(X)),

- a j ),

6:S i , j :S 10

- a)),

1 S i :::; 5 6 :S j < 10 6 :S i

- ex)),

< 10

15:.JS5

(24)

and the ith element of P vector is given by i

== 1,2" ··,L (25)

where

(20)

5

where I" Q -_ -1- IfL +tu/2 . . . Ifl +~f/2 S( jO, 00 , (Llr)L f/. -~f/2 fl-~f/2

4>0

d i ==

)

where

P

1

== (Llr)L

Cij

Iff-

+M/2

f

L

-~f/2

...

f

r , + ~r/2 r,-~r/2

S([o, 00 , ¢o) ds, dr, ... dr.:

(22)

For example, for the double-ring ten elements circular array as shown in Fig. 2, the (i, j)th element of Q matrix when the mean-square-deviation is integrated over some tolerance bound Sr about the nominal value To is given by [Q l.. = exp [j 21T fOTOhij ]sinc (1T fO~Thij), i, j == 1, 2, ... , L

(23)

-

Cii),

1 S:. i

<5

{ ~ cos (cPo v

(Xi),

6 S:. J

:; 10

V cos (
== (i - 1)72°,

i==I,2,···,IO.

(26)

(27)

C. Channel Phase Errors The third example is the design of a beamformer with robustness against channel phase errors. The effect of channel phase errors on the performance of optimum beamformers is of great importance and has been extensively analyzed in the literature [6], [12]-[17]. Phase errors can arise in the array electronics or due to wavefront distortion in the propagation medium, and hence any signal that was originally in a plane wave may appear to the beamformer as an interference from some possibly nonreal direction. The beamformer will then attempt to null this signal even though it may be the desired signal one wishes to detect. Very little work has appeared in the

609

literature for combating phase errors. The use of norm constraint on the weight vector to combat against channel phase errors has been investigated in [8], [18], [21]. In this section a new approach based on the generalized response deviation constraint given by (7) is used to design an optimum beamformer with robustness against channel phase errors. Let S' be the steering vector in the presence of phase errors defined as

~

gee)

= {exp U(cP. + tl)], exp U(cP2 + r2)],

Fig. 3.

.. ·,exPU(cPL+fL)}T

(28)

IV.

where {cPi = 21r!OTj,i == 1,2,···,L} and {ti,i == 1, 2, ... ,L} are the phase errors associated with the L channels. It is assumed that {t;, i == 1, 2, ... , L} are independent random variables, uniformly identically distributed in [-&P/2,ocP/2] and are of zero mean and variance Robustness in the design can be formulated as follows:

e2

== -

13

minimize W H RW

where {3 is a scalar given by

==

subject to (W o - W)H Q(Wo - W) ~

J ... lxP/2

-04>/2

JOt/>/2 -04>/2

and n(tl,"', rL) is the joint pdf of {Si,i has the property that

(30)

== 1,2"" ,L} and

v

-

V)H R (W0

subject to V H QV

(37)

(38)

then the optimization problem defined by (36) and (37) be reformulated as minimize (W0

O(rl,"', sL)drl ... dSL

E.

' V :i: W o - W

-sen

(29)

(36)

w

2.,. j~/2 n(t.,···,rL)ll-~HWI2drl···drL Let

-lxP/2

(3

OPTIMUM BEAMFORMER WITH ROBUSTNESS CAPABILITY

The optimum weight vector is the solution to the following complex constrained optimization problem:

l1f.

1 J&/J/

Structure of the new partitioned narrow-band beamformer.

:s

-

V)

f.

can

(39) (40)

Using the standard primal-dual method [25], it can be shown that the optimal vector which solves the problem defined by (39) and (40) is given by (41)

Since {ri, i == 1, 2, ... ,L} are assumed to be mutually independent, it follows that

O(rl,

sz,' ", SL) ==

IT f2 ( t i )

where the optimum Lagrange multiplier ~ is the root of the following rational function of ~, namely

L

i

(32)

;=1

where Oi(ri) is the pdf of rio Substituting (28) and (32) into (29) and after some manipulation, one obtains

e 2 == w H QW - (pHW + WHp) + 1

The optimum weight vector is then obtained by substituting (41) into (38) and is given by (43)

V. AN

ALTERNATIVE STRUcrURE

The beamformer described previously can also be considered in terms of a constrained partitioned form shown in Fig. where 3. Wo is a coefficient vector of the upper filter. This vector ensures that a closest approximation, in the minimum-meanI, if i == j square sense, to the desired response is achieved in the beamsteer direction in spite of variation in parameters. W p is a 2 [Q]i j = { 5 ) exp U(tPi - tPj)] sine coefficient vector of the lowest filter, and is chosen to minimize the total mean output power. if i =1= j , i , j == 1, 2, . . . ,L (34) The output get) consists of the difference between a main beam Yo{l) and an auxiliary beam Yp{t). To prevent signal suppression, the adjustable weights of the lower filter must . . (o
r

(

(33)

t1

610

One pos sibility of ensuring that the lower filter has min imum response in the beamsteer direction is to introduce a quadratic constraint on th e weight factor as W~

QW p =s

>0. •

(44)

E.

--

,.. 18 . 9

This ensures that the output of th e lower filter contains little look direction signal component. Inst ead of quadratically constraining the weights of the partitioned beamformer, a set of linear constraints can also be used to approximate the effect of the quadratic constraint defined by (44) . Since ideally the desired response of the beamformer is primarily determined by the upper filter, to prevent signal suppression, the adjustable weight of the lowe r filter must be co nstrained to have zero response in the beamsteer direction and thus

§ ~

~

- - - - - _

.

O'

, \

\

\

••

a -1 1 11

\

\

\

, -,

-,

,

,,

- 2'a .1

..

.>0 •

...

I I ,'

lOX DD8:rTOoI

'5.'

".'

cD-t'

>S• •

(45)

Fig . 4 . OUlPUl SNR for the new quad ratically const rained optim um bea mforme r with j. = 0° and j.'I> = 6° and ~ = 0 .1 percent in the presence of a 6 dB 0° directional source . a 0 dB spherically isotropic noise and - 30 dB white noise .

If Q has rank equal to n j, then it is clear that the necessary and sufficient co nditions for (45) to be satisfied are

constraint as well as the linear co nstrai nts defi ned by (47) . The new optimization problem is

E~Wp

= 0,

i

= 1,2, ' . . , n \

(46)

(52)

where {E" i = 1, 2, " ', nl} are the n \ orthonormalized eigenvectors corresponding to the n\ nonzero eigenvalues of Q. If Q has full rank or because in practice, the eigenvalue eva lua tion will not yield exactly zero eigenvalues, on e can impose no linear constraints of the form ErWp = 0, i = 1, 2, · · · , no

subject to

~ (~ A) /(~ A) X 100 percent

(47 )

subject to E ~Wp = 0,

i = 1,2, . . . , no.

IIWp W L L

;=no+ 1

A"

<5

i = no","I

»;

Se

(55)

by a suitable choice of no and <5 . VI.

(48)

(50)

The robustness of the processor is achieved through the eigenvectors {E" i = 1, 2, . .. , no} which are obtained from Q. It can be readily established that

W~{QWp =s

L L

W~ Q W p S

(49)

wp

(53)

The norm constraint defined by (54) ensures that

is greater than or equal to some threshold. The constrained optimization problem can then be expr essed as minimize E[g2(t)]

i = l , 2, ' '' ,no

=0,

(54 )

where no is the smallest integer chosen such that the percentage trace of the Q matrix defined by percent tr

Efwp

(51)

The choice of no according to (48) helps achieve a small uppe r bo und on W~Q Wp as required by (44) . However, complete co ntrol is not achieved because of the IIWpWin (51) . Thus one is led to an optimization problem which includes a no rm

COMPUTER STUDIES

T he purpose of this section is to present some numerical res ults to de mo nstrate the perfor mance achievable when the new approach is applied to the design of robust narrow -band arr ay processors . The array geometry used in the computer stud ies is the do uble-ring circular array shown in Fig. 2 . The beamsteer direction is assumed to be in the plane of the array and the 0° look direction is aligned with one arm of the array shown in Fig . 2. The parameter r is specified relative to the frequency of interest through the dimensionless spatial sampling factor Jl. which is defined by 6

r 'Ao

Jl.= -

(56)

where 'Au is the wavelength corresponding to fa . In the compute studies, p. was set at 0 .25 . Fig . 4 show s the array outp ut signal-to-noise ratio for the quad ratically constrained beamformer with I::i.cP = 0° and I::i.cP = 6° , respectively. The source scenari o was ass umed to consist of a 0° directional source of power 6 dB . The ambient noise field is spherically isotropic noise of 0 dB with un co r-

61 1

- - At-t .• -~-- -_.

30.'

_ . - _ . • At' .. 'I . •

_ - -:,,<:

' .

••

-,

-11iI. '

~ re.e

S.o

•1 .'

1'i.1

1NYtJI.4rDCsPAtDC URR

(I()

"'-:...-

- - - _.- _.-.- - - .- --- .. . .

0 .0

< .

-- .. _-

-19 .6

.

....

_"'.• L....~~........~~~"'-'-'---'-.....:::;~=---'- .............:L.--'-.......:.c..........J '.0

_ ._ ._ . • 6+" is"'

"'.

- - - - 61''' 2tII.BlC r,

·~--=~5 .-:-.:-,~, . - "':;' == - ~ --.... ~

It .•

- - - .. .. ... &t .. 10·

30.'

r,

. . 6r .. !;.1Ie r.

"'. ~

_ _ 6+- o'

f,

6C'·'. • r,

"'

s.•

Ie.'

SOD:R 'W.5E EJR:Pt

Fig. 5. Array gain in the 0° direction as a functionof interring spacing error for the new quadratically constrained optimum beamformer with E = 0.1 percent in the presence of a 6 dB 0° directional source, a 0 dB 180° directional source, a 0 dB spherically isotropic noise and - 30 dB white noise.

1'5.'

eor..)

"'.0

....

Fig. 6. Array gain in the ()" direction as a function of phase errors for the new quadratically constrained optimum beamformer with ~ = 0.1 percent in the presence of a 12 dB ()" directional source , a 0 dB 180· directional source, a 0 dB spherically isotropic noise and -30 dB white noise.

Qp is chosen so that related noise at level of - 30 dB. In the plots, an optimum beam is scanned through a number of bearing angles ranging 1~ _-)2_ (58) (~J ~j - 1. from 0°-25°. It can be seen that signal suppression problem due to directional mismatch is quite severe. The design of incorporating Llq, = 6° is to reduce the severity in the vicinity The source scenario was assumed to consist of a 0 dB 180° of the look direction . It is worth mentioning that with the new directional source and 0 dB spherically isotropic noise and approach, the width of the main beam can be specified, and a - 30 dB white noise. The power of the 0° directional source good compromise can be reached between a reasonable signal was assumed to be 12 dB. The array gain is averaged over acceptance angle and the ability of the beamformer to reject 50 different realizations of the random number generator. It is clear that the beamformer without 8q, incorporated is very directional interferences. Fig. 5 shows the array gain of the new beamformer as a sensitive to phase errors, especially at high input signal-tofunction of interring spacing errors. In the plots, the array noise ratio. On the other hand, the array gain is practically correlation matrix R is computed based on an array geome- constant up to 5° phase errors when 8fjJ = 15° is used in the try with difference interring spacings ranging from 0-25 per- design, even for a high input signal -to-noise ratio. cent smaller than ro = 0.25 hOI which is the value assumed VII. CONCLUSION in the constraint equation. The source scenario was assumed to consist of a 0 dB 180° directional narrow-band source, 0 This paper has presented a unified approach to the design dB spherically isotropic noise , and -30 dB white noise. The of robust narrow-band array processors. Three types of ropower of the 0° directional source was assumed to be 6 dB. bust designed have been considered; robustness against diIt can be seen that the processor without robustness incorpo- rectional mismatch, robustness against array geometry error, rated is very sensitive to interring spacing errors, especially and robustness against channel phase errors. Initially a gen at a high input signal-to-noise ratio . On the other hand, the eral quadratic constraint on the weights is developed. Sub beamformer with Llr incorporated in the design is able to sequently, the quadratic constraint is approximated by linear retain the array gain for small errors. constraints or at most linear constraints plus norm constraint. Fig . 6 shows the array gain of the new beamformer as a These latter set of constraints are no more complex than those function of sensor phase errors. In the plots, the array cor- required for designs which do not incorporate robustness fea relation matrix R is computed based on some random time tures explicitly . Numerical results show that the proposed apdelays at each sensor element as follows. proach appears to offer a unified treatment for directly designLet q,0 be the rms phase error, in degrees, atfo. A constant ing narrow-band processors which are robust against various phase error on all sensors will not degrade the system, so types of errors and mismatches between signal model and aca vector of sensor time errors OJ is assumed, where OJ > 0 tual scenario. corresponds to phase advance

L;;:

ACKNOWLEDGMENT

(57)

where ~j = ~j/Qp and ~j is a random variable between -1 and +1 with a constant probability distribution. The constant

612

The authors are grateful to the anonymous reviewers' comments, which helped to improve the presentation of the paper. REFERENCES

[1] C. L. Zaham, "Effects of errors in the direction of incidence on the

[2]

(3]

(4] [5] [6]

[7] [8]

[9] [10] [11] (12] [13]

[14] (15]

performance of an adaptive array," Proc. IEEE, vol. 60, no. 8, pp. 1008-1009, Aug. 1972. H. Cox, "Resolving power and sensitivity to mismatch of optimum array processors," J. Acoust. Soc. Am., vol. 54, no. 3, pp. 772-785, 1973. S. P. Applebaam and D. J. Chapman, "Adaptive arrays with main beam constraints," IEEE Trans. Antennas Propagat., vol. AP-24, no. 5, pp. 650-662, Sept. 1976. K. Takao, H. Fujita, and T. Niski, "An adaptive array under directional constraint," IEEE Trans. Antennas Propagat., vol. AP-24, no. 5, pp. 662-669, Sept. 1976. A. M. Vural, "A comparative performance study of adaptive array processors," in IEEE ICASSP '77 Rec., May 1977, pp. 695-700. - , "Effects of perturbations on the performance of optimum/adaptive arrays," IEEE Trans. Aerosp. Electron. Syst., vol. AES-15, no. 1, pp. 76-87, Jan. 1979. R. T. Compton, Jr., "Pointing accuracy and dynamic range in a steered beam adaptive array," IEEE Trans. Aerosp. Electron. Syst., vol. AES-16, no. 3, pp. 280-281, May 1980. J. E. Hudson, Adaptive Array Principles. New York: Peter Peregrinus, 1981. R. A. Mucci and R. G. Pridham, "Impact of beam steering errors on shifted sideband and phase shift beamfonning techniques," J. Acoust. Soc. Am., vol. 69, no. 5, pp. 1360-1368, May 1981. R. T. Compton, Jr., "The effect of random steering vector errors in the Applebaum adaptive array," IEEE Trans. Aerosp. Electron. Syst., vol. AES-18, no. 5, pp. 392-400, Jan. 1983. Y. Bar-Ness, "Steered beam and LMS interference canceler comparison," IEEE Trans. Aerosp. Electron. Syst., vol. AES-19. no. 1, pp. 30-39. Jan. 1983. R. N. McDonald, "Degraded performance of nonlinear array processors in the presence of data modeling errors," J. Arouse. Soc. Am., vol. 51, no. 5, pp. 1186-1193, Apr. 1977. D. J. Ramsdale and R. A. Howerton, "Effect of element failure and random errors in amplitude and phase on the sidelobe level attainable with a linear array," J. Aroust. Soc. Am., vol. 68, no. 3, pp. 901-906, Sept. 1980. A. H. Quazi, ., Array beam response in the presence of amplitude and phase fluctuations," J. Aooust. Soc. Am., vol. 72, no. 1, pp. 171-180, July 1982. D. R. Farrier. "Gain of an array of sensors subjected to processor

[16] [17]

[18] [19] [20] [21] [22] [23] [24]

[25] [26]

613

perturbations," Proc. Inst. Elec. Eng., vol. 130, pt. H, no. 4, pp. 251-254, June 1983. L. C. Godara, "The effect of phase-shifter errors an the performance of an antenna array beamformer," IEEE J. Ocean. Eng., vol. OE-IO, no. 3, pp. 278-284, July 1985. N. K. Jablon, "Adaptive beamforming with the generalized sidelobe canceller in the presence of array imperfection," IEEE Trans. Antennas. Propagat., vol. AP-34, no. 8, pp. 996-1012, Aug. 1986. E. N. Gilbert and S. P. Morgan, "Optimum design of directive antenna arrays subject to random variations," Bell Syst, Tech. J., vol. 34, pp. 637-663, May 1955. J. M. McCool, HA constrained adaptive beamformer tolerant of array gain and phase errors," in Aspects oj Signal Processing, pt. 2, G. Tacconi Ed. Dordrecht, Holland: Reidel, 1977, pp. 517-522. J. N. Maksym, "A robust formulation of an optimum cross-spectral beamformer for linear array," J. Acoust. Soc. Am., vol. 65, no. 4, pp. 971-975, Apr. 1979. H. Cox, R. M. Zesking, and T. Kooij, "Sensitivity constrained optimum endfire array gain," in Proc. IEEE ICASSP, 1985, paper 46.12. A. K. Steele, "Comparison of directional and derivative constraints for beamfonners subject to multiple linear constraints," Proc. Inst. Elec. Eng., vol. 130, pts. F, H, no. 1, pp. 41-45, Feb. 1983. J. W. R. Griffiths and J. C. Hudson, "An introduction to adaptive processing in a passive sonar system," in Aspects of Signal Processing, G. Tacconi Ed. Dordrecht, Holland: Reidel, 1977, pp. 299-308. K. M. Ahmed and R. J. Evans, "An adaptive array processor with robustness and broadband capabilities," IEEE Trans. Antennas Propagat., vol. AP-32, no. 9, pp. 944-950, Sept. 1984. D. G. Luenberger, Optimization by Vector Space Methods. New York: Wiley, 1969. M. H. Er, "Optimum antenna array processors with linear and quadratic constraints," Ph.D. dissertation, Dept. Elec. Comput. Eng., Univ. Newcastle, N.S.W. 2308, Australia, May 1985.

Design Trades for Rotman Lenses R. C. Hansen, Fellow, IEEE Abstract- The foundation of a satisfactory Rotman lens design is geometric. The ellects of the seven design parameters (focal angle, focal ratio, beam angle ratio, maximum beam angle, beam port curve ellipticity, array element spadDg, and focal length/X) on the shape, and on the geometric phase and amplitude errors of a Rotman lens are described. The adYantage of beam port shaping to reduce phase error, and of pointinl port horns at the opposite apex (instead of normal to the curve) to reduce oil-axis beam amplitude asymmetries, are shown numerically. A design procedure for selecting these parameters Is given, and a new calculation of lens gain Is presented.

M

I.

Yz

INTRODUCTION

ULTIPLE beam antennas have proved useful for various applications such as ECM, and the Rotman lens is often used . Design of these lenses must involve both geometric trades and mutual coupling effects between the lens ports. The latter is relatively difficult to control , but the former is crucial to the realization of an efficient and compact lens. Thus a careful geometric optics design should be accomplished first ; then adjustments must be made to reduce mutual coupling effects . This paper describes the geometric design trades . The Rotman lens has six basic design parameters: focal angle (x, focal ratio {3, beam angle to ray angle ratio 'Y, maximum beam angle Y,m, focal length f l , and array element spacing d. The last two are in wavelengths, and l' is a ratio of sines . A seventh design parameter allows the beam port arc to be elliptical instead of circular. Since the design equations are implicit and transcendental , with only one sequence of solution, the interplay of design parameters is difficult to discern . In this paper a series of lens plots is used to show the effects of each parameter. Geometric phase and amplitude errors over the element port arc vary primarily with (X and {3 , and with an implicit parameter which is the normalized element port arc height. Representative plots show how these errors depend upon the parameters. For lenses where the beam port arc and feed port arc are identical, resulting in a completely symmetric lens, the design equations are greatly simplified [8]. However, these lenses are seldom used because their design options are much more constrained. A new calculation of lens gain is presented , with the lens connected to an array of isotropic elements. Port spillover, and phase and amplitude errors are included in the gain calculation, but not impedance mismatches due to mutual coupling. Finally, a design procedure is outlined.

focus

Fig .!.

Ray geometry .

at the center. The focal angle (X is subtended by the upper and lower foci at the center of the element port curve . It is assumed here that the foci are symmetrically disposed about the axis , and that the lens is also symmetric. Then the parameter {3 is the ratio I of upper (and lower) focal length f 2 to f l :

(I) Clearly the lens width in wavelengths , f l lA, is another parameter. Now the angle of the beam radiated by the array is y" and if one of the off-axis foci is excited, then the ratio of lens ray angle (X to array beam angle y, is 1': sin y, 'Y = - - . sin (X

r.

An indirect parameter of utility is which relates the distance Y3 of any point on the array from the axis , to fl' This parameter controls the portion of the phase and amplitude error curves that the lens experiences. It is expressed: Y3'Y

r=· f

n. LENS PARAMETERS The lens equations equate path lengths from the foci to the array elements; see [7] or [6] for a derivation of these. Using the nomenclature of Fig. 1, it is convenient to normalize all dimensions by the principal focal length fl' This is also the lens width

(2)

(3)

l

Note that the line lengths w of Fig. 1 are an integral and essential part of the lens. The maximum beam angle, Y,m , is an important parameter, as is the array element spacing in waveI Note that this ratio tJ is the inverse of the ratio g used by Rotman, and by McGrath (41. As it is convenient to normalize all dimensions by Iv. the

Manuscript rece ived December 4, 1989; revised Septembe r 11, 1990. The author is at P.O. Box 570215, Tarzana, CA 91357 . IEEE Log Number 9041791.

ratio of

f 2 / f.

is more appropriate.

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. 39, No .4, pp. 464-472, April 1991.

614

1. 0 . - - - - - - - - - - - - - - - - - , ALFA - 40

.8 ZETA

.6

.4

.2

o

L....-----'_~

. 80

Fig. 2.

lengths d IA. The

_

_

.L___J_

_'___ __ _ '

. 92

. d8

. 84

BETA

. 96

Upper limit on parameter zeta.

lmax that corresponds is given by lmax

(NE - l)-yd

(4)

= -----

2/ 1

where NE is the number of elements in the linear array , since Ymax = (NE - 0 /2. An upper limit on I occurs when the tangent to the element port curve is vertical ; this also gives w = O. This value of I is given by Fig. 3.

(5) Fig. 2 gives this limiting value versus {3, for several values of a . Since the useful range of I is roughly from 0.5 to 0.8, a range of {3 appropriate for a given a may be inferred . The geometric lens equation is a quadratic in the line length w that connects an element port to the corresponding array element:

(6) where the coefficients involve the parameters , a, {3 , and 'Y :

(1 - (3)2

0=1 2 12

c = -

r2 +

(7)

( I - {3c) 2 - (32

b=-2+-+ {3

12

2(1 - (3)

I - {3C

r 2S 2

I - {3C

12S 2 (l -

(3)

(I - {3C)2

14S4 4( I - fJC)2

(8) (9)

and C = cos o , S = sin«. Usually the number of beams and number of elements, the maximum beam angle and element spacing, are specified from

Effect of focal angle . (3 = 0 .9, "y = 1.1. psim = 50. /1 = 4 WV, d = 0.5 WV.

the system requirements . Thus , the task is to select the optimum a, fJ, 'Y, and /I/A . III.

EFFECT OF PARAMETERS ON LENS SHAPE AND PORT POSITIONS

The lens shape is important , both in conserving space and in reducing loss. For example , a wide lens tends to have path lengths that are more nearly equal, and allows the beam port curve and the element port curve to have different heights, and even different curvatures. As in Fig. I , the width is along the lens axis. Wide lenses have large spillover loss, and higher transmission line loss. A compact lens tends to minimize spillover losses; roughly equal port curve heights now become important, to avoid severe asymmetric amplitude tapers and large phase errors. Curvatures of the two port curves may be different; use of array element spacing greater than half-wavelength allows more beam ports than element ports to be used. For this case, the beam port curve may be more curved , and the element port curve flatter. The effects of the seven parameters will be shown through a series of charts . Six charts are shown in Figs. 3-8. Beam ports, which are ticked, are on the left. Element ports, also ticked, are on the right. Foci are indicated by asterisks . The focal length is normalized to unity, so that each tick mark on the axis (and on

615

l.0

VAL UES OF GAMA SHOWN

Fig. 4.

Effect of focal ratio. a = 40, r = 1.1, psim d = 0.5 WV.

= 50, I. = 4

WV,

the ordinate) is 0.05. From the ordinate scale the element positions may be inferred. Each lens curve is extended past the outermost port by half the width of that port. These examples have nine beam ports, and II element ports, and of course II elements in an equally spaced linear array. With all other variables fixed, increasing a opens the beam port curve, and closes the element port curve. Port positions are roughly unchanged. But the outer foci locations change markedly as expected. The three lens plots of Fig. 3 illustrate these effects. It can be seen that a value a can be selected that roughly equalizes the heights of the two curves. Of course, a must be selected in conjunction with the other variables, to minimize phase errors over the aperture. The outer foci should be comfortably inside the beam port curve. Increasing 13 has an effect similar to increasing a; the beam port 'curve opens, and the element port curve closes. Fig. 4 contains three lens plots to show this. Again, port positions are roughly unchanged. Also the focal locations change relatively little. Again, a value of 13 can be selected that roughly equalizes the curve heights. There are pairs of a and 13 that produce closely the same lens shape, and port positions. However, the foci vary with a, and the connecting lines (from element ports to elements) are different. Table I shows three a - 13 lens pairs that have common = 50, II/X = 4, and lens curves and ports, all for 'Y = 1.1, d = 'A/2. One may thus infer that the phase error over the

Fig. 5.

Effect of angle ratio. a = 40. {3 = 0 .9, psim d = 0.5 WV.

= 50, I I = 4 WV,

aperture for each beam will be different, depending upon a . This will be shown in the next section. For any set of the other four parameters, there are probably some a - (3 pairs that behave similarly. Increasing 'Y leaves both lens curves unchanged, but the beam ports are moved closer together, while the element ports are spread apart. A three-lens set in Fig. 5 shows this trend. Although the foci remain fixed, the ends of the curves change, so that the relative position of the foci changes. 'Y also can affect the relative heights of the two curves. Values of 'Y here are one or greater, as the cases used are all for large beam angles. When the beam cluster subtends a more modest angle, e.g., 30·, values of 'Y < I are appropriate as they allow a "fat" or curved lens. When is changed, only the beam port spacings change. Increasing '"m spreads the beam ports and extends the port curve, so that this parameter helps produce a lens with roughly equal heights of beam and element port curves. The three lens plots of Fig. 6 depict this behavior. Element spacing is critical as it controls the appearance of the grating lobes [2]. For a maximum beam angle of spacing that just admits a grating lobe is

"'m

616

"'m

"'m'

d /'A = 1/(2 + s~n "'m) ' In general, spacings are kept below this value.

(10)

. 50 .45 . 50 . 45 . 50 . 45

55 50 45

VALUES OF PSI-MAX SHOWN

Fig. 6.

Effect of maximum beam angle . Ct = 35. I, = 4 WV . d = 0.5 WV.

VALUES OF d/).. SHOWN

t3 = 0.92.

'Y

=

1.1.

When d is changed, only the element port spacings and the extent of the port curve change, analogous to Vt m changing beam port spacing. Fig. 7 uses two lens plots to show this. Increasing the lens focal length (width) in general increases the separation between the end ports as well. But changing f l / }., also changes all spacings, as the lens equations are normalized by fl' Thus as shown in the two lens plots of Fig. 8 changing fl /}., also changes the element port arc and element port spacings. The minimum value of f , is smaller for Rotman lenses than for other types of lenses [9]. Next, phase and amplitude errors will be examined . IV.

EFFECT OF PARAMETERS ON PHASE AND AMPLITUDE ERRORS

Aperture errors depend upon ex and (3, and upon eccentricity, but only indirectly on the other parameters. Thus the most insight results from plotting phase and amplitude errors versus the normalized parameter r; see (3). Since phase errors are zero at angles corresponding to the three foci, a satisfactory approach uses one beam position midway between the central and edge foci, and a second beam position beyond the edge focus. Amplitude errors occur at all beam ports, so more cases are needed to display amplitude error behavior .

Fig. 7.

Effect of array element spacing , Ct = 40. psim = 50. I, = 4 WV.

A . Phase Errors

t3 = 0.88 .

'Y = 1.1.

Figs. 9-12 show phase error versus r for lenses with ex of 30 and 40 deg., for the two beam positions. Note that to get phase error, the values from the figures are to be multiplied by f l / }.,. For the midfoci beams, the phase error is small, except for very large lenses. Phase errors for the wider angle beams are still modest, and will not be important except for large lenses, or designs with r> 0.75 . In general, the phase errors increase as ex is increased, for all beam positions. Although Rotman and Turner indicated an optimum value of (3, which was 2/(2 + ex 2 ) , examination of Figs. 9-12 show that an optimum (3 exists only for one range of and one ray angle. Best values of (3 are different for between foci rays and rays outside the foci. Since the latter usually have larger phase error, the designer could optimize, but the value would vary with both This old value also gives poor lens ray angle and with shapes when the number of beam and element ports are roughly equal. The results of using an elliptical beam port curve are shown in Fig. 13, where the phase errors are equalized at r = ±0.7 for a ray angle of 45 Note that the elliptical beam port curve is a simple way of realizing the optimum curve at Katagi et al. [3]. This gives a 13% reduction in phase error; the phase errors for the midfoci ray at 17.5 become slightly more asymmetric, but

617

r

rmax'

0

•

0

Fig. 8.

Effect of focal length . a = 40. {3 d = 0.5 WV.

TABLE I

a-{3 PAIRS

Number I

2 3

a = 30 a = 35 a = 40

{3 = .94 {3 = .92 {3 = .90

are still well below the 45° ray errors. Note that the ellipticity of -0.3 only changes the principal radius by 5%, so amplitudes are essentially unchanged. The beam port ellipse major axis is along the lens axis for this ellipticity.

B. Amplitude Errors

= 0.9 •

.., = 1.1. psim

= 50.

angles the near and far ends of the element port curve experience modest amplitude changes. Compared to the amplitude taper needed to produce 25 dB sidelobes, these amplitude errors are small. Actual lenses may have port widths larger than >../2. so the amplitude tapers can be expected to increase, especially for edge beams. The asymmetry of amplitude for the off-axis beams can be reduced by pointing each port hom at the opposite apex, instead of normal to the port curve [5]. For example, a nine-beam. ll-element lens with ex = 40, {:j = 0.9, 'Y = 1.1, !/1 m = 50, d = 0.5>" and I. = 4>" has amplitude taper for the outside beam as shown in Table II. Also shown is the taper for apex pointed horns. Use of apex pointing produces appreciable improvement. Gain is slightly improved.

Amplitude errors are calculated using beam port and element port hom patterns of sine ;r u, where horn widths are all set to a nominal >../2. Each port hom has its axis normal to the port curve. Fig. 14 shows amplitude error, normalized to 0 dB for the axial ray, for a lens with ex = 30, {:j = .94. Curves for ray angles of 0° , 15°, and 45° are given. Similarly, Fig. 15 is for a lens with ex = 40, {:j = 0.9, for ray angles of 0° , 20°, and 50° . These examples are two of the ex - {:j pairs of Table 1, and thus the amplitude errors are similar. As expected, for wide ray

618

V . CALCULATION OF LENS GAIN

Element and beam port spillover, phase and amplitude errors, port impedance mismatches, and transmission line loss all contribute to reducing lens gain. Note that, as in the case of a horn feeding a reflector antenna, there is no feed hom spreading loss, due to the equal path property through the foci. The small inequality of other paths is subsummed in the path phase and amplitude errors. Gain will be calculated here based on port

2. 0

1.

1. >

'"

<,

u,

<,

w '" 0

a:' a a: a: w w

til

-c a.

-0 .

-0 .

:I:

- 1.

-1. 5

-2. -O .B

- 0 .6

-0 . 4

Fig. 9.

-0 .2

0 .0

0 .2

ZETA

Phase error between foci. Rotman lens. ex

0.4

O. B

0 .6

= 30, ray angle = 15.

2 .0 <,

I I I I I

1.

1.

>

'"' ;:;:

<,

O.

<,

'"

ur 0

rr' 0 a: a: w ur

til

-c

-0 .

-0.

\

\

\

\

\

\

r

I I I I I

BETA =

\

'\

. 94

J

I

:I:

a.

-1.

-1.5

-,

/

J

I I I I I

/

/

/

/

<,

. 92

<,

/

---

- -

//

/'

--=-=""",~---,

..-/'.

-...::::::::-:- :-.. -...

'-'.

"\.

.

"

\

0 .2

Fig. 10. Phase error beyond foci. Rotman lens, ex

spillover and on aperture errors. Since the amplitude error calculation includes both beam port and element port hom patterns, spillover is included [10]. The phase and amplitude errors at the element ports are transferred to an array of isotropic elements. Then the problem reduces to that of calculating gain of a symmetric linear array of isotropic elements with complex coefficients. This is readily done [2]:

IL:A n I 2 L:L:AnAm*sinc(n _ m)27rd/>..'

"

\,

. 96

0 .0

=

-.

i

\"

ZETA

G

....." """".'-'. "

(11) 619

0 .4

""

I

I --- . /

/

\. 0 .6

O. B

= 30, ray angle = 45.

Actual gain is then that of (11) multiplied by the element gain, times the impedance mismatch factors. Variation of gain with parameters is very small. For example, for a typical small lens only 0.2 dB change occurs from the center beam to the edge beam. And the gain values are roughly independent of o , {3, 'Y etc. With a larger array the gain increases just as expected. Using the same nine-beam, l l-element example of the previous section, gain ranges from 10.2 to 10.4 dB; the latter value, for the center beam, is within 0.1 dB of the gain for a similar uniformly excited array. For this lens the range is ± 0.756.

r

2 .01...-

....,

I.

I.

>

x

<,

i:

O.

<,

l!l lU CJ

a:

-0 .

o

a: a:

lU

...

lU

Ul

-0 .

:I:

a.

-I.

- 1.

Fig. II.

Phase error between foci. Rotman lens. cr = 40. ray angle = 20.

2l

It L

I. ° t

~

<,

l!l

01

ur

CJ

a:'

-0 .

CJ

a: a: lU lU

Ul

-e

-0 .5

:I:

a.

- 1.0

-1.

- 0 .6

Fig. 12.

-0 .4

-0 .2

0 .0

ZETA

PROCEDURE

System requirements usually specify the frequency range, the number of beams and the angular coverage, and either the

0 .6

0 .4

Phase error between foci. Rotman lens. cr = 45. ray angle

Effects of feed hom spillover and internal lens reflections can be reduced by either employing dummy (terminated) feed horns adjacent to the edge horns, or through the use of absorber between the ends of the beam port arc and the element port arc [5]. Port impedance mismatches are outside the scope of this paper.

VI. A DESIGN

0 .2

O. B

= 22 .5.

beamwidth or adjacent beam crossover level. From these, a suitable combination of number of elements and d f A. may be inferred. The design process starts by the selection of a center frequency, at which all dimensions are computed. Then, a starting value of i. fA. is selected, to keep fmax well below 0.8 . The focal length will be somewhat less than the array length. Next, using the guidelines of Sections ill and IV. ex, (3, and 'Y are selected, to: locate the outer beam port a modest amount past the outer focus; produce beam port and element port arcs of comparable heights; and yield acceptable phase and amplitude errors at each port. Achieving this may require adjustment of

620

4 .0

\

>

E

\

\

\

3'

<,

\

u, <,

\

\

UJ

cr: a

-,

<,

<,

-0.

a: a:

s-:

UJ

/"

UJ

en

0

- .3

45

co 0

=

E -

s--:

~

z-: RAY

--

,.-

ANGLE' 17 . 5

I

a.

-2 0

-0 6

-0 2

Fig. 13.

2 .0

o

0 .0

0 .2

ZETA

Effect of ellipticity. Rotman lens,

Cl

I

0.4

0 .0

C 8

= 35, e = 0.92.

\. t~ \

~ \

Or

L u,

<,

rn a

a:' o a: a: -2 . C

\

RAY ANGLE - 0

\

"

........

........

........

<,

<,

~~- -- - - - - - - - - --- - - - - - - - - - - - ~~' 45

UJ

'", \

UJ

\

a

::0

\

-4 0

Fig. 14.

/ 1/'" or d / "', and of ex ,

Amplitude errors. Rotman lens.

{3 , and 'Y. Use of an elliptical beam port arc is usually not warranted except for large lenses, When a satisfactory design is realized at the center frequency, phase and amplitude errors at each port are calculated at representative frequencies, to assess performance over the frequency range. And of course the actual beam and element port hom widths are used. At this stage, calculation of a beam rosette (a set of beam patterns) at each frequency is appropr iate. Some compromise and iterative adjustment of parameters may be necessary to obtain good wide-band results, and to best accommodate mutual coupling effects.

Cl

= 30, e = 0.94.

Although the lens width / 1 is less than the array length, the lens height is always greater than the array length. Lens dimensions are reduced by the square root of effective dielectric constant for either stripline or microstrip implementation. See [1] for examples.

VII.

CONCLUSION

Guidelines have been given on how the seven Rotman lens parameters affect lens performance, and on how to select values for them, based on geometrical optics. Such a design must be tempered by mutual impedance considerations.

621

2 .0.-,-

-,

\

\

o.

\

u:

<,

\,

a:>

c

rZ 0 a: a: - 2 . w w

<,

~

y----

~~~

-".------- .............. ...

~

RAY ANGLE - 0

--

~~

.............. <, <,

------ ------- -------------.~.

~

50

0

:::>

I-

..J

o,

:>:

-c

- 4.

- 0 .6

-0 .4

- 0 .2

0 .0

ZETA

0 .2

0.4

Fig. 15. Amplitude errors. Rotman lens, cr = 40, {3 = 0.9. TABLE

n

BEAM ONE AMPLITUDE TAPER ;

fl =

4>-

Axes Normal to Arc

Axes through Apex

I 2

-8.49 -7.03

-2.26 -1.84

4

-4 .29 -3 .33

-1.46 -1.31 -1.37

-2.04 -1.70 -1.57 -1.68 -2 .08

-2 .73 -3.60 -4.71 -5 .95

Element Number

3

5 6 7 8 9

10 11

(dB)

-5.50

-2.58

(dB)

- 1.63 -2.08

REFERENCES

[I]

[2] [3] [4]

[5] [6] [7] [8] [9] [10]

D. H. Archer, "Lens-fed multiple beam arrays," Microwave J ., vol. 27, pp. 171-195 , Sept. 1984. R. C. Hansen, "Linear arrays," in Handbook of Antenna Design, vol. 2, A. W. Rudge et al., Eds. U.K.: Inst. Elec. Eng./Peregrinus , 1983, ch. 9. T. Katagi et al., "An improved design method of Rotman lens antennas," IEEE Trans. Antennas Propagat., vol. AP-32, pp. 524-527, May 1984. D. T. McGrath, "Contrained lenses," in Reflector and Lens Antennas, C. J. Sletten, Ed . Dedham, MA: Artech House, 1988, ch. 6. L. Musa and M. S. Smith, "Microstrip port design and sidewall absorption for printed Rotman lenses," Proc. Inst . Elec. Eng., vol. 136, pt. H, pp. 53-58 , Feb. 1989. D. M. Pozar, Antenna Design Using Personal Computers. Dedham, MA: Artech House, 1985, sec. 4.6. W. Rotman and R. F. Turner , "Wide-angle microwave lens for line source applications," IEEE Trans. Antennas Propagat.; vol. AP-ll, pp. 623-632, Nov. 1963. J. P. Shelton, "Focusing characteristics of symmetrically configured bootlace lenses," IEEE Trans. Antennas Propagat., vol. AP·26, pp. 513-518, July 1978. M. S. Smith, " Design considerations for Ruze and Rotman lenses," Radio Electron. Eng., vol. 52, pp. 181-197 , Apr. 1982. M. S. Smith and A. K. S. Fong, "Amplitude performance of Ruze and Rotman lenses," Radio Electron. Eng., vol. 53, pp. 329-336, Sept. 1983.

622

0 .6

O. B

Optimum Networks for Simultaneous Multiple Beam Antennas Edward C. DuFort, Fellow, IEEE The desired complex radiation patterns or aperture distributions are presumed to be specified but are arbitrary. The analysis which develops the maximum possible efficiency also suggests a synthesis procedure. The synthesis technique is completely developed for the case of the general linear MBA. Section II applies Stein's technique to the present situation. The correlation between beams is defined and the physical significance of the correlation matrix is developed. Eigenfunctions and eigenvalues of the matrix are discussed and the efficiency limit is derived. A general synthesis procedure is developed in Section III which is based on the eigenfunction structure of the correlation matrix. This BFN produces the maximum possible efficiency. Several practical examples are discussed in Section IV, and concluding remarks are contained in Section V.

Abstract- The design of passive microwave circuits for the formation of simultaneous multiple beams with arbitrary but specified shapes is considered. The maximum possible efficiency is derived from energy conservation and is determined from a Hermitian matrix whose elements are the correlation coefficients between all beam pairs. The eigenfunctions of the correlation matrix are the basis of a synthesis procedure for a practical network that will achieve the maximum efficiency. Several practical examples are given where unavoidable losses are typically 1 dB or more.

T

I. INTRODUCTION

HE problem of forming simultaneous multiple beams

from a common aperture is of considerable practical interest. For example, it may be necessary for a radar to illuminate a large elevation sector with several beams of different widths. A narrow high gain beam is used for the horizon. and successively wider beams are used at the higher angles. This provides total coverage from one multiple beam antenna (MBA) with a small number of beams. The gain reduction at higher angles can be tolerated when viewing altitude-limited targets because they are closer in range. Other examples arise in the design of multibeam feeds for dish antennas. Even though it is assumed that lossless transmission lines and ideal matched directional couplers of all values are available, it was shown a long time ago by Allen [11 that perfect efficiency often cannot be achieved. The b.arnforming network (BFN) must be dissipative in order to satisfy certain energy conservation relations. Stein [2] devised a method for calculating the maximum possible efficiency of the network knowing just the amplitude and phase of the desired radiation patterns or aperture distributions. His results are fundamental, and the loss cannot be circumvented by clever design of the linear passive BFN. A sensitivity to the problem can be obtained by referring to White's [3] work, which preceded Stein's. He studied some special cases of low sidelobe high crossover beams derived from a feed network attached to a Butler matrix. It was necessary to insert attenuation in the BFN to obtain the desired patterns. This author [4] studied the formation of a large number of identical low sidelobe beams pointing in the characteristic Butler directions. Networks were synthesized such that the Stein limit was achieved. In the present work, the problem is considered in general.

II.

EFFICIENCY LIMIT FOR GENERAL

MBA"s

The aperture of a linear array contains N elements which are connected through a linear passive BFN to M beam terminals. Excitation of beam terminal In with a unit wave produces a scattered wave Sn m at the nth aperture terminal as portrayed in Fig. 1. Perfect operation (for unit input) of the BFN would produce specified outputs B n m where ,V

L e.; I

1

2

n=l

(1)

= 1.

Since the BFN may be lossy, the scattered waves generally are of the form

(2)

where K ~m is the efficiency of the mth beam. In matrix notation I we have S

(3)

= BK

where the B matrix is specified but arbitrary and K is an unknown diagonal matrix. If the beam terminals are excited by an arbitrary input vector A, the output vector V at the aperture is

v == SA = BKA. The output power is obtained by multiplying

Manuscript received March 22, 1989; revised August 28, 1991. The author was with the Hughes Aircraft Company, Fullerton, CA. He is now at 2121 Domingo Road. Fullerton, CA 92635-3410. IEEE Log Number 9105265.

V

by the

I Upper case letters are matrices and overlined upper case letters are single column matrices (vectors). We use the dagger symbol to indicate conjugate transpose, so Ht = H means H is Hermitian. Lower case letters are scalars.

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. 40, No.1, pp. 1-7, January 1992.

623

(4)

the efficiency of the best beam K n

2

= K~lwtew = KtlH or

H = WtCW = n',

BEAM FORMINGNETWORK(BFN)

AT~

Since ~e input power is power IS m

Pout

transpose

=

( l1a)

(Bt)lnBnm =

L

n=l

B:,Bnm = BJ

AlHA

o :5 K fl

. 8m. (6b)

(7b)

III.

where K;"m is the efficiency for the mth beam. The beam with the best desired efficiency is numbered one. The others are allowed relative degraded efficiencies m: In the radar example described earlier, the desired efficiency for the horizon beam is as large as possible (K II = 1, WI I = 1), but the high angle beam gain may degrade. The elements of the real diagonal matrix Ware of the form

o<

=

Wmm -s 1,

1

m

*" 1

(12)

S

1/ Al .

(13)

SYNTHESIS OF THE OPTIMUM NETWORK

The weighted correlation matrix H = W t C W has a sequence of eigenvalues AI' I = 1,···,M, and corresponding orthogonal eigenfunctions [5] ~t, which are solutions to (l lb), These eigenfunctions may be normalized

W,;

WI.

= A.nax;

AI

fl

(7a)

= KllW

:5

The maximum possible efficiency is the reciprocal of the largest eigenvalue of the H EIatrix. Now H I 1 = 1 in all cases; therefore the choice (A)m = 0ml in (12) shows that Al ~ 1. Thus, the efficiencies are always equal to or less than unity. Since H depends only on the correlation matrix of the specified beams and the specified weighting W~ no network design can exceed the efficiency limit (13) which is a fundamental energy conservation limit. It is shown in Appendix I ~hat the efficiency of the preferred beam number 1 always Improves by applying a weighting to the other beams. In the case W = I, Allen [1] showed that it is necessary that the beams be orthogonal, C = I, for K = 1. Stein [2] first derived the important results (13) for W = I. The problem of synthesizing a network which actually achieves the Stein limit for arbitrary specified beams is based on the properties of the H matrix.

where the equality holds when / = m. The desired distributions are orthogonal when C is the identity matrix; however, this condition often is not obtained, and this leads to the efficiency limit. Often the desired efficiencies are the same for each beam; however, to allow for bias, it is assumed that

K

(lIb)

therefore, the energy conservation relation (10) requires

The element elm is the scalar product between the lth and mth distributions, and is a measure of the correlation between these distributions. From the Schwartz inequality and the unit normalization (1) of B,

K mm == KllWmm

= A~

o < A.14 :5 AlA

(6a) n~l

A is

Since H is Hermitian and positive definite, it is well known that the A are real and positive [5]. These may be ordered with AM the smallest and AI and largest. Thus, for any A:

(5)

matrices. Consider the pair BtB whose elements are N

(10)

AA

H~

= VtV = AtKtBtBKA = AT · [Kt · (ntH) · K] . A

where we have relied on the well-known formula (AB)t

L

-

A=~

BtAt and indicated a particularly useful grouping of the

=

AT

for any and all inputs X Stationary values occur when an eigenfunction of the H matrix with eigenvalue ~.

Diagram of waves transmitted through BFN.

Pout

elm

the ratio of output to input

AHA

Pin

Vt:

N

2

(9)

-=KIl~SI

M

BEAMTERMINALS

Fig. 1.

fl :

Kt . BtB . K = KtCK

N

~:

(8a) (8b)

•

~I

= o/q.

(14)

~is set may be used to represent an arbitrary input vector A, which may be applied to the beam terminals of the BFN.

and are specified. The combination of matrices in square brackets in (5) is defined in terms of a new matrix H, and

624

A= at

M

Lat~,

(I5a)

= ~l· A.

( 15b)

1

composed of hybrid junctions where the network transmission coefficients m to I are are VI*",' The operation of the T matrix may be performed by attaching attenuators (AI/AI) 1/2 to the lth output terminal of the U'' network. Because the vectors E, are orthonormal, the operation of the t matrix also may be realized by a loss less network of hybrid junctions where the network transmission coefficients are (£,) n : A block diagram of the network is shown in Fig. 2. The initial synthesis problem for S reduces to synthesizing two lossless networks, the t network and the U t network. The procedure may be the same for both. The insides of the t and U t networks may be realized in a number of ways-series, parallel, or combinations of hybrid junction types. The Blass network is convenient for large arrays. These networks are described in the literature [6]- [8] from which the required synthesis can be deduced, but not without di fficulty. The synthesis of the {f network in Blass form, including limitations in coupling values, is contained in Appendix II. Note that in the radar case, the efficiency loss must be accepted in order to transmit three independent beams; however, on receive, identical low noise"amplifiers (LNA's) may be employed. If placed on either side of the U t network the noise input to the receivers will be the same as the noise produced by each LNA. The signals will be degraded by the factor Wm m IA t as before. With the LNA's placed between the attenuators and the tf/ network, the signals still are degraded, but the fate of the noise is more complex. Let each LNA produce unit noise power. Since the transmission coefficients through ir network are V/~n' the noise a~ at the mth receive terminal is

The response of the BFN to a unit input at beam terminal m is the specified function Snm = (BK)nm given by (3). The response to an input eigenfunction 'ItI is

2, = S~, = BK~, = K1IBW~/'

(16)

Note that the product of two different Z vectors excited by two different 'It is

ztz = ~tKtBtBK~ q

q

I

2-t

I

-

( 17a)

= K11'ltqHV,

where Q7) follows from the definition of the H matrix (9). Since V is an eigenfunction of H, (14) may be used to express (17a) as follows: -t . -

_

2

_

Zq Z, - K11A,tJ 1q

-

A,

-tJ 1q . Al

(17b)

Thus the output responses to eigenfunction inputs are orthogonal; however ~ there is loss, I 2, I 2 = AI/AI ::5 1 because o < AI ::5 AI' Define new orthogonal functions which are proportional to 2, and orthonormal. £,=

AI ) 1/2 _ _ B W ~, Z ,- 2 Al/( -A I

- t E I

( 18a)

I

.

E-

Q

=

~

U'q'

(18b)

Equation (16) may be written in terms of the orthonormal as follows:

E vectors

1= I .... ,M.

( 19a)

(21a)

In matri x notation these become: ( 19b)

(2Ib)

where U is a matrix whose columns are the orthonormal vectors ~" ~c) is an N x M matrix whose column vectors are the orthonormal vectors £" and T is a diagonal matrix with elements (A,/A)1/2. Note that UtU = I or U t = U-I; consequently, UU t = I as well, and both the rows and columns of U are orthonormal (U is unitary). Equation (19b) may be solved for S by postmultiplying each side by ir

suir = S = 6TU t .

Thus, received noise is always lowered in any channel by having the LNA's between the attenuators and the g network instead of at the receivers. It is not possible to say in general that the useful performance index, gain/noise or l/(Ala~), will recover from the loss produced by the attenuators; however it is easily calculated in any particular case using (21a).

(20a)

When beam terminal q is excited with unit amplitude, ( A') m = 0m q the first operation in (20a) produces the result i

(utA), =

M

L

m=l

(Ut)'mAm = (Ut),q = (~t)q = ~/~.

1= 1

n=I,···,N/2

I

v,~V,s = tJ q S '

q,s= I, ... ,M

(20c)

Bn2 = Dne- < n- O / [ j

then output vectors of the U t operation due to two different beam terminal excitations are orthogonal. Thus the operation of the u' matrix can be performed by a lossless network 625

r r

(Neven)

e; = Dne O/ [ ~ D~ j

Since the rows of V are orthonormal or, M

EXAMPLES

The first example requires that two similar beams be formed with the same symmetrical amplitude distribution D n» but different linear phase progressions.

Dn=DN - n+ l ,

(20b)

L

IV.

~ D~

2

(22a)

2

(22b)

Neither beam is preferred; therefore, W = I, H = C, and the 2 x 2 correlation matrix has only one distinct off-diago-

the difference arm is

APERTURE

n

2

A,, ) 1/2 = tan a /2. ( A~

N

(26)

From (19b) the E vectors are

LOSSLESS ~ NEnNORK

-

(£1)

Bn l

n

B n2

(27a)

2 cos o /2

-

Bn l

(E,, ) = -

+

= ---

Bn 2

-

2 sin o /2

n

.

(27b)

Thus the iB network is a monopulse-type fee that may be realized in a number of ways such as: a dual series Lopez feed [9], a Blass feed, or a parallel feed. The form of the entire network is shown if Fig. 3 for the parallel feed 4' network. Those simple results may have been obvious from symmetry; however, it is reassuring that the general sy nthesis procedure produces the same networks. Another example is the case where the aperture distributions are cosine amplitude with the characteristic Butler matrix phase gradients: CI

LOSSLESS Ut NETWORK

M

m

BEAMTERMINALS

Fig. 2.

Diagram of the ideal network.

nal element C 12 .

el 2

=

L D,;[ eN

2 ) ( n - ( N + 1/2»4>0]

I

N /2

~

/lVL

s.; =

D;,

(23)

1) ]/ ~ D;,

N + - 2 - rPo

N /2

= cos a .

where Ii = (N + 1)/2, Iii = (M + 1)/2 and B is normalized to unity square magnitude. This case was treated in [4] for large Nand M. According to that analysis the efficiency for W = I should be the ratio of the average to the peak value of the amplitude distribution squared.

The homogeneous equations (II) for A and 'It may be written:

I K ,• I z =

1 N N ~

I e.; I ~ / I s.; I ~a,

l/[NI Bnml~l.a]

I - A ( cosO' The determinant must vanish; consequently A = 1 ± cos (J . Assume cos (J > 0 0'

A = 2 cos? I 2'

Az

]ej(n-ii)(m-m)(~"'''NI:

n) :

n = 1,··', N: m = 1,···. A1 (28)

I

D; cos [2( n -

~ cos[ (n -

a

= 2 sirr' -

2'

-t

(I/2' /2I)

(24a)

-t_

(I/2' /2 - I).

(24b)

if l =

v2 -

If cos (J < 0, the roles of AI and maximum possible efficiency is

Az

1/2.

(29)

This is confirmed by the present analysis as shown below. By writing B in terms of two orthogonal exponential functions

Bn m

e''" - fi)(m -

m+(1 /2»211"/ /v

= -------- +

Clm elm

With W = I, H

=

1,

= 1/2,

Clm

(25)

e)(n - n)(m - m - (1/2»271" /.Y

---~=----

It is easy to verify that B is properly normalized, and the correlation matrix elements are

are interchanged. The

- - - - < 1. I + cosa

=

= C,

= 0,

m

m

=I

=I

± 1

all other m.

(30a)

(30b) (30 e)

(11) are

(~)m+l + (~)m-l - 2(~)m(A - 1) = 0,

Since ex~tation of beam terminal 1 produces equal in phase outputs 'IrI' and excitation of terminal 2 produces equal out of phase outputs ~2' then the U t network is simply a magic T. The transmission coefficient of the attenuator attached to

626

m

= 1,2,···,M

(31a) (31b)

n=1

N/2

N/2+1

beamforming networks consist of two lossless networks joined by a bank of attenuators determined by the eigenvalues. The lossless network closest to the beam terminals is determined by the eigenvectors; whereas the other lossless network closest to the aperture is determined by the eigenvectors and the desired distributions. Both lossless networks alone produce orthogonal outputs; therefore they can be realized by interconnecting hybrid couplers. One standard form is the Blass network, which may be used for both lossless networks. Several examples show that the networks are relatively simple and results obtained earlier by other means are verified here.

N

ApPENDIX

I

EFFICIENCY CHANGE DUE TO THE DIAGONAL WEIGHT

W

MATRIX

=

When W == I, then H

C and

C'l' TYPICAL MAGIC TEE BEAM 1

SUM

Fig. 3. A

DIFFERENCE

BEAM 2

When W *- I, H

_it

-

+

Monopulse feed.

M

I

+

I

1

'

m,/= 1.···,M

satisfies the end conditions (31b) and is normalized square magnitude. It also satisfies (31a) provided A == AI

:=

.,

2 cos-

7rl

2M+ 2

to

(32)

2

1

==-

AI

2

-

A',

(~,t Wt)C( W~') == ---~~-- t

IW

(37)

it ' 12

2::1 Wnn(~')nI2

= h

lit'l:!

I

I:1(it')nI

2

= A -\W21~'12 nn n J

I: I i'~

1

2

or

-

2M+ 2

which is in agreement with (29) when M is large. The simplest network which achieves the efficiency of 1/2 is obtained by placing attenuation in the aperture of a Butler matrix [4]; therefore, the network derived by the present approach is omitted. This example does show that the present approach produces the same results as those derived by an independent method; however, the optimum network is not unique. V.

$

~ hll~'twI2

(33)

(34)

2C05 -

(36)

The first term on the right is the Rayleigh quotient applied to (35), and it is less than AI.

unity

Consequently, the efficiency of each of the beams is

I K II I

= Xl~'.

Regardless of W, the largest eigenvalue is greater than unity, and from the Rayleigh quotient applied to (36) A', satisfies the following inequalities:

DIFFERENCE

(-2 ) 1/2 sin (m / ). -- -7r M

(35)

WtCWand wtCW~1

solution of the form

( ')m -

=

= At'!'.

SUMMARY AND CONCLUSION

The maximum efficiency of a network that forms multiple beams is determined entirely by the beam correlation matrix. Innocuous appearing beam clusters can lead to surprisingly poor efficiencies. A class of optimum networks which achieve the Stein maximum efficiency limit is described based on the eigenvectors and eigenvalues of the correlation matrix. The

(38) The application of a weighting function always improves the efficiency of the first or preferred beam. ApPENDIX

II

RECURSIVE DESIGN OF BLASS MATRICES

The Blass matrix can be analyzed in terms of directional cross couplers and interconnecting transmission lines. The cross coupler has the scattering properties shown in Fig. 4. There may be an upper limit a < a m x < 1f /2 due to practical bandwidth or dimensional limitations of the coupler types chosen. This possibility is included in the calculations; however, efficiency will degrade beyond that due to beam correlation. The couplers usually are equally spaced, separation L, and are joined by transmission lines with wavenumber kg. A typical network has N antenna elements and M input ports (N > M) such as the network described in Section III. The network is shown in Fig. 5. The U network may be similar, but N = M. A typical section of network with 627

0

t-

j SINo(

Q

CD

----

~

COSO(

1

G)

s=

COUPLER an

Fig. 6.

0

0

COSa

-JSINa

0

0

- j SIN a

casu

a

0

casu -JSlN

Fig. 4.

-j SIN

a

cos e

0-

0

rn starting with rn = (Eq) n at the aperture. At each iteration we have z 1 = 0 since only the qth input is excited. Thus from (40) with Zl = 0 ri can be determined then Z2 from (39), and cycle back to (40) to calculate r~, etc. until r~ is calculated. Designate r; to be a new r n associated with next bank and repeat the cycle until the qth bank is reached.? The second application of these equations arises at this point where known are to be generated by the qth bank with r; = O. In this case (39) and (40) reduce to

c

Cross-guide directional coupler.

2

Typical section of Blass network.

'n

N

=

Zn+l

'ne-jt/>"

LOAD

I

z, cos

Qne-jkgL

(41)

= -jzn sin an.

(42)

If z. is chosen real, then from (41) and (42) the unknown phases are

(43)

-Zn- =e -j(n- I)k g L t z,

2

"

I

TYPICAL FIXED PHASE SHIFTER

(44) As in [10], the amplitudes are determined from inspection of Fig. 6 in conjunction with energy conservation or from the amplitudes in (41) and (42). 1

M

z, 1

2

=

I ZN+ I 1

2

+

N

Ln 1 '/1

2

(45)

.

Inserting z, from (42) produces the expression for the coupling parameter ex. n Fig. 5.

Blass network.

various traveling waves is shown in Fig. 6 for the transmit case. A known vector Ii with components (R)n = travels away from the coupler bank, after passing through phase shifters cJ>n. Incident waves zn and r~ are scattered by couplers each of which is characterized by an angular parameter an (a in Fig. 4). Referring to the typical coupler in Fig. and zn by 6, the outputs Zn+1 and 'n are related to inputs the following equations derived with the aid of Fig. 4:

csc

2a

n

= [

I .1 + ~ I r ]/1 r, 1~ esc? ««. ZN+

2

l 1

2

2

(46)

'n

r:

or

I ZN+I 12 =

(40)

max[ I r, 1

2

2 CSC

amx

(47)

n

The following choice minimizes the wasted power

(39) These equations are applied at each bank of couplers in one of two ways. First, assume the C network is to be designed and all the values of c/l n and an for coupler banks 1,2 q - 1 which generate vectors £),. ··, Eq _ 1 have been determined. The qth bank is designed by recursively calculating r~ from

N

L 1',1 2 •

IZN+112~ IrnI2csc2a.mx-

-

~ 1',1

1 2

Z N + 1 I 2: ] .

(48)

The coupling now can be determined from (46), and the input power is determined from (45) with n = 1. Since the Eq are

628

2 Although 4N+ 1 is not needed, it can be calculated and should be zero.. Thus curious result also follows from energy conservation.

nonnalized to unity, the efficiency is efficiency

= I Eq I 2 / I Z I I 2 = 11/ Z I I 2 .

[6] ( 49)

Thus, starting with q = 1 and r n = (E1)n, the entire rff network can be built recursively until q = M. The U network may be constructed similarly if desired. Note from (48) when Ct m x = 7r /2, I z.!V+ I 1 2 = 0 and the networks 1/ and U t will be lossless. Generally, the load losses will require adjustment of the attenuators in Fig. 2. Combine the efficiencies of the U t and {/ networks for the qth interconnecting channel to form an overall efficiency 11' q I 2. Then the attenuators in Fig. 2 are I Tq I 2 as follows:

I T q I 2 I Tq I 2 = 'A q Since

I Tq I 2 <

.

[7] [8] [9] [10]

(constant).

1,

I Tq I 2 = I ~ql 2 / ('f.. q / I i q I 2) max

( 50)

which reduces to 'A q / Al as before when I T q I 2 = 1. The overall efficiency derived in Section II must be reduced by the load losses in both the U t and (.r networks. REFEREI'CES

11] J. L. Allen ... A theoretical limitation on the formation of lossless beams in linear arrays." IRE Trans. Antennas Propagat .. vol. AP-9, p. 350. July 1961. [2] S. Stein. "On cross coupling in multiple-beam antennas." IRE Trans. Antennas Propagat .. vol. AP-IO. p. 548, Sept. 1962. [3] W. B. White. "Pattern limitations in multi-beam antennas." IRE Trans. Antennas Propagat .. vol. AP-I0. p. 430. July 1962. [4J E. C. DuFort .. 'Optimum low side lobe high crossover multiple beam antennas." IEEE Trans. Antennas Propagat .. vol. AP-33. p. 946. Sept. 1985. [5] G. Strang. Linear Algebra and its Applications. 2nd ed. New York: Academic. 1980. ch. 5 and 6.

629

R. C. Hansen, Ed. Microwave Scanning Antennas. New York: Academic. 1966, ch. 3, p. 247. R. C. Johnson and H. Jasik. Eds. Antenna Engineering Handbook. New York: McGraw-Hill. 1984. ch. 20, p. 56. Y. T. Lo and S. W. Lee. Eds. Antenna Handbook. New York: Van Nostrand Reinhold, 1988. ch. 19, p. 10. A. R. Lopez. "Monopulse networks for series feeding in antennas." IEEE Trans. Antennas Propagat., vol. AP-16, p. 436. July 1968. W. R. Jones and E. C. DuFort, "On the design of optimum dual-series feed networks," IEEE Trans. Microwave Theory Tech.. vol. MTT-19. p. 451, May 1971.

Direction Finding in Phased Arrays with a Neural Network Beamformer Hugh L. Southall, Senior Member, IEEE, Jeffrey A. Simmers, and Teresa H. O'Donnell

Abstract-Adaptive neural network processing of phased-array antenna received signals promises to decrease antenna manufacturing and maintenance costs while increasing mission uptime and performance between repair actions. \Ve introduce one such neural network wbich performs aspects of digital beamforming with Imperfectly manufactured, degraded, or failed antenna components. This paper presents measured results achieved with an adaptive radial b&4lis function (ARBF) artificial neural network architecture which learned the single source direction finding (DF) function of an ei&bt-element X-band array having multiple, unknown failures and degradations. We compare the single source DF performance of this ARBF neural network, whose internal weights are computed using a modified gradient descent algorithm, with another radial basis function network, Linnet, whose weights are calculated using 6Dear algebra. Both networks are compared to a traditional DF approach using monopulse.

S

I. INTRODUCTION

TANDARD antenna beamfonning algorithms, such as

monopulse, require calibrated antennas because they de-

pend on nearly identical antenna element performance. These algorithms do not perform well with uncalibrated antennas or unknown degradations. As phased-array antennas become

antenna measurement preprocessing, a radial basis function (RBF) neural network, and output postprocessing. We compare two variations of the RBF neural network, the first an adaptive network, ARBF [1], which uses gradient descent optimization training, and the second, a linear algebra based network, Linnet, which trains using a least mean squared (LMS) error solution. Comparisons between ARBF and Linnet and analysis of an error-weight surface show that the ARBF implementation converges to a near-optimal solution in only a few iterations.

Section III briefly describes a monopulse direction finding

(DF) algorithm whose performance we compare with the networks' performance. This algorithm, which is calibrated to partially compensate for nonideal element behavior and array misalignment, relies on near-identical array elements to form high-quality antenna beams for accurate results. Section IV presents and compares the experimental DF performance of the monopulse algorithm, Linnet, and ARBF at locating single sources in 10 data sets, taken under various signal-to-noise and interference conditions.

II.

larger and more highly integrated into physical structures, this uniformity requirement generates production and maintenance costs which are increasingly prohibitive for many military and civilian applications. The requirement for nearly identical elements results from a lack of adaptive beam forming algorithms capable of managing the complexities introduced by nonidentical elements with unknown behaviors. Traditional techniques synthesize aperture behavior as a mathematical combination of well-matched element and receiver channel responses. Neural network based beamfonning (neural beamfonning), in contrast, attempts to approximate overall aperture behavior from a finite number of observations of that behavior under varying circumstances. If we assume that the mapping between the received signals and the antenna's behavior is a continuous function, it is possible to model it with an artificial neural network trained at discrete samples along the function. After this learning process, the network can predict antenna behavior at points between the training points by generalizing. In Section II we present the architecture of a neural beamformer designed for single source direction finding and discuss Manuscript received February 1~, 1994; revised March 13, 1995. This work. was supported in part by the USAF Contract #F 19628-92-C-0 177. H. Southall and J. Simmers are with the USAF Rome Laboratory Electromagnetics and Reliability Directorate, Hanscom AFB, MA 01731 USA. T. O'Donnell is with AReON Corporation, Waltham. MA 02J54 USA. IEEE Log Number 9415634.

NEURAL BEAMFORMER DIRECTION FINDING

This section briefly describes the DF neural beamfonner components and structure, introduces and compares the adaptive ARBF and linear algebra based Linnet RBF networks, and explains the rapid convergence and near-optimal performance of the adaptive network.

A. Neural Beamformer Architecture The neural beamfonner architecture consists of antenna measurement input preprocessing, an artificial neural network, and output postprocessing. This section briefly summarizes the purpose and interaction of these functional clements. Network preprocessing exploits antenna expertise to simplify and enhance neural network inputs. It removes redundant or irrelevant information, eliminates artificial discontinuities in the input function space, and reduces problem inputs to a small set of relevant information, Although neural networks can learn to ignore irrelevant inputs, and discontinuities can be trained "across," if their locations are known and boundary points are available, these techniques usually create larger (and slower) networks than ones which utilize intelligent preprocessing. In the problem of single source DF. the amplitude of the received signal is not a strong indicator of the angle of arrival. The absolute phase of the received signal at each element also contains nonessential information, There is, however, a strong relationship between relative element phases and angle

Reprinted from IEEE Transactions on Antennas and Propagation, Vol. 43, No. 12, pp. 1369-1374, December 1995.

630

:9gjff' I

Anten",

InpolNodos

Gaussian

Processing Nodes Weights

Summation Nodes

Fig. I. The neural beamformer architecture consists of antenna measurement input preprocessing. an artificial neural network, and output postprocessing which gives an estimate. iI. of the angle of arrival, 8

of arrival. Therefore, we preprocess the measured phases at each element to determine the phase differences between consecutive array elements. These phase differences, however, contain artificial discontinuities caused by phase transitions (or branch cuts) in received phase measurements from -180 degrees to + 180 degrees. Discontinuities make it difficult for the network to learn the mapping from a small discrete set of training points, especially since the branch cuts are quasirandom (dependent on arbitrary receiver phase references.) To eliminate the branch cuts, we use the sine and cosine of the phase differences as final processed inputs. These functions have the added benefit of bounding the inputs between - 1 and 1 which is not essential for the RBF network but may be useful for other network architectures. It is important to note that we preprocess raw antenna measurements; there is no calibration or traditional antenna processing to correct for element mismatches. We chose RBF neural networks for this antenna application for a number of reasons. As mentioned earlier, the relationship between source angle and antenna measurements is generally a continuous function with small changes in angle yielding small changes in received measurements. After preprocessing, this is true for fully functional antennas and many forms of degraded ones. (Severely degraded antennas or those with intermittent errors such as phase shifter bit failures. however. may exhibit discontinuous behavior.) Therefore, we chose a neural network architecture which hac; proven successful at approximating continuous functions from a small set of samples and can be trained across discontinuities. Real-time processing requirements, independent of antenna size. also generated two guidelines for the network architecture. First, the network should have a constant processing delay regardless of the number of inputs. Second, the network should have a minimum number of layers (computations that must be performed sequentially). Thus as the number of inputs increase (larger antenna), the size of the network layers should grow at the same rate. with no additional layers required. The

RBF network architecture reportedly satisfies all of these constraints. The mathematical basis for these networks sterns from the fitting (approximation) and regression capabilities of RBF's [2], [3], combined with a feed-forward neural network architecture whose single "hidden " layer does not grow faster than the number of inputs [4], [5]. A three-layer REF network can also theoretically model any continuous function [2). The architecture of a three-layer RBF network. shown in Fig. 1, consists of an input layer, a hidden layer of Gaussian RBF's, and an output layer of summation nodes. The input nodes receive the preprocessed antenna data and broadcast the input vectors to each hidden layer node. Each input vector. x, is an element of the n-dimensional input space, ...t . For our experimental eight-element antenna, n == 14. Each input vector contains the seven cosines and seven sines of the phase differences between the elements . For these n-component input vectors, x== (Xl, %2, ' .. , X n ) , preptocessing ensures such that X/c E [-1,11, where k == 1,2 , · .. , n . The hidden layer maps X into a space ~ which consists of q-component Gaussian RBF vectors, rP. where q is the number of hidden layer nodes. The components of rP are m (I)

where rPi is the output of the ith hidden layer node for i == 1,2, · .. , q. rPi is calculated from the input vector, x, the Gaussian center. 'Tni , and the spread parameter, O'i . Initially, a, was a trainable parameter which varied for each Gaussian node. We achieved comparable results, however. by using the same value of the spread parameter for all hidden layer nodes. so now a, == (J . Each output node computes a weighted sum of the outputs generated by the hidden layer nodes q

YJ

== L
(2)

;= 1

where the w'J values are the output weights. Thus the output layer maps ~ into Y, the space of all possible angular directions to the source, where the output vector y= (Yl, Y2, ... , Yr) E Y for T output nodes. We postprocess the energy in the T output nodes to estimate the source angle of arrival. For single source DF we could train a single output node to emit a value proportional to the source angle. Since our goals include multiple source detection and DF, however, we chose multiple output nodes representing bins of energy in discrete angular regions of space. For an eight-element array, we use r == 13 output bins, centered at 10degree intervals from -60 degrees to +60 degrees, inclusive. We train the output nodes to emit values between zero and one, inclusive. which represent the presence and strength of a source within each angular bin. A bin output of "1" indicates a source exactly on the bin location and "0" represents no source. Values between zero and one on consecutive output nodes represent a source located between the bin angles represented by those output nodes.

631

Our postprocessing technique is used for both the single source DF networks presented here and for multiple source OF networks currently under development. We locate energy concentrations among the output nodes and consider each concentration to represent a source location. For each concentration, we determine the consecutive pair of nodes with the largest energy and interpolate between the bin angles to determine the angle of the source represented by that concentration. Although this technique performs well for single sources and multiple sources which are farther apart than two bin-widths, alternate output representations are necessary for multiple-source super-resolution. B. Network Training

In the previous section we presented the forward computational architecture of the RBF network. This computation will only produce correct results if the network has been adequately trained with pairs of inputs and their corresponding outputs. We investigated two training techniques for the RBF networks: a standard adaptive RBF training method and a linear algebra technique. Adaptive RBF training adds Gaussian RBF's to the hidden layer as needed. Initially empty, the hidden layer grows as training points are presented to the network. At each training point, either I) a new Gaussian function is added with its center on the training point and its initial network weights W1.] chosen to produce the correct network response for that point or 2) existing "close" Gaussian functions have their centers moved and weights adjusted to incorporate the new training data. Network weights are adjusted using a modified gradient descent algorithm known as "backpropagation." Additional details regarding the network architecture and training can be found in two articles by Lee [61. [7]. We refer to this implementation of the neural beamfonner as ARBF. Backpropagation of network error, however, does not ensure global optimization and introduces several vague network aspects. Often, the network designer must choose parameters for which the literature contains only general guidelines. Two important parameters include the step size and the stopping criteria. The step size dictates the convergence speed of the weights and whether they will converge or oscillate. One popular approach, which decreases the step size slowly, requires choice of initial step size, decreasing function, and stopping criteria. Even if these parameters are correctly chosen for the application, backpropagation may still stop at local minima. Unlike ARBF which adds basis functions as needed, the Linnet network places Gaussian RBF's at all training points and solves for the network weights using linear algebra [8], r9]. Linnet finds globally optimal weight values, in the LMS error sense. The weights are only optimal with respect to the training data used to construct the corresponding matrices, not necessarily for any other data. Consider a specific set of N input vectors from the input space X, labeled Xl , where l = 1,2"", N. For our data sets which measure a source at l-degree intervals from - 60 degrees to +60 degrees, N = 121. Each component of the

n-dimensional input vector is XLk, where k == 1, 2, ... ,n. Similarly, consider a specific set of N Gaussian RBF vectors

Planar Antennas for Wireless Communications

Read more

Antennas for Base Stations in Wireless Communications

Read more

Adaptive Low-Power Circuits for Wireless Communications

Read more

Handbook of Antennas in Wireless Communications

Read more

Implanted Antennas in Medical Wireless Communications

Read more

Adaptive Antennas and Phased Arrays for Radar and Communications

Read more

Radio Propagation and Adaptive Antennas for Wireless Communication Links:

Read more

Antennas And Propagation for Body-Centric Wireless Communications

Read more

Adaptive Signal Processing in Wireless Communications

Read more

Antennas And Propagation for Body-Centric Wireless Communications

Read more

Adaptive antennas and receivers

Read more

Adaptive Antennas and Receivers

Read more

Optical Wireless Communications: IR for Wireless Connectivity

Read more

Optical Wireless Communications: IR for Wireless Connectivity

Read more

Wireless Communications

Read more

Wireless Communications

Read more

Wireless Communications

Read more

Wireless Communications

Read more

Wireless Communications

Read more

OFDM for Wireless Communications Systems

Read more

TDD-CDMA for Wireless Communications

Read more

Cell planning for wireless communications

Read more

TDD-CDMA for Wireless Communications

Read more

Switched Parasitic Antennas for Cellular Communications

Read more

Antennas and propagation for wireless communication systems

Read more

CAD of Microstrip Antennas for Wireless Applications

Read more

Radiowave propagation and antennas for personal communications

Read more

Radiowave propagation and antennas for personal communications

Read more

Antennas for Information Super Skyways: An Exposition on Outdoor and Indoor Wireless Antennas (Antennas) (Antennas, 12)

Read more

Channels, Propagation and Antennas for Mobile Communications

Read more

Recommend Documents

Planar Antennas for Wireless Communications

...

Antennas for Base Stations in Wireless Communications

Antennas for Base Stations in Wireless Communications ABOUT THE EDITORS DR. ZHI NING CHEN received his Ph.D. from the...

Adaptive Low-Power Circuits for Wireless Communications

Handbook of Antennas in Wireless Communications

HANDBOOK OF ANTENNAS IN WIRELESS COMMUNICATIONS © 2002 by CRC Press LLC THE ELECTRICAL ENGINEERING AND APPLIED SIGNA...

Implanted Antennas in Medical Wireless Communications

P1: IML/FFX MOBK029-FM P2: IML/FFX QC: IML/FFX MOBK029-Rahmat-Samii.cls T1: IML June 26, 2006 21:0 Implanted Anten...

Adaptive Antennas and Phased Arrays for Radar and Communications

Adaptive Antennas and Phased Arrays for Radar and Communications For a list of recent titles in the Artech House Radar...

Radio Propagation and Adaptive Antennas for Wireless Communication Links:

Radio Propagation and Adaptive Antennas for Wireless Communication Links Radio Propagation and Adaptive Antennas for ...

Antennas And Propagation for Body-Centric Wireless Communications

Antennas and Propagation for Body-Centric Wireless Communications For a listing of recent titles in the Artech House ...

Adaptive Signal Processing in Wireless Communications

Adaptation in Wireless Communications Edited by Mohamed Ibnkahla ADAPTIVE SIGNAL PROCESSING in WIRELESS COMMUNICATION...

Antennas And Propagation for Body-Centric Wireless Communications

Antennas and Propagation for Body-Centric Wireless Communications For a listing of recent titles in the Artech House A...