RANDOM FIELDS ESTIMATION
This page intentionally left blank
RANDOM FIELDS ESTIMATION
Alexander G. Ramm Kansas State University, USA
N E W JERSEY
. LONDON
sWorld Scientific Y SINGAPORE
BElJlNG
SHANGHAI
. HONG
KONG
- TAIPEI - C H E N N A I
Published by World Scientific Publishing Co. Pte. Ltd. 5 Tob Tuck Link, Singapore596224 USA ofice: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK ofice: 57 Sbelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-PublicationData A catalogue record for this book is available from the British Library.
RANDOM FIELDS ESTIMATION Copyright 0 2005 by World Scientific Publishing Co. Re. Ltd. All rights reserved. This book, or parts thereoJ; may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any informationstorage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-256-536-1
Printed in Singapore by World Scientific Printers (S) Pte Ltd
To the memory of my parents
This page intentionally left blank
Preface
This book presents analytic theory of random fields estimation optimal by the criterion of minimum of the variance of the error of the estimate. This theory is a generalization of the classical Wiener theory. Wiener’s theory has been developed for optimal estimation of stationary random processes, that is, random functions of one variable. Random fields are random functions of several variables. Wiener’s theory was based on the analytical solution of the basic integral equation of estimation theory. This equation for estimation of stationary random processes was Wiener-Hopf-type of equation, originally on a positive semiaxis. About 25 years later the theory of such equations has been developed for the case of finite intervals. The assumption of stationarity of the processes was vital for the theory. Analytical formulas for optimal estimates (filters) have been obtained under the assumption that the spectral density of the stationary process is a positive rational function. We generalize Wiener’s theory in several directions. First, estimation theory of random fields and not only random processes is developed. Secondly, the stationarity assumption is dropped. Thirdly, the assumption about rational spectral density is generalized in this book: we consider kernels of positive rational functions of arbitrary elliptic selfadjoint operators on the whole space. The domain of observation of the signal does not enter into the definition of the kernel. These kernels are correlation functions of random fields and therefore the class of such kernels defines the class of random fields for which analytical estimation theory is developed. In the appendix we consider even more general class of kernels, namely kernels R(z,y), which solve the equation QR = P6(z - y). Here P and Q are elliptic operators, and S(z - y) is the delta-function. We study singular perturbation problem for the basic integral equation of estimation theory Rh = f . The solution to this equation, which is of interest vii
Random Fields Estimation Theory
viii
in estimation theory, is a distribution, in general. The perturbed equation, Eh, Rh, = f has the unique solution in L 2 ( D ) .The singular perturbation problem consists of the study of the asymptotics of h, as E +.0. This theory is not only of mathematical interest, but also a basis for the numerical solution of the basic integral equation in distributions. We discuss the relation between estimation theory and quantum-mechanical non-relativistic scattering theory. Applications of the estimation theory are also discussed. The presentation in this book is based partly on the author’s earlier monographs [Ramm (1990)] and [Ramm (1996)], but also contains recent results [Ramm (2002)], [Ramm (2003)], [Kozhevnikov and Ramm (2005)], and [Ramm and Shifrin (2005)l. The book is intended for researchers in probability and statistics, analysis, numerical analysis, signal estimation and image processing, theoretically inclined electrical engineers, geophysicists, and graduate students in these areas. Parts of the book can be used in graduate courses in probabilty and statistics. The analytical tools that the author uses are not usual for statistics and probability. These tools include spectral theory of elliptic operators, pseudodifferential operators, and operator theory. The presentation in this book is essentially self-contained. Auxiliary material which we use is collected in Chapter 8.
+
Contents
vii
Preface
.
1 Introduction 2
.
1
Formulation of Basic Results 2.1 Statement of the problem . . . . . . . . . . . . . . . . . . . 2.2 Formulation of the results (multidimensional case) . . . . . 2.2.1 Basic results . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Generalizations . . . . . . . . . . . . . . . . . . . . . 2.3 Formulation of the results (one-dimensional case) . . . . . . 2.3.1 Basic results for the scalar equation . . . . . . . . . . 2.3.2 Vector equations . . . . . . . . . . . . . . . . . . . . 2.4 Examples of kernels of class R and solutions to the basic equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Formula for the error of the optimal estimate . . . . . . . .
.
3
Numerical Solution of the Basic Integral Equation in Distributions Basic ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical approaches . . . . . . . . . . . . . . . . . . . . . Multidimensional equation . . . . . . . . . . . . . . . . . . . Numerical solution based on the approximation of the kernel Asymptotic behavior of the optimal filter as the white noise component goes to zero . . . . . . . . . . . . . . . . . . . . 3.6 A general approach . . . . . . . . . . . . . . . . . . . . . . .
3.1 3.2 3.3 3.4 3.5
ix
9
9 14 14 17 18 19 22 25 29
33 33 37 43 46 54 57
Random Fields Estimation Theory
x
4
.
Proofs 4.1 4.2 4.3 4.4
5
.
.
7
.
.
Singular Perturbation Theory for a Class of Fredholm Integral Equations Arising in Random Fields Estimation Theory 5.1 5.2 5.3 5.4 5.5 5.6
6
Proof of Theorem 2.1 . . . . . . . . . . . . . . . . . . . . . . Proof of Theorem 2.2 . . . . . . . . . . . . . . . . . . . . . . Proof of Theorems 2.4 and 2.5 . . . . . . . . . . . . . . . Another approach . . . . . . . . . . . . . . . . . . . . . . .
65
65 73 79 84
87
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Auxiliary results . . . . . . . . . . . . . . . . . . . . . . . . 90 Asymptotics in the case n = 1 . . . . . . . . . . . . . . . . . 93 Examples of asymptotical solutions: case n = 1 . . . . . . . 98 Asymptotics in the case n > 1 . . . . . . . . . . . . . . . . . 103 Examples of asymptotical solutions: case n > 1 . . . . . . . 105
Estimation and Scattering Theory
111
6.1 The direct scattering problem . . . . . . . . . . . . . . . . 6.1.1 The direct scattering problem . . . . . . . . . . . . . 6.1.2 Properties of the scattering solution . . . . . . . . . 6.1.3 Properties of the scattering amplitude . . . . . . . . 6.1.4 Analyticity in k of the scattering solution . . . . . . 6.1.5 High-frequency behavior of the scattering solutions . 6.1.6 Fundamental relation between u+ and u- . . . . . . 6.1.7 Formula for det S ( k ) and state the Levinson Theorem 6.1.8 Completeness properties of the scattering solutions . 6.2 Inverse scattering problems . . . . . . . . . . . . . . . . . . 6.2.1 Inverse scattering problems . . . . . . . . . . . . . . 6.2.2 Uniqueness theorem for the inverse scattering problem 6.2.3 Necessary conditions for a function to be a scatterng amplitude . . . . . . . . . . . . . . . . . . . . . . . . 6.2.4 A Marchenko equation (M equation) . . . . . . . . . 6.2.5 Characterization of the scattering data in the 3 0 inverse scattering probtem . . . . . . . . . . . . . . . . 6.2.6 The Born inversion . . . . . . . . . . . . . . . . . . . 6.3 Estimation theory and inverse scattering in R3 . . . . . . .
111 111 114 120 121 123 127 128 131 134 134 134
Applications
159
135 136 138 141 150
Contents
7.1 What is the optimal size of the domain on which the data are to be collected? . . . . . . . . . . . . . . . . . . . . . . . 7.2 Discrimination of random fields against noisy background . 7.3 Quasioptimal estimates of derivatives of random functions . 7.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 Estimates of the derivatives . . . . . . . . . . . . . . 7.3.3 Derivatives of random functions . . . . . . . . . . . . 7.3.4 Finding critical points . . . . . . . . . . . . . . . . . 7.3.5 Derivatives of random fields . . . . . . . . . . . . . . 7.4 Stable summation of orthogonal series and integrals with randomly perturbed coefficients . . . . . . . . . . . . . . . . 7.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Stable summation of series . . . . . . . . . . . . . . . 7.4.3 Method of multipliers . . . . . . . . . . . . . . . . . . 7.5 Resolution ability of linear systems . . . . . . . . . . . . . . 7.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 7.5.2 Resolution ability of linear systems . . . . . . . . . . 7.5.3 Optimization of resolution ability . . . . . . . . . . . 7.5.4 A general definition of resolution ability . . . . . . . 7.6 Ill-posed problems and estimation theory . . . . . . . . . . 7.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 7.6.2 Stable solution of ill-posed problems . . . . . . . . . 7.6.3 Equations with random noise . . . . . . . . . . . . . 7.7 A remark on nonlinear (polynomial) estimates . . . . . . . . 8
.
xi
159 161 169 169 170 172 180 181 182 182 184 185 185 185 187 191 196 198 198 205 216 230
Auxiliary Results
233
8.1 Sobolev spaces and distributions . . . . . . . . . . . . . . . 8.1.1 A general imbedding theorem . . . . . . . . . . . . . 8.1.2 Sobolev spaces with negative indices . . . . . . . . . 8.2 Eigenfunction expansions for elliptic selfadjoint operators . 8.2.1 Resoluion of the identity and integral representation of selfadjoint operators . . . . . . . . . . . . . . . . . 8.2.2 Differentiation of operator measures . . . . . . . . . 8.2.3 Carleman operators . . . . . . . . . . . . . . . . . . . 8.2.4 Elements of the spectral theory of elliptic operators in L 2 ( R T ) . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Asymptotics of the spectrum of linear operators . . . . . . . 8.3.1 Compact operators . . . . . . . . . . . . . . . . . . . 8.3.1.1 Basic definitions . . . . . . . . . . . . . . . .
233 233 236 241 241 242 246 249 260 260 260
xii
Random Fields Estimation Theory
8.3.1.2 Minimax principles and estimates of eigenvalues and singular values . . . . . . . . . . 8.3.2 Perturbations preserving asymptotics of the spectrum of compact operators . . . . . . . . . . . . . . . . . . 8.3.2.1 Statement of the problem . . . . . . . . . . 8.3.2.2 A characterization of the class of linear compact operators . . . . . . . . . . . . . . . . . 8.3.2.3 Asymptotic equivalenceof s-values of two operators . . . . . . . . . . . . . . . . . . . . . 8.3.2.4 Estimate of the remainder . . . . . . . . . . 8.3.2.5 Unbounded operators . . . . . . . . . . . . . 8.3.2.6 Asymptotics of eigenvalues . . . . . . . . . . 8.3.2.7 Asymptotics of eigenvalues (continuation) . 8.3.2.8 Asymptotics of s-values . . . . . . . . . . . . 8.3.2.9 Asymptotics of the spectrum for quadratic forms . . . . . . . . . . . . . . . . . . . . . . 8.3.2.10 Proof of Theorem 2.3 . . . . . . . . . . . . . 8.3.3 Trace class and Hilbert-Schmidt operators . . . . . . 8.3.3.1 Trace class operators . . . . . . . . . . . . . 8.3.3.2 Hilbert-Schmidt operators . . . . . . . . . . 8.3.3.3 Determinants of operators . . . . . . . . . . 8.4 Elements of probability theory . . . . . . . . . . . . . . . . 8.4.1 The probability space and basic definitions . . . . . . 8.4.2 Hilbert space theory . . . . . . . . . . . . . . . . . . 8.4.3 Estimation in Hilbert space L2(R,U ,P) . . . . . . . 8.4.4 Homogeneous andisotropicrandomfields . . . . . . 8.4.5 Estimation of parameters . . . . . . . . . . . . . . . 8.4.6 Discrimination between hypotheses . . . . . . . . . . 8.4.7 Generalized random fields . . . . . . . . . . . . . . . 8.4.8 Kalman filters . . . . . . . . . . . . . . . . . . . . . .
Appendix A Analytical Solution of the Basic Integral Equation for a Class of One-Dimensional Problems A.l Introduction . . . . . . . . . . . . . . A.2 Proofs . . . . . . . . . . . . . . . . .
............. .............
Appendix B Integral Operators Basic in Random Fields Estimation Theory
262 265 265 266 268 270 274 275 283 284 287 293 297 297 298 299 300 300 306 310 312 315 317 319 320
325
326 329
337
Contents
...
Xlll
B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 337 B.2 Reduction of the basic integral equation to a boundary-value problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 B.3 Isomorphism property . . . . . . . . . . . . . . . . . . . . . 349 B.4 Auxiliary material . . . . . . . . . . . . . . . . . . . . . . . 354 Bibliographical Notes
359
Bibliography
363
Symbols
371
Index
373
This page intentionally left blank
Chapter 1
Introduction
This work deals with just one topic: analytic theory of random fields estimation within the framework of covariance theory. No assumptions about distribution laws are made: the fields are not necessarily Gaussian or Markovian. The only information used is the Covariance functions. Specifically, we assume that the random field is of the form
U ( Z )= S(Z)
+ n(z), IC E R",
(1.1)
where s(z) is the useful signal and n ( z )is noise. Without loss of generality assume that
- -
s(z) = n ( z )= 0,
(1.2)
where the bar denotes the mean value. If these mean values are not zeros then one either assumes - that thev are known and considers the fields s(z) s(z) and n ( z ) - n ( z ) with zero mean values, or one estimates the mean values and then subtracts them from the corresponding fields. We also assume that the covariance functions
U*(Z)U(Y):= R(Z,Y), are known. The star stands for complex conjugate. This information is necessary for any development within the framework of covariance theory. We will show that, under some assumptions about functions (1.3), one can develop an analytic theory of random fields estimation. If the functions (1.3) are not known then one has to estimate them from statistical data or from some theory. In many applications the exact analytical expression for the covariance functions is not very important, but rather some general features of R or f are of practical interest. These features include, for example, the correlation radius. 1
Random Fields Estimation Theory
2
The estimation problem of interest is the following one. The signal u ( x ) of the form (1.1) is observed in a domain D c RT with the boundary r. Assuming (1.2) and (1.3) one needs to linearly estimate As(x0) where A is a given operator and xo E RT is a given point. The linear estimate is to be best possible by the criterion of minimum of variance, i.e. the estimate is by the least squares method. The most general form of a linear estimate of u observed in the domain D is
LU
:=
s,
h ( x ,y)U(y)dy
(1.4)
where h ( x ,y) is a distribution. Therefore the optimal linear estimate solves the variational problem E
:= (LU - As)2 = min
(1.5)
where Lu and As are computed at the point xo. A necessary condition on h ( x , y ) for (1.5) (with A = I) to hold is (see equations (2.11) and (8.423))
Rh :=
L
R ( x ,y)h(z,y)dy = f ( ~z ), ,
X,
z EB
:= D U I?.
(1.6)
The basic topic of this work is the study of a class of equations (1.6) for which the analytical properties of the solution h can be obtained, a numerical procedure for computing h can be given, and properties of the operator R in (1.6) can be studied. Since z enters (1.6) as a parameter, one can study the basic equation of estimation theory
Rh := JD R ( x ,y)h(y)dy= f ( x ) , x E
B.
A typical one-dimensional example of the equation (1.7) in estimation theory is exp(-la: - yl)h(y)dy = f ( a : ) ,
-1 5 a: 5 1.
(1.8)
Its solution of minimal order of singularity is
h ( x ) = (- f"+f)/2+6(~+l)[-f'(-l)+f(-l)]/2+6(~-
1)[f1(1)+f(1)]/2. (1.9) One can see that the solution is a distribution with singular support at the boundary of the domain V. By sing supp h we mean the set having no open neigborhood to which the restriction of h can be identified with a locally integrable function.
Introduction
3
In the case of equation (1.8) this domain D ' = (-1,l).
Even if f E
C"(D) the solution to equations (1.7), (1.8) are, in general, not in L 2 ( D ) . The problem is: in what functional space should one look for the solution? Is the solution unique? Does the solution to (1.7)provide the solution to the estimation problem (1.5)? Does the solution depend continuously on the data, e.g. on f and o n R ( x , y ) ? How does one compute the solution analytically and numerically? What are the properties of the solution, for example, what is the order of singularity of the solution? What is the singular support of the solution? What are the properties of the operator R as an operator in L ~ ( D ) ? These questions are answered in Chapters 2-4. The answers are given for the class of random fields whose covariance functions R ( x ,y ) are kernels of positive rational functions of selfadjoint elliptic operators in L2(R'). The class R of such kernels consists of kernels (1.10) where A, dp, @(z,y , A) are respectively the spectrum , spectral measure and spectral kernel of an elliptic selfadjoint operator C in L2(R') of order s, and P(X)and Q ( X ) are positive polynomials of degrees p and q respectively. The notions of spectral measure and spectral kernel are discussed in Section 8.2. If p > q then the operator in L 2 ( D )with kernel (1.10) is an elliptic integrodifferential operator R; if p = q then R = CI K ,where c = const > 0, I is the identity operator, and K: is a compact selfadjoint operator in L 2 ( D ) ;if p < q, which is the most interesting case, then R is a compact selfadjoint operator in L 2 ( D ) .In this case the noise n ( x )is called colored. If 4(X) is a measurable function then the kernel of the operator 4(C) is defined by the formula
+
The domain of definition of the operator $(C)consists of all functions L2(R') such that
fE
(1.12) where Ex is the resolution of the identity for L It is a projection operator
4
Random Fields Estimation Theory
with the kernel
In particular, since E+, = I , one has 00
S(Z - Y) =
L,
%, Y,wo).
(1.14)
In (1.13) and (1.14) the integration is actually taken over (-00, A) n A and (-W, 00) n A respectively, since d p = 0 outside A. The kernel in (1.8) corresponds to the simple case T = 1, C = -z&, A = (-w,w), dp = dA, @(z,y, A) = (27r)-'exp{iA(z - y)}, P(A) = 1, Q(A) = (A2 + 1)/2, e-121 = (27r)-'
J-",
(*)-'
exp(iAz)dA, and formula (1.9) is a very particular
case of the general formulas given in Chapter 2. Let R(z,y) E R, a := i s ( q - p ) , H e ( D ) be the Sobolev spaces and B-'(D) be its dual with respect to H o ( D )= L 2 ( D ) .Then the answers to the questions, formulated above, are as follows. The solution to equation (1.7) solves the estimation problem if and only if h E f i - " ( D ) . The operator R : f i - * ( D ) -+ H a ( D ) is an isomorphism. The singular support of the solution h E f i - * ( D ) of equation (1.7) is I' = dD. The analytic formula for h is of the form h = Q(C)G, where G is a solution to some interface elliptic boundary value problem and the differentiation is taken in the sense of distributions. Exact description of this analytic formula is given in Chapter 2. The spectral properties of the operator R : L 2 ( D ) + L 2 ( D ) with the kernel R ( z ,y) E R are given also in Chapter 2. These properties include asymptotics as n -+ w of the eigenvalues A, of R, dependence A, on D , and asymptotics of AI(D) as D + R', that is D growing uniformly in directions. Numerical methods for solving equation (1.7) in the space k - " ( D ) of distributions are given in Chapter 3. These methods depend heavily on the analytical results given in Chapter 2. The necessary background material on Sobolev spaces and spectral theory is given in Chapter 8 so that the reader does not have to consult the literature in order to understand the contents of this work. No attempts were made by the author to present all aspects of the theory of random fields. There are several books [Adler (1981)], [Yadrenko (1983)], [Vanmarcke (1983)], [Rosanov (1982)l and [Preston (1967)], and many papers on various aspects of the theory of random fields. They have
Introduction
5
practically no intersection with this work which can be viewed as an extension of Wiener’s filtering theory. The statement of the problem is the same as in Wiener’s theory, but we study random functions of several variables, that is random fields, while Wiener (and many researchers after him) studied filtering and extrapolation of stationary random processes, that is random functions of one variable. Wiener’s basic assumptions were:
+
1) the random process u(t)= s ( t ) n(t)is stationary, 2) it is observed on the interval (-00, T ) , 3) it has a rational spectral density (this assumption can be relaxed, but for effective solution of the estimation problems it is quite useful). The first assumption means that R ( ~ , T=) R(t - T ) , where R is the covariance function (1.3). The second one means that D = ( - m , T ) . The third one means R(X) = P(X)Q-l(X), where P(X) and Q(X) are polynomials, R(X) L 0 for -00 < X < 00, R(X) := ~~mR(t)exp(-iXt)dt.The analytical theory used by Wiener is the theory of Wiener-Hopf equations. Later the Wiener theory was extended to the case D = [Tl,T]of a finite interval of observation, while assumptions 1) and 3) remained valid. A review of this theory with many references is [Kailath (1974)l. Although the literature on filtering and estimation theory is large (dozens of books and hundreds of papers are mentioned in [Kailath (1974)]), the analytic theory presented in this work and developed in the works of the author cited in the references has not been available in book form in its present form, although a good part of it appeared in [Kato (1995), Ch. 11. Most of the previously known analytical results on Wiener-Hopf equations with rational R(X) are immediate and simple consequences of our general theory. Engineers can use the theory presented here in many applications. These include signal and image processing in TV, underwater acoustics, geophysics, optics, etc. In particular, the following question of long standing is answered by the theory given here. Suppose a random field (1.1) is observed in a ball B and one wants to estimate ~(zo), where 20 is the center of B. What is the optimal size of the radius of B? If the radius is too small then the estimate is not accurate. If it is too large then the estimate is not better than the one obtained from the observations in a ball of smaller radius, so that the efforts are wasted. This problem is of practical importance in many applications. We will briefly discuss some other applications of the estimation theory, for example, discrimination of hypotheses, resolution ability of linear
Random Fields Estimation Theory
6
systems, estimation of derivatives of random functions, etc. However, the emphasis is on the theory, and the author hopes that other scientists will pursue further possible applications. Numerical solution of the basic integral equation of estimation theory was widely discussed in the literature [Kailath (1974)l in the case of random processes (T = l), mostly stationary that is when R(z,y) = R(z - y), and mostly in the case when the noise is white, so that the integral equation for the optimal filter is
(I
+ R)h := h(t)+
rT 10
R,(t
- T)h(T)dT= f ( t ) ,
0 5 t 5T,
(1.15)
where Rs is the covariance function of the useful signal s(t). Note that the integral operator R in (1.10) is selfadjoint and nonnegative in L2[0,TI. Therefore ( I T)-’ exists and is bounded, and the numerical solution of (1.15) is not difficult. Many methods are available for solving onedimensional second order Redholm integral equation (1.15) with positivedefinite operator I R. Iterative methods, projection methods, colloqation and many other methods are available for solving (1.15), convergence of these methods has been proved and effective error estimates of the numerical methods are known [Kantorovich and Akilov (1980)l. Much effort was spent on effective numerical inversion of Toeplitr matrices which one obtains if one discretizes (1.15) using equidistant colloqation points [Kailath (1974)]. However, if the noise is colored, the basic equation becomes
+
+
Jo’R(t - T)h(T)dT= f ( t ) , 0 5 t 5 T
(1.16)
This is a F’redholm equation of the first kind. A typical example is equation (1.8). As we have seen in (1.9), the solution to (1.8) is a distribution, in general. The theory for the numerical treatment of such equation was given by the author [Ramm (1985)] and is presented in Chapter 3 of this book. In particular, the following question of singular perturbation theory is of interest. Suppose that equation
+
€he Rh, = f,
E
> 0,
(1.17)
is given. This equation corresponds to the case when the intensity of the white-noise component of the noise is E . What is the behavior of h, when E .+ +O? We will answer this question in Chapter 3. This book is intended for a broad audience: for mathematicians, engineers interested in signal and image processing, geophysicists, etc. There-
Introduction
7
fore the author separated formulation of the results, their discussion and examples from proofs. In order to understand the proofs, one should be familiar with some facts and ideas of functional analysis. Since the author wants to give a relatively self-contained presentation, the necessary facts from functional analysis are presented in Chapter 8. The book presents the theory developed by the author. Many aspects of estimation theory are not discussed in this book. The book has practically no intersection with works of other authors on random fields estimation theory.
This page intentionally left blank
Chapter 2
Formulation of Basic Results
2.1
Statement of the problem
Let D c R' be a bounded domain with a sufficiently smooth boundary I?. The requirement that D is bounded could be omitted, it is imposed for simplicity. The reader will see that if D is not bounded then the general line of the arguments remains the same. The additional difficulties, which appear in the case when D is unbounded, are of technical nature: one needs to establish existence and uniqueness of the solution to a certain transmission problem with transmission conditions on r'. Also the requirement of smoothness of I? is of technical nature: the needed smoothness should guarantee existence and uniqueness of the solution to the above transmission problem. Let C be an elliptic selfadjoint in H = L2(RT)operator of order s. Let A, @(x, y, A), dp(A) be the spectrum, spectral kernel and spectral measure of C, respectively. A function F(C) is defined as an operator on H with the kernel
and domF(C) = { f : f E H ,
s-",
IF(A)(2d(EAf , f ) < m}, where
Definition 2.1 Let R denote the class of kernels of positive rational functions of C , where C runs through the set of all selfadjoint elliptic operators 9
Random Fields Estimation Theory
10
in H
= L2(R').
I n other words, R(x,y ) E R if and only if
R(x,Y) =
P(X>Q-l(X)@(x,Y,X W P ( X ) ,
where P(X) > 0 and Q(X) > 0 , VX E A, and A, @, dp correspond to a n elliptic selfadjoint operator L in H = L2(RT). Let
p
= degP(X),
q = degQ(X),
s = ordL,
(2.4)
where degP(X) stands for the degree of the polynomial P(X), and ordC stands for the order of the differential operator C. An operator given by the differential expression ~u :=
C aj(x)aju,
(2.5)
ljlls
a&ag
+
where j = ( j 1 , j z . . .j,) is a multiindex, aju = . . . a&u Ijl = jl . . . j,, j, 2 0, are integers. The expression (2.5) is called elliptic if, for any real vector t E RT,the equation
j2
+
M=y
implies that t = 0. The expression
C+U :=
c
(-l)'j'aya;(x)u)
ljlly
is called the formal adjoint with respect to L. The star in (2.6) stands for complex conjugate. One says that C is formally selfadjoint if C = C+. If C is formally selfadjoint then C is symmetric on CT(R'), that is (L$,$) = ($,C$)V4,$ E C?(RT), where ( 4 , $ ) is the inner product in H = L2(R'). Sufficient conditions on u j ( x ) can be given for a formally selfadjoint differential expression to define a selfadjoint operator in H in the following way. Define a symmetric operator LO with the domain C p ( R T ) by the formula LOU= C u for u E C F ( R r ) . Under suitable conditions on aj(x) one can prove that LO is essentially selfadjoint, that is, its closure is selfadjoint (see Chapter 8). In particular this is the case if aj = a; = const. In what follows we assume that R(x,y) E R. Some generalizations will be considered later. The kernel R(x,y ) is the covariance function (1.3) of the random field U ( x ) = s(z)+o(x)observed in a bounded domain D c RT.
Formulation of Basic Results
11
A linear estimation problem can be formulated as follows: find a linear estimate := LU :=
(2.7)
h ( x ,y ) U ( y ) d y
such that E
:= lz;l- As12 = min.
(2.8)
The kernel h(z,y) in (2.7) is a distribution, so that, by L. Schwartz’s theorem about kernels, estimate (2.7) is the most general linear estimate. The operator A in (2.8) is assumed to be known. It is an arbitrary operator, not necessarily a linear one. In the case when AU = U ,that is A = I, where I is the identity operator, the estimation problem (2.8) is called the filtering problem. From (2.8) and (2.7) one obtains E
=
s,
h(Z,y)U(y)dy
/
D
h*(Z,z ) U * ( z ) d z
-
2 R e l h ( z ,z ) U ( z ) d z ( A s ) * ( z )
-
2Re / D h * ( ~z ), f ( z , z ) d z
+
+
IAs(z)12
IAs(z)12
= min
Here (2.10) the bar stands for the mean value and the star stands for complex conjugate. By the standard procedure one finds that a necessary condition for the minimum in (2.9) is:
R ( z ,y ) h ( Z ,y)dy = f ( z ,x),
Z,z E
+
D := D U I?.
(2.11)
In order to derive (2.11) from (2.9) one takes h ar] in place of h in (2.9). Here a is a small number and 7 E Cr(D). The condition ~ ( h5) E(h ar]) implies (a=o= 0. This implies (2.11). Since h is a distribution, the left-hand side of (2.11) makes sense only if the kernel R ( z , y) belongs to the space of test functions on which the distribution h is defined. We
+
Random Fields Estimation Theory
12
will discuss this point later in detail. In (2.11) the variable x enters as a parameter. Therefore, the equation R(x,y)h(y)dy = f (x),
Rh :=
x
E
B,
(2.12)
is basic for estimation theory. We have supressed the dependence on x in (2.11) and have written x in place of I in (2.12). From this derivation it is clear that the operator A does not influence the theory in an essential way: if one changes A then f is changed but the kernel of the basic equation (2.12) remains the same. If Au = one has the problem of estimating the derivative of u. If Au = u(x xo), where xo is a given point such that x xo @ D , then one has the extrapolation problem. Analytically these problems reduce to solving equation (2.12). If no assumptions are made about R ( x ,y) except that R(x,y) is a covariance function, then one cannot develop an analytical theory for equation (2.12). Such a theory will be developed below under the basic assumption R(z,y) E
&,
+
+
R. Let us show that the class R of kernels, that is the class of random fields that we have introduced, is a natural one. To see this, recall that in the one-dimensional case, studied analytically in the literature, the covariance functions are of the form
w
R(X) :=
R ( x )exp(-iXx)dx = P(X)Q-l(X), Jw
where P(X) and Q(X) are positive polynomials [Kai]. This case is a very particular case of the kernels in the class R. Indeed, take d r = l , C = - '2-, dx
A = (--oo,oo), dp(X) = dX,
~ ( xy, ,A) = (27r)-' exp{iX(x - y)}.
Then formula (2.3) gives the above class of convolution covariance functions with rational Fourier transforms. If p = q, where p and q are defined in (2.4), then the basic equation (2.12) can be written as
+
Rl(x, y)h(y)dy = f(x), x E B, a2 > 0,
Rh := a2h(z)
D
(2.13)
Formulation of Basic Results
13
where
+
P(X)Q-l(X) = o2 PI(X)Q-l(X),
p l := degP1
< q,
(2.14)
and u2 > 0 is interpreted as the variance of the white noise component of the observed signal U ( z ) .If p < q, then the noise in U ( z )is colored, it does not contain a white noise component. Mathematically equation (2.13) is very simple. The operator R in (2.13) is of F'redholm type, selfadjoint and positive definite in H , R 2 021,where I is the identity operator, and A 2 B means (Au,u) 2 (Bu,u) Vu E H . Therefore, if p = q then equation (2.12) reduces to (2.13) and has a unique solution in H . This solution can be computed numerically without difficulties. There are many numerical methods which are applicable to equation (2.13). In particular (see Section 2.3.2)) an iterative process can be constructed for solving (2.13) which converges as a geometrical series; a projection method can be constructed for solving (2.13) which converges and is stable computationally; one can solve (2.13) by collocation methods. However the important and practically interesting question is the following one: what happens with the solution h, to (2.13) as o + O? This is a singular perturbation theory question: for u > 0 the unique solution to equation (2.13) belongs to L 2 ( D ) ,while for u = 0 the unique solution to (2.13) of minimal order of singularity is a distribution. What is the asymptotics of h, as u + O? As we will show, the answer to this question is based on analytical results concerning the solution to (2.13). The basic questions we would like to answer are: 1) In what space of functions or distributions should one look for the solution to (2.12)? 2) When does a solution to (2.12) solve the estimation problem (2.8)?
Note that (2.12) is only a necessary condition for h(z,y) to solve (2.8). We will show that there is a solution t o (2.12) which solves (2.8) and this solution to (2.12) is unique. The fact that estimation problem (2.8) has a unique solution follows from a Hilbert space interpretation of problem (2.8) as the problem of finding the distance from the element ( A s ) ( z )to the subspace spanned by the values of the random field u ( y ) , y E D. Since there exists and is unique an element in a subspace at which the distance is attained, problem (2.8) has a solution and the solution is unique. It was mentioned in the Introduction (see (1.9)) that equation (2.12) may have no solutions in L1(D) but rather its solution is a distribution. There can be
Random Fields Estimation Theo7y
14
several solutions t o (2.12) in spaces of distributions, but only one of them solves the estimation problem (2.8). This solution is characterized as the solution to (2.12) of minimal order of singularity. 3) What is the order of singularity and singular support of the solution to (2.12) which solves (2.8)? 4) Is this solution stable under small perturbations of the data, that is under small perturbations of f(z)and R(x,y)? What is the appropriate notion of smallness in this case? What are the stability estimates for h ? 5) How does one compute the solution analytically? 6) How does one compute the solution numerically? 7) What are the properties of the operator R : L 2 ( D )+ L 2 ( D ) in (2.12)? In particular, what is the asymptotics of its eigenvalues Xj(D) as j -+ +GO? What is the asymptotics of Xl(D) as D 4 R', that is, as D expands uniformly in direct ions? These questions are of interest in applications. Note that if D is finite then the operator R is selfadjoint positive compact operator in L 2 ( D ) ,its spectrum is discrete, A 1 > Xz 2 . - . > 0, the first eigenvalue is nondegenerate by Krein-Rutman's theorem. However, if D = R' then the spectrum of R may be continuous, e.g., this is the case when R ( x , y ) = R(x - y ) , R(X) = P(X)Q-l(X). Therefore it is of interest to find XI, := limXl(D) as D t R'. The quantity XI, is used in some statistical problems. 8) What is the asymptotics of the solution to (2.13)as D
2.2 2.2.1
+ O?
Formulation of the results (multidimensional case) Basic results
We assume throughout that R ( z , y ) E R,f(z)is smooth, more precisely, that f E H a , H a = Ha(D)is the Sobolev space, (u := (q - p)s/2, where s , q , p are the same as in (2.4), and the coefficients a j ( z ) of C (see (2.5)) are sufficiently smooth, say a,(x) E C ( R T )aj(x) , E Lloc T'(s-'a') if s - ~ j < l r/2, aj E Lfo, if s - Ijl > r/2, aj E L1",$,',E > 0, if s - Ijl = 7-/2 (see [Hormander (1983-85), Ch 171). If q 5 p , then the problem of finding the is simple: such a solution does exist, is unique, and belongs to Hm+21al if f E Hm. This follows from
Formulation of Basic Results
15
the usual theory of elliptic boundary value problems [Berezanskij (1968)], since the operator P(L)Q-l ( L ) is an elliptic integral-differential operator of order 214 if q 5 p . The solution satisfies the elliptic estimate:
II h
IIffm+2b
I c I1 f
llffm,
4
I P,
where c depends on L but does not depend on f . If q > p , then the problem of finding the mos solution of (2.12) is more interesting and difficult because the order of singularity of h, is in general, positive, ordh = a. The basic result we obtain is: + H a is a linear isomorphism between the spaces the mapping R : H W a and H a . The of h is aD = I?, provided that f is smooth. If hl is a solution to equation (2.12) and ordhl > Q then E = 00, where E is defined in (2.8). Therefore if hl solves (2.12) and ordhl > Q then hl does not solve the estimation problem (2.8). The unique solution to (2.8) is the unique mos solution to (2.12). We give analytical formulas for the mos solution to (2.12). This solution is stable towards small perturbations of the data. We also give a stable numerical procedure for computing this solution. In this section we formulate the basic results.
Theorem 2.1 If R ( x ,y) E R,then the operator R in (2.12) is an isomorphism between the spaces H - a and H a . The solution to (2.12) of minimal a, can be calculated by the formula: order of singularity, ordh I
h ( x ) = Q(L)G,
(2.15)
where (2.16)
g ( x ) E HS(P+Q)I2is an arbitrary fixed solution to the equation
P(L)g=f
in D ,
(2.17)
and the functions u ( x ) and v ( x ) are the unique solution to the following (2.18)-(2.20): Q(L)u=O
in R,
u(co)=O,
P(L)v=O in D,
(2.18) (2.19)
16
Random Fields Estimation Theory
on
%u=aj,(v+g)
r,
+
~ < j <4-P- 4 )
2
By u(m) = 0 we mean limu(z) = 0 as 1x1 -+ Corollary 2.1
1.
(2.20)
00.
I f f E H2P, ,8 1 a, then sing supph = r.
(2.21)
Corollary 2.2 If P(X) = 1, then the transmission problem (2.18)-(2.20) reduces to the Dirichlet problem in R: Q(.C)u=O in R, on
ajNu=aj,f
r,
u(00)= 0 ,
(2.22)
0 < j < -34 --1 2 ’
(2.23)
in D inR.
(2.24)
and (2.15) takes the form
h=Q(L)F, F =
f u
Corollary 2.1 follows immediately from formulas (2.15) and (2.16) since g(z)+ v ( z ) and u ( z ) are smooth inside D and R respectively. Corollary 2.2 follows immediately from Theorem 2.1: if P(X) = 1 then g = f, v = 0, and p = 0. Let w ( X ) L 0, w ( X ) E C ( R 1 ) w(00) , = 0, w := maxw(X), XEA
(2.25)
(2.26) X j = X j ( D ) be the eigenvalues of the operator R : L 2 ( D ) ---t L 2 ( D ) be the eigenvalues of the operator R : L 2 ( D ) + L 2 ( D ) with kernel (2.26), arranged so that XI 2
Theorem 2.2
If D
c D’ sup
xERT
A2
2 XJ 2 ... > 0.
(2.27)
then Xj 5 Xi, where X i = Xj(D’). If
1
IR(z,y)ldy := A < 00,
(2.28)
Formulation of Basic Results
17
then X1,
=w,
(2.29)
where lim X1(D) := Al,,
(2.30)
D-R'
and w is defined in (2.25). Theorem 2.3 If w(X) = IXl-"(l+o(l)) as 1x1 --+ 00, and a > 0, then the asymptotics of the eigenvalues of the operator R with kernel (2.26) is given by the formula:
Xi
,j-""/'
as
j + 00,
c = const > 0 ,
(2.31)
where c = yaSIr and (2.32)
y := (27r)-'L q ( z ) d z , with q ( z ) := meas{t : t E R',
C
aao(z)ta+fl I I).
(2.33)
lal=IPI=sP
Here the f o r m a,p(x) generates the principal part of the selfadjoint elliptic operator C:
cu =
aa(aa,(z))aPu
+ C1,
ordL1 < s.
lal=IPI=s/2 Corollary 2.3 If w ( A ) = P(X)Q-l(X) then a = q - p , where q = deg Q, p = d e g P , and A, cn-('J-P)S/r,where A, are the eigenvalues of the operator in equation (2.12) . N
This Corollary follows immediately from Theorem 2.3. Theorems 2.12.3 answer questions 1)-5) and 7) in section 2.1. Answers t o questions 6) and 8) will be given in Chapter 3. Proof of Theorem 2.3 is given in Section 8.3.2.10. 2.2.2
Genemlizations
First, let us consider a generalization of the class R of kernels for the case when there are several commuting differential operators. Let C1,. . .C, be a system of commuting selfadjoint differential operators in L2(Rr).There
18
Random Fields Estimation Theory
exists a dp(E) and a spectral kernel @(z,y, E ) , E = ( ( 1 , . . . function F(C1,. . . C,) is given by the formula
,em)such that a (2.34)
c).
where @(() is the operator with kernel @(z,y, The domain of definition of the operator F(L1,. . . L,) is the set of all functions u E L2(R‘) for which J, lF(()12(@([)u,u)dp < 00, M is the support of the spectral measure dp, and the parentheses denote the inner product in L2(R‘). For example, let m = T , Lj = -2-. ax, a Then ( = ( ( I , . . . (y), d p = d& . . . dEr14 ( ~y, ,<) = (27~)-‘ exp{if. (Z- y)}, where dot denotes the inner product in R‘. If F ( c ) = P ( ( ) Q - ’ ( t ) , where P(6) and Q(<) are positive polynomials and the operators P(C) := P ( C 1 , .. . C,) and Q ( C ) := Q(C1,. . . C,) are elliptic of orders m and n respectively, m < n, then the theorems, analogous t o Theorems 2.1-2.2, hold with sp = m and sq = n. Theorem 2.3 has also an analogue in which as = n - m in formula (2.31). Another generalization of the class R of kernels is the following one. Let Q ( z ,a) and P(z,a) be elliptic differential operators and
Q R = Pd(x - y) in R’.
(2.35)
Note that the kernels R E R satisfy equation (2.35) with Q = Q(C), P = P ( L ) . Let ordQ = n, o r d P = m, n > m. Assume that the transmission problem (2.18)-(2.20), with Q ( z , a) and P ( z ,a) in place of Q(C) and P ( C ) respectively, and p s = m, qs = n, has a unique solution in H ( ” S m ) / 2 .Then Theorem 2.1 holds with a = ( n- m)/2. The transmission problem (2.18)-(2.20) with Q ( z , a) and P ( z ,a) in place of Q ( C ) and P(C) is uniquely solvable provided that, for example, Q ( z , ~ )and P(z,a) are elliptic positive definite operators. For more details see Chapter 4 and Appendices A and B.
2.3
Formulation of the results (one-dimensional case)
In this section we formulate the results in the one-dimensional case, i.e., r = 1. Although the corresponding estimation problem is the problem for random processes (and not random fields) but since the method and the results are the same as in the multidimensional case, and because of
Formulation of Basic Results
19
the interest of the results in applications, we formulate the results in onedimensional case separately. 2.3.1
Basic results f o r the scalar equation
Let r = 1, D = (t - T , t ) ,R ( x , y ) E R. The basic equation (2.12)takes the form t
6,
R ( x ,y ) h ( y ) d y = f ( x ) , t - T
I x I t.
(2.36)
Assume that f E Ha,a = s(q - p)/2. Theorem 2.4 The solution t o equation (2.36) an fi-& exists, is unique, and can be found by the formula h = Q(W',
(2.37)
where
Here b: are constants, the functions $ F ( x ) , 1 5 j 5 sq/2, f o r m a fundamental system of solutions to the equation Q(C)$ = 0,
Qj;(-m) = 0,
$[(+m) = 0.
(2.39)
The function g(x>is defined b y the formula
(2.40) where go(x) is an arbitrary fixed solution to the equation
P(L)g = f , t - T 5 x 5 t , the functions the equation
dj,
15 j
I sp, f o r m
(2.41)
a fundamental system of solutions to
20
Random Fields Estimation Theory
and cj, 1 I j 5 sp, are constants. The constants b;, 1 I j I sq/2, and cj, 1 5 j 5 sp, are uniquely determined from the linear system:
where D = d / d x l 0 5 k 5 i s ( p + q ) - 1. The map R-' : f t h, where h is given by forrnula (2.37), is a n isomorphism of the space Ha onto the space H-a Remark 2.1 This theorem is a complete analogue of Theorem 2.1. The role of C is played now by a n ordinary daflerential selfadjoint operator C in L2(R1).An ordinary differential operator is elliptic i f and only i f the coefficient in front of its senior (that is, the highest order) derivative does not vanish: S
Lu = c a j ( x ) D j u , a,(z)# 0 ,
(2.45)
j=O
One can assume that a,(z) > 0, x E R 1 , and the condition of uniform ellipticity is assumed, that i s 0
< c1 I a s ( 4 5 cz,
(2.46)
where c1 and cp are positive constants which do not depend on x . Corollary 2.4 I f f E Ha,then sing supph = d D , where d D consists of two points t and t - T .
Corollary 2.4 is a complete analogue of Corollary 2.1. Corollary 2.5 Let Q(X) = a+(X)a-(X), where .*(A) are polynomials of degree 912, the zeros of the polynomial .+(A) lie in the upper half plane ImX > 0, while the zeros of a-(X) lie in the lower half-plane ImX < 0. Since Q(X) > 0 f o r --oo < X < 00, the zeros of .+(A) are complex conjugate of the corresponding zeros of .-(A). Assume that P(X) = 1. Then formula (2.37) can be written as
h(x) = a+(C)[B(x- t + T ) a - ( C )f (z)] - a-(C)[B(x- t ) a + ( L )f ( x ) ] , (2.47)
Formulation of Basic Results
where e(z) =
1 ifX20
21
, and the differentiation in (2.47) is understood
0 ifx
This Corollary is an analogue of Corollary 2.2.
Remark 2.2 Formula (2.47) is convenient for practical calculations. Let us give a simple example of its application. Let C = -3,r = 1, P(A) = I, &(A) = (X2+1)/2, R ( x , y ) = e x p ( - l x - y l ) , @ ( z , y , X )= ( 2 ~ ) - ' e x p { i A ( x y ) } , dp(X) = dX, t = 1, t - T = -1. Equation (2.36) becomes
.+(A) = M Jz , .-(A)
=
3. Formula (2.47) yields:
h ( z )= -(-ia - i)[O(x+ 1)(-ia 2 L
+i)f(x)]
(2.49) Here we have used the well known formula O'(x-u) = d ( 2 - a ) where 6 ( x ) is the delta-finction. Formula (2.49) is the same as formula [1.9). The term (- f" + f ) / 2 in (2.49) vanishes outside the interval [-1,1] by definition.
Remark 2.3 If t = +oo and t -T = 0, so that equation (2.36) takes the form of Wiener-Hopf equation of the first kind
Sum
R ( x ,?4)h(Y)d9= f ( X I ,
2 0,
(2.50)
h ( x )= a + ( C ) [ e ( z : ) a - ( C ) f ( x ) l .
(2.51)
2
then formula (2.47) reduces to
Random Fields Estimation Theory
22
If L = -id formula (2.51) can be obtained by the well-known factorization method. Example 2.1
Consider the equation roc)
By formula (2.51) one obtains
h(x) =
-f"+f 2
+ -f'W 2+ f(Q6(.),
(2.53)
if one uses calculations similar to the given in formula (2.49).
2.3.2
Vector equations
In both cases r = 1 and T > 1 it is of interest to consider estimation problems for vector random processes and vector random fields. For vector random processes the basic equation is (2.36) with the positive kernel R(z,y ) in the sense that (Rh,h) > 0 for h $ 0 , given by formula (2.3) in which R ( X ) = P(X)Q-'(X) is a matrix:
R ( A ) = (&(A)),
&(A)
:= Pij(X)Qijl(X), 1 5 i , j
5 d,
(2.54)
where Pij (A) and Qij (A) are relatively prime positive polynomials for each fixed pair of indices (ij), 1 5 i , j 5 d , d is the number of components of the random processes U ,s and n. Let &(A) be the polynomial of minimal degree, degQ(X) = q, for which any Qij(X), 1 5 i, j 5 d, is a divisor, Aij(X) := Rij(X)Q(X). Denote by E the unit d x d matrix and by A(C) the matrix differential operator with entries A,, (L). Assume that det lAij(X)J> 0,
VX E R1,
detBm(x) # 0,Vx E R1,
(2.55)
(2.56)
where m := smaxl~i,j
A ( L ) := E B j ( z ) t P , j=O
1
a
d=dx'
(2.57)
Formulation of Basic Results
23
Let S(z,y ) denote the matrix kernel
of the diagonal operator Q-l(C)E. The operator Q(C)E is a diagonal matrix differential operator of order n = sq. Let us write the basic equation rt
(2.59) where R(z, y ) is the d x d matrix with the spectral density (2.54), h and f are vector functions with d components, f E ?-la, a = ( n - rn)/2, ‘Ha denotes the space of vector functions (fl,.. .fd) such that 11 f I~.Hu:=
(c;4II fj IlkU)1’2. Remark 2.4 I n the vector estimation problem h and f are d x d matrices, but f o r simplicity and without loss of generality we discuss the case when h is a vector. Matrix equation (2.59) is equivalent to d vector equations. Equation (2.59) one can write as A(,C)V= f ,
v := Q-l (C)Eh =
(2.60) (2.61)
Let Q j , 1 5 j 5 rn, be a fundamental system of matrix solutions to the equation
A ( W = 0,
(2.62)
and qf,1 5 j 5 n/2 be the fundamental system of matrix solutions to the equation
Q(L)E\E = 0,
(2.63)
such that
q (+m) = 0,
*;
(-m) = 0.
(2.64)
The choice of the fundamental system of matrix solutions to (2.63) with properties (2.64) is possible if C is an elliptic ordinary differential operator,
24
Random Fields Estimation Theory
that is a,(%) # 0, z E R1 (see Remark 2.1 and [Naimark (1969), p. 1181. Let us write equations (2.60) and (2.61) as (2.65) where go( z) is an arbitrary fixed solution to the equation (2.60), and 1 5 j 5 m, are arbitrary linearly independant constant vectors.
cj,
T h e o r e m 2.5 If R E R with R given by (2.54), the assumptions (2.46), (2.55), (2.56) hold, and f E 'FI", a! = (n - m)/2, then the matrix equation (2.59) has a solution in fi-", this solution is unique and can be found by the formula (2.66)
h = Q(C)EG,
where the vector function G is given b y
G ( x )=
Here the functions *f, + j and go(x) were defined above, and the constant vectors bf, 1 5 j 5 n/2 and c j , 1 5 j 5 m, can be uniquely determined from the linear system m
go(z) e=t-T
+COj(x)cj j=1
(2.68) z=t-T
+
where 0 5 k 5 ( n m)/2 - 1. The map R-l : f + h, given by formulas (2.66)-(2.69) is an isomorphism between the spaces 'Haand 7ka, a! = (n - m)/2.
Remark 2.5 The conditions (2.68), (2.69) guarantee that the function G ( x ) , defined by formula (2.67), is maximally smooth so that the order of singularity of G and, therefore, of h [see formula (2.66)) is minimal.
Formulation of Basic Results
2.4
25
Examples of kernels of class R and solutions to the basic equation
1. If T = 1, c = -id, d = d / d z , @(z,y,A) = (27r)-l exp{iX(z - y)}, dp = dX, then R(z,y) E R if
R(x)exp{iX(z - y ) } d ~ ,
~ ( zy), = (27r)-'
(2.70)
where
R(X) = P(X)Q-l(X)
(2.71)
and P(X),Q(X) are positive polynomials. . Cc,),Cr = 2. If T > 1, C = (Ll,.. = a / d x r , @(,. 9, A) = (2n)-' exp{iX. (z - y)}, X = (XI,. . . A,), dp(X) = dX = dX1.. . dX,, then ~ ( zy), = (27r)-'
where
lRTR(X)
exp{iX . (z - y)}dX,
(2.72)
>0
(2.73)
R(X) is given by (2.71) and > 0, &(A)
P(X) = P ( X 1 , . . .A,)
= Q(X1.. .A,)
are polynomials. For the operators P(C) and Q(C) to be elliptic of orders p and q respectively, one has to assume that
0 < ~1 I P(X)(X(-' L CZ, 0 < ~3 where 1x1 = (A: + . . . 3. If T = 1, c = domain of C,then
I Q(A)lXl-q 5 ~
4 ,
VX E R'
(2.74)
+ A:)1/2
and cj, 1 5 j 5 4, are positive constants. D ( C ) = {u : u E H2(0,00), u'(0) = O}, D ( C ) =
--&
where 00
A ( z )= 7r-'
P(X)Q-l(X) ~ o s ( d i z ) X - ' / ~ d X
and P(X) > 0, Q(X) > 0 are polynomials. Indeed, one has for C
(2.76)
Random Fields Estimation Theory
26
0 i x, y
< 00. Since
1 C O S ( ~C Z )O S ( = ~ ~- [)c o s ( ~ ~-: ky) 2
+ C O S ( ~+Xky)],
k =6
one obtains (2.75) and (2.76). If one put 6 = k in (2.76) one gets
A(z) = -
Srn
n o
P(k2)&-'(k2) cos(kz)dk,
(2.77)
which is a cosine transform of a positive rational function of k. The eigenfunctions of C,normalized in L2(0,m), are (): 1'2 cos(ks) and dp = dk in the variable k. If C = is determined in L2(0,m) by the boundary condition u(0)= 0, then
--& 1
R(x,y) = p ( I . - YI) - A ( .
+
Y)ll
Z,Y 2 0,
(2.78)
where A ( x ) is given by (2.77), the eigenfunctions of C with the Dirichlet boundary condition u(0) = 0 are G s i n ( k z ) , dp = dk in the variable k, and @(x,y, k ) d p ( k ) = sin(kx) sin(ky)dk one can compare this with the formula @(x, y, k)dp(k) = cos(kz) cos(ky)dk, which holds for C determined by the Neumann boundary condition u'(0) = 0. 4. If C = (v' - ;)x-', v 2 0, x 2 0, then
3
2
-6+
so that
R ( x ,Y) =
&Irn
P(X)Q-l(X)J,(Xz)J,(Xy)XdX,
(2.80)
0
where P(X) and Q(X) are positive polynomials on the semiaxis X 2 0. 5. Let R ( x ,y) = exp(-alz - y1)(4n1x - yI)-', z,y E R3,a = const > 0. Note that (-A a 2 ) R= S(z - y) in R3. The kernel R(x,y) E R. One has C = (Cl,C2, &), C j = -Zaj, P(X) = 1, Q(X) = X2 a2, X2 = X l +A; + X i , @dp = ( 2 ~ ) exp{iX. -~ (x - y)}dX,
+
+
(2.81) 6. Let R(z,y) = R(zy). Pu t x = exp(J), y = exp(-v). Then R(sy) = R(exp(E - y)) := R1(( - y). If R1 E R with C = -3, then one can solve
27
Formulation of Basic Results
the equation
s,”
I5 I b
R(zy)h(y)dy = f(z), a
(2.82)
analytically. 7. Let Ko(alzI) be the modified Bessel function which can be defined by the formula
Ko(alzl) = (27r)-l where
L
=
z = Xlzl
(-i&,--z&),
exp(iX . z)
dX,
a > 0,
(2.83)
+ XZZZ. Then the kernel R(z,y) := Ko(alz - YI) E R, T
= 2, P(X) = 1, Q(X)
=
X2 + a 2 , @(z,y,X)dp(X) =
(27r)-l exp{iX. (z - y)}dX. 8. Consider the equation
with kernel (2.81). By formula (2.15),Theorem 2.1, one obtains the unique solution to equation (2.84) in h-’(D): (2.85) where u is the unique solution to the Dirichlet problem in the exterior domain R := R3 \ D:
(-A+a2)u=0
I’ = dD
=
in fl,
ulr = f i r ,
(2.86)
dfl is the boundary of D, and 6r is the delta function with
support r. Let us derive formula (2.85). For the kernel (2.81) one has T = 3, p = 0, = 1. Formula (2.15) P(X) = 1, &(A) = X2 + a 2 , s = 1, q = 2, a = reduces to
h ( z ) = (-A
+ a2)G,
(2.87)
with in D G={ f u in R,
(2.88)
Random Fields Estimation Theory
28
and u is the solution to (2.86). Indeed, since P(A) = 1, one has w = 0 and g = f. In order to compute h by formula (2.87)one uses the definition of the derivative in the sense of distributions. For any 4 E CF(RT)one has:
((-A
+ a2)G,4) = (GI (-A + a2)4) =
s,
+ a2)4dx + + a2)f$dx +
f( -A
=L(-A
=s,
(-A
s,
u(-A
+ a2)4dz
+ a2)f4dx + (af aN - k dN )$ds,
where the condition u = f on
(2.89)
I' was used. Formula (2.89)is equivalent to
(2.85). 9. Consider the equation
where D = {x : x E R2,1x1 5 b } , and Ko(z) is given by formula (2.83). The solution to (2.90)in k - l ( D ) can be calculated by formula (2.85)in which u(x)can be calculated explicitly
(2.91) where z = (r,4), ( T , 4) are polar coordinates in
R2,
r2n
(2.92) K,(r) is the modified Bessel function of order n which decays as r in formula (2.85): One can easily calculate
Ir
4
+co.
Formulas (2.85)and (2.93)give an explicit analytical formula for the solution to equation (2.90)in fi-'(D).
Formulation of Basic Results
2.5
29
Formula for the error of the optimal estimate
In this section we give an explicit formula for the error of the optimal estimator. This error is given by formula (1.5). We assume for simplicity that A = I in what follows. This means that we are discussing the filtering problem. In the same way the general estimation problem can be treated. 1. The error of the estimate can be computed by formula (2.9) with A = I . This yields E
= (Rh,h) - 2Re(h,f)
+
EO(Z),
where (u, w ) := uvdz, (Rh,h) = h(y) := h(z,y). The optimal estimate
~ ( 2:= ) Is(z)I2,
s, ,s
(2.94)
R ( z ,y)h(y)h*(z)dydz, and
is given by the solution to the equation (2.11):
Rh= f ,
(2.96)
and we assume that R E R. Since (Rh,h)> 0, it follows from (2.94) and (2.96) that E(Z) = E o ( 2 ) - (Rh,h).
(2.97)
It is clear that the right side of (2.97) is finite if and only if the quadratic form ( R h l h )is finite. Our goal is to show that if and only if one takes the solution to (2.96) of minimal order of singularity, the mos solution to (2.96), that is the solution h E H-al one obtains a finite value of (Rh,h). Therefore only the mos solution to (2.96) solves the estimation problem, and the error of the optimal estimate is given by formula (2.97) in which h E H - a is the unique solution to (2.96) of minimal order of singularity. 2. In order to achieve our goal, let us write the form (Rh,h) using the Parseval equality and the basic assumption R E R: (2.98) Here (2.99)
Random Fields Estimation Theory
30
where I#Jj(x, A) are the eigenfunctions of C which are used in the expansion of the spectral kernel:
c N A
@(Z,Y, 4 =
I#Jj(Z,X)I#J;*(YI
(2.100)
A),
j=1
and Nx 5 00 (see Section 8.2). One has
h E H-bS @
/ lh(X)12 +
(1 X2)-bdp(X)
< 00,
(2.101)
A
where b > 0 is an arbitrary number. By the assumption (see 2.74) 0 < c1 5 P(X)(l
+x2)-pl2
I c2,
(2.102)
0 < ~3 5 Q(X)(l
+ X2)-Q/2
5~
4 ,
(2.103)
- Q. <
(2.104)
where cj, 1 5 j 5 4, are positive constants. Thus 0 < c5 5 PQ-'(l
+ X2)(q--P)/2
From (2.98),(2.101) and (2.104) it follows that
(Rh,h) < 0O @ h E
fi-(q-p)s/2
= k-0.
(2.105)
In particular, if m ( h ) := ordh > a,then (Rh,h) = 00. Let C be an operator with constant coefficients in L2(R') so that R(z,y) = R(z - y), L = (C1,. . . C,), L, := -za/az,, and formula (2.98) takes the form
(Rh,h) = with X = (XI,.
. .A,)
/
'R
Ih(X)12dX,
(2.106)
h(z) exp(iX. z)dz.
(2.107)
P(X)Q-l(X)
and
h ( ~:=) (2x)-'I2
Lr
Then P ( L ) is elliptic of order p if and only if P(X) satisfies (2.102) and Q(C) is elliptic of order q if and only if Q(X) satisfies (2.103). The integral (2.106) is finite if and onlyif h(X)(l+X2)(p-q)/4 E L2(R'), where k(X) is the usual Fourier transform. This is equivalent to h E H d a ( R T )a, = (q-p)/2. Since we assume that supph c D , we conclude in the case described that h E I?a(D).
Formulation of Basic Results
31
3. Formula (2.97) can be written as E(2)
= E o ( 2 ) - (f,h) = E O ( 2 ) - ( f l R - 9 ) .
(2.108)
If U is a vector random field then h(z,y) is a d x d matrix and formulas (2.97) and (2.108) take the form E(X) = EO(Z) - tr(f,h) = EO(Z)
- SDtr{f(s,y)h*(?llz)}~y, (2.109)
where trA is the trace of the matrix A.
This page intentionally left blank
Chapter 3
Numerical Solution of the Basic Integral Equation in Distributions 3.1 Basic ideas
It is convenient to explain the basic ideas using the equation Rh =
L1
exp(-lz - yl)h(y)dy = f(z), -1 5 z 5 1
(3.1)
as an example, which contains all of the essential features of the general equation
Rh =
1
R(z,y)h(y)dy = f(z), z E
C
R'
with the kernel R E R. The first idea that might come to mind is that equation (3.1) is F'redholm's equation of the first kind, so that the regularization method can yield a numerical solution to (3.1). On second thought one realizes that, according to Theorem 3.1, the solution to (3.1) does not belong to L2(-1, 1) in general, and that the mapping RL1 : Ha -+ H d a , ct = 1 for equation (3.1), is an isomorphism. Therefore the problem of numerical solution of equation (3.1) is not an ill-posed but a well-posed problem. The solution to (3.1) is a distribution in general, which is clear from formula (2.49). For the solution of (3.1) to be an integrable function it is necessary and sufficient that the following boundary conditions hold f'(1)
+ f(1)= 0,
f'(-l) = f(-1).
(3.3)
This follows immediately from the formula (2.49). The problem is to develop a numerical method for solving equations (3.1) and (3.2) in the space of distributions H - a . 33
Random Fields Estimation Theory
34
Much work has been done on the effective numerical inversion of the Toeplitz matrices which one obtains by discretizing the integral equation
Eh,
+ Rh, = f,
E
>0
(3.4)
with R, for example, given by (3.1). If the nodes of discretization are equidistant, equation (3.4), after discretizing, reduces to a linear algebraic system with Toeplitz matrix t i j = ti+. Discussion of this, however, is not in the scope of our work. The question of principal interest is the question about asymptotic behavior of the solution to (3.4) as E + +O. Note that, for any E > 0, equation (3.4) is an equation with a selfadjoint positive definite operator EI R in L2(-1, 1). Therefore, for any E > 0, equation (3.4) has a solution in L2(-1, 1) for any f d 2 ( - 1 ,l),and this solution is unique. Numerical solution of equation (3.4) by the above mentioned discretization (or collocation) method becomes impossible as E -+ fO because the condition number of the matrix of the discretized problem grows quickly as E --f +O. The nature of singularity of the solution to the limiting equation (E = 0) (3.1) is not clear from the discretization method described above. Numerical solution of equation (3.1) requires therefore a new approach which we wish to describe. The basic idea is to take into account theoretical results obtained in Theorems 2.1 and 2.4 . According to the Theorem 2.4, the solution to equation (3.1) with a smooth right hand side f (x)has the following structure:
+
+ +
+
h = Ad(%- 1) B ~ ( z 1) hsm,
(3.5)
where A and B are constants and h,, is a smooth function. The order of singularity of the solution to equation (3.1) is 1 since a = 1 for this equation. Let us assume that f is smooth, f E H 2 , for example, so that
Let us look for an approximate solution to equation (3.1) of the form n
hn(x) =
C
+
- 1)
~ j 4 j ( ~a)h ( %
+ C-I~(Z + I),
(34
j=1
where c j , j = - l , O , 1,..., are constants, {&}, 1 I j < 00, is a basis in Ho= L2(-1, 1). The constants can be found, for example, from the least
Numerical Solution of the Basic Integral Equation in Distributions
35
The variational problem (3.7)
where cb := Qe-
1
,
’ c-’ = c-le-l,
The linear system for finding the
a€
-=0,
l l j l n ,
dCj
cj,
1 Ij
= R&,
&(x)
1 Ij
I n.
(3.9)
I n and cb, cY1 is
a€ a4
a€
-= 0.
(3.10)
In, Ij <
(3.11)
bi := ( f , $ i ) l
(3.12)
- = 0,
acL1
The matrix of this system is aij := ($j,$i)l,
I i,
-1
j
where $0 = exp(z), Q-1 = exp(-s), $ j for 1 00 is defined in (3.9), the system { $ j } , -1 i j h n is assumed to be linearly independent for any n, and the inner product is taken in the space H’: ( u , w ) ~:= f l ( u F u’F’)dx. Matrix aij is positive definite for any n so that the system
+
n
C
~ijc= j bi,
-1 5 i 5
72,
j=-1
is uniquely solvable for any n. Convergence of the suggested numerical method is easy to prove. One wishes to prove that
11 h n - h Since R : H-’
+
11-1-+0 as
00.
(3.13)
H’ is an isomorphism, it is sufficient t o prove that
11 Rh, - Rh ]I1+
0 as n --+
00.
(3.14)
Since Rh = f , equation (3.14) reduces to
IIRhn-fII1+O
as n + m .
(3.15)
Random Fields Estimation Theory
36
Equation (3.15) holds if the set of functions {$j}, -1 5 j < 00 is complete in H 1 . Here $j
= Rq$,
1 5 j 5 00,
11-1
= exp(--z), $0 = exp(z).
(3.16)
Therefore, if one chooses a system $j E H1, 1 I j < 00, such that the system {$j}, -1 I j < 00, forms a basis of H 1 (or just a complete system in H 1 ) then (3.15) holds. Since for practical calculations one need only know the matrix aij and the vector bi (see (3.12)), and both these quantities can be computed if the system {$j}, -1 5 j < 00, and f are known, it is not necessary to deal with the system {cjj}, 1 5 j I 00. We have proved the following Proposition 3.1 If {$j}, -1 _< j < 00, $0 = exp(z), $-I = exp(-z), i s a complete system in H' then, for any n, the system (3.12) is uniquely
solvable. Let cy' be its solution. Then the function
C7'-1~y)$j(~), where
$0 = b ( ~ I), 1 Ij I n converges in H-l to the solution h of equation (3.1):
hn = l),
$j
4-1
+
= 6 ( ~
= R-l$j,
11 h - h,
11-14
0,
n --+ 00.
There are some questions of a practical nature: 1) how does one choose the system $ j , 1 5 j < 00, so that the matrix aij in equation (3.12) is easily invertible? 2) how does one choose $j, 1 5 j < 00, so that the functions $ j are easily computable?
The first question is easy to answer: it is sufficient that the condition number of the matrix aij, -1 5 j I n for any n is bounded. This will be the case if the system {$j}, -1 5 j < 00, forms a Riesz basis. Let us recall that the system {$j} is called a Riesz basis of a Hilbert space if and only if there exists an orthonormal basis { fj} of H and a linear isomorphism B of H onto H such that B f j = $j, V j . The system {$j} forms a Riesz basis of the Hilbert space H if and only if the Gram matrix ($i, $ j ) := I'ij defines a linear isomorphism of l2 onto itself. If {$j}, 1 5 j < 00, in (3.6) is a basis of H" then { $ j } , 1 5 j < 00, $ j = R$j, is a complete set in H1. Indeed, suppose that f E H1 and ( f , R $ j ) l = 0 V j . Then 0 = (f,R$j)+ = (I-'f,R$j)O = (RI-'f,$j)o V j , where the operator I has been introduced in Section 8.1.2, and H+ = H 1 .
Numerical Solution of the Basic Integml Equation in Distributions
37
Since the system {&} is complete in HO (Ho = H" in our case) by the assumption, one concludes that R I - l f = 0. Since I-lf E H- and R ( z ,y) is positive so that ( R g , g ) o > 0 for g # 0 , g E H-, one concludes that I-lf = 0. Since I-1 is an isometry between H+ and H-, one concludes that f = 0. Therefore, by Proposition 3.1, if the system {&} forms a basis of H" then 11 h, - h 11-1-+ 0 as n -+ 00, where h, is the solution to (3.7) of the form (3.6). If { + j } is a basis of H" then the system (3.12) is uniquely solvable for all n if and only if the system {&}, -1 5 j 5 n is linearly independent in H1 for all n. Here the system qbj is defined by formula (3.16). 3.2
Theoretical approaches
1. Let us consider equation (3.2) as an equation with the operator R : -+ H+ which is a linear isomorphism between the spaces H- and H+. The general theory of the triples of spaces H+ c HO c H- is given in Section 8.1, and we will use the results proved in Section 8.1. In our case H+ = H a , Ho = H o , H- = H - a . In general, H+ c Ho and Ho are Hilbert spaces, H+ is dense in Ho, 11 u ll05llu ll+, and His the dual space to H+ with respect to Ho. It is proved in Section 8.1 that there exist linear isometries p+ : Ho -+ H+ and p - : HHo, and (u,'u)+ = (qu,qv)o,where q = pT1. The operator q*, the adjoint of q in Ho, is an isometry of HO onto H-. Let us rewrite equation (3.2) in the equivalent form
H-
-+
(3.17)
Ah0 := qRq*ho = fo, where
fo := qf,
ho := (q*)-'h,
fo E Ho,
ho E Ho.
(3.18)
The linear operator qRq* is bounded, selfadjoint, and positive definite:
(qRq*4,4)o = (Rq*4,q*4)o 2 ci II q*4 /I?= ci I1 4
112,
ci
> 0.
(3.19)
Moreover (3.20) II 4 112, c2 > 0. 11 q*4 11-=11 4 110, and the inequality
(R;4,9-*4)0 5 c2 It q*4 =:1 Here we used the isometry of q*: ci
c2
11 h 11T1 (Rh,h)o 5 c2 11 h 115,
c2
2 ci > 0.
(3.21)
Random Fields Estimation Theory
38
This inequality is proved in Section 3.4, Lemma 3.5, below. Equation (3.17) with a linear positive definite operator A on a Hilbert space Ho is uniquely solvable in Ho and its solution can be obtained by iterative or projection methods. If the solution ho to equation (3.17) is found then the desired function h = q*ho. Let us describe these methods. Let us start with an iterative method. Assume that A is a bounded positive definite operator on a Hilbert space:
0 < m IA 5 M . This means that m 1) q5
\I2<
(Aq5,q5) 5 M
(3.22)
I] q5 )I2,
Vq5 E H . Let
A u = f.
(3.23)
Consider the iterative process
a
%+1 = (1-
2 M+m’
:= -
(3.24)
where uo E H is arbitrary.
Lemma 3.1 There exists limn--rooun= u. This limit solves equation (3.23). One has
This is a well known result (see e.g. [Kantorovich and Akilov (1980)l). We give a proof for convenience of the reader.
Proof. If u, + u in H then, passing to the limit in (3.24), one concludes that the limit u solves equation (3.23). In order to prove convergence and the estimate (3.25), it is sufficient t o check that
II I - aA II<
(3.26)
Q,
where q is defined in (3.25). This follows from the spectral representation:
M-m
(3.27)
Lemma 3.1 is proved.
If A is not positive definite but only nonnegative, and f E R ( A ) ,where R ( A ) is the range of A, then consider the following iterative process un+1+ Aun+l= 1‘1,
+f
(3.28)
Numerical Solution of the Basic Integral Equation in Distributions
39
uo E H is arbitrary.
Lemma 3.2
I f A 2 0 and f E R ( A ) then there exists lim un = u,
(3.29)
n-+w
where u, is defined by (3.28) and u solves equation (3.23).
Proof. I f Un -+ u in H then, passing to the limit in (3.28) yields equation (3.23) for u.In order to prove that u, -+ u one writes equation (3.28) as un+1 = Bun
+ h,
(3.30)
where
B:= (I+A)-',
h:=Bf.
(3.31)
Since A 2 0 one has 0 5 B 5 I , where I is the identity operator in H . Under this condition (0 5 B 5 I ) one can prove [Krasnoselskii et al. 0 (1972),p. 711 that u, -+ u.Lemma 3.2 is proved.
+
Remark 3.1 If A satisfies assumptions (3.22), then 0 < ( M 1)-l 5 B 5 ( m+ l)-', and the iterative process (3.30) converges as a geometrical series with q = ( m+ 1)-'. 2. Let us consider the projection methods for solving equation (3.23) under the assumption (3.22). First, consider the least squares method which is a variant of the projection method. The least squares method can be described as follows. Take a complete linearly independent system { + j } in H . Look for a solution n
u, = c c j + j .
(3.32)
j=1
Find the constants cj from the condition
11 Aun - f I/= min.
(3.33)
This leads to the linear system for cj:
where (3.35)
Random Fields Estimation Theory
40
Since the system q5j, 1 5 j 5 n, is linearly independent for any n, and A is an isomorphism of H onto H , the system (Aq+}, 1 5 j 5 n, is linearly independent for any n. Therefore det aij # 0, 1 5 i , j 5 n, and the system (3.34) is uniquely solvable for any right hand sides and any n. Let c y ’ , 1 5 j 5 n, be the unique solution to system (3.34) and n
(3.36) j=1
Let us prove that u, + u as n m. It is sufficient to prove that the system {A4j}, 1 5 j < 00, is complete in H . Indeed, if this is so, then (1 Au, - f I[-+ 0 as n + 00, where u, is given by (3.36). Therefore ---f
(1 un - 11=11
A-l(Aun - f ) 115 7n-l
11 Aun - f I[+
0.
(3.37)
Here we used the estimate 11 A-’ 115 m-l. It is easy to check that the system {A&}, 1 I j < 00, is complete in H . Indeed, suppose ( h ,A $ j ) = 0 , 1 5 j < 00, for some h E H . Then (Ah,q+) = 0 , 1 5 j < m. Thus Ah = 0 since by the assumption the system {$j}, 1 5 j < 00, is complete in H . Since A-l exists, equation Ah = 0 implies h = 0. We have proved the following lemma.
Lemma 3.3 If A satisfies condition (3.22) and { + j } , 1 5 j < 00, is a complete linearly independent system in H , then the least squares method of solving equation (3.23) converges. Namely: a) for any n the system (3.34) is uniquely solvable and the aproximate solution un i s uniquely determined by formula (3.36), and b) 11 21, - u 1 14 0 as n + 00, where u is the unique solution to equation (3.23). The general projection method can be described as follows. Pick two complete linearly independent systems in H { $ j } and {qj}, 1 5 j < 00. Look for an approximate solution to equation (3.23) of the form (3.32). Find the coefficients c j , 1 5 j 5 n, from the condition
(Aun - f,&) =O,
15 i 5 n.
(3.38)
Geometrically this means that the vector Aun -f is orthogonal to the linear span of the vectors $i, 1 I i 5 n. Equations (3.38) can be written as n
Numerical Solution of the Basic Integral Equation in Distributions
41
where bij =
(A4jIh)l
(3.40)
fi = (f,$i).
The least squares method is the projection method with Gi = A&. In [Krasnoselskii, M. (1972)l one can find a detailed study of the general projection method. Let us give a brief argument which demonstrates convergence of the projection method. Let { $ j } be a complete linearly independent system in H , L, := span{&, . . .&}, Pn is the orthogonal projection on L, in H . An infinite system { # j } is called linearly independent in H , if, for any n, the system {&}, 1 5 j 5 n, is linearly independent in H . Take l(?i = +j and write equation (3.38) as
PnAun = Pn f l un E Ln.
(3.41)
Since u, = Pnun and the operator PnAPn is selfadjoint positive definite on the subspace L, C H , equation (3.41) is uniquely solvable for any f E H and any n. Note that PnAu, = PnAPnun and A satisfies assumes (3.22). To prove that 11 u, - u II+ 0 as n. + 00, let us subtract from (3.41) the equation
P,Ah = Pnf.
(3.42)
The result is
(3.43)
PnAPn(u, - U ) = PnA(u - Pn.).
Since { $ j } is complete, Pn -+ I, as n + 00, strongly, where I is the identity operator. This and the boundedness of A imply
11 PnA(U - P,u)
115 c 1) u - Pnu
I]+
0,
n + 00.
(3.44)
Multiply (3.43) by Pn(u - un) and use the positive definiteness of A to obtain c or
11 Pn(u-Un) 112<(1
u-pnu
IIII Pn(u-un) IIi
Random Fields Estimation Theory
42
We have proved
Lemma 3.4 If (3.22) holds and {4j} i s a complete linearly independent system in H , then the projection method (3.41) for solving equation (3.23) converges, and equation (3.41) is uniquely solvable f o r any n.
3. If one applies the projection method with $j = 4 j to equation (3.17) and chooses a Schauder basis { 4 j } of H , then the matrix . aij := (q&?*+jr+i) = (R,*+j,,*+i)
(3.47)
and the numbers f o i = (fo, 42) = (f,,*&I
(3.48)
can be computed. The parentheses here denote the inner product in H o = Ho, ( u , v ) = (u,v)o. As q*+i := wi one can take a basis of H-: since q* is an isometry between HO and H- it sends a basis {4i}of H , onto a basis {wi} of H-. Let us suggest a system {wi} for computational purposes. Let B be a ball which contains the domain D ,{vj} be the orthonormal in L 2 ( B )system of eigenfunctions of the Dirichlet Laplacian in B :
-Avj
=Xjvj,
vj = O
on d B .
(3.49)
For any --oo < /3 < 00, the system {vj}, 1 I j < 00, forms a basis of HD(B). Indeed the norm in HD(B) is equivalent to the norm
II (-A)p’2u l b ( B ) : = I l
21 11s
and
Therefore, u E H P ( B ) is a necessary and sufficient condition for the Fourier series M
j=1
to converge in H P ( B ) , so that the system {vj} is a basis of H p ( B ) for any p, --oo < p < 00. Moreover, this basis is orthogonal in H f l ( B )for any p, although it is not normalized in H @ ( B )for p # 0. In order to check the orthogonality property, note that
Numerical Solution of the Basic Integral Equation an Distributions
43
( ~ j~, i ) = p ((-A)Pwj, ~ i = A) , P (~w j , ~ i ) o = A,PS,i, where Sji is the Kronecker delta. for any The basis {vJ} can be used therefore as a basis of H- = k p p 2 0. In the case of equation (3.17) with R E R,one has H- = H - a , cy = s(q - p ) / 2 . Although the system {wj}, 1 5 j < 00, is a basis of it is not very convenient for representation of singular functions, such as Sr, for example. The situation is similar to the one arising when 6(z - y) is represented as
(3.52) where { + j } is an orthonormal basis in L 2 ( D ) . Formula (3.52) is valid in the sense that for any f E L 2 ( D )one has M
(3.53) j=1
The sequence &(x
[ [ s,
-
fSn(. - Y)dY - f (.)
y) := C;',
q$(z)c#$(y) is a delta-sequence, that is
---f
0
as n
3 0O
Vf E L2(D). (3.54)
IIW
The series (3.52) does not converge in the classical sense.
3.3
Multidimensional equation
In this section we describe the application of the basic idea presented in Section 3.1 t o the multidimensional equation of random fields estimation theory
s,
R(., y)h(y)&
= f(z), 2 E
D c R'.
(3.55)
We assume that R E R and r is smooth. Let j = (jl, . . .j r ) be a multiindex, Ijl = j , + j 2 + . . .+j,. By {b(s)&}(j)we mean the distribution with support on I' = d D which acts on a test function 4 E CT(R') by the formula ({b(z)sr}(j),4 ) = (-1)ljl
(3.56)
44
Random Fields Estimation Theory
Here b(s) is a smooth function on r. Let us look for an approximate solution to equation (3.55) in H-LYof the form
j=1
i=O m=l
where umi and cj are constants, the system ( q $ } , 1 2 j < 00, forms a basis of HO = L 2 ( D ) ,and the systems {bmi(s)}, 1 5 m < 00, form a basis of L 2 ( r ) , 0 5 (it 5 Q - 1. Here h,, stands for an approximation of ho, the ordinary part of the solution h = h, + h, to equation (3.55), and h,, stands for an approximation of the singular part, h,, of this solution. If f E H 2 q s - p s , then G ( z ) ,defined by formula (2.16), belongs to H2qs and h, the solution to (3.55) given by formula (2.15),belongs to H o = L2(D)in the interior of D ,h, = Q(L)GID,where the symbol hlo denotes the restriction of the distribution h to the interior of D. For example, if D = (t - T , t ) ,h is given by formula (2.49), then hlD = *. The term n
(3.58) j=1
in (3.57) can approximate h, with an arbitrary accuracy in H o , if n is large enough, because the system { $ j } , 1 5 j < 00, forms a basis of Ho. It would be sufficient to assume that {$j} is a complete in H o linearly independent system. The term LY-1 n
(3.59) i = O m=l
can approximate h, in with an arbitrary accuracy, if n is large enough, because the systems {bmi(s)}, 1 5 m < 00, are complete in L 2 ( r )and h, is of the form (see formula (2.16)) (3.60) i=O
where the coefficients bi(s) are the traces on J? of certain derivatives of f(z) (see formula (2.85), for example). If f(z)is sufficiently smooth then the functions bi(s) are in L2(r) and can be approximated in L2(r') with an arbitrary accuracy, if m is large enough, by linear combinations of functions bmi(s) because the systems {bmi(s)}, 1 _< m < 00, are assumed to be complete in L2(r)for any o 5 lil 5 Q - I.
Numerical Solution of the Basic Integral Equation in Distributions
45
Choose coefficients cj and b,i in (3.57) so that
11 Rh,
min.
-f
(3.61)
The variational problem (3.61) leads to the linear algebraic system for the coefficients cj and bmi. The arguments given in Section 3.1 below formula (3.7) remain valid without essential changes. Rather than formulate some general statements, consider an example. Let (3.62)
This is equation (2.84) with a = 1. Look for its approximate solution in fi-l of the form n
n
j=1
m=l
(3.63)
Here a = 1, so that the double sum in (3.59) reduces to the second sum in (3.63), cj and am are the coefficients to be determined by the least squares method (3.61). One has (3.64) m=l
j=1
where (3.65)
Therefore (3.61) yields: n
n
j=1
m= 1
(3.67)
This leads to the linear system for the 2 n coefficients cj and a,:
Random Fields Estimation Theory
46
where
Exercise. Under what assumptions can one prove that 11 Rhn - f Ill-+ 0 as n -+ 00 implies 11 h,, - h, 0 and )I hsn - hs 11-1-+ 0 a~ n -+ CO? 3.4
Numerical solution based on the approximation of the kernel
Consider the basic equation
Rh :=
R(z,y)h(y)dy= f(z),
C
2 E
Rr
(3.73)
with kernel (3.74) where h ( X ) is a positive continuous function vanishing at infinity. Let us call this function the spectral density corresponding to the kernel R(z,y). Assume that, for any E > 0, one can find polynomials Pe(A) > 0 and QE(X) > 0, such that the kernel &(A) := PE(X)Q;l(X) approximates &A) in the following sense: sup{lRe - Q(1+ X2)P}
:=[I
R,
-R
6.
(3.75)
XEA
We assume that for all sufficiently small E , 0 < E one has
< E,,
degQ,(X) - degP,(X) = 2 p > 0, where p does not depend on P + 3
E.
For example, if
A = (-CO,CO),
E,,
is a small number,
(3.76)
Numerical Solution of the Basic Integral Equation in Distributions
47
then /3 = 1. We also assume that (3.77) and inf {R(A)(l+ A2)O} := y1 XEA
:= y2 < m. (3.78)
> 0, sup{I?(A)(l+ XEA
The basic idea of this section is this: if the operator R, : H - p s 4 H p s , with the spectral density fi,(X) is for all E E (O,E,) an isomorphism and the asumptions (3.75)-(3.78) hold with constants /3, c which do not depend on E , then R, the operator with the kernel R(z,y), is also an isomorphism between H - p s and H p s . Therefore the properties of the operator R will be expressed in terms of the properties of the rational approximants of its spectral density. We will need some preparations. First, let us prove a general lemma.
Lemma 3.5 Let H+ c HO c H- be a rigged triple of Halbert spaces, where H- is the dual space to H+ with respect to Ho. Assume that R : H- 4 H+ is a linear map such that ci
(1 h 112-5( R h , h )I c2 (1 h 112,
V h E H-,
(3.79)
where 0 < c1 < c2 are constants, and (f,h) is the value of the functional h E H- o n the element f E H+. Then
II R 1 1 1c2, II R-l I I I c r l , so that R is an isomorphism of H- onto H+. Here
(3.80)
11 R 11
is the norm of
the mapping
R : HProof.
-+
H+.
One has
so that
(3.81)
(3.82)
Random Fields Estimation Theory
48
Let us prove that the map R is surjective, that is, the range of R is all of H+. If it is not, then there exists a 4 E H - , 4 # 0, such that
( R h , 4 )= 0 Qh E H-.
(3.83)
It follows from (3.83) that (h,R4) = 0 Qh E H-.
(3.84)
Therefore Rq5 = 0 and
0 = (R414) L c1
II 4 112-
(3.85)
*
Thus 4 = 0 contrary to the assumption. Therefore the map R is surjective. Let us now prove the first inequality (3.80). One has
where the premum is taken over all
such that
Remark 3.2 The surjectivity of the map R : H- --t H+ follows also from the fact that R is a coercive, monotone and continuous mapping (see, e.g., [Deimling (1g85), p . 1001). Let us now prove
Lemma 3.6 Let R, : H- -+ H+ be an isomorphism for all E E (Oleo), where E , > 0 is a small fixed number. Let R ; H - + H+ be a linear map defined on all of H-. Assume that
and
Numerical Solution of the Basic Integral Equation in Distributions
where M = const > 0 does not depend on is an isomorphism and
11 R-' 111 M(1Proof.
EM)-'
E E
(0, E
~ ) . Then
R : H-
for E M < 1.
49
-+
H+
(3.90)
One has
R = R,
+ R - RE = R,[I + RF1(R- R E ) ] ,
(3.91)
where I is the identity operator on H-. The operator R;'(R - R E )is an operator from H- into H- and
11 RT1(R- RE) I[<
EM
(3.92)
because of (3.88) and (3.89). If E M < 1, then the operator I + R;'(R- RE) is an isomorphism of H- onto H-, and
11 [I + R;'(R
-
111 (1 - EM)-'.
(3.93)
Therefore, the operator R is an isomorphism of H- onto H+,
+
R-' = [I RF1(R- RE)]-lRF1,
(3.94)
11 R-' 111 M ( 1 -
(3.95)
and
E M ) - ' , E M < 1.
Lemma 3.6 is proved.
0
Let us choose H- = H - P ' , HO = H o = L 2 ( D ) ,and H+ = HP", where C is the elliptic operator which defines the kernel R E R of equation (3.73). Note that, by Parseval's equality, one has: s = ordC, and
(3.96) and, similarly, 71 II h 11?p,<
(Rh,h),
(3.97)
where y1 and 'yz are constants from condition (3.78). From (3.96), (3.97) and Lemma 3.5, one obtains Lemma 3.7 If the spectral density R ( X ) of the kernel (3.74) satisfies conditions (3.78) with some /3 > 0, then the operator R with kernel R(x,y),
Random Fields Estimation Theory
50
defined by formula (3.74), is an isomorphism between the spaces H-Os(D) and HD8(D) and
II R I15 YZ, II R-’ 111 where y1 and
72
(3.98)
are constants from Condition (3.78).
Let us discuss briefly the approximation problem. Let R ( X ) be a continuous positive function such that conditions (3.78) hold and (3.99)
where p is a positive integer, and let X = t g ( 4 / 2 ) . Then X runs through the real axis, -00 < X < 00, if 4 runs through the unit circle, -7r 5 4 5 7r. Because of the assumption (3.99), one can identify +00 and -00, and consider the function R ( X ) as a function
R(4) := R
(tg:)
,
(3.100)
which is 27~periodic function defined on a unit circle. Since 1- X2 2x sin4 = - cos4 = 1 1+X2
+
and cos(m4) is a polynomial of degree m of C O S ~ while , polynomial of degree m of cos 4, a trigonometric polynomial
(3.101)
is also a
n
Sn(+):= a,
+ x a j cos(j4) + bj sin(j4),
(3.102)
j=1
where aj are bj are constants, can be written as (3.103)
where c, are some constants. Therefore, if one wishes to approximate a function h ( X ) on the whole axis (-00,co) by a rational function, one can approximate the function R($),defined by formula (3.100) on the unit circle, by a trigonometric polynomial Sn($)and then write this polynomial as a rational function of X as in formula (3.103). The function &(A), defined by formula (3.103), satisfies the condition
&(A)
-73~-’O
as
x -, f00
(3.104)
51
Numerical Solution of the Basic Integral Equation in Distributions
if and only if
p, 0 < p < n, is an integer and
If (3.105) holds then the constant
7 3 in formula (3.104) equals czn-2p. The theory of approximation by trigonometric polynomials is well developed (see [Akhieser (1965)l). In particular, if condition (3.99) holds, then approximation in the norm
sup (1
+ A2)qR(A) - R,(A)I :=/I R - R, I(p
(3.106)
is possible. The norm (3.106) is the norm (3.75) with A = R1. We keep the same notation for these two norms since there will be no chance to make an error by confusing these two norms: in all our arguments they are interchangeable.
Lemma 3.8 If R(X) is a continuous function defined o n R1 = (-cm,cm) which satisfies condition (3.99), then, f o r any e > 0 , there exists a rational function k,= PE(X)QF1(X) such that
II R ( 4 - fi&> llP<
(3.107)
E
and condition (3.76) holds. If R ( X ) > 0 then the polynomials P,(X) and Qe(X) can be chosen positive.
Proof.
+ is continuous on R’ (1+ A’))PR(x) = y3
The function $(A) := (1 A2)”(A) lim
X+foo
because of (3.99). The function $ ( t g ; )
,t g f
and (3.108)
= A, is a continuous function
of C#I on the interval [ - T , T ] . Therefore, there exists, for any given trigonometric polynomial such that
E
> 0, a
where n = n(e) depends on $. From (3.109) and (3.103) it follows that I
212
I
Random Fields Estimation Theory
52
Therefore (3.107) holds with 2n
Pe(X):=
C %Arn,
+
Qe(X) := (1 X2)-n-P.
(3.111)
m=O
In order to prove the last statement of Lemma 3.8, one approximates first the continuous function (1 A2)p/zk1/2(X) by a rational function T(X) in the uniform norm on R' with accuracy 5 E , where c > 0 is a given number. Then the square of this rational function approximates the function (1 X2)pR(X) with accuracy const. E, where the constant does not depend on E. Indeed, if If - TI < E , then
+
+
If2-T21 5
If
-T((maxIfl+maxITI)I~.const.
(3.112)
If T is a rational function then T2(X)= P(A)Q-'(X) where P(X) and Q(X) are positive. Lemma 3.8 is proved. 0 Let us summarize the results in the following theorem.
Theorem 3.1 Let R ( A ) be a continuous positive j h c t i o n o n R' and suppose condition (3.99) holds with a positive integer p. Then: a) for any E > 0, there exists a positive rational function R , ( X ) such that conditions (3.75) and (3.76) hold; b) for all sufficiently small E , 0 5 E 5 € 0 , the operator R,, with the kernel defined by the spectral density R E ( X ) , is an isomorphism of the space fi-os(o) := f i - p s onto Hp"(D) := H p S ; c) the operator R : H-as -+ Hbs is an isomorphism; d) there exist positive constants yo,71, and 7 2 , such that conditions (3.77) and (3.78) hold; e) the following estimates hold:
II R 1 1 17 2 , I1 R-l I 11 where
71
and
72
$7
(3.113)
are the constants in formula (3.78);
f) if y;'~ < 1, then (3.114)
and
53
Numerical Solution of the Basic Integral Equation in Distributions
Proof. The statements (a) to (e) of Theorem 3.1 follow from Lemmas 3.5-3.8. The statement (3.114) is analogous to (3.95) and can be proved similarly. The last statement (3.115) follows immediately from the identity:
R - ~- R;~ = R;~(R~ - R ) R - ~
(3.116)
and estimate (3.114), second estimate (3.113), and the estimate
I) R€- R 115
€9
(3.117)
which is a consequence of (3.75). Let us explain why (3.75) implies (3.117). One has
(3.118) Here H- =
Thus
(3.119) From (3.75) and (3.119) one obtains (3.117). Theorem 3.1 is proved.
[7
It is now easy to study the stability of the numerical solution of equation (3.73) based on the approximation of R(z,y). Consider the equation
Rhs where
f6
= fa,
fs E H+,
(3.120)
1 1 + 1 6.
(3.121)
is the noisy data:
I1 f6 - f
This means that, in place of the exact data f E H,, an approximate data fa is given, where S > 0 is the accuracy with which the given data approximates in H+ the exact data. Suppose that R(X), the spectral density of the given kernel, satisfies condition (3.99) with P > 0 an integer. Take a kernel REE R,such that the estimate (3.117) holds with E > 0 sufficiently
Random Fields Estimation Theory
54
small. This is possible by Theorem 3.1. Then estimate (3.115) holds. Consider the equation
By Theorem 3.1 the operator R, : H- + H+ is an isomorphism if E > 0 is sufficiently small, 0 < E < €0. Therefore, for such E equation (3.122) is uniquely solvable in H-. We wish to estimate the error of the approximate solution:
I1 h - h e 8 11-
=
11 R-lf
- RFlfa
11-
I II R-'(f - fs) 11- + II (R-' - Rr')fa 11I II R-l 1111 f - fa II+ + II R-l - RT' IIII fa II+
+
I YT16 EYC2(1 - EYT')-l
II fa I[+
*
(3.123)
In our case H- = fi-P", H+ = HP". Estimate (3.123) proves that the error of the approximate solution goes to zero as the accuracy of the data increases, that is b --+ 0. Indeed, one can choose E > 0 sufficiently small, so that the second term on the right side of the inequality (3.123) will be arbitrarily small, say less than 6. Then the right side of (3.123) is not more than (7;' 1)b. We have proved
+
Lemma 3.9 The error estimate of the approximate solution ha&is given by the inequality:
11 h - he6 11-1YT'~+ ~ r T ' ( 1 - EYT~)-' II f a II+
(3.124)
where y1 is the constant in condition (3.78). 3.5
Asymptotic behavior of the optimal filter as the white noise component goes to zero
Consider the equation
the
+ Rh, = f ,
E
> 0,
(3.125)
where R E R,or, more generally, R is an isomorphism between H- and H+, where H+ C Ho c H- is a rigged triple of Hilbert spaces. We wish to study the behavior as E -+ 0 of h,, the optimal filter. This question is of theoretical and practical interest as was explained in the Introduction. It
Numerical Solution of the Basic Integral Equation in Distributions
55
will be discussed in depth in Chapter 5. We assume that the estimate CI
11 h 11?L (Rh,h)L ~2.11h 1:
V h E H-
(3.126)
holds, where c1 > 0 and c 2 > 0 are constants and the parentheses denote the pairing between H- and H+. From (3.125) it follows that E
II he 1;
+(Rho h € )= (flhe),
(3.127)
where the parentheses denote the inner product in Ho, which is the pairing between H- and H+ (see Section 8.1). It follows from (3.126) and (3.127) that
II he 11-5 c II f I[+,
c=
CT1,
(3.129)
where the constant c > 0 does not depend on E . Since H- is a Hilbert space, and bounded sets are weakly compact in Hilbert spaces, inequality (3.129) implies that there is a weakly convergent subsequence of h,, which we denote again h,, so that
h,
2
h in H-
as
E
-+
0.
(3.130)
Here 3 denotes weak convergence in H- which means that for any f E H+ one has
Let q5 E H+ be arbitrary. It follows from (3.125) that
One has
56
Random Fields Estimation Theory
where we used estimate (3.129). Therefore one can pass to the limit in equation (3.132) and obtain
E t
0
or
where h E H- is the weak limit (3.130). Since H+ C H- is dense in H- in the norm of H- , one concludes from (11) that
Rh= f.
(3.136)
We have proved the following theorem.
Theorem 3.2 Let H+ C HO C H- be a triple of rigged Hilbert spaces. If R : H- t H+ is an isomorphism, and (3.126) holds, then the unique solution to equation (3.125) converges weakly in H- to the unique solution to the limit equation (3.136). Remark 3.3 The weak convergence in H- is exactly what is natural in the estimation theory. Indeed, the estimate
is the value of the functional (3.15’7) at the element f, hx = h(x,y) E H - , f E H+, x E Rr being a parameter. The error of the optimal estimate (see e.g. formulas (2.96) and (2.108) are also expressed as the values of a functional of the form (3.137)). One can prove that actually h, converges strongly to H-. Indeed, equation (3.125) implies ( h E , h E ) -= ( R h e , h e )I (Rh,h,) = (h,h,)-. Thus llhll-. Choose a weakly convergent in H- sequence h, := h,,, Ilh,ll- I lim E , = 0. Then h, 2 h in H-, llhll- 5 lim, ,wllhnll- and n+w
llhll-. Consequently, llhnll- = llhll-. This and Gn+mllhnll I the weak convergence h, 2 h in H- imply strong convergence in H-, so Ilh - h,ll- = 0.
Numerical Solution of the Basic Integral Equation in Distributions
3.6
57
A general approach
In this section we outline an approach to solving the equation Rh = f
(3.138)
which is based on the theory developed in Section 8.1. Assume that R : H -+ H is a compact positive operator in the Hilbert space H , that is (Rh, h)
> 0 Qh E H , h # 0.
(3.139)
The parentheses denote the inner product in H . The inner product
induces on H a new norm
Let H- be the Hilbert space with the inner product (3.140) which is the completion of H in the norm (3.141). By H+ we denote the dual to Hspace with respect to H = Ho (see Section 8.1). One has
H+cHcH-
(3.142)
where H+ is dense in H and H is dense in H-. The inner product in H+ is (u,v)+ = (R-lu,v)
=
(R-l12u, R-l12v),
u,v E Dom(R-l12).
(3.143)
Therefore (3.144)
One can see that H+ is the range of R112. Indeed, Ran(R1/') C H+ by definition and is closed in H+ norm. Indeed, let fn = R1I2Un and assume that )I fn - fm I ] + - + 0 as n,m -3 co. Then, by (3.144), 11 U n - U m I/--+0 as n,m 00. Therefore there exists a u E H such that )I u, - u /I-+ 0 as n -+ co. Let f := R'12u. Then f E H+ and 11 f - f n Il++ 0 as n --+ 00. Thus Ran(R1/2) is closed in H+, where Ran(A) is the range of an operator A. Since H+ is the completion -+
58
Random Fields Estimation Theory
of Ran(R'l2) in H+ norm, it follows that H+ define the norm in H+ as
= Ran(R'I2).
One can also
(3.145) If and only if the right side of (3.145) is finite one concludes that u E Dom(R-1/2) and obtains from (3.145) equation (3.144). The triple (3.142) is a triple of rigged Hilbert spaces (see Section 8.1), and R : H- + H+ is an isomorphism. Therefore, a general approach to stable solving equation (3.138) can be described as follows. Suppose an operator A : H -+ H is found such that A > 0 and the norm ( A ~ , u )is~equivalent /~ to the norm (3.141). In this case the spaces H+ and H- constructed with the help of A will consist of the same elements as the spaces H+ and Hconstructed above, and the norms of these spaces are equivalent, so that one can identify these spaces. Suppose that one can construct the mapping A-' : H+ -+ H-. This mapping is an isomorphism between H+ and H-. Then equation (3.138) can be written as
Bh := A-lRh = A-lf
:= g 1
(3.146)
where B : H- + H- is an isomorphism. Therefore equation (3.138) is reduced to the equation
Bh=g
(3.147)
which is an equation in the Hilbert space H- with a linear bounded operator B which is an isomorphism of H- onto H-. The operator B in equation (3.148) is selfadjoint and positive in H-. Indeed
(Bh,v)- = (RBh,v)= (RA-lRh,v) = (h,R A - ~ R v = ) (h,Bv)-. (3.148) Moreover
(Bh,h)- = (A-lRh, Rh) > 0 for h # 0
(3.149)
since A and R are positive. Equation (3.148) with positive, in the sense (3.149), isomorphism B from H- onto H- can be easily solved numerically by iterative or projection methods described in Section 3.2. Let us now describe some connections between the concepts of this section and the well known concept of a reproducing kernel.
59
Numerical Solution of the Basic Integral Equation in Distributions
Definition 3.1 A kernel K(x,y), x,y E D c RT, is called a reproducing kernel for a Hilbert space H+ of functions defined o n D , where D is a (not necessarily bounded) domain in Rr, i f for any u E H+ one has (K(x,y),'ZL(d)+ = 4.).
(3.150)
It is assumed that, for every x E D , K ( x ,y) E H+, and H+ consists of functions for which their values at a point are well defined. From (3.150) it follows that ( K ( z , y ) u , u ) +2 0, where K(x,y)u := ( K ( x ,Y), 4 + ,and K ( x ,x) 1 0, K ( x ,Y) = K*(y,x)
IK(x, y)I2 I K ( x ,z)K(y,y) (3.151)
as we will prove shortly. The reproducing kernel, if it exists for a Hilbert space H+, is unique. Indeed, if K1 is another reproducing kernel then (K(z,y) - Kl(z,Y),u(Y))+= 0
k L
E H+.
(3.152)
Therefore K ( z ,y) = K l ( z ,y). The reproducing kernel exists if and only if the estimate
I+)I I c II 1' 1 [I+
hE H+
(3.153)
holds with a positive constant c which does not depend on u.Indeed I4x)I 1 I(K(GY),
U(Y))+l
Thus (3.153) holds with c =I/ K ( z ,y)
1 1 1K(z,y) 11+11
I[+.
21
II+ .
(3.154)
Note that
II K ( x ,Y) I:= (K(z,y),K ( z ,Y))+
= K ( x ,).
(3.155)
because of (3.150). Conversely, if (3.153) holds then, by Riesz's theorem about linear functionals on a Hilbert space, there exists a K ( z ,y) such that (3.150) holds. Since (3.150) implies that, for any numbers t j , 1 5 j 5 n, one has n
/ n
n
\
one sees that the matrix K ( x i , x j ) is nonnegative definite and therefore (3.151) holds.
Lemma 3.10 Assume that D c RT is a bounded domain and the kernel R(x,y) of the operator R : H + H ,H = L2(D) is nonnegative definite and
Random Fields Estimation Theory
60
n.
continuous an x,y E Then the Hilbert space H+ generated by R (see formula (3.143)) is a Hilbert space with reproducing kernel R(x,y). Proof.
If u E H+ then u E Ran(R112) so that there is a v such that
u = R1I2v, v
E
H.
(3.157)
If we prove that the operator R1I2 is an integral operator:
~ ( x= ) R1'2~=
s,
T(~,y)v(y)dy
(3.158)
such that the function
(3.159) is continuous in lu(.)l
n, then (3.158) and (3.144) imply
5
II 21 II=
t ( x ) II R-l12U
II=
t ( 4 II I 4 II+
*
(3.160)
This is an estimate identical with (3.153), and we have proved that this estimate implies that H+ has the reproducing kernel K ( z ,y). To finish the proof one has to prove (3.158) and (3.159). Since D is bounded, the operator R : H -+ H with continuous kernel is in the trace class. This means that
cX j 4 j ( + q Y ) 00
R(G Y) =
(3.161)
j=1
where A 1 2 X2 2 ... > 0 are the eigenvalues of R counted according to their multiplicities, q5j are the normalized eigenfunctions
R(z,y)$j(y)dy = Aj4j(x),
(q5j,q5i)
= &j,
(3.162)
and do
TrR =
Xj =
R(x,x)dx < 00.
(3.163)
j=l
We will explain the second equality (3.163) later. The operator R'I2 has the kernel
T(GY) =
c
A:124j(z)q5;(Y),
(3.164)
Numerical Solution of the Basic Integral Equation in Distributions
61
which can be easily checked: the kernel of R is the composition: (3.165) Therefore (3.166) Therefore (3.158) and (3.159) are proved, and
t ( x ,x) = [R(z,x ) ] ~ ” .
(3.167)
Let us finally sketch the proof of the second equality (3.163). This equality is well known and the proof is sketched for convenience of the reader. It is sufficient t o use Mercer’s theorem : If R ( x , y ) is continuous and nonnegative definite, that is
s,s,
R(x,Y)h(Y)h*(x)dydx 2 0 Qh E L2(D)7
(3.168)
then the series (3.161) converges absolutely and uniformly in x D, If Mercer’s theorem is applied to the series (3.161) with z = y , then (3.169) Thus, (3.163) holds. To prove Mercer’s theorem, note that the kernel &(x,y) := R(z,y ) C;‘, A j + j ( x ) @ ( y ) is nonnegative definite for every n. Therefore R , ( z , z ) 1 0 so that n
Xjl4j(x)I2 5 R(z,x) Qn.
(3.170)
j=1
Therefore the series CE, X j l + j ( ~ > 15~ R ( x , x ) 5 c converges and c does not depend on x because R ( x ,x) is a continuous function in D.Thus, the series
c 00
R(z79) =
j=1
Xj+j(”>+j*(Y)
(3.171)
62
Random Fields Estimation Theory
converges uniformly in
2:
for each y E B.Indeed:
n
5
C X X ~ ~ ~0 ~as( Ym,n) ( ~ t
t 00.
(3.172)
m
Take y = z in (3.171) and get 00
R ( z , z )= CAjl4j(.)l2.
(3.173)
j=l
Since R(z,y ) is continuous in x D, the functions &(z) are continuous in D. By Dini’s lemma the series (3.173) converges uniformly in z E Therefore the series (3.173) can be termwise integrated which gives (3.169). Lemma 3.10 is proved. 0
n.
Exercise. Prove Dini’s lemma: i f a monotone sequence of continuous functions on a compactum D C R’ converges to a continuous function, then it converges uniformly. In Definition 3.1 we assume that the space H+ with reproducing kernel consists of functions u(x) whose values at a point are well defined. This excludes spaces L 2 ( D ) ,for example. If the definition (3.150) is understood in the sense that both sides of (3.150) are equal as elements of H+ (and not pointwise) then the spaces of the type L2 can be included. However, in general, for such kind of spaces the reproducing kernel is not necessarily an element of the space. For example, if H+ = L 2 ( D )then (3.150) implies that K ( z ,y ) is the kernel of the identity operator in L 2 ( D ) . But the identity operator in L 2 ( D ) does not have a kernel in the set of locally integrable functions. In the set of distributions, however, it has kernel S(z- y ) , where S(z) is the delta-function. As the kernel of the identity operator in L 2 ( D ) the delta-function is understood in weak sense:
//+ D
D
-Y)f3ZdY =
s,
f(z)g(z)dz, v l f , g E L 2 ( D ) .
(3.174)
Remark 3.4 W e have seen that the operator R in L 2 ( D ) , D C R’ is bounded, with continuous nonnegative definite kernel R(z,y ) , belongs to the trace class. Therefore R112 is a Hilbert-Schmidt operator. One can prove that such a n operator is a n integral operator without assuming that R1I2 2 0. A linear operator A : H t H o n a Hilbert space is called a
Numerical Solution of the Basic Integral Equation in Distributions
63
Hilbert-Schmidt operator if Cj”=, 11 A4j [I2< 00 where { r $ j } , 1 5 j < 00, is an orthonormal basis of H . Pick an arbitrary f E H , f = Cj”=,( f ,~ $ ~ ) q 5 ~ . Consider
Let H = L 2 ( D ) . Then (3.175) can be written as
Af =
s,
(3.176)
Y)f(Y)dY
with 00
~ ( zY), :=
C aji+;(Y)4i(z),
aji := (+j1
A*+i).
(3.177)
i,j=l
One can check that the series (3.177) converges in L2(D)x L2(D) and (3.178)
Indeed, by Parseval’s equality one has 0 0 0 0
00
i=l j=1
j=1
One can prove that the sum (3.179) does not depend on the choice of the orthononnal basis of H .
This page intentionally left blank
Chapter 4
Proofs
In this chapter we prove all of the theorems formulated in Chapter 2 except Theorem 2.3, the proof of which is given in Section 8.3.2.10 as a consequence of an abstract theory we develop.
4.1
Proof of Theorem 2.1
In order to make it easier for the reader to understand the basic ideas, we give first a proof of Corollary 2.2, which is a particular case of Theorem 2.1. This case corresponds to the assumption P(A) = 1, and in this case the transmission problem (2.18 - 2.20) reduces to the exterior Dirichlet boundary problem (2.22 - 2.23).
Proof of Corollary 2.2 Equation (2.12)
holds if and only if
(Rh,4) = (f,4) v4 E I?--*
(4.2)
Since smooth functions with compact support in D are dense in H--a equation (4.2) holds if and only if
(%4)
= (f,4)
v4 E H,m(D), m L -a,
(4.3)
where Hr(D)is the Sobolev space of functions defined in the domain D with compact support in D. Let us take 4 = Q(C)$, $ E Cr(D).The function 4 E H F ( D ) if the coefficients of the operator L belong to H m ( D ) . 65
Random Fields Estimation Theory
66
We will assume that these coefficients are sufficiently smooth. Sharp conditions of smoothness on the coefficients of the operator L: are formulated in the beginning of Section 2.2. Let us write (4.3) as
(Q(L:Ph+) = (Q(L:>f,+) WJE C?(D).
(4.4)
By the assumption
Q ( L ) R= 6 ( -~ Y),
(4.5)
equation (4.4) reduces to (h,$J)= (Q(L:C)f,$J)
WJE C?'(D).
(4.6)
This means that the distribution h equals to Q ( L )f in the domain D (which is an open set). If f is smooth enough, say f E H q s , then the obtained result says that sing supp h = d D = r,
(4.7)
since in D the distribution h is equal to a regular function Q ( L )f in D. In order to find h in we extend f from D to RT so that the extension F has two properties:
a,
F
is maximally smooth
(4.8)
and
Q(L:)F = O
in R.
(4.9)
Requirement (4.9) is necessary because the function h = Q ( L ) F has to have support in Requirement (4.8) is natural from two points of view. The first, purely mathematical, is: the requirement (4.8) selects the unique solution to equation (4.1) of minimal order of singularity, the mos solution to (4.1). The second point of view is of statistical nature: only the mos solution to equation (4.1) gives the (unique) solution to the estimation problem we are interested in (see formula (2.105)). Let F = u in 52. Then (4.9) says that
n.
Q(L:)u=O in 0.
(4.10)
Since F = f in D , condition (4.8) requires that (4.11)
Proofs
67
where N is the outer normal to r, and one cannot impose more than f boundary conditions on u since the Dirichlet problem in R allows one to impose not more than conditions on I?. Finally one has to impose the condition (4.12)
u(00)= 0.
Indeed, one can consider F as the left hand side of equation (4.1) with h E H - a . In this case it is clear that condition (4.12) holds since R(x,y) + 0 as 1x1 -+ 00. The Dirichlet problem (4.10)-(4.12) is uniquely solvable in H " ( a ) iff E C"(B), I' E C", and the coefficients of C are C". If I'and the coefficients a j ( z ) of C are C", but f E H " ( D ) , m 2 Q! = f ,then the solution u to the Dirichlet problem (4.10)-(4.12) belongs to Hm(D) n H"(R). We assume that a j ( z ) E C" and I' E C". This is done for simplicity and in order to avoid lengthy explanations of the results on elliptic regularity of the solutions and the connection between smoothness of r and of the coefficients of C with the smoothness of the solution u to the problem (4.10)-(4.12). The uniqueness of the solution to the Dirichlet problem (4.10)-(4.12) follows from the positivity of Q(A): the quadratic form (Q(C)u,u),y(n)= 0 if and only if u = 0 provided that u satisfies conditions (4.11) with f = 0. If u is the unique solution to the problem (4.10)-(4.12) then (4.13) Indeed, F E HP,,( R'), and F E H a (R') since u decays at infinity. Since F E H a ( R r ) and ordQ(C) = qs = 2a, one has
h ( z )= Q ( C ) F E R - " ( D ) .
(4.14)
Corollary 2.2 is proved.
Remark 4.1 Consider the operator C = -A+a2, a > 0, in L 2 ( R 3 )with the domain of definition H 2 ( R 3 ) . The operator C i s elliptic selfadjoint and positive definite in Ho = L 2 ( R 3 ) :
(cu,~ 2 u'(u,u), )
a > 0,
vu E HO.
(4.15)
Random Fields Estimation Theory
68
The Green f i n c t i o n of C, which i s the kernel of the operator C-I, i s (4.16)
I t decays exponentially as 1x1 + 00. Exponential decay as 1x1 -+ 00 holds f o r the Green function of & = -A q in HO i f C has property (4.15), in particular i f q(x) 2 go > 0.
+
Exercise. Is it true that if C is an elliptic selfadjoint operator in HO = L2(R")and Q(X) > 0 for all X E R1 is a polynomial, then the kernel of the operator [Q(&)]-l decays exponentially as 1x1 + 00? Proof of Theorem 2.1 Let us start with rewriting equation (4.1) with R E R in the form (4.17)
where S(z, Y ) :=
Q-l(X)@(x, Y , X)dp(X).
(4.18)
Equation (4.17) can be written as
s,
S(S, Y)h(!l)d!/= 9
+V(Z),
(4.19)
where g is a fixed solution to the equation
P(C)g=f
in D
(4.20)
and v is an arbitrary solution to the equation
P(L)v=O in D.
(4.21)
Equation (4.19) is of the form considered in the proof of Corollary 2.2 with g o in place of f . Applying the result proved in this corollary, one obtains the following formula:
+
where
U
in R,
(4.23)
Proofs
69
and
Q(L)u=O in
a,
u(co)=O.
(4.24)
Here g is a particular solution of (4.20) and v is an arbitrary solution to (4.21). Formula (4.22) gives the unique solution of minimal order of singularity, mos solution, to equation (4.1) if and only if G is maximally smooth. If f and are sufficiently smooth, the maximal smoothness of G is guaranteed if and only if the following transmission boundary conditions hold on r: on
(4.25)
Given the orders of the elliptic operators ord P ( L ) = p s ,
ord Q(L)= qs,
(4.26)
9
one cannot impose, in general, more than conditions of the form conditions then the (4.25). We will prove that if one imposes transmission problem (4.20), (4.21), (4.24), (4.25) is uniquely solvable and G E H"(p+Q)/'(R'). Therefore the mos solution h to equation (4.1), given by formula (4.22) in which G has maximal smoothness: G E H S ( p + q ) / 2 ( R r ) , has the minimal order of singularity:
In order to complete the proof one has to prove that the transmis sion problem (4.20), (4.21), (4.24), (4.25) has a solution and its solution is unique. This problem can be written as
P ( L ) G = f in D
(4.27)
Q(L)G=O in R
(4.28)
G(w) = 0
(4.29) (4.30)
+
where and - in (4.30) denote the limiting values on r from D and from respectively.
70
Random Fields Estimation Theory
First let us prove uniqueness of the solution to the problem (4.27)-(4.30). Suppose f = 0. The problem (4.27)-(4.30) is equivalent to finding the mos solution of equation (4.1). Indeed, we have proved that if h is the mos solution to (4.1) then h is given by formula (4.22) where G solves (4.27)-(4.30). Conversely, if G solves (4.27)-(4.30)) then h given by formula (4.22) solves equation (4.1) and has minimal order of singularity. This is checked by a straightforward calculation: for any 4 E Cr(D)one has
where we have used the formula
and the selfadjointness of P ( L ) . Formula (4.31) implies that
Rh = RQ(C)G= f in D.
(4.33)
Equations (4.22)-(4.24)imply that supph c D. It follows from (4.22) and the inclusion G E HS(p+Q)/2(R') that h E f i - a ( D ) , a = (q - p)s/2. Thus, h given by (4.22) with G given by (4.27)-(4.30)solves we have checked that equation (4.1) and belongs to &-a(D), that is h is the mos solution to (4.1).
Exercise. Prove uniqueness of the solution to problem (4.27)-(4.30) in Hs(P+Q)/2 (R T )by establishing equivalence of this problem and the problem of solving equation (4.1) in f i - a ( D ) , and then proving that equation (4.1) has at most one solution in fi-"(D). Hint: If h E H - a and Rh = 0 in D , then
so that h = 0. Let us prove the existence of the solution to the problem (4.27)-(4.30) in Hs(p+Q)/2(R'). Consider the bilinear form
(4.35) defined on the set V := HS(p+Q)12(Rr) n Hsq(R) of functions which satisfy
71
Proofs
the equation: Q(L)4=0
Since P(A)Q(A) 2 c
in 52.
(4.36)
> 0 VA E R1, one has the norm (4.37)
which is equivalent to the norm of Hs(P+q)/2(RT).Indeed (4.38) Therefore
5 d2
/
+
(1 X2)(p+q)/21$12dp(A). (4.39)
A
On the other hand, (4.40) This proves that the norm (4.37) is equivalent to the norm of the space HS(P+q)l2(RT). Consider the form
(4.41) where Parseval’s equality was used. Let W be the Hilbert space which is the completion of V in the norm (4.37). For any f E H a ( D ) ,a = s(q - p ) / 2 , the right-hand side of (4.41) is a linear bounded functional on W . Indeed, extend f to all of RT so that f E Ha(RT) and use Parseval’s equality and
72
Random Fields Estimation Theory
the equation Q(L)11,= 0 to obtain
I II f llffll 11, llwl c II 4 IIW'
(4.42)
where c =I1 f is a positive constant which does not depend on 11, E W . According to Riesz's theorem about linear functionals one concludes from (4.41) and (4.42) that
Itf
where T : H" -+ W is a bounded linear mapping. The function
G=Tf, GEW,
(4.44)
is the solution to problem (4.27)-(4.30) in HS(P+Q)I2(RT). The last statement can be proved as follows. Suppose that G E W satisfies equation (4.41) for all 11, E V . Then G E H S ( P + 9 ) / 2 ( R P so) that equations (4.30) and (4.29) hold, and G solves equation (4.28). In order to check that equation (4.27) holds, let 11, E C r ( R T )in (4.41). This is possible since Cr(RT)c V . Then
s,
{ P ( L ) G- f} q*dy = 0 Vq = Q(L)11,, $ E Cr(RT).
(4.45)
If the set of such q is complete in L 2 ( D ) ,one can conclude from equation (4.45) that equation (4.27) holds. To finish the proof of Theorem 2.1, let us prove that the set {Q(L)$}V$ E CF(R")is complete in L 2 ( D ) . But this is clear because even the smaller set of functions
{Q(L)$} V $ E C " ( D ) , #;11,=0
on I?,
O l j
9s 5-1 2
(4.46)
is dense in L 2 ( D ) . Indeed, the operator Q(L) is essentially selfadjoint in L2(D)on the set
{ + : $ c c w ( D ) , aj,$=o
on
r,
0 5 j < -qs2
l}.
(4.47)
Proofs
73
That is, the closure of Q(C) with the domain (4.47) is selfadjoint in L2 (D): it is the Dirichlet operator Q(C) in L2(D). Since Q(C) is positive definite on the set (4.47), its closure is also a positive definite selfadjoint operator in L 2 ( D ) . Therefore, the range of the closure of Q(C) is the whole space L 2 ( D ) ,and the range of the operator Q(C) with the domain of definition (4.47) is dense in L 2 (D ).This completes the proof of Theorem 2.1.
a Proof of Theorem 2.2
4.2
Let us first prove a lemma of a general nature.
Let
Lemma 4.1
r
and assume that the kernel R ( x ,y) defines a compact selfadjoint operator in H = L 2 ( D )for any bounded domain D c R'. Let Xj(D) be the eigenvalues of R : L 2 ( D )4 L 2 ( D), R#Jj = Xj(o)4j
(4.49)
and A f ( D ) be the positive eigenvalues, ordered so that X f ( D ) 2 A$ 2 ...
(4.50)
and counted according to their multiplicities. Then Af(D2) 2 Xf(D1) V j provided that Dz 3 D1. Proof.
(4.51)
By the well-known minimax principle one has
Xf(D2)=
min
dJi,...,dJj-i
Here (u,v), :=
max
(cp,+i)z=o. 16i6j-1 (b1b)2=1
(R#J,#J)2 := m$npj($).
(4.52)
so, u v * d ~ ,
m = 1,2, and
'J")
:= ( $ , + , ) 2 = 0max , 1
(R4,4)2.
(4.53)
(b1b)2=1
Let D3 := D2 \ D1. If we assume that an additional restriction is imposed on in formula (4.52), namely #J
#J=O
in D3
(4.54)
74
Random Fields Estimation Theory
then p j ( $ ) cannot increase: max
Vj($) :=
(0.@,)2=0,
15a5j-1
(4,4)2=1,4=0
(R4,4 1 2 I Pj ($1.
(4.55)
D3
in
On the other hand, .j($>
=
(O,@i)l=Ot
15isj-1
(R4,4)1.
(4.56)
(4,4)1=1
Since
*
A f( D 1 )= minvj($) 5 minpj($) 11
= Xj(D2),
one obtains (4.51). Lemma 4.1 is proved. Assume now that D expands uniformly in directions, D t o find out if the limit lim X1(D) := X1,
D-+Rr
(4.57)
0 + R'.
We wish (4.58)
exists and is finite. We assume that (4.59)
where R ( X ) > 0 is a continuous function which vanishes at infinity. This implies that the operator R : L 2 ( D )+ L 2 ( D )with kernel (4.59) is compact, as follows from
Lemma 4.2
If R(X) is a continuous function such that lim
A+*,
R(x)= 0,
(4.60)
then the operator R : L 2 ( D )+ L 2 ( D ) ,where D c R' is a bounded domain, with the kernel (4.59) is compact in H = L 2 ( D ) . Proof. Given a number 1x1 > N such that
E
> 0, find a continuous function re(X)= 0 for
Here the number N is chosen so large that
Proofs
75
Denote by R, the operator in L 2 ( D )whose kernel has the spectral density r,(X). Let P denote the orthoprojection in L2(Rr)onto L 2 ( D ) . If R, is the operator with kernel (4.59) ( considered as an operator in L2(R')), then
R
= PR,P
(4.63)
R,
= PR,,P
(4.64)
and
with the same notation for REm. One has
11 R - Re 1(511Rcm - R e ,
Here one has used the fact that the norm of the operator L2(R') with the kernel r(2, Y) =
s,
."(A)%
(4.65)
115 6.
Y,X)dP(X)
T
: L2(R")-+
(4.66)
is given by the formula
Indeed
This proves the inequality
I1 7- 1 1 1y:,
I."(X).
(4.69)
In order to establish the equality (4.67), take the point XO at which the function Ir(X)l attains its maximum. Such a point does exist since the function Ir(A)l is continuous and vanishes at infinity. Then find a +(z), 11 4 l l p ( ~ r ) =1, such that = 0 for IX-XoI > b, where b > 0 is an arbitrary small number. Then, using continuity of ."(A), one obtains
6
=m a If(A)I - q(6) XEA
(4.70)
76
Random Fields Estimation Theory
where ~ ( 6 is ) arbitrarily small if S > 0 is sufficiently small. From (4.69) and (4.70) formula (4.67) follows. From (4.67) and the obvious inequality (1 P (151 one obtains (4.65). If one can prove that the operator R, is compact in L2(D )then Lemma 4.2 is proved, because R can be approximated in the norm by compact operators R, with arbitrary accuracy according to (4.65). Let us prove that the operator R, is compact in L2 (D).One has
Let 11 f IILZ(Rr)< 1. Taking into account that equality, one obtains
La = A@ and using Parseval’s
IrdX)12 II f
= fNN X2Ir,(X)l2lfl2dp(X) I
L c(N).
II$(Rr)
(4.72)
Let us now recall the well-known elliptic estimate ([Hormander (1983-85)]):
II w IIH~(D~)Lc ( D 1 , ~ 2(I1) Lw IILZ(D~) + II w I I L ~ ( D ~ ) )
(4.73)
which holds for the elliptic operator L,ordL = s, and for arbitrary bounded domains D1 c D2 c R‘, D1 is a strictly inner subdomain of D2. Fkom (4.72) and (4.73) it follows that
II ‘w IIHa(D)<
c Vf
E
B1 := {f
:II f
IILz(Rr)<
1} 1
(4.74)
where c > 0 is a constant which does not depend on f E B1. Indeed, the estimate for 11 Lw IltZ(Rr) is given by formula (4.72), and the estimate for 11 w l l L z ( R r ) is obtained in the same way. Inequality (4.74) and the imbedding theorem (see Theorem 8.1) imply that the set { R e f }is relatively compact in L 2 ( D ) . Therefore the operator R, maps a unit ball in L2 (D) into a relatively compact set in L 2 ( D ) .This means that R, is compact in 0 L2( D ) .Lemma 4.2 is proved. In order to proceed with the study of the behavior of Xl(D) as D -+ R’ let us assume that (4.75)
Lemma 4.3 If the kernel R(x,y) defines, for any bounded domain D c R‘, a selfadjoint compact nonnegative operator in L2 (D), and condition
Proofs
77
(4.75)holds, then the limit (4.58) exists and XI,
I A.
(4.76)
Proof. From Lemma 4.1 one knows that X1(D) grows monotonically as D increases in the sense (4.51). Therefore, existence of the limit (4.58) and the estimate (4.76) will be established if one proves that
Xi(D) 5 A for all D
c R'.
Let
(4.77)
= Xl(D)$l. One has
;:irJ,. IR(z7y)IdySUP I41(Y>I.
I
(4.78)
UED
Therefore inequality (4.77) is obtained. Lemma 4.3 is proved.
0
Let us now prove Theorem 2.2.
Proof of Theorem 2.2 We need only prove formula (2.29). Take 41 in Lemma 4.3 with 11 41 l l ~ z ( ~ ) 1. = Extend 41 to all of R' by setting $1 = 0 in R. Then, using Parseval's equality, one obtains:
5 yEyfi(A) 11 41 llL2(Rr)'
yEyfi(A).
(4.79)
Choose $1 with support in a small neighborhood of the point XO at which R(A) attains its maximum. Then X(D) 2 rnaxxR(X) - E , where E > 0 is an arbitrary small number. This proves formula (2.29) and Theorem 2.2 in which w (X ) stands for R( A). 0 We now discuss some properties of the eigenvalues Xj(D). Suppose that R ( z , y ) = R(x - y), R(-z) = R(z), and the domain D is centrally symmetric with respect to the origin, that is, if z E D then -z E D. Let us recall that an eigenvalue is called simple if the corresponding eigenspace is one dimensional.
Random Fields Estimation Theory
78
Lemma 4.4 I f X is a simple eigenvalue of the operator R : L 2 ( D ) -+ L 2 ( D ) with the kernel R(z - y), R(-z) = R ( z ) , and D is centrally symmetric, then the corresponding eigenfunction $
Rc$ = A$
(4.80)
is either even or odd. Proof.
One has
=
s,
R(z - z ) 4 ( - z ) d z .
(4.81)
Here we set y = --3 in the second integral and used the assumption of central symmetry. Therefore 4( -z) is the eigenfunction corresponding t o the same eigenvalue A. Since this eigenvalue is simple, one has $(-z) = c $ ( z ) , c = const. This implies c$(z) = c ~ ( - z )so , that c2 = 1. Thus c = f l . If c = 1 then 4(z) is even. Otherwise it is odd. Lemma 4.4 is proved. 0
Remark 4.2 If X is not simple, the corresponding eigenfunction may be neither even nor odd. For example the operator (4.82)
has an eigenvalue X = 0. The corresponding eigenspace is infinite dimensional: it consists of all functions orthogonal to 1 in L 2 ( - x , x ) . In particular, the function cosy sin y is an eigenfunction which is neither even nor odd and it corresponds to the eigenvalue X = 0. Suppose one has a family of domains D t , 0 < t < 00, such that Dl = D , and Dt = {z : z = t E , < E 01). Then the eigenvalues Xj(Dt) := X j ( t ) depend on the parameter t and one can study this dependence. If one writes
+
then one sees that X j ( t ) are the eigenvalues of the operator R(t) in L 2 ( D ) with the kernel R ( J ,q, t ) := tTR(tJ, tq), where D = D1 does not depend on
Proofs
79
t. This implies immediately that A j ( t ) depend on t continuously provided that
11 R(t’) - R(t) 114 0
as t
-+
t’.
(4.83)
Indeed, maxJAj(t0- Aj(t>l 3
<]I
R(t)- R(t’) 1) .
(4.84)
Estimate (4.84) follows from the minimax principle and is derived in Section 8.3. Condition (4.83) holds, for example, if
for any bounded domain D C R‘. One can also study differentiability of the eigenvalues A j ( t )with respect to parameter t using, for example, methods given in [K]. This and the study of the eigenfunctions as functions of the parameter t would lead us astray.
Proof of Theorems 2.4 and 2.5
4.3
Proof of Theorem 2.4 This proof can be given in complete analogy to the proof of Theorem 2.1. On the other hand, there is a special feature in the one-dimensional (T = 1) theory which is the subject of Theorem 2.4. Namely the spaces of all solutions to homogeneous equations
P(C)q5= 0
(4.86)
QWlCI= 0
(4.87)
and
are finite dimensional (in contrast to the case when T > 1). The system (2.43)-(2.44), for example, is a linear algebraic system. Therefore, existence of the solution to this system follows from the uniqueness of this solution by Fredholm’s alternative.
Let us briefly describe the basic steps of the proof. Step 1. Consider first the case when P(A)= 1.
Random Fields Estimation Theory
80
Lemma 4.5
The set of all solutions of the equation
Rh :=
/t-T
R(z,y ) h ( y ) d y = f (z), t - T 5 z 5 t
(4.88)
with the kernel R(z,y) E R and P(A) = 1 in the space H - s Q := fi-"q(D>, D = (t - T ,t ) , is in one-to-one correspondence with the set of the solutions of the equation
I
00
R ( z , y ) h ( y ) d y = f (Z)l
2
E R17
(4.89)
J-00
where h =Q
h E H-'q(R1) and supph
(
W
(4.90)
7
c D = [t- T ,t ] . Here x
(4.91)
b i are arbitrary constants, and the functions {@}, 1 5 j 5 q s / 2 , form a fundamental system of solutions to equation (4.87) such that $T(+Oo)
Proof. Let h E H-4' q5 E C r ( R 1 ) one has
= 0,
$Y(-Oo) = 0.
(4.92)
be a solution to (4.88). This means that for any
(Rh,4) = ( F ,4
7
vq5 E cr(R1)7
(4.93)
where the parentheses in (4.93) denote the L 2 ( R 1 )inner product and F is given by (4.91). One can say that F is defined to be the integral Rh for x > t and x < t - T (see formula (4.88)). This integral for h E H-Q' can be considered as the value of the functional h on the test function R(z,y), since, for a fixed Z,R(x,y) E HEc outside of an arbitrarily small neighborhood of the point Z. Since Q(L)R= d(x - y), the kernel R does not belong to Hqs, however, &(z - y) can be interpreted as a kernel of the identity operator in L2, so that (S(Z - y), q5) = in the sense that (d(Z - Y M 7 $1 = ($7 $1 v47 E L2. Since h E H-q' one can consider it as an element of H - q 5 ( R 1 ) . Equation (4.93) then is equivalent to (4.89). The correspondence mentioned in Lemma 4.5 can be described as follows. If h E H-q'(R'),
+
a1
Proofs
supph E [t- T ,t], and h solves equation (4.89), then h = h solves equation (4.88). Conversely, if h E h-qs solves equation (4.88), then h = h is an element of H-Q”(R1)with supph E [t- T,t ] ,and h solves equation (4.89). The constants :b in (4.91) can be uniquely determined by the given h from the formula
Lemma 4.5 is proved. Let
[:
Example 4.1
(4.94)
e-l”-”lh(y)dy = f(z), -1 I zI 1.
Here L: = -i&, T = 1, P(X) = 1, Q(X) = *, Q(L) = a = ddz . (See Section 2.4.) Equation (4.87) has the solutions $- = exp(z),
(-a2 + 1)/2,
$+ = exp(-z).
(4.95)
Choose F by formula (4.91) with some b*, qs = 1, t = 1, t - T = -1. Then I
h = -(-a2 2 -
+
2
+
+l ) F +
- 1) [ f ( l )
- b+e]
+ S(z - 1) [f’(l) - b+e]
+ l)[b-e - f(-1)] + S(z + l)[-f’(-1)
- b-el.
(4.96)
For convenience of the reader let us give all the details of the calculation.
By definition of the distributional derivative one has b+
(Q(L:)F,4 ) = ( F ,Q ( 0 4 ) = 2
/
1
O0
exp(z)(-4”
After integrating by parts two times one obtains
+ 4Pz
Random Fields Estimation Theory
82
Formula (4.97) is equivalent to (4.96). Corollary 2.5 follows from Lemma 4.5 immediately. Indeed, the solution of minimal order of singularity one obtains from formula (4.90) if and only if the constants bj' are chosen so that F has minimal order of singularity, that is, if F is maximally smooth. This happens if and only if the following conditions hold:
[F(j)(t)= ] 0,
[FC1)(t- T ) ] = 0,
0 Ij
I9s - 1. 2
(4.98)
- 0)
(4.99)
Here, for example,
[F(3)(t)]:= F q t + 0) - F q t
is the jump of F ( j ) ( z )across point t. If conditions (4.98) hold, one can rewrite formula (4.90) as formula (2.47). Exercise. Check the last statement.
Hint: The calculations are similar to the calculations given in Example 4.1. Step 2. Assume now that P(X) $ 1. Write equation (4.88) as rt
(4.100) where (4.101) Rewrite equation (4.100) as (4.102) 1 5 j 5 ps, is the fundamental system of solutions to equation (4.86), go is a particular solution to the equation
Here
{$j},
P(C)g = f, t - T 5 z 5 t ,
(4.103)
and cj, 1 5 j 5 p s , are arbitrary constants. Apply to equation (4.102) the result of Lemma 4.5, in particular, use formula (4.90) to get the conclusion of Theorem 2.4. Equations (2.43)-(2.44)
Proofs
83
are necessary and sufficient for G, given by formula (2.38)’ to have minimal order of singularity and, therefore, for h, given by formula (2.37)’ to have minimal order of singularity. As we have already noted, the solvability of the system (2.43)-(2.44) for the coefficients c j , 1 5 j 5 p s , and b:, 0 5 j 5 f - 1, follows from the fact that the homogeneous system (2.43)(2.44) has only the trivial solution and from Fredholm’s alternative. The fact that for f ( x ) = 0 the system (2.43)-(2.44) has only the trivial solution can be established as in the proof of Theorem 2.1. Corollary 2.1 follows from formulas (2.15)-(2.20) immediately. Exercise. Give a detailed proof of the uniqueness of the solution of the homogeneous system (2.43)-(2.44). Hint: A solution to this system generates a solution h to equation (4.88) with f = 0, h E $l-a.Use Parseval’s equality to derive from
Rh=O
t-T<X
hEHFa
(4.104)
that h = 0. If h = 0 then c j = bj’ = 0 for all j .
Proof of Theorem 2.5 The proof is similar to the proof of Theorem 2.4. Equation (2.65) rt
where g(x) is the right hand side of equation (2.65), can be written as
L 00
S ( X p)h(y)dy , = G(x),
-00
< x < 00,
(4.106)
where G ( x ) is given by (2.67) and supph
C [t- T ,t ] .
(4.107)
There is a one-to-one correspondence between solutions to equation (4.105) in 7 k a and solutions to equation (4.106) in ‘FI-*(R1) with property (4.107). Equation (4.106) can be solved by the formula h = Q(L)EG,
(4.108)
and h given by formula (4.108) has property (4.107). This h has minimal order of singularity 5 (I! = The constant vectors b: and c j solve linear algebraic system (2.68-2.69). That this system is solvable follows
y.
Random Fields Estimation TheonJ
84
from Fredholm's alternative and the fact that this system has at most one solution. The fact that this system has at most one solution follows, as in the proof of Theorem 2.4, from Parsed's equality and the positivity of the kernel R(x, y). Theorem 2.5 is proved.
0 4.4
Another approach
The remark we wish to make in this section concerns the case of the positive definite kernel R(x, y) which satisfies the equation
Q(x, ~ ) R ( xY) , = P(x,W ( X - Y),
(4.109)
where Q(x, a) and P(x, a) are elliptic positive definite in L 2 ( R r )operators, not necessarily commuting, ordQ(z, a) = n, ordP(z, a) = m, m < n, and R(z,y) 4 0 as IX - y J ---f 00. As in Section 4.1, we wish to study the equation
Rh= f,
XED
(4.110)
y,
with f E H"(D), a := to prove that the operator R : h-"(D) + Ha(D) is an isomorphism, and to give analytical formulas for the solution h of minimal order of singularity. Consider equation (4.110) as an equation in L2(Rr) Rh = F,
x E R',
(4.111)
with
f inD u in52:=Rr\D,
(4.112)
where Q(z,d)u = 0 in 0.
(4.113)
Equation (4.113) follows from equation (4.109) and the assumption that supph c D.The function F has minimal order of singularity if and only if u solves (4.113) and satisfies the boundary conditions n (4.114) a"Nu=8Nf on r, O < j < - - l , u(00)=0. 2
Proofs
85
The problem (5)-(6) is the exterior Dirichlet problem which is uniquely solvable if Q is positive definite. If u solves problem (4.113)-(4.114) then F E H ( n - m ) / 2 ( R T )suppQ(z,a)F , c Dl Q F E H - ( n + m ) / 2 ( R r ) Apply . Q(z, a) to both sides of equation (4.111) and use equation (4.109) to get
P ( x ,a)h = Q ( x ,d ) F , QF E H-(n+m)/2(RT),suppQF c D. (4.115) The solution to (4.115) in the space f i - ( n - m ) / 2 ( D exists, ) is unique, and is the solution to equation (4.110) of the minimal order of singularity. More details are given in Appendix B.
This page intentionally left blank
Chapter 5
Singular Perturbation Theory for a Class of F'redholm Integral Equations Arising in Random Fields Estimation Theory A basic integral equation of random fields estimation theory by the criterion of minimum of variance of the estimation error is of the form Rh = f, where Rh = R ( z ,y)h(y)d y , and R(z,y ) is a covariance function. The D
singular perturbation problem we study consists of finding the asymptotic behavior of the solution to the equation E ~ ( z E)+Rh(x, , E ) = f(x),as E + 0, E > 0. The domain D can be an interval or a domain in R", n > 1. The class of operators R is defined by the class of their kernels R(x,y) which solve the equation Q(z,D,)R(z, y) = P(zlDz)d(z - y), where Q(z, D z ) and P(x,D,) are elliptic differential operators. The presentation in this chapter is based on [Ramm and Shifrin (2005)l.
5.1
Introduction
Consider the equation
Eh(z,E)t- Rh(x,E)= f(z), z E D c R"
,
(5.1)
where D is a bounded domain with a sufficiently smooth boundary aD, and
J
D
In this paper we study the class R of kernels R(x,y) which satisfy the equation
87
Random Fields Estimation Theory
88
and tend to zero as Iz - y( -, 00, where Q(z, D,)and P ( z ,0,) are elliptic differential operators with smooth coefficients, and b(x - y) is the deltafunction. For technical reasons below we use the kernels R(a,y) of the same class, but written in a slightly different form (see (5.5)). Specifically, we write
R(z,Y) = P ( Y , D,)G(z, Y)
(5.2)
where
aa(y)D; ,
P(Y, Dy) =
b p ( ~ ) @, P
Dz)=
Q(2,
< q,
(5.3)
18114
l 4 l P
and
Note that
&(a,Dz)R(z,Y) = P(Y, Dv)b(z - Y) .
(5-5)
In this paper all the functions are assumed to be real-valued. We assume that the coefficients a,(z), b p ( z ) and f(z)are sufficiently smooth functions in R", Q = ( ( Y I , . " ,an) and p = ( P 1 , . - . , p n ) are multiindices, la1 = n n alal 3181 . Sufficient C ~ i IPI , = C P j , Dy*= . . .ay:, ' D,P = i=l
j=1
azp . . . azk
aY:'
smoothness of the coefficientsmeans that the integrations by parts we use are justified. The following assumptions hold throughout the paper: A1) (Q(s, D,)cP,'P) 2
CI(CP, 'P)
(P(., D,)cp, cp) 2 c2(cp,cp) ,
, c 1 = const > 0 V V ( ~ E) C r ( R n )
c2 = const
7
>0
v'cp(z) E
cr(Rn),
(5.6) (5.7)
where (., .) is the L2(Rn)inner product, and L2 is the real Hilbert space. By Q*(z, D,)and P*(z,D,)the operators formally adjoint to Q(z, D,) and P ( z ,D,) are denoted. If (5.6) holds, then q > 0 is an even integer, and (5.7) implies that p is an even integer, 0 5 p < q. Define a := ( q - p ) / 2 . Let H x ( D ) be the usual Sobolev space and f i - x ( D ) be its dual with respect to L 2 ( D ) = H o ( D ) . Denote Ilcpllx = IIPIIHACD, for X > 0 and Ilcpllx = IlcpllgAcD, for X < 0. Let us denote, for the special value X = a, H a ( D ) = H+, H - a ( D ) = H-. Denote
Singular Perturbation Theory
89
by (hl,h2)- and by (-,.) the inner products in H- and, respectively, in L 2 ( D ) . As in Chapter 8, let us assume that
c3 =const > O,Vcp(z) E CF(R"). (5.8) This assumption holds, for example (see Chapter 8) if
A2)
~3(IcpIl?
I (Rcp,cp)5 ~4llcp11?,
~sllcpll(p+q)/2 I
IlQ*cpII-a 5 c~llcpll(~+~)/2 , c5 = const > o , Vcp(.)
E CoOO(R")7
(5.9)
and
C7IIVII(p+q)/2I (f'Q*CP,CP)
I C811'Pll(p+q)/2,C7
= const
>0,
Vcp(z) E C?(Rn).
(5.10)
The following result is proved in Chapter 8. Theorem 5.1 If (5.8) holds, then the operator R : H- -+ H+ is an isomorphism. If QR = P6(x - y) and (5.9) and (5.10) hold, then (5.8) holds.
Equation (5.1) and the limiting equation Rh = f are basic in random fields estimation theory, and the kernel R(z,y) in this theory is a covariance function, so R(x,y) is a non-negative definite kernel:
( R V ,cp) 2 0 , V c p ( X ) E CoOO(Rn)* If p
<
q, then the inequality
(Rcp,cp) L C(cp,cp), C = const > 0 ,
Vcp(z) E C r ( R n )does not hold. In [Ramm and Shifrin (1991); Ramm and Shifrin (1993); Ramm and Shifrin (1995)] a method was developed for finding asymptotics of the solution to equation (5.1) with kernel R(x,y) satisfying equation (5.5) with Q(x, 0%) and P ( z ,Dz)being differential operators with constant coeffi-
cients. Our purpose is to generalise this theory to the case of operators with variable coefficients. In Chapter 8 the limiting equation Rh = f is studied for the above class of kernels. In Chapter 2 the class of kernels R(z,y), which are kernels of positive rational functions of an arbitrary selfadjoint in L 2 ( R n ) elliptic operator, was studied. In Section 5.2 we prove some auxiliary results. In Section 5.3 the asymptotics of the solution to equation (5.1) is constructed in case n = 1, that is, for one-dimensional integral equations of class R defined below formula
Random Fields Estimation Theory
90
(5.1). In Section 5.4 examples of applications of the proposed asymptotical solutions are given. In Section 5.5 the asymptotics of the solution to equation (5.1) is constructed in the case n > l , and in Section 5.6 examples of applications are given. 5.2
Auxiliary results
Consider now the case n = 1:
In this case D = (c,d ) , B = [c,d ] .
Lemma 5.2
If g(y) is a smooth function in B,then d
d
C
C
(5.17)
Singular Perturbation Theory
91
where
and
Kz=O if p = o .
(5.19)
Proof. Use definition (5.16) of P ( y , Dy) in (5.17), integrate by parts, and get formulas (5.17) - (5.19). Lemma 5.3
If g ( y ) is
a
smooth function in fs, then
d
d
C
(5.20) where
Proof. Similarly t o Lemma 5.2, integrations by parts yield the desired formulas. 0 Consider the case n
Lemma 5.4 inn,then
> 1.
If P ( y , Dy)is defined in (5.3) and g ( x ) i s a smooth function
D
(5.22) where
92
Random Fields Estimation Theory
Here aD is the boundary of D, y E aD, Nk(y) is the k - th component of the unit normal N to dD at the point y, pointing into D' := R" \ D , and if a k = 0 then the summation over ^(rc should be dropped. Proof.
Apply Gauss' formula ( i.e. integrate by parts ).
Lemma 5.5 inn, then
If Q ( x ,Dz)is defined in (5.3) and g(y) is a smooth function
where n
Pk
(5.25)
Here y E d D , and if P k = 0 then the summation over dropped.
Remark 5.1
For any smooth in
^/k
should be
function g(x), one has
Q ( x ,D,)Kjg(z) = 0
2
j = 1,2,
E
(c,d )
2
E D , j = 1,2.
(5.26)
and
Q(x,D,>Mjg(x) = 0 ,
(5.27)
Formulas (5.26) and (5.27) follow from the definitions of Kj and Mj and from equation (5.4).
Singular Perturbation Theory
5.3
93
Asymptotics in the case n = 1
To construct asymptotic solutions to equation (5.1) with R(z,y) E R we reduce this equation to a differential equation with special, non - standard, boundary conditions.
Equation (5.1) is equivalent to the problem:
Theorem 5.2
E Q ( ~Dz)h(z, , E ) + P*(z,D z ) h ( z , ~=) Q ( x , Dz)f(z)
2
7
E (c, d)
(5.28)
with the following conditions EKlh(z,E) - Kzh(z,E) = K , f ( z ) . Proof.
(5.29)
If h ( z ,E ) solves (5.1) and R(z,y) satisfies (5.2), one gets d
Eh(2,E ) +
/
m y , Q)G(z, Y)1
h(Y, dY = f(z)
*
(5.30)
C
From (5.30) and (5.17) one gets:
/ d
~ h ( z€1, +
G(z, Y) [P*(Y, D,)h(y, &)I dy + K2h(z,E ) = f(z).
(5.31)
C
Applying Q ( z , D z )to (5.31) and using (5.4) and (5.26), yields (5.28). Let us check (5.29). From (5.28) and (5.31) one gets: d
~ h ( zE ),
+
/
G(z, Y) Q(Y, 4)If (Y) - E ~ ( Y &>I , dy + Kzh(z,E ) = f(z>.
C
(5.32) From (5.32) and (5.20) one obtains
+Kl(f - c h ) ( z ,&)
+ KZh(Z,&)= ).(f
'
(5.33)
From(5.33) and (5.11) one concludes:
~ h ( 2E ),
+ f ( ~ -) Eh(2,E)+ K l f ( z )- EKlh(2, + K z h ( z , ~=) f(z)
This relation yields (5.29).
E)
Random Fields Estimation Theory
94
Let us now assume (5.28) and (5.29) and prove that h(z,E ) solves (5.1). Indeed, (5.2) and (5.17) imply d
+
c~(z,E)
/
d
R(z,y ) h ( y , dy
= E ~ ( z &) ,
C
+
/ [%,
D,)G(z, Y)] h(y, E ) dy
C
C
From (5.34) and (5.28) one gets
eh(z,&)
+ Rh(z,
E)
= Eh(2,E )
d
+/
[Q*(y,D,)G(z,y)l(f(y) - & h ( y , ~ ) ) d y + K i ( f - & h ) ( 2 , &+K2h(z1&). )
C
This relation and equation (5.11) yield: ch(z, E )
+ Rh(z,E ) = E h ( Z , E ) + f(z)+ K l f ( X ) - EKlh(Z, + K2h(z,
Eh(Z, E )
E)
E) >
0
and, using (5.29), one gets (5.1). Theorem 5.2 is proved.
This theorem is used in our construction of the asymptotic solution to (5.1). Let us look for this asymptotics of the form: 03
h(Z,E) = CE'(Ul(.)
l=O
03
(5.36)
+ W ( 2 , & ) ) = - p h l ( z , E )1 1=o
where the series in (5.36) is understood in the asymptotical sense as follows: L
h(2,E) = C & ' ( U l ( Z ) 1=o
+ Wl(Z,&)>+ O ( E L + l ) as
E
0
Singular Pertarbation Theory
95
where O ( E ~ + 'is) independent of 2 and ul(2) and w ~ ( z , Eare ) some functions. Here uo(2)is an arbitrary solution to the equation (5.37)
P*(2,Dz)u0(2)= Q(z,D z ) f ( ~ ) . If uo(z) is chosen, the function w O ( Z , E ) lution to the equation: €Q(Z, D,)WO(Z,E )
is constructed as a unique so-
+ P*(2,Dz)WO(~:, E) = 0
1
(5.38)
which satisfies the conditions
Theorem 5.3 tion
The function ho(2,E ) = uo(2)
+
WO(Z, E )
solves the equa-
Random Fields Estimation Theory
96
-cho(y,E)
+ EUO(Y)l 44 + Kzho(2,E).
(5.44)
Equations (5.44) and (5.20) yield: EhO(5, E )
+ Rho(2,
E)
+Kl(f(Z) - E h O ( Z , E )
= eho(z,E )
+ E U O ( Z ) ) + K2hO(Z,E).
(5.45)
From (5.45) and (5.11) one derives: &ho(X,&)
+ Rho(z,E ) = E h O ( 5 , + f ( X ) E)
-EK1Wo(X,E)
- E h O ( X , E ) + EUO(2)
+ Kzuo(z)+ K2WO(X,E).
(5.47)
Equations (5.47) and (5.39) yield (5.40). Theorem 5.3. is proved.
0
Let us construct higher order approximations. If 1 2 1 then u l ( z ) is chosen to be an arbitray particular solution to the equation
p * ( z ,DZ)ui(z)= -Q(z, D ~ ) W - I ( Z ) . After U ~ ( Z )is fixed, the function solution to the equation
W ~ ( Z , E )is
(5.48)
constructed as the unique
EQ(~ D,z ) ~ ( zE ,) + P*(x, D z ) ~ ( xE ,) = 0 1
(5.49)
satisfying the conditions & K l W l ( X , E ) - Kzwl(z,E)= -Klul-l(z)
+
KZUl(2).
(5.50)
Singular Perturbation Theory
Theorem 5.4
The function hz(x,E) = u ~ ( s ) + w ~ ( ~solves , E ) the equation Ehl(Z,E )
Proof.
97
+ Rhl(x,
E)
+
= -uI-~(x)
EU~(Z).
(5.51)
The proof is similar to that of Theorem 5.3 and is omitted.
0
Define L HL(2,E)
Theorem 5.5
(5.52)
= C E l h l ( 2 ,E ) . z=o
The function HL(x,E ) solves the equation E H L ( 2 ,E )
+ RHL(Z, E ) = f(z)+
EL+lUL(Z).
(5.53)
Proof. From (5.52) one gets
z=o
k 0
L
=
CE'[Ehz(x, + Rhi(z, E)
E)]
.
(5.54)
1=0
Using (5.40), (5.51) and (5.54) yield (5.54). Theorem 5.5 is proved.
0
Theorem 5.6 If the function f(x) is suficiently smooth in B,then it is possible to choose a solution uo(x)to (5.37) and a solution ul(x)to (5.48) so that the following inequality holds
IIHL(z,E)- h(x,E)II- 5 CeL+' ,
(5.55)
where C = const > 0 does not depend on E, but it depends on f (x). Proof.
From (5.1) and (5.54) one obtains
E(HL(z, E ) - h(x,E ) )
+ R(HL(z,
E ) - h ( z ,E ) )
= E ~ + ~ U L ( Z ) . (5.56)
From (5.56) it follows that E P L b , E ) - h ( z ,E ) , HL(2, ). - h(x,E ) )
Random Fields Estimation Theory
98
Inequality (5.55) follows from (5.58) if the norm I l u ~ ( z ) l l +is finite. Consider L = 0. If f(z)E H 3 ( 4 - p ) I 2 ( D ) then it is possible to find a solution of (5.37) uo(z) E H(q--P)/2(D).Thus the norm Iluo(z)II+ is finite. For L = 1 suppose that f(z) E H5(9-P)/2(D). Then there exist a solution to (5.37) uo(z) E H3(q-P)/2(D)and a solution to (5.38) u1(z) E H(Q-P)/2(D) = H+ so that the norm I[ul(z)ll+is finite. ) (5.55) can If f(z)E Cm(B) then the approximation H L ( ~ , Esatisfying 0 be constructed for an arbitrary large L.
5.4
Examples of asymptotical solutions: case n = 1 Let
Example 5.1
+ / e-alz-ylT(y)h(y,&) dy = f(z), 1
eh(z,E )
(5.59)
-1
where ~ ( y 2) C2 > 0 is a given function. In this example the operators P ( y , Dy) and Q(z,Dz)act on an arbitrary, sufficiently smooth, function g(x) according to the formulas:
and
One has
Equation (5.37) yields
(5.60) and (5.38) takes the form &
-(-wg(z, 2a
E)
+ a2wo(z,&))+ r(z)wo(z,
E)
= 0.
(5.61)
Singular Perturbation Theory
99
If one looks for the main term of the asymptotics of can solve in place of (5.61) the following equation E
-- w:a(x, 2a
E)
+ T(x)wo,(z,
E) = 0,
~ ( z , E ) , then
one
(5.62)
where woa(x, E ) is the main term of the asymptotics of wo(x, E ) . We seek asymptotics of the bounded, as E -+ 0, solutions to (5.61) and (5.62). To construct the asymptotics, one may use the method, developed in [Vishik and Lusternik (1962)l. Namely, near the point x = -1 one sets x = y - 1, y 2 0, and writes (5.62) as: E
-2a vC(Y~E ) + T(Y - 1)V a ( y ,6 ) = 0, where va(y, E ) := W o a ( y - 1,E ) . P u t y = t& and denote c p a ( t , E ) := wa(t&,E). -2a
d2cpa(t7E)
dt2
+
T(t&
(5.63)
Then
- 1)cpa(t,E ) = 0
(5.64)
Neglecting the term t& in the argument of T is possible if we are looking for the main term of the asymptotics of pa,Thus, consider the equation: (5.65)
Its solution is
Discarding the unbounded, as t
-+
+oo, part of the solution, one gets
cpa(t,€1 = C l e - m t . Therefore, the main term of the asymptotics of woa(x,E ) near the point z = -1 is:
Similarly one gets near the point x = 1
From (5.66) and (5.67) one derives the main term of the asymptotics of the bounded, as E t 0, solution to equation (5.62): W O , ( ~ , E= ) Cle-J-(l+z)
+
Dle-d-(l-z)
(5.68)
Random Fields Estimation Theory
100
Now the problem is to find the constants C1 and D1 from condition (5.39). Since p = 0, formula (5.19) yields K2 = 0, and (5.39) is: E K l W O ( Z , E ) = Klf(Z)
(5.69)
.
From (5.69) and (5.21) one gets
(5.70) Note that d G b y) - -ae-alz--yl
aY
sgn(y - z), where s g n ( t ) = t/lti, so
From (5.71) and (5.70) one obtains - woa(- 1,e)ae-Q(l+X)
E{wOa(1, E ) (-u)e-a(l-z)
-f(
-qae-41+4 - f’(l)e-a(l-z)
+ f’(-l)e-
a(l+x).
(5.72)
This implies: E(aWOa(1, E )
+ wha(1, E l } = af (1) +
m
7
and
+ Wha(-1,E)}
E{-awo,(-l,&)
= -af(-l)
+ f’(-l)
Keeping the main terms in the braces, one gets: J z q p 1
= f’(1)
+af(l),
and
-d-C,
= f‘(-1) - af(-1).
(5.73)
Singular Perturbation Theory
101
Therefore
From (5.60), (5.68) and (5.74) one finds the main term of the asymptotics of the solution to (5.59):
(5.75) If r ( z ) = const, then (4.17) yields the asymptotic formula obtained in [Ramm and Shifrin (199 l)].
Example 5.2
Consider the equation d
Ebb,E ) +
/
G(z,y)h(y, ). dY = f(.)
1
(5.76)
C
where G(z, y) solves the problem - d2G(x7
8x2
and u 2 ( x ) 2 const
+ u 2 ( x ) G ( zy), = &(a: - y) ,
G(m, y) = 0 ,
(5.77)
> 0, Vx E R1.
d2 Here P ( y , Dy) = I, p = 0, Q(z, DZ) = -dx2 One can write G(z,y) as
+ a 2 ( 2 ) ,q = 2. (5.78)
where functions cpl(z)and c p z ( 2 ) are linearly independent solutions to the equation Q(x,D,)cp(z) = 0, satisfying conditions ‘p1(--m =)0, cp2(+m) = 0 and
By (5.37) one gets uo(2) = -f”(2)
+ a2(.)f(.).
(5.80)
Random Fields Estimation Theory
102
By (5.38) one obtains E(-WG(Z,
The main term tion:
e)
+ a2(z)Wo(z, +
WO,(Z, E )
E))
W&,
of the asymptotics of
--Ew&(z,e)
E)
= 0.
WO(Z, E )
solves the equa-
= 0.
+WOa(z,E)
Thus W O aE( )z = , ce-(=-c)/&
+ De-(d-4/&,
(5.81)
Condition (5.39) takes the form (5.69). Using woa(z,e) in place of in (5.69), one gets, similarly to (5.70), the relation
WO(Z,E)
Keeping the main terms, one gets (5.82)
+fI(C)(P2(4cpl(C)
.
(5.83)
Because c p l ( ~ )and cpz(z) are linearly independent, it follows from (5.83) --EW;),(d,E)(P2(4 =
f(4cpm - f W c p a ( 4
7
Singular Perturbation Theory
103
This yields the final formulas for the coefficients:
From (5.80), (5.81) and (5.85) one gets the main term of the asymptotics of the solution to (5.76):
(5.86)
5.5
Asymptotics in the case n
>1
Consider equation (5.1) with R(z,y) E R. The method for construction of the asymptotics of the solution to (5.1) in the multidimensional case is parallel to the one developed in the case n = 1. The proofs are also parallel to the ones given for the case n = 1, and are omitted by this reason. Let us state the basic results. Theorem 5.7 EQ(z,
Equation (5.1) is equivalent to the problem
+
Dz)h(z,~) P*(z,Dz)h(z,~ =)Q(z,Dz)f(z) 7
(5.87)
&MIh(Z,&) - M2h(z,E ) = M1f(z) .
(5.88)
Proof. One uses Lemmas 5.1, 5.4 and 5.5 and formula (5.27) to prove Theorem 5.7. 0
To construct the asymptotics of the solution to equation (5.1), let us look for the asymptotics of the form: M
M
where uo(z) is an arbitrary solution to the equation (5.90)
Random Fields Estimation Theory
104
and if some uo(x) is found, then solution to the problem
WO(Z,E)
is uniquely determined as the
EQ(D ~ ,~ ) w oE() ~+,P* (2,D ~ ) w oE ( ) = ~0, 7
&(Z,E)
+ Rho(x,E)= f(z)+ E ~ o ( z ) .
(5.91)
(5.93)
Let us construct higher order terms of the asymptotics. Define ul(z) (1 >_ 1) as an arbitrary solution t o the equation
P*(z,D,)w(z) = -Q(z, D z ) ~ - i ( x ) .
(5.94)
After finding ul(x),one finds w ~ ( z , Eas ) the unique solution to the problem
+
~Q(x,D,)wz(x,~) P * ( G D ~ ) W Z (=~0, ,E )
(5.95)
L HL(2,E) = C&lhl(x,E).
(5.98)
1=0
From Theorems 5.8 and 5.9 one derives
Theorem 5.10
The function H L ( x , E )solves the equation
E H L ( Z , E+) R H L ( x , E )= f(.)
+E~+'uL(z).
(5.99)
Theorem 5.11 If the junction f ( x ) i s sufficiently smooth in fi, then it i s possible to choose a solution U O ( X ) to (5.90) and a solution uz(x)to (5.94), so that the following inequality holds JJHL(~,E) - h(x,E)JJ-I C@l,
where C = const > 0 does not depend on E , but it depends on f (x).
Singular Perturbation Theory
105
Examples of asymptotical solutions: case n > 1
5.6
Example 5.3 Consider the equation (5.100) s1
where z = ( 2 1 , 2 2 ) , y = ( y l , y 2 ) , IyI =
d s i , s(ly1) is a known smooth
1 positive function, s((y() 2 C2 > 0 , G(z,y) = - Ko(a(z- yo, & ( T ) is 27r the MacDonalds function, (-Ax a 2 ) G ( z ,y) = S(z - y), S1 is a unit disk centered at the origin. In this example P(Y, D,)g(y) = s(lyl)g(y), P = 0, Q(z,Dz)= -A= +a2, q = 2. Let us construct the main term of the asymptotics of the solution to (5.100). By (5.90) one gets
+
S(~Z~)UO(Z =>
(-Az + a2)1 = a 2 .
Thus (5.101) Equation (5.91) yields:
€(-Ax + a2)wo(2,E ) + s(Izl)wo(z, E )
=0.
(5.102)
The main term woa(z, E ) of the asymptotics of wo(z, E ) solves the equation
In polar coordinates one gets
Random Fields Estimation Theory
106
The asymptotics of the solution to (5.105) we construct using the method of [Vishik and Lusternik (1962)l. Let T = 1 - e. Then
Put
e = t f i and keep the main terms, to get - d2woa(t)
dt2
+ s(l)wo,(t) = 0 ,
(5.106)
so
woa(t)= C e - m t Keeping exponentially decaying, as t
+D e m t . + +m,
solution one obtains:
woa(t>= Ce-JZ;iTit. Therefore W O a ( T ,E )
=C e - r n ( 1 - T )
.
(5.107)
To find the constant C in (5.107) we use condition (5.92). Since p = 0, one concludes M2 = 0, and (5.92) takes the form (5.108)
EM1WOa(Z,E ) = M1f(z) = M11. From (5.25) and (5.108) one gets:
=
/[
asl
-1
1- G(z, y) d l
aNY
dl,
,
where dl, is the element of the arclength of dS1. If one replaces wo(y, e) by woa(y, E ) in the above formula then one gets
Singular Perturbation Theory
107
The main term in (5.109) can be written as: (5.110)
By (5.107) for y E dS1 one gets (5.111) From (5.110) and (5.111) one obtains -
a
C
J G(x,y) dl, = J
dl, , vx E
sl.
(5.112)
asl
as1
For x = 0 and y E 85'1 one gets
U
U
= - KA(ar)L = -Kl(U). 2n =1 2n
These relations and (5.112) imply:
- ~ C K O ( U = -aKl(a). ) Therefore
aK1( a ) = -KO(U)
(5.113) '
From (5.101), (5.107) and (5.113) one finds the main term of the asymptotics of the solution to (5.100):
If ~(1x1)= 1, then (5.114) agrees with the earlier result, obtained in [Ramm and Shifrin (1995)l. Example 5.4
Consider the equation EMX,
€1 +
1
Bi
G(z, ?I)s(lyl)h(3G). du = 1
7
(5.115)
Random Fields Estimation Theory
108
where J: = ( q , 2 2 , ~
3 ) y, = (yl,y 2 , y?),
,
(-Az
s( IyI) is a smooth positive function,
-,
+
a2)G(x,Y) = d ( -~Y), SO &(x, D,) = -Az a2,q = 2, B1 is a unit ball centered at the origin. The main term of the asymptotics is constructed by the method of Section 5. By (5.90) one gets f
s(~z~)uo(z) = (-A,
+ a2)1= a'.
Thus
(5.116 ) By (5.91)
&(-A,
+ a2)wo(2, + s(~z~)wo(z, E ) = 0. E)
Keeping the main terms WO,(Z,E)of the asymptotics of gets -EA,Woa(Z,
E)
+ S(IZI)WOa(J:, E ) = 0
W O ( Z , E ) , one
'
In spherical coordinates this equation for the spherically symmetric solution becomes:
Let r = 1 - p. Then (5.117) can be written as:
Put
e = tJE
and keep the main terms in the above equation to get - d2Woa(t)
dt2
+ s(l)woa(t)= 0
The exponentially decaying, as t
.+ +oo, solution
(5.118)
to (6.19) is:
woa(t)= c e - m t . Therefore W~,(Z,E)
= Ce--(l-bl).
(5.119)
Singular Perturbation Theory
109
The constant C in (5.119) is determined from conditions (5.92), which in this example can be written as EM1WO(X, E )
= MIf(X) = M11.
(5.120)
Using formulas (5.25) and (5.120) one gets
Replacing wo(y, E ) by woa(y, E ) and keeping the main terms, one obtains
From (5.119) for y E dB1 one derives (5.122) From (5.122) and (5.121) it follows that -
m
C
J G ( x , Y ) ~ s , J dG(x, aNy Y) dSy . =
aBi
Put
2
(5.123)
aBi
= 0 in (5.123). Let us compute the corresponding integrals:
J
dS, =
aBi
s
4n e-a
aBi
dSy = e-a.
(5.124)
aBi
Note that: 1
aNy
= -- (ueWa
4~ dr
+ e-a) .
Thus
s
aBi
aG(o’y)dS - --1 e-a(a + 1) y47r
J
dSy = -e-a(u
+ 1).
(5.125)
aBi
From (5.123), (5.124) and (5.125)) one gets, setting z = 0, the relation = -eVa(a
+ 1).
Random Fields Estimation Theory
110
This yields
C=- a + l
m
(5.126)
From (5.116), (5.119) and (5.126) the main term of the asymptotics of the solution to equation (5.115) follows: (5.127) y If s(z)= 1, formula (5.127) yields a result obtained in [Ramm and Shifrin (1995)l. Let us summarize briefly our results. In this paper we constructed asymptotics of the solution to (5.1) as a + +O, and demonstrated how the L2 - solution to (5.1) tends to a distributional solution of the limiting equation Rh(z) = f(z).
Chapter 6
Estimation and Scattering Theory
In recent years a number of papers have appeared in which the threedimensional (30) inverse scattering problem is associated with the random fields estimation problem. In this Chapter we give a brief presentation of the direct and inverse scattering theory in the three-dimensional case and outline the connection between this theory and the estimation theory. This connection, however, is less natural and significant than in one-dimensional case, due to the lack of causality in the spacial variables. In Chapter 1 the direct scattering problem is studied, in Chapter 2 the inverse scattering problem is studied, in Chapter 3 the connection between the estimation theory and inverse scattering is discussed.
6.1 6.1.1
The direct scattering problem
The direct scattering problem
Consider the problem
t,u
- k 2 u := [-V2
+ q(z) - k2]u= 0
u = exp(ilce.z)+A(B',B,lc)r-lexp(ikr)+o(T-l),
in R3, k
T
=1 . 1
>0
4
00,
(6.1)
X
0' = -
(6.5 where 0,8' E S 2 , S2 is the unit sphere in R3, and o ( T - ' ) in (6.2) is uniform in 8,tY E S2. The function u is called the scattering solution, the function A(#, 8, k) is called the scattering amplitude, the function q(z) is the potential. 111
Random Fields Estimation Theory
112
Let us assume that qEQ:={q:q=ii,
lqllc(l+lxl)-a,
a>3}.
(6.3)
The bar in this chapter stands for complex conjugate (and not for mean value). By c we denote various positive constants. By Qm we denote the following class of q
Qm := { q
: q(j) E
Q , 0 5 Ijl 5 m } ,
(6.4)
so that QO= Q.
The scattering theory is developed for q which may have local singularities and are described by some integral norms, but this is not important for our presentation here. Our purpose is to give a brief outline of the theory for the problem (6.1)-(6.3) with minimum technicalities. The following questions are discussed: 1) selfadjointness of l,, 2) the nature of the spectrum of e, , 3) existence and uniqueness of the solution to (6.1)-(6.3), 4) eigenfunction expansion in scattering solutions, and 5) properties of the scattering amplitude.
The operator l , defined by the differential expression (6.1) on C r ( R 3 ) is symmetric and bounded from below. Let us denote by l , its closure in H =P(R3).
Lemma 6.1
Proof.
The operator l , is selfadjoint.
This lemma is a particular case of Lemma 8.5 in Section 8.2.4.0
Lemma 6.2 1) The negative spectrum of C , is discrete and finite. 2) The positive spectrum is absolutely continuous. 3) The point X = 0 belongs to the continuous spectrum but may not belong to the absolutely continuous spectrum.
Proof.
Let us recall Glazman’s lemma:
0
Lemma 6.3 Negative spectrum of a selfadjoint operator A is discrete and finite if and only if sup dim M
(6.5)
Estimation and Scattering Theory
113
where the supremum is taken over the set of subspaces M such that (Au,u)5 0 for u E M .
A proof is, e.g., in [Ramm (1986), p. 3301 or [Glazman (1965), 831. Therefore the first statement of Lemma 6.2 is proved if one proves that N- := s u p d i m M < 00,
M
:=
{u L3IVuI2dx < L3q-(x)1u12dx} , :
(6.6) where N- is the number of negative eigenvalues of C, counting their multiplicities and q- = max (0, -q(x)}. One has
Let us write
The ratio of the quadratic forms in (6.8) has a discrete spectrum and the corresponding eigenfunctions solve the problem q-u= -XV 2 u in
R3
which can be written as
Let q1/2 := p . Then (6.9) can be written as
A4
:= pgop4 = X4,
4 := pu.
(6.10)
The operator A , defined in (6.10), is compact, selfadjoint, and nonnegativedefinite in L 2 ( R 3 )if q E &. Therefore the number of the eigenvalues A, of the problem (6.10) which satisfy the inequality A, > 1, is finite. This number is the dimension of M defined in (6.8). Thus N- < 00, where Nis defined in (6.6). Statement 1) of Lemma 6.2 is proved. Only a little extra work is needed to give an estimate of N- from above. Namely, W
A24j = Xj”+j,
TrA2 =
EX; j=1
L
E Xj” 2 Xj>l
1 = N-. X,>1
(6.11)
Random Faelds Estimation Theory
114
Thus
(6.12) The right-hand side of (6.12) is finite if q- E Q. Note that if q E Q, then q- E Q and q+ E Q. 6.1.2
Properties of the scattering solution
Lemma 6.4 Proof.
The scattering solution exists and is unique.
The scattering solution solves the integral equation
where uo := exp(ikf3. x).
(6.14)
Conversely, the solution to (6.13) is the scattering solution with 47T
exp(-ik8’ . y)g(y)u(y,6, k)dy.
It is not difficult to check that if q
(6.15)
i Q then the operator
T(lc)u:= gqu
(6.16)
is compact in C ( R 3 ) . Therefore, the existence of the solution to (6.13) follows from the uniqueness of the solution to the homogeneous equation
u = -Tu,
u E C(R3)
(6.17)
by F’redholm’s alternative. If u solves (6.17) then u solves equation (6.1) and satisfies the radiation condition
(6.18) Since q
= ?j, the
function
solves equation (6.1) and Green’s formula yields (6.19)
115
Estimation and Scattering Theory
From (6.18) and (6.19) it follows that (6.20) Any solution to (6.1) which satisfies condition (6.20) has to vanish identically according to a theorem of Kato [Kato (1959)]. Thus u = 0 and Lemma 6.4 is proved. 0
Let f E L 2 ( R 3 )be arbitrary. Define
Lemma 6.5
1
f(<):= ( 2 ~ ) - ~ /f ~( z ) u ( z , J ) d z , f j := ( f , u j ) ,
1 5 j 5 N-,
(6.21)
where uj are the orthonormalized eigenfunctions corresponding to the discrete spectrum of e,: 1< j
equj = Xjuj,
< N-,
X j < 0,
(6.22) (6.23)
< E R3, Then
f
= (27d-3/2
= k,
s
< =kb',
b' E S2.
(6.24)
c
(6.25)
N-
m u ( . >
fjUj(4.
j=1
Formulas (6.21), (6.25) are analogous to the usual Fourier inversion formulas. They reduce to the latter if q(z) = 0. The proof of Lemma 6.5 requires some preparations. We follow the scheme used in [Ramm (1963); Ramm (1963b); Ramm (1965); Ramm (1968b); Ramm (1969d); Ramm (1970); Ramm (1971b); Ramm (1987); Ramm (1988b)l and in [Ramm (1986), p. 471. Let G(z, y, k ) be the resolvent kernel of e,:
(a, - k2)G(z,y, k ) = 6(z - y)
in R3.
(6.26)
This kernel solves the equation G ( z ,Y, k ) = dz,Y, k ) -
s
9(z,z , k)q(z)G(z, Y,k ) d z .
(6.27)
Random Fields Estimation Theory
116
This equation is similar to (6.13) and can be written as
(I
+T)G = g,
(6.28)
where T is defined in (6.16). Therefore, as in the proof of Lemma 6.4, the solution to the equation (6.27) exists and is unique in the space C,(R3) of functions of the form G = C ~ X- y1-l V ( X , y) where V ( X , y) is continuous and c = const, 11 G II:= IcI + m a x Z E ~lv(z,y)I. 3 The operator T is compact in C,(R3) if q E Q. The homogeneous equation (6.28) has only the trivial solution. The operator T = T ( k ) depends continuously on k E C+ := {k : Imk 2 0). Therefore [ I + T ( k ) ] - l is a continuous function of k in the region C + n A(k), where A(k) is a neighborhood of a point ko, A(k) := {k : Ik - kol < 6) 6 > 0, and the operator I T(ko)is invertible. Since for any k > 0 the operator I T ( k ) is invertible, it follows that G(z, y, k) is continuous in k in the region C + n A(0, m), that is in a neighborhood of the positive semiaxis in C + . The continuity holds for any X,y fixed, x # y, and also in the norm of C,. This implies that the continuous spectrum of C, in the interval (0, m) is absolutely continuous. From the equation (6.27) it follows that
+
+
+
where g ( r ) := (4xr)-' exp(ikr) and u ( y , -8, k) is the scattering solution. In fact, o(1) = 0 uniformly in y E D,where D E R3 is an arbitrary fixed bounded domain. Indeed, it follows from (6.27) that
(A)
The function (6.30) solves equation (6.1):
and satisfies the condition (6.2) since the integral term in (6.30) satisfies the radiation condition. Therefore, the scattering solution can be defined by formula (6.29). This definition was introduced and used systematically in [Ramm (1987)l.
Estimation and Scattering Theory
117
The starting formula in the proof of the eigenfunction expansion theorem is the Cauchy formula
(6.31) where
Rx := ( A - AI)-‘,
A = A* = I 4’
(6.32)
CN is a contour which consists of the circle YN := {A : 1x1 = NI}, of a finite number N- of circles rj := {A : IX + X i ( = 6 ) where A j < 0, 1 _< j 5 N - , are negative eigenvalues of I,, 6 > 0 is a small number such that rj does not intersec with ”ym for j # m, and of a loop CN which joins points N - i0 and N ZO and goes from N - i0 to 0 and from 0 to N i0. The circles rj, 1 5 j 5 N- are run clockwise and YN is run counterclockwise. The integral
+
+
(6.33) where Pj is the orthoprojection in H = L2(R3)onto the eigenspace of A corresponding to the eigenvalue A j . Note that there is no minus sign in front of the integral in (6.33) because rj is run clockwise and not counterclockwise. One has:
(6.34) where we have used the relation Rx-i0f
= Rx+iof.
(6.35)
Formula (6.35) follows from the selfadjointness of A:
and from the symmetry of the kernel of the operator Rx(x,y). Finally, for any selfadjoint A one has
(6.37)
Random Fields Estimation Theory
118
Indeed, if A is selfadjoint then 00
Rx = [,(t
- X)-ldEx,
(6.38)
where Ex is the resolution of the identity for A. Substitute (6.38) into (6.37) to get
(6.39) Here we have used the formula 1, - N < t < N , 0, t > N or t
1
< -N.
(6.40)
Using (6.31), (6.33), (6.34) and (6.37) one obtains (6.41) where
and the sum in (6.41) is the term
CPjf.
(6.43)
Let X = Ic2 in (6.41). Then
We wish to show that the term (6.44) is equal to the integral in (6.25). This can be done by expressing ImG(z,y , Ic) via the scattering solutions. Green's formula yields:
Estimation and Scattering Theory
119
Take r + 00 and use (6.29) to get
Thus
Substitute (6.47) into (6.44) to get
Here 5 = k8, d< = k2dkd8, f(E) is given by (6.21). Fkom (6.41), (6.44), and (6.48) formula (6.25) follows. Lemma 6.5 is proved.
Remark 6.1 Let us give a discussion of the passage from (6.44) and (6.47) to (6.48). First note that our argument yields Parseval's equality:
(f
7
h) =
(mw + cfJG 7
(6.49)
j
and the formula for the kernel of the operator understood in the weak sense
3, where the derivative is
To check (6.49) one writes
where we have used the orthogonality of the spectral family: E ( A ) E ( A ' )=
E ( A n A'). Furthermore, using (6.50) one obtains
Random Fields Estimation Theory
120
The last two formulas yield (6.49). The passage from (6.44) and (6.47) to (6.48) is clear if f E L2(R3)n L1(R3): in this case the integral $(() := 1f ( y ) u ( y , f ) d y converges absolutely. If f E L2(R3)then one can establish formula (6.25) by a limiting argument. Namely, let 3’be the operator of the Fourier transform, Ff = [f(f),{fj}] := [F,f,Fdf] where the brackets indicate that the Fourier transform is a set of the coeficients f j , corresponding to the discrete spectrum of C, and the function f(f) corresponding to the continuous spectrum of C., The operator 3 is isometric by (6.49), and i f it is defined originally o n a dense in L2(R3)set L2(R3) nL1(R3)it can be uniquely extended by continuity on all of L2(R3). Formula (6.21) is therefore well defined for f E L2(R3).I f formula (6.25) is proved for f E L2(R3)n L1(R3),it remains valid f o r any f E L2(R3) because the inverse of 3 is also an isometry from RanF onto L2(R3).Let us note finally that R a G c = L2(R3),where 3cf := $(<), and 3c is an isometry from L2(R3)onto L2(R3),F&Fcf = f -Eo f , FcF$f = $. Here Eo f is the projection o f f onto the linear span of the eigenfunctions of e,, Eof = C jPif . This follows from the formula N (F6) = {O}, the proof of which is the same as in [Ramm (1986), p. 511. 6.1.3
Properties of the scattering amplitude
Let us now formulate some properties of the scattering amplitude A(8’,8, k).
Lemma 6.6
If q E Q then the scattering amplitude has the properties
A(8’,8 , - k ) = A(&,8, k ) , k > 0 A(8’,8 , k) = A(-8, -8’, I c ) ,
A(#, 0, k) - A(8,8’, k) 2i
k
=
(6.51)
(reality)
(6.52)
(reciprocity)
G L *A(8‘,a, k)A(8,a, k ) d a
(unitarity). (6.53)
I n particular, if 8’ = 8 in (6.53) then one obtains the identity
k
I m A ( 8 , 8 , k) = 4r
L
IA(8,a, k)I2 d a
(optical theorem).
(6.54)
Proof. 1) Equation (6.51) follows from the real-valuedness of q(z). Indeed, u ( x ,8, - k ) and u ( x ,0, k ) , k > 0, solve the same integral equation
Estimation and Scattering Theory
121
(6.13). Since this integral equation has at most one solution, it follows that
+, e, k) = +, e, +,
Ic > 0.
(6.55)
Equation (6.51) follows from (55) immediately. 2) The proof of (6.52)-(6.54)is somewhat longer and since it can be 0 found in [Ramm (1975), pp. 54-56] we refer the reader to this book. Let us define the S-matrix
k S=I--A 27ri
(6.56)
where S : L 2 ( S 2 )-+ L 2 ( S 2 )is considered to be an operator on L 2 ( S 2 )with the kernel k (6.57) s(e/,e, k ) = q e - e l ) - -A(e', e, k ) . 27ra The unitarity of S
s*s= I
(6.58)
A-A* k = -A*A
(6.59)
implies 2i
47r
which is (6.53) in the operator notation.
6.1.4 Analyticity in k of the scattering solution Define
6 := exp(-ikB . z)u(z,8, k). Then
(6.60)
4 solves the equation
where
The operator Te(k) : C(R3)--f C(R3)is compact and continuous (in the norm of operators) in the parameter k E := { k : Imk 2 0). If q E Q, the operator Te(k) is analytic in k in C + since I z- yI - 0 - (z - y) 2 0. The operator I + Te(lC)is invertible for some k E C + , for example, for k E C +
Random Fields Estimation Theory
122
sufficiently close to the positive real semiaxis, or for k = a + ib, where a and b are real numbers and b > 0 is sufficiently large. Indeed under the last assumption the norm of the operator Te(k)is less than one. Therefore by the well-known result, the analytic Fredholm's theorem, one concludes that [I+T0(k)]-' is a meromorphic in C+ operator function on C ( R 3 )(see [Ramm (1975), p. 571). The poles of this function occur at the values kj at which the operator I To(lCj)is not invertible. These values are
+
kj
(6.63)
= i f i
where X j are the eigenvalues of the operator l,, and, possibly, the value k = 0. Indeed, if
[I + T&)]
2,
= 0,
2,
E C(R3), kj E
c+
(6.64)
then the function w := exp(ik0. x ) ~ , 2, E C (R 3 )
(6.65)
solves the equation (6.66)
w = -Tw
where T is defined in (6.16). It follows from (6.65) and (6.66) that w = O(lxl-'). This and equation (6.66) imply that ( l , - k2)w = 0
(6.67)
w E P(R3).
(6.68)
and
Equation (6.67) follows from (6.66) immediately. Equation (6.68) can be easily checked if k E C+,that is, if k = a ib, b > 0, a is real. Indeed, use (6.66) , the assumption q E Q, which implies that q E L2(R3)n L1(R3), and boundedness of w to get:
+
Since the operator t, = - A + q ( x ) is selfadjoint equations (6.67) and (6.68) imply w = 0 provided that k 2 is not real. Since k E C + , the number k2 is real if and only if k = k2 = -[XI < 0. Equations (6.67) and
im,
Estimation and Scattering Theory
123
(6.68) with k 2 = -[XI imply that X = X j is an eigenvalue of C,. Therefore, the only points at which the operator [I TO(^)]-^ has poles in C+ are the points (6.63) and, possibly, the point k = 0. One can prove [Ramm (1987)], [Rariim (1988b)l that if q E Q and C, 2 0 the number X = 0 is not an eigenvalue of C,. However, even if 2 0 the point X = 0 may be a resonance (half-bound state) for C,. This means that the equation
+
!,
Au = qu, u E C(R3) and u @ L 2 ( R 3 )
(6.70)
may have a nontrivial solution which is not in L2(R3). In this case the operator I+To(O) is not invertible. Even if q(z) E CF the operator 1, may have a resonance at X = 0. Even if C, 2 0 and q is compactly supported and locally integrable the operator t, may have a resonance at X = 0.
Example 6.1 Let B = {x : 12) 5 l , z E R 3 } . Let u = lxl-l for 1x1 2 1. Extend u inside B as a C" real-valued function such that u(x) >_ b > 0 in B. This is possible since u = 1 on dB. Define Au
(6.71)
q(z) := -,
U
Then q E CF, q = 0 for 1x1 2 1, q is real-valued, u @ L2(R3),and the desired example is constructed. This argument does not necessarily lead to a nonnegative I,. In order to get C, 2 0 one needs an extra argument given in [Ramm (1987)l. Let us give a variant of this argument. The inequality 2 0 holds if and only if (*) J q(x)l+lz] dx 2 0 for all 4 E C7(R3). It is known that JJVq5J2dz2 J(4r2)-1)+12dz for all 4 E CF(R3),r := 1x1. Therefore (*) holds if (**) (4r2)-l q >_ 0. Choose u = ry-1 (1 y - y r ) , where y > 0 is a sufficiently small number. Then q, defined by (71), satisfies (**) as one can easily check. This q is integrable and C, 2 0. The function u = r-l for r 2 1 and u = rY-I(l+ y - y r ) solves the equation C,u = 0 in R3, u L2(R3)), 2 0, q = 0 for r 2 1, and q is locally integrable.
!,
+
+
+
!,
Exercise. Prove that the numbers (6.63) are simple poles of [I+To(k)]-'. 6.1.5
High-frequency behavior of the scattering solutions
Assume now that q E written as
Q1.
Then the function
4 defined in
(6.60) can be
(6.72)
Random Fields Estimation Theory
124
If q E Qm, m > 1, more terms in the asymptotic expansion of q5 as k + 00 can be written (Skriganov (1978)l. Formula (6.72) is well known and can be derived as follows. Proof of Formula (6.72) Step 1: Note that
~sII TZ(k)II+
as k
0
(6.73)
+oo.
--+
This can be proved, as in [Ramm (1986), p. 3901, by writing the kernel Be(z,Y,k) of T i :
Introduce the coordinates s, t , $ defined by the formulas 21 = est
z1 +Y1, +2 23
z2 = eJ(s2 - 1)(1- t2)COS$
= ~ J ( s 2 - 1)(1- t 2 ) sin$
+ Y2 +2 22
+ 2 3 +2 Y3 '
(6.75)
= 2et,
3 = t3(s2-t2),
~
where
e = 1z-y1/2,
Iz-zI+Iz-yI
= 2es,
1z-yl-lz-yl
(6.76) and 3 is the Jacobian of the transformation ( ~ 1 , ~ 2 ,--+ ~ 3( S), t , $ ) ,
1 I s < 00,
-1
I t I 1,
0I$<2 ~ .
In the new coordinates one obtains
where
and q1(s, t,$) is q ( z ) in the new coordinates given by formula (6.75). One can choose a sufficiently large number N > 0 such tht
1x1 > N
or
IyI > N
implies
sup OES2,k>0
IBe(z,y, k)l < E ( N ) , (6.78)
Estimation and Scattering Theory
125
where c ( N ) + 0 as N 00. If N is fixed, then for 1x1 5 N and IyI 5 N it follows from (77)that -+
lBel + O
as k - + + m
(6.79)
since p ( s ) E L1 (I,00). This proves (6.73). Note that in this argument it is sufficient to assume E
Q. Step 2: If (6.73)holds, one can write 00
4 = 1 + x(-l)jTj(k)l,
(6.80)
j=1
where the series in (6.80)converges in the norm of C ( R 3 )if k is sufficiently large so that 11 T i I[< 1. Note that if 11 T I[< 1 then
(6.81) j=O
and the series converges in the norm of operators. If (6.81)remains valid. Indeed
11 T 2 I/<
1, formula
= (1- T2)-l - T ( I - T2)-l =
(I- T ) ( I- T
+
y = (I T)-!
(6.82)
In fact it is known that the series (6.81)converges and formula (6.81)holds if 11 T m II< 1 for some integer m 2 1. As k -+ 00, each term in (6.80)has a higher order of smallness than the previous one. Therefore it is sufficient to consider the first term in the sum (6.80)and to check that
-Te(k)l=
2ik
O0
q(z - re)&
+o
(k) ,
k + +00
(6.83)
Random Fields Estimation Theory
126
in order to prove (6.72).
Step 3: Let us check (6.83). One has
O0
where we set y = z 55:
drr2 exp(ikr)
exp(ikr8. a)q(z
+ ra)da,
(6.84)
+ z , z = ra, a E S 2 . Use formula [Ramm (1986)],p. exp(ikr8. a)f(a)da
Js2
= 2Ti
[
e x p (kr -ikr)
f(-q
- eTx fp ((i sk r)) ]
+o(;),
as k
(6.85)
which holds if f E C1(S 2 ) . From (6.84) and (6.85) one obtains
which is equivalent to (6.83). Formula (6.72) is proved. It follows from (6.72) that
provided that q E Q1. Another formula, which follows from (6.72), is
or
-
q(z) = e V,
lim ( 2 i k [$~(z, 8, k) - 11).
(6.88)
k++w
Note that the left side does not depend on 8, so that (6.88) is a compatibility condition on the function Cp(z, 8, k).
127
Estimation and Scnttering Theory
From (6.87) and (6.15)it follows that exp [ik(8- 8‘)
47T
. X]q(z)dz + 0 (6.89)
provided that q E &I. In particular, (6.90)
6.1.6 Fundamental relation between u+ and uIf one defines
u+ := u(x,8,k ) ,
-
u := u ( z ,-8, - k )
(6.91)
then one can prove that (6.92) Let us derive (6.92). We start with the equations U+
= u0 - G+quO, u0 := exp(ik8
- x),
-
u = uo - G-quo, where G+
= G,
(6.93) (6.94)
where G is defined by the equation (6.27), and
G- := E .
(6.95)
Equations (6.93) and (6.94) one can easily check by applying the operator lq - k2 to these equations. Subtract (6.94) from (6.93) and use (6.95) to get
u+ - u- = -2iImG+quo
(6.96) The last equality in (6.96) follows from the definition (6.91), properties (6.51) and (6.52) of the scattering amplitude and the formula (6.97)
Random Fields Estimation Theory
128
which is similar to (6.46) and which follows from (6.46) and (6.91). Let us derive (6.97). Note that, by formulas (6.46) and (6.91), one has ImG+(z,y,k) = ImG+(z,y, k), u*(z,8, -k)
= u*(z,O, k),
(6.98)
(6.99)
Here we used (6.98). Thus, formula (6.97) is Note that
and
From formula (6.52) it follows that the right sides of (6.100) and (6.101) are equal. This explains the last equation (6.96).
6.1.7 Formula for det S ( k ) and state the Levinson Theorem
+
If q(z)= U(z) is decaying sufficiently fast, (for example, if (1 1zI)q E Q, z E R3) then the operator A : L2(S2) + L2(S2) with kernel A(@‘,8, k), k > 0, is in the trace class and det S ( k ) = det
(I+ $A)
= exp
[--$/p(z)dx]
g,
k >0 (6.102)
where
d(k) := d;t ( I + T(lc)).
(6.103)
Estimation and Scattering Theory
129
The operator T ( k )in (6.103) is defined in (6.16) and the symbol detz(I+T) is defined in Definition 8.7 on p. 300. If k > 0 and q = g, then d ( - k ) = d ( k ) , where the bar stands for complex conjugate. Therefore (6.104)
det S ( k ) = exp [2iS(k)], where
“J
S(k) = --
47r
q(2)dz - P ( k ) ,
P ( k ) := a r g d ( k ) .
(6.105)
The Levinson Theorem says that ~ ( 0= ) 7r ( m
+f),
(6.106)
where m is the number of the bound states counting with their multiplicities, in other words m is the dimension of the subspace spanned by the eigenfunctions of C, corresponding to all of its negative eigenvalues, and v = 1 if k = 0 is a resonance and v = 0 otherwise, that is, if I T(0)is invertible. It is assumed that S(k) is normalized in such a way that
+
lim 6 ( k ) + k-m
[
“J
I
(6.107)
q(z)da: = O
or, according to (6.105), that
lim P ( k ) = 0.
(6.108)
k-+m
Formula (6.106) follows from (6.105) and the argument principle applied to d ( k ) . Formula (6.102) can be derived as follows:
d ( - k ) := det ( I + T ( - k ) ) 2
= det 2
= d;t
[I+ (I+ T(k))-’ ( T ( - k ) - T ( l c ) ) ] } [I + T ( k ) ]det [I+ ( I + T(k))-’ ( T ( - k ) T ( k ) ) ]x +
{[I T ( k ) ]
exp {-Tr [ T ( - k ) - T ( k ) ] }.
-
(6.109)
Here we have used item 12) which precedes Definition 8.8 on p. 300. The k z-yl) operator T ( - k ) - T ( k ) has the kernel - 2isin q ( y ) , so that its trace 4,&!y, is
Random Fields Estimation Theory
130
Therefore formula (6.109) can be written as
d ( - k ) exp d(k)
{ --; /q(~)dx} det =
+ ( I +T(k)>-I (T(-k) - T(k))] (6.111)
Finally one proves that det [I
+ ( I + T(k))-l (T(-k) - T ( k ) ) ]= det (I+
Formulas (6.111) and (6.112) imply (6.102). Let us prove (6.112). Let ( I T ( k ) ) - l := B. Then
+
T
:= ( I
+ T(k))-' (T(-k) - T ( k ) )= B [ g ( - k ) - g(k)]Q.
(6.113)
One has 2ik sin(k1z - yI) - --2' g ( - k ) - g(k) = -41r k l z - y l - 27r 47r
l.
exp {ikO (Z- y)} do. a
(6.114)
Furthermore, if uo := exp(ik6 . x) then
B~~ = U(Z,o, q,
(6.115)
where U(Z,8 , k) is the scattering solution (6.1)-(6.2). Therefore the righthand side of (6.113) is the operator in L2(R3)with the kernel
Note that
Trr =
s
T ( Z , Z)~Z =
A(B,o,k)do = Tr ("A) 2T
.
(6.117)
F'rom item (8) of Section 8.3.3, p. 299, it follows that (6.112) is valid provided that (6.1 18) One can check (6.118) as we checked (6.117). Thus, formula (6.102) is derived.
Estimation and Scattering Theory
6.1.8
131
Completeness properties of the scattering solutions
Theorem 6.1
Let h(8) E L2((s2)and assume that
h(e)+, 8, q d e = o v x E 1
nR:= .{
:
1 ~ >1 R ) ,
(6.119)
2
where k > 0 is fixed, x E R3. It is assumed that q E Q. Then h(8)= 0 . The same conclusion holds if one replaces u(x,0, k) by u(x,-8, -k) in (119) and if x E R', r 2 2.
Proof. The proof consists of two steps. Step 1. The conclusion of Theorem 6.1 holds if u ( x ,8, k) is replaced by U O ( X ,8, k) := exp(ik8. x) in (6.119). Indeed, if
L
h(8)exp(ik8. x)d8 = 0 V X E O R
(6.120)
and a fixed k > 0, then the Fourier transform of the distribution h(8)bs2 vanishes for all sufficiently large x. The distribution h(8)Ssz is defined by the formula (6.121) Since h(8)bsz has compact support, its Fourier transform is an entire function of x. If this entire function vanishes for all sufficiently large x E 3 , it vanishes identically. Therefore h(8) = 0. Step 2. If (6.119) holds then (6.120) holds. Therefore, by Step 1, h(8)= 0. In order t o prove that (6.119) implies (6.120) let us note that
+
uo(z,8, k) = (1 T ( k ) )u,
(6.122)
+
where T ( k ) is defined by (6.16). For every k > 0 the operator I T ( k ) is an isomorphism of C ( R 3 )onto C ( R 3 ) .Applying the operator I + T ( k ) to (6.119) and using (6.122) one obtains (6.120). Note that the operator I T ( k ) acts on u(x,8, k) which is considered as a function of x while 8 and k are parameters. Theorem 6.1 is proved. 0
+
Theorem 6.1 is used in [Rarnrn (1978e)l for a characterization of the scattering data which we give in Section 6.2.5. Another completeness property of the scattering solution can be formulated. Let ND(e,)
:= {w : w E
H ~ ( D ) , ~ ,=wo in D} ,
(6.123)
132
Random Fields Estimation Theory
where D c R3 is a bounded domain with a sufficiently smooth boundary r, for example r E C 1 , a ct, > 0 , suffices.
Theorem 6.2 Let q E Q , where Q is defined in (6.3). The closure in L 2 ( D ) (and in H 1 ( D ) ) of the linear span of the scattering solutions {u(x,8, k ) } V8 E 5’ and any fixed k > 0 contains N D ( ~-, k’). Proof. We first prove the statement concerning the L’(D) closure. Let E ND(C, - k’) and assume that
f
f u ( x , 8 ,k ) d z = 0 V8 E 5’’.
(6.124)
Define
where G is uniquely defined by equation (6.27). Use (6.29) and (6.123) to conclude that v(x) = 0 (1x1-2)
1.1
as
4
00.
(6.126)
Since (l,- k’)v = 0 in
R
:= R3 \
D
(6.127)
and (6.126) holds, one concludes applying Kato’s theorem (see [Kato (1959)l) that v = 0 in Q (see the end of the proof of Lemma 6.4 in Section 6.1.2). In particular,
V=VN=O
where V N is the normal derivative of v on
(l, - k’)w = -f
I?,
on
(6.128)
r. It follows from (6.124) that in D.
(6.129)
Since
(l, - k 2 ) f = 0 in
D
(6.130)
Estimation and Scattering Theory
133
by the assumption, one can multiply (6.128) by 7, integrate over D and get
(6.131)
Here we have used (6.128) and the real-valuedness of the potential. It follows from (6.131) that f = 0. The first statement of Theorem 6.2 is proved. In order to prove the second statement which deals with completeness in H1(D), one assumes (6.132)
and some f E ND(!, - k 2 ) . Integrate (6.131) by parts to get
k(-A f i f ) u d x i fivuds = 0 , Ve E 5’.
(6.133)
Define
Argue as above to conclude that v = 0 in R and v=v;=O
on l?,
(6.135)
where v; is the limit value of V N on r from !2. By the jump formula for the normal derivative of the single-layer potential (see, e.g., [Ramm (1986), p. 141) one has v$-Vi
(6.136)
=fN.
Since v; = 0 it follows that v$ = f N on I?. Thus V=O,
v$ = f N
on
r,
(6.137)
and
(C- k 2 ) v = Af - f
in
D.
(6.138)
134
Random Fields Estimation Theory
Multiply (6.138) by 7,integrate over D and then by parts to get
From (6.130), (6.137) and (6.139) it follows that (6.140) Thus, f = 0. Theorem 6.2 is proved.
6.2 6.2.1
Inverse scattering problems Inverse scattering problems
The inverse scattering problem consists of finding q(x) given A(6’,8, k). One should specify for which values of 8’,6 and k the scattering amplitude is given. Problem 1 A(O’,8,k) is given for all 6’, 6 E S2 and all k > 0. Find q(z). Problem 2 A(#, 6 , k ) is given for all 6’, 8 E S2 and a fixed k
> 0.
Problem 3 A(O’,6, k ) is given for a fixed 8 E S2 and all 6’ E S2 and all k > 0. Problem 1 has been studied much. We will mention some of the results relevant to estimation theory. Problem 2 has been solved recently [Ftamm (2005)] but we do not describe the results since they are not connected with the estimation theory Problem 3 is open, but a partial result is given in [Ramm (1989)l. 6.2.2
Uniqueness theorem f o r the inverse scattering problem
The uniqueness of the solution to Problem 1 follows immediately from formula (1.89). Indeed, if A ( @ ,6, k) is known for all 6 € S2 and all k > 0, then take an arbitrary E R3, an arbitrary sequence k , + +CQ, and find
<
Estimation and Scattering Theory
135
a sequence On, 8; E S2 such that
lim (8, - 8;)kn =
n+oo
c,
k,
This is clearly possible. Pass to the limit k, - 4 ~lim kn(8, k,-m
- 8;)
= gA(8;, On,
k,)
=
+ +co.
4
J
00
(6.141)
in (6.89) to get
exp(it. z)q(z)dz. (6.142)
Therefore the Fourier transform of q is uniquely determined. Thus q is uniquely determined. We have proved
Lemma 6.7 If q E &1 then the knowledge of A(#, 8, k ) on S2 x S2 x R+, R+ := (0, co), determines q(z) uniquely. In fact, our proof shows that it suffices to have the knowledge of A for an arbitrary sequence kn -+ 00 and for some 8‘ and 8 such that for any E E R3 one can choose 8; and 8, such that (6.1) holds. The reconstruction of q(z) from the scattering data via formula (6.142) requires the knowledge of the high frequency data. These data are not easy to collect in the quantum mechanics problems, and for very high energies the Schrodinger equation is no longer a good model for the physical processes. Therefore much effort was spent in order to find a solution to Problem 1 which uses all of the scattering data; to find necessary and sufficient conditions for a function A(O’,O,k) to be the scattering amplitude for a potential q from a certain class, e.g. for q E Qm,this is called a characterization problem; and to give a stable reconstruction of q given noisy data.
6.2.3
Necessary conditions for a function to be a scatterng amplitude
A number of necessary conditions for A(8’,8, k) to be the scattering amplitude corresponding to q E Q1 follow from the results of Section 6.1 of this scatteringtheory. Let us list some of these necessary conditions: 1) reality, reciprocity and unitarity: that is, formulas (6.51)-(6.54) 2) high-frequency behavior: formulas (6.89), (6.90), (6.142).
Other necessary conditions will be mentioned later (see formulas (6.158) and (6.159) below). Some necessary and sufficient conditions for A(8’,8, k) to be the scattering amplitude for a q E &I are given first in [Ramm
Random Fields Estimation Theory
136
(1987e)], see also [Ramm (2005)j. These conditions can not be checked algorithmically: they are formulated in terms of the properties of the solutions to certain integral equations whose kernel is the given function A(B’,0, k).
A Marchenko equation (M equation)
6.2.4
Define [+(z, 8, k) - 11exp(-ikcu)dk,
where 4 is defined by (6.60) and has property (6.72) as k that q E &I.
(6.143) ---t
For simplicity we assume that lq has no bound states.
+
m, provided
(6.144)
c+
Under this assumption is analytic in C+ and continuous in \ 0. Let us assume that k = 0 is not an exceptional point, that is, is continuous in E+. Start with equation (6.92) which we rewrite as
s,,
+
A(O’,8, k) exp [ik(e’ - e) .z] +(x, 4,- k ) d e (6.145)
or
+(x,e, k)
- 1 = +(z, -8,
+
g S,.
-k)
-1
A(e’,e, k) exp [ilc(e’- e) - 4[+(z, -e’, -IC) - 11do’
2L2
A(e’,0, k) exp [ik(e‘ - 0 ) . z] de’.
+
Take the Fourier transform of (6.146) and use (6.143) to get
Here
La 00
v0 :=
exp(ika)
{
l2
A(8‘,8, k) exp [ik(e’ - 8) . z] d8’
(6.146)
137
Estimation and Scattering Theory
and the integral term in (6.147) is 227r J-”, dkexp(--zka)$
Js2
[q5(x,
A ( @ ,8, k) exp [ik(8’ - 0) x] x -k) - 11 do’
where q a ,81, 8, x) :=
L. 2T
ik
00
-00
dk exp(-zka)-A(B‘, 2n
.
8, IC)exp [iqe’ - e)
(6.150) The Fourier transform in (6.150) is understood in the sense of distributions. Under the assumption (4), the analyticity of +(x,8, k) in k in the region C + and the decay of q5 as Ikl -+ 00, k E C + , which follows from (6.72) imply that
7 ( ~ , 8 , a=)0 for a < 0.
(6.151)
Therefore the right hand side of (6.149) can be written as
IrndPJ,2
dO’B(a
+ P,
8, x)r](o,-8‘,P) :=
IW
B(a
+ P)r](P)dp. (6.152)
Equation (6.147) now takes the form of the Marchenko equation (6.153) where we took into account that ~ ( 2-8, ,
-a) = 0 for a > 0
(6.154)
according to (6.151). The function r]o in (6.153) is defined in (6.148), and the integral operator in (6.153) is defined in (6.152). The kernel of the operator in (6.153) is defined by (6.150) and is known if the scattering amplitude is known. If A(#, 8, k) is the scattering amplitude corresponding to a q E Qp, where QY is the subset of Q1 which consists of the potentials with no bound states, then equation (6.153) has a solution r] with the following properties: if one defines r] for a < 0 by formula (6.151) then the function daexp(ika)r](z,8, a )
(6.155)
Random Fields Estimation Theory
138
solves the equation
vZ4 + 2ike. vz4- q(z)+= o
(6.156)
and the function u := exp(ik0 z)q5 solves the Schrodinger equation +
(6.157)
e,u = 0.
In particular, the function (02
+ k 2 ) U := q(x)
(6.158)
U
does not depend on 8 (this is a compatibility condition). Another compatibility condition gives formula (6.88). This formula can be written as =
-28. v,q(z,
e, +o).
(6.159)
Indeed, it follows from (6.155) that lim {2ik(+ - 1))
k+w
= -2q(z,
8, +O).
(6.160)
Formula (6.159) follows from (6.88) and (6.160). The compatibility condition (6.159) and the Marchenko equation (6.153) appeared in [Newton (1982)] where condition (6.159) was called the “miracle” condition since the left side of (6.159) does not depend on e). The above derivation is from [Ramm (1992)l. 6.2.5
Characterization of the scattering data i n the 3 0 inverse scattering problem
Let us write A E do if A := A(O’,e,k) is the scattering amplitude corresponding t o a potential q E Q. Assuming that q E Q, we have proved in Section 6.1 that equation (6.92), which we rewrite as w(2,8, IC) = w(z,-8,
-IC)
+ii S,.
e, ~c)w(z,
-el,
-Ic)de’ (6.161)
has a solution w for all z E R3 and all k
w
> 0, where
:= U(Z,8,k) - exp(ik8. z)
.-
21
- uo.
(6.162)
Estimation and Scattering Theory
139
This v has the following properties:
which is equation (6.2), and (6.164) which is equation (6.158). These properties are necessary for A E .AQ. It turns out that they are also sufficient for A E dQ.Let us formulate the basic result (see [Ramm (1992)l).
Theorem 6.3 For A E d g it is necessary and suficient that equation (6.161) has a solution v such that (6.164) holds and
v = A,@’, 8, q g ( r ) + +-I),
=
+ CO,
= 8’.
(6.165)
The function A, defined by (6.165) is equal to the function A(8‘,8, k) which is the given function, the kernel of equation (6.161), and it is equal to the scattering amplitude corresponding to the function q ( x ) defined by (6.164). There is at most one solution to equation (6.161) with properties (6.163), (6.164) and (6.165).
Proof. We have already proved the necessity part. Let us prove the sufficiency part. Let A(8’8,Ic) be a given function such that equation (6.161) has a solution with properties (6.164) and (6.165). First, it follows that u defined by the formula u := exp(ik8. z)
+v
(6.166)
is the scattering solution for the potential q ( 2 ) defined by formula (6.164).
Since the scattering solution is uniquely determined (see Lemma 6.4 in Section 6.1.2) one concludes that the function A,(#, 8, Ic) defined by formula (6.165) is the scattering amplitude corresponding to the potential q(z) defined by formula (6.164). Secondly, let us prove that
A,(8’, 8, Ic) = A(8’,8, Ic)
(6.167)
where A(#, 8, k ) is the given function, the kernel of equation (6.161). Note
Random Fields Estimation Theory
140
that we have proved in 6.1 that v satisfies the equation
A,(8’, 8, k) exp(ik8’ . z)d8’.
(6.168)
This is equation (6.92) written in terms of w. Subtract (6.168) from (6.161) to get
[A(O’, 8, k) - A,(O’, 8, k)]u ( z ,-8’, -k)d8’,
Vz E R3.
(6.169)
Equation (6.169) and Theorem 6.1 from Section 6.1.8, imply (6.167). The last statement of Theorem 6.3 can be proved as follows. Suppose there are two (or more) solutions vj, j = 1,2, to equation (6.161) with properties (6.164) and (6.165). Let q j ( 2 ) and Aj(O’,O,k), j = 1 , 2 be the corresponding potentials and scattering amplitudes. If q1 = q2 then w 1 = w 2 by the uniqueness of the scattering solution (Lemma 6.4, p. 114). If q1 $ q 2 then w := v1 - v2 $ 0 . The function w solves the equation
w(x,e,k) = w(x, -8,
vx E R
A(el‘,e, k)w(x, -8y - k ) d Y ,
~ .
(6.170) Note that W(Z,
8, k) = [Al(8‘, 8, k) - ~ ~ ( e8,’k)] , g(T) + o ( T - ~ ) ,
+ 00,
xT-l
= 8‘
(6.171) and W(X,
-8, -k) = [A1(8’, -8, -k) - A2(8’, -8, - k ) ] g ( r ) + o ( T - ’ ) , T--too, xT-l=ef (6.172)
where g ( T ) := T - ~exp(ikr). From (6.170), (6.171) and (6.172) it follows that
+
[A1(8’, 8, k) - ~ ~ ( e8,’k)] , g ( T ) = q e l , 8, I C ) ~ ( ~ o) ( T - ~ ) ,
T
-, 00
(6.173) , k) is not important for our argument. It where the expression for B ( @8, follows from (6.173) that
A1(8’, 8, k) = A2(0’, 8, k)
(6.174)
Estimation and Scattering Theory
141
so that B(8’,8, k ) = 0). By Lemma 6.7, it follows that 6.3 is proved.
+
Exercise. Prove that if ag(r) = @(r) o(r-’), and b do not depend on r then a = b = 0.
T
-+
q1
= q 2 . Theorem
0 00,
k
+
> 0, where a
y,
Hint: Write aexp(ikr) = bexp(-ikr) o(1); choose rn = n 00. nn+ H Derive that a = b. Then choose rk = +, n -i 00. Derive -a = b. Thus -+
a=b=0.
Another characterization of the class of scattering amplitudes is given in [Ramm (1992)l. A characterization of the class of scattering amplitudes at a fixed k > 0 is given in [Ramm (1988)l. 6.2.6
The Born inversion
The scattering amplitude in the Born approximation is defined to be exp ( i k ( 8 - 8’) . x } q ( z ) d z
AB(e‘,8, k ) :=
(6.175)
which is formula (6.15) with u(y, 8, k ) substituted by %(y, 8, k) := exp(ik8. y). The Born inversion is the inversion for q ( z ) of the equation
1
exp (ik(9 - 8’) . X } q ( z ) d x = -47rA(9’, 8, k )
(6.176)
which comes from setting
A ( @ ,9, k ) = A B ( 8 ’ , 8, k ) .
(6.177)
The first question as: does a q ( x ) E Q exist such that (6.177) holds for all 9’,9 E S2 and all k > 0 1 The answer is no, unless q ( x ) = 0 so that AB(8’, 8, k ) = A(B’,8, k ) = 0.
Theorem 6.4 Assume that q E Q . If (6.177) holds for all 8‘, 8 E all k > 0 then q ( z ) = 0. Proof.
S2 and
Since q = if, it follows from (6.175) that
A g ( 8 , 8 ,k ) - AB(8,8, k ) = 0.
(6.178)
Random Fields Estimation Theory
142
From (6.178), (6.177) and (1.54) one concludes that
s,,
(AB(8,a,k)I2da= 0 V8 E S2 and all k > 0.
Thus Ag(8, a , k) = 0 for all 8, a E S2 and k that q(z) = 0. Theorem 6.4 is proved.
(6.179)
> 0. This and (6.175) imply
Remark 6.2 If q E Q is compactly supported and (6.177) holds for all 8‘, 8 E S2 and a fixed k > 0 then q(x) = 0. This follows from the uniqueness theorem proved in [Ramm (1992)l. It follows from Theorem 6.4 that the scattering amplitude A(#, 8, k) cannot be a function of p := k(8 - 8‘) only. The Born inversion in practice reduces to choosing a p E R3, finding 8,B’ E S2 and k > 0 such that p = Ic(e - e l ) ,
(6.180)
writing equation (6.176) as q(p) :=
/
+
exp(ip. z)q(z)dz = -47r [ A @ ) 771
(6.181)
where
and 77 is defined as
v := A(8’,8, k ) l l c ( o - o + p One then wishes to neglect
r]
- A(P).
(6.183)
and compute q(z) by the formula (6.184)
However, the data are the values A(p)+r] or, if the measurements are noisy, the values
where r]l is noise and 6 > 0 is defined by formula (6.186) below. The question is: assuming that 6 > 0 is known such that
h + m I <6
(6.186)
Estimation and Scattering Theory
143
how does one compute qa(z), such that
- q(z)l I E ( 6 )
146(2)
+
0 as 6 -+ 0.
(6.187)
In other words, how does one compute a stable approximation of q(z) given noisy values of A(p) as in (6.185)? This question is answered in [Ramm (1992)].We present the answer here. Define q&)
:= -47r(47r)-3
/
& ( p ) exp(-ip. z ) d p ,
R(6) = m6-A
lPlIR(6)
where the constants Q > 0 and b > (6.191) and (6.190)). Theorem 6.5
(6.188)
4 will be specified below (see formulas
The following stability estimate holds:
provided that (6.190) The constants
Q
and c1 are given by the formulas (6.191)
25 +
c1="(L) 31T
Proof.
41T
"$'
1 3-2.1 27r2(2b - 3 ) ( 4 7 r ) ~
(6.192)
Using (6.182) and (6.190), one obtains
2 I -6R3
3lr
:= $(A,
c:, R3-:,' + -2b - 3 ~IT'
R).
(6.193)
Random Fields Estimation Theory
144
For a fixed 6
> 0 , minimize 4(6, R) in R to get (6.194)
where c1 is given by (52). Theorem 6.5 is proved.
0
The practical conclusions, which follow from Theorem 6.5, are: The Born inversion needs a regularization. One way to use a regularization is given by formula (6.188). If one would take the integral in (6.188) over all of R3, or over too large a ball, the error of the Born inversion might have been unlimited. Even if the error 77 of the Born approximation for solving the direct scattering problem is small, it does not imply that the error of the Born inversion (that is the Born approximation for solving the inverse scattering problem) is small. The second conclusion can be obtained in a different way, a more general one. Let B ( q ) = A(*), where 23 is a nonlinear map which sends a potential q E Q into a scattering amplitude A . The Born approximation is a linearization of (*). Let us write it as (6.195) The inverse of the operator 23'(qo) is unbounded on the space of functions with the sup norm. Therefore small in absolute value errors in the data may lead to large errors in the solution q - qo. The Born approximation is a linearization around qo = 0. The distorted wave Born approximation is a linearization around the reference potential 40. In both cases the basic conclusion is the same: without regularization the Born inversion may lead to large errors even if the perturbation q - qo is small (in which case Born's approximation is accurate for solving the direct problem). Let us discuss another way to recover q(z) from A(8', 8, k) given for large k. This way has a computational advantage of the following nature. One does not need to find 8', 8, and k such that (6.180) holds and one integrates over S2 x S 2 instead of R3 in order to recover q(z) stably from the given noisy data A6(8', 8, k):
IA6(8',8, k) - A(8', 8, k)l 5 6.
(6.196)
Estimation and Scattering Theory
145
We start with the following known formula ([Saito (1982)l)
To estimate the rate of convergence in (6.197) one substitutes for A(W,0, k) its expression (6.15) to get
(6.198) where
4 is defined by (6.60). A simple calculation yields
(6.200) One has 1 - cos(2kla: - yI)
s
= - 2 ~ q(y)lz - ~ l - ~ d-y2~
where
Let us assume that
I”
dr ~ 0 ~ ( 2 k r ) Q T( z) , (6.201)
Random Fields Estimation T h e o q
146
Then, integrating by parts, one obtains
C
IF, Here c = const
k>l.
(6.204)
> 0 does not depend on x:
+
JVq(x ra)lda
c = max x€R3 L ' 2
+ xmax ER3
(6.205)
T>O
(6.206) and dr(1
+
12 - TI)-"
5
1"
dr (1
+ 1151 -
L 00
TI)-"
5
dr (1 + 11x1 - T I ) - "
5 2 p ( l +r)-" 5 c2
(6.207)
so that the right-hand side of (6.205) is bounded uniformly in x E R3. Thus 3 1 =
J IZ
- 2 ~
- YI2q(Y)dY
+ 0(K2), k +
00
(6.208)
provided (6.203) holds. Note that if q E LfOc(R3)and no a priori information about its smoothness is known, then one obtains only o(1) in place of O(k-') in (6.208). From (1.72) it follows that 3 2
5 ck-1,
k > 1.
(6.209)
Thus, assuming (6.203),
3 = -27r
J
Ix - yI-Zq(y)dy
where O ( k - l ) is uniform in x.
+o(k-I),
k
-+ 00
(6.210)
147
Estimation and Scattering Theory
Therefore (6.197) can be written as
IC2
L
2
s,,
8, k) exp {iqe’ - e) . X} dede‘
= -2n
J 1%
-
yl-2q(y)dy
+ 0(lC-l).
(6.211)
The equation -2nI
12 - yI-2q(y)dy
= f(z),
x
E
(6.2 12)
R3
is solvable analytically. Take the Fourier transform of (6.212) to get 1
(6.213)
@(PI = -&lfl(P). Here we have used the formula
(6.214) thus q(z) = ~-1 Jexp(-~p.X)~p~~(p)dp.
(6.215)
Assume for a moment that the term O ( k - ’ ) in (71) is absent. Then, applying formula (6.215), taking f ( ~ t)o be the left-hand side of (6.211), and taking the Fourier transform of f l ( p ) , one would obtain
-
k3
dOdO’A(e’, 8, qle’ - 81 exp {ik(e’ - e) . X}
. (6.2 16)
This formula appeared in [Somersalo, E. et al. (1988)l. If A6 is known in place of A , and (6.196) holds, then formula (6.216), with A6 in place of A , gives q&)
:= --
dede’A6(e’, 8, k ) p ’ - e ( exp { i k ( e ’ -
e) . Z} . (6.217)
Random Fields Estimation Theory
148
Neglecting the term O(k-') in (6.211), one obtains (6.218) where we have used the formula:
L,1,
le - elidel = 4*
10 - e'ldede' = = 8n2
1"
S,, le
- elldo'
64r2 ydy = - (6.219) 3 -
J-sni
It is now possible to take into account the term O(k-') that, as follows from (6.72) and (6.200),
in (6.211). Note
One has
and
= 2T
[1+ 11x1 - .I2]
-a/2+1
+
- [ I +1.1(
2
3 -a/2+1
(6.222)
21xllrl(; - 1) Moreover
Jo"
dr
[1+ 11x1 - TI2]
-a/2+1
1.1(
- [I+ 7-
2
+TI
1-a/2+1
C
5 -,
1x1
1x1 2 1 (6.223)
where c > 0 is a constant. Therefore (6.224)
Estimation and Scattering Theory
149
This means that the L 2 ( R 3 )norm of 3 2 as a function of x is O(k-') as k 4 00. Therefore, if one takes into account the O(k-') term is (71), one obtains in place of (6.218) the following estimate
11 q - q6
11p(~3)<
c(6k3
+ k-'),
c = const
> 0.
(6.225)
Minimization of the right-hand side of (6.225) in Ic yields
11 q - q6
IIp(tz3)I
c ~ S - ' ' ~ for
k,
=
(36)-'14
(6.226)
where k, = k m ( 6 ) is the minimizer of the right-hand side of (6.225) on the interval k > 0. Therefore, if the data Ah(#, 8, k) are noisy, so that (6.196) holds, one should not take k in formula (6.217) too large. The quasioptimal k is given in (6.226), and formula (6.217) with k = k, gives a stable approximation of 4(x). Let us finally discuss (6.225). The term ck-l has already been discussed. The first term c6k3 has been discussed for the estimate in sup norm. In the case of L2 norm one has to estimate the L2(R3)norm of the function
given that la1 5
6,
]Veal + IVetal I mi
(6.228)
where Ve, Vv are the first derivatives in 8 and 8', and ml = const > 0. In order to estimate h we use the following formula [Ramm (1986), p.541
aE
s2.
(6.229)
This formula is proved under the assumption f E C'(S2). From (6.227)(6.229) one obtains (6.230) Thus
Random Fields Estimation Theory
150
By c we denote various positive constants. From (6.231) one obtains the first term in (6.225) for the case of the estimate in L 2 ( R 3 )norm. 6.3
Estimation theory and inverse scattering in R3
Consider for simplicity the filtering problem which is formulated in Chapter 2. Let
U = s + n(z),
z E R3
(6.232)
where the useful signal s(z) has the properties
-
s(z) = O,
(6.233)
s*(z)s(y) = R3(z,y) s(z)n(y) = 0
(6.234)
and the noise is white
-
n ( z )= 0,
n*(z)n(y)= 6(x - y).
(6.235)
In this section the star stands for complex conjugate and the bar stands for the mean value. The optimal linear estimate of s(z) is given by (6.236)
Here the optimality is understood in the sense of minimum of variance of the error of the estimate, as in Chapter 2. Other notations are also the same as in Chapter 2. In particular, D c R3 is a bounded domain in which U is observed. It is proved in Chapter 2 that the optimal h(z,y) solves equation 2.11 which in the present case is of the form:
or, if one changes z -+ y and y
h ( z ,y) -k
-+
z, this equation takes the form
R3(!/7z ) h ( z >z)dz = R 3 ( ! / , z),
z, y
D*
(6.238)
Note that under the assumptions (6.233)-(4) one has
R ( z ,Y) z= R3(z,Y) + 6(z - Y),
f(z,Y)
= Rs(2,Y)
(6.239)
Estimation and Scattering Theory
151
where R ( z ,y) and f ( z ,y) are defined in Chapter 2 (see formulas (2.3) and (2.10)). The basic equation (6.238) is a Fredholm’s second kind equation with positive definite in L 2 ( D )operator R, R 2 I. It is uniquely solvable in L 2 ( D )(that is, it has a solution in L2(D)and the solution is unique). There are many methods to solve (6.238) numerically. In particular an iterative method can easily be constructed for solving (6.238). This method converges as a geometrical series (see Section 3.2, Lemma 3.1). Projection methods can be easily constructed and applied to (6.238) (see Section 3.2, Lemma 3.3). In [Levy and Tsitsiklis (1985)I and [Yagle (1988)] attempts are made to use for the numerical solution of the equation (6.238) some analogues of Levinson’s recursion which was used in the onedimensional problems when Rs(z, y) = &(a: - y). In the one-dimensional problems causality plays important role. In the three-dimensional problems causality plays no role: the space is isotropic in contrast with time. Therefore, in order to use the ideas similar to Levinson’s recursion one needs to assume that the domain D is parametrized by one parameter. In [Levy and Tsitsiklis (1985)] the authors assume D to be a disc (so that the domain is determined by the radius of the disc; this radius is the parameter mentioned above). Of course, one has to impose severe restrictions on the correlation function R,(x,y). In [Levy and Tsitsiklis (1985)] it is assumed that
(6.240)
This means that s ( x ) is an isotropic random field. In [Yag]the case is considered when D c R3 is a ball and R s ( z ,y) solves the equation
where A, is the Laplacian. If R s ( z ,y) = R,(z - y) then (6.241) holds. Let us derive a differential equation for the optimal filter h(z,y). The derivation is similar to the one given in Section 8.4.8 for Kalman filters. Let us apply the operator Az - Av to (6.238) assuming (6.241) and taking
152
Random Fields Estimation Theory
where, as we will prove,
Here S2is the unit sphere in R3 and
Let us prove (6.243) and (6.244). Integrate by parts the second integral in (6.242) t o get
The first integral in (6.242) can be written as
where A* is the angular part of the Laplacian and z = pp, where p = IzI
Estimation and Scattering Theory
and
153
P E S2. One has
(6.247)
From (6.245) and (6.247) one obtains
where q is given by (6.244). In order t o derive a differential equation for h let us assume that
Random Fields Estimation Theory
154
write equation (6.238) for D = { z :
(zI
5 1x1) as
Note the restiction 1yI I 1x1 in (6.250). Multiply (6.250) by q ( r a , r p ) ,set in (6.250) x = rp, r = 1x1,integrate over S2in ,B and then multiply by r2 to get
Define
and set x = ra in (6.252). Write equation (6.242) as
or, in the operator form
where $ is the right-hand side of (6.253). Equation (6.251) is of the form
where
Since the operator I+R, 2 I is injective, it follows from (6.254) and (6.255) that 9 = y. Thus
(A,
-
Ay)'Ft(x,Y) = r2
J,,X ( r P ,
Y)Q(TQ,
rP)dP,
IYI 5 1x1 = r,
= ra,
(6.257) where a , p E S2 and q is given by (6.244) with h(ra,r p ) = K ( r a ,r p ) . Let us formulate the result:
Estimation and Scattering Theory
155
Lemma 6.8 If 7-1(xly) solves equation (6.250) and the assumptions (6.241) and (6.249) hold, then 7-1 solves equation (6.257) with q(ra,rP) given by (6.244) with h ( r a ,rP) = 'Fl(ra,TP), Q , P E S2, 1x1 = r , z = T Q . If one defines
and put
where the integral in (6.259) is taken actually over the ball IyJ5 1x1 because of (6.258)' then ~ ( zy), = (27r)-+
I
Fi(z,E ) exp(-if
.y ) d ~ .
(6.260)
Substitute (6.260) into (6.257) (or, which is the same, Fourier transform (6.257) in the variable y) to get
(A,
+ f 2 ) f i ( z ,6 ) = r 2
L
fi(rP,f ) q ( r a lrP)d/3, z = T Q .
(6.261)
Equation (6.261) is a Schrodinger equation with a non-local potential
Suppose that 'Fl(z,y) is computed for IyI 5 1x1 5 a. Given this 'Fl(z,y), how does one compute the solution h(z,y) to the equation (6.238) with D = B, = {z : 1x1 5 a } ? Write equation (6.238) for D = B, and z = p/3 as
Differentiate (6.263) in a t o get
Random Fields Estimation Theory
156
Let z = ua in (6.263). Multiply (6.263) by -a2h(z,aa,a) and integrate over S2 to get: -a2
Jszh(aa,y, a ) h ( z ,aa, a)da
+ so"Jsz W Y ,PP) { -a2 = -a2
Js2
PPl
a)h(z,
a)}
JszR,(y,aa)h(z,aa, a)da.
(6.265)
+
The operator I R, is injective. Therefore equations (6.264) and (6.265) have the same solution since their right-hand sides are the same. Set z = z and a = p in (6.265), compare (6.264) and (6.265) and get (6.266) Note that
h(z,Y) = w z , Y) for IYI I 1x1
(6.267)
according to equation (6.250). Therefore (6.266) can be written as (6.268) Equation (6.268) can be used for computing the function h (z ,y,a ) for all z, y E B,, given %(z,y) for IyI 5 1x1 5 a. The value h(z,ap, a ) can be computed from equation (6.263):
The function R,(ap, z ) is known for all z , ,6 and the function h ( z ,pd, a ) , p < a is assumed to be computed recursively, as a grows, by equation (6.268). Namely, let us assume that %(z,y) is computed for all values IyI 5 1x1 5 A , and one wants to compute h(z,y) for all z, y E BA := {z :1 . 1 5 A } . From (6.268) one has
h (z, y, (m + 1 ) ~=)h ( z ,y, m ~-)T
( ~ Tx) ~
1-l(m~P, y, m ~ ) h (m z , ~ pr n, ~ ) d p . (6.270)
Here m = 0,1,2,. . . , T > 0 is a small number, the step of the increment of a. It follows from (6.263) that (6.271)
Estimation and Scattering Theory
157
so that
(6.272) (6.273) and so on. One can assume that IyI > 1x1 because for IyI 5 1x1 one can use (6.267). A formal connection of the estimation problem with scattering theory can be outlined as follows. Let us assume that there exists a function 'Fl(~,y),'Fl(z,y) = 0 for IyI > 1x1,such that the function +(x,8, k) defined by the formula
+(z,8, k) := exp(ik8
X) -
/
exp(ik8. y)'Fl(z, y)dy
(6.274)
lYlll4
is a solution to the Schrodinger equation
[A
+ k2 - q ( ~ )+] = 0,
(6.275)
where A = V2 is the Laplacian. This assumption is not justified presently, so that our argument is formal. Taking the inverse Fourier transform of (6.274) in the variable k8, one obtains
(6.276) Compute (A, - Ay)'Fl formally taking the derivatives under the integral signs in the left-hand side of (6.276) and using (6.275). The result is
(A, - Ay)'Fl(X, Y) = q(z)'Fl(z,Y).
(6.277)
One is interested in the solution of (6.277) with the property 'Fl(z,y) = 0 for JyI > JzI. Define &(x,() by formula (6.259). Substitute (6.260) into (6.277) and differentiate in y formally under the sign of the integral to get
[A,
+ t2- ~ ( z )*(x, ] <) = 0.
(6.278)
Therefore, if one compares (6.278) and (6.261) one can conclude that the right-hand side of (6.261) reduces to q ( z ) f i ( z - , c )This . means that (6.279)
158
Random Fields Estimation Theory
where S(a - p) is the delta-function. If (6.279) holds then the non-local potential Q defined by (6.262) reduces to the local potential q(z). Equations (6.244) and (6.279) imply (6.280) Note that h(rcr,r p ) = 'H(ra,rp). From (6.280) one obtains (6.281) where we have used the equation
'FI(0,O) = &(O, 0)
(6.282)
which follows from (6.250). Let us summarize the basic points of this section: 1) the solution to equation (6.250) solves equation (6.257) provided that the assumptions (6.241) and (6.249) hold; 2) the solution to equation (6.238) with D = B, is related to the solution to equation (6.250) for 1x1 5 a by the formulas (6.267) and (6.268); 3) if the solution to equation (6.250) is found then one can compute the solution to equation (6.238) with D = B, recursively using (6.270); 4) the solution to equation (6.250) solves the differential equation (6.257), and its Fourier transform solves the Schrodinger equation (6.261) with a non-local potential.
Chapter 7
Applications
In this chapter a number of questions arising in applications are discussed. All sections of this chapter are independent and can be read separately.
7.1 What is the optimal size of the domain on which the data are to be collected?
Suppose that one observes the signal
+
U ( Z )= ~ ( z ) n ( ~ ) ,2 E D c R'
(7.1)
in a bounded domain D which contains a point 20. Assume for simplicity that D is a ball Be with radius C centered at 20.The problem is to estimate ~(20).As always in this book, s(z) is a useful signal and n(z) is noise. It is clear that if the radius C, which characterizes the size of the domain of observation, is too large then the time and effort will be wasted in collecting data which do not improve the quality of the estimate significantly. On the other hand, if C is too small then one can improve the estimate using more data. What is the optimal l? This question is of interest in geophysics and many other applications. Let us answer this question using the estimation theory developed in Chapter 2. We assume that the optimal estimate is linear and the optimization criterion is minimum of variance. We also assume that the data are the covariance functions (1.3), that condition (1.2) holds, and that R(z,y) E R. The optimal estimate is given by formula (2.15). Let us assume for simplicity that P(X) = 1, and IR(z,y)l I cexp(-alz - yI) 159
c
> 0,
Iz - yI 2 E
>0
(7.2)
Random Fields Estimation Theory
160
where the last inequality allows a growth of R(z,y) as z + y, for example, R(z,y) = (4nlz-y))-lexp(-a)z-y/). Here c and a are positive constants, is the so-called correlation radius, and the function f(z,y) defined by formula (1.3) is smooth. Concerning f(z,y) we assume the same estimate as for R(z,y): If(z,y)l
I cexp(-ala: - Yl),
c > 0,
Iz - YI
L
€0
> 0.
(7.3)
Under these assumptions the optimal filter is of the form
+
h = Q ( L ) f h, = ho
+ h,
(7.4)
where h, is the singular part of h which contains terms of the type {b(s)6r}(j)(see Section 3.3), and ho is the regular part of h. The optimal estimate is of the form i(z0)=
s,,
ho(zo,y)U(y)&
+
s,,
hs(zo,Y)U(Y)dY.
(7.5)
The optimal size C of the domain Be of observation is the size for which the second term in (7.5) is negligible compared with the first.
Example 7.1 Suppose that r = 3, zo = 0, R(z,y) = (4+ exp(-alz - Yl), and l$fl 5 M , 0 5 Ijl 5 2, where j is a multiindex. Then by formula (2.85) one has
where h ( y ) = h ( O , y ) , f(y) = f(O,y), optimal estimate is h ( y ) U ( y ) d y=
/
Bt
(-A
r
= {z : (21 =
C}. Therefore the
/
+ a 2 ) f U d y + r (af - *) U ( s ) d s ~ Y I ~ Y I
(7.7) and u is uniquely determined by f as the solution to the Dirichlet problem (2.22-2.23) which in our case is
( - A + a 2 ) u = 0 if
1x1 2 1, u = f if 1x1 = C,
u ( m ) = 0.
(7.8)
The solution to problem (7.8) can be calculated analytically. One gets
Applications
161
where 0 = (29,$) is a point on the unit sphere S2 in R3,a unit vector, {Yn(8)}is the system of spherical harmonics orthonormalized in L 2 ( S 2 ) , f n := f(C,O)Y,*(e)de,and hn(r) is the spherical Hankel function, ( r ) , where HA1)(r)is the Hankel function. .The := ($) 2'1 Hn+(1/2) (') solution (7.9) is a three dimensional analogue of the solution (2.91). The second integral on the right-hand side of (7.7) is of order of magnitude 0 (exp(-a!)), while the first integral is of order of magnitude O(1). Therefore, if we wish to be able to neglect the effects of the boundary term on the estimate with accuracy about 5 percent then we should choose e = 3/a. A practical recipe for choosing e so that the magnitude of the boundary term in (7.7) is about y percents of the magnitude of the volume term is 1 100 C = -In-. (7.10) a 7
ss2
7.2
Discrimination of random fields against noisy background
Suppose that one observes a random field U ( x ) which can be of one of the forms
U ( z )= sp(x) + n ( z ) , p = 0 , l .
(7.11)
Here sp(z), p = 0,1, are deterministic signals and n ( z )is Gaussian random field with zero mean value -
n = 0.
(7.12)
In particular, if SO = 0 then the problem is to decide if the observed signal U ( x ) contains the signal s~(z) or is it just noise. In order to formulate the discrimination problem analytically we take as the optimality criterion the principle of maximum likelihood. Other optimal decision rules such as Neyman-Pearson or Bayes rules could be considered similarly. Note that we assume in this section that noise in Gaussian. This is done because under such an assumption one can calculate the likelihood ratio analytically. Let us first develop the basic tools for solving the discrimination problem. Let
162
Random Fields Estimation Theory
Here
A1
2 A2 2 * - . > 0,
(7.15)
X j are the eigenvalues of the operator R counted according their multiplicities, and the & are the corresponding normalized in L 2 ( D )eigenfunctions. Let us define random variables nj by the formula:
From (7.12) it follows that -
nj
(7.17)
= 0.
1 i=j 0 i#j.
(7.18)
The random variables nj are called noncorrelated coordinates of the random field n(x). One has (7.19) The series in (7.19) converges in the mean. The random variables Gaussian since the random field n(z)is Gaussian. Define spj := X
y
s,
s,(z)fqz)dz,
p = 0,l.
nj
are
(7.20)
Then
Let (7.22)
A pplacataons
163
Then
u,,=
IUpj - s p j 1 2 = 1,
spj,
(7.23)
and Upj are Gaussian. Let H, denote the hypothesis that the observed random field is sp(x) n ( z ) ,p = 0,1, and f(u1,.. . , u, H,) is the probability density for the random variables U,j under the assumption that the hypothesis H, occured:
+
I
Here we used equations (7.23). Since U,j are complex valued we took ( 2 ~ ) rather - ~ than ( 2 ~ ) - " /as ~ the normalizing constant in (7.24). The likelihood ratio is defined as (7.25) Therefore
We wish to compute the limit of the function (7.26) as n --f 00. If this is done one can formulate the decision rule based on the maximum likelihood principle. Note first that the system of eigenfunctions {q$} of the operator R is complete in L 2 ( D ) since we have assumed that the selfadjoint operator R : L 2 ( D ) -+ L 2 ( D ) is positive, that is (R$I,$I)= 0 implies 4 = 0 (see (7.15)). Indeed since
L 2 ( D )= d{RanR}
@ N(R)
(7.27)
where d{RanR} is the closure of the range of R, and N ( R )is the null space of R, and since N ( R ) = (0) by assumption (7.15), one concludes that the closure of the range of R is the whole space L 2 ( D ) .Thus the closure of the linear span of the eigenfunctions {&} is L 2 ( D )as claimed.
Random Fields Estimation Theory
164
Let us assume that
Define the function V(z) as the solution of minimal order of singularity of the equation RV :=
s,
R(z,y)V(y)dy = s1(z)
- so(z), z E D.
(7.29)
Thus
V(X= ) R-'(s~ - SO).
(7.30)
Using Parseval's equality one gets W
Xy'cjbj" = j=1
s,
c(y) {R-'b(y)}* dy
(7.31)
where c(y) and b(y) are some functions for which the integral (7.31) converges,
XT'bj =
ID
R-'b(y)$j*dy.
(7.33)
Therefore using formulas (7.16), (7.19), (7.20), (7.22), (7.30) and (7.31) one obtains (7.34)
Let us assume that
165
Applications
Then, 1using selfadjointness of R-l, one gets 00
=
-
s,
soV*dx -
s;Vdz.
(7.36)
Combining (7.36)) (7.34) and (7.26) one obtains
Lemma 7.1
There exists the limit
(7.37) If the signals s p , p = 0,1, and the kernel R(x,y) are real valued then (7.36) reduces to 1nC ( U ( x ) )=
1
[u(x) - s o ( x )-k 2
S1(x)]
V(x)dx.
(7.38)
D
Suppose that the quantity on the right hand side of equation (7.37) (or (7.38)) has been calculated, so that the quantity 1nC (U(x))is known. Then we use The maximum likelihood criterion: if lnC(U(x)) 2 0 then the decision is that hypothesis H1 occured, otherwise hypothesis Ho occured. Therefore if
R e l U ( x ) V * ( x ) d x2
s ~ ( x ) V ( x ) d x (7.39)
then HI occured. Here V is given by formula (7.30). If the opposite inequality holds in (7.39) then Ho occured. If s p ( x ) ,p = 0,1, U ( x )and R ( x ,y ) are real valued then the inequality (7.39) reduces to
L L ( ( x ) V ( x ) d rL
[SO(X)
+ sl(x)I V ( x ) d x .
(7.40)
Random Fields Estimation Theory
166
The decision rule is: if (7.40) holds then HI occured, otherwise Ho occured. If one uses some other threshold criterion (such as Bayes, NeymanPearson, etc.) then one formulates the decision rule based on the inequality
U ( x ) V * ( x ) d x2 1nK
+
+
[ s o ( z ) V * ( x ) s T ( x ) V ( x ) ]d x ,
(7.41)
where )c > 0 is a constant which is determined by the threshold. (See Section 8.4 for more details.) The decision rule: Practically, the decision rule based o n the inequality (7.5’9) (or (7.41)) can be formulated as follows: 1) Given sp(x), p = 0,1, solve equation (7.29) for V ( x ) by formulas given
in Theorem 2.1. 2) If V ( x ) is found and U ( x ) is measured, then compute the integrals in formula (7.39) and check if the inequality (7.39) holds. 3) If yes, then the decision is that the observed signal is
u = s1(z) + n(z).
(7.42)
+ n(x).
(7.43)
Otherwise
U
= SO(Z)
Example 7.2 Consider the problem of detection of signals against the background of white Gaussian noise. In this case SO(Z) = 0, R(z,y ) = cr2S(x - y), we assume that the variance of the noise is cr2. The solution to equation (7.29) is therefore
v = 0-2s&).
(7.44)
The inequality (7.39) reduces to
(7.45)
If (7.45) holds then the decision is that the observed signal U ( x ) is of the form
U ( z ) = SI(Z)
+ n(x).
Otherwise one decides that
U ( z ) = n(z).
Applications
167
The problem of discrimination between two signals sl(z) and so(z) against the white Gaussian noise background is solved similarly. If
then the decision is that equation (7.42) holds. Otherwise one decides that (7.43) holds. If all the signals are real-valued then inequality (36) can be written as (7.47) The decision rule now has a geometrical meaning: if the observed signal 5’1 in L 2 ( D ) metric then the decision is that (7.42) holds. Otherwise one decides that (7.43) holds. We have chosen a very simple case in order to demonstrate the decision rule for the problem for which all the calculations can be carried through in an elementary way. But the technique is the same for the general kernels R E R.
U is closer to
Example 7.3 Consider the problem of detection of a signal with unknown amplitude. Assume that the observed signal is either of the form
U ( z )= ys(x)
+n(z)
(7.48)
or
U ( z )= n(z).
(7.49)
Parameter y is unknown, function s(z) is known, n(z)is a Gaussian noise with covariance function R(z,y) E R. Given the observed signal U ( z ) one wants to decide if the hypothesis H 1 that (7.48) holds is true, or the hypothesis H o that (7.49) holds is true. Moreover, one wants to estimate the value of y. In formula (7.37) take s 1 = ys(z),
so = 0.
(7.50)
Then, using the equation R&*V = ReUV*,write (7.51) where V ( z )solves the equation r
(7.52)
168
Random Fields Estimation Theory
One should find the estimate of y by the maximum likelihood principle from the equations
dine
-=
0,
8-f
dlne
--
dr*
- 0.
(7.53)
If, again for simplicity, one assumes that the noise is white with variance a2 = 1, so that R ( x , y ) = r5(x - y), then the solution to (7.52) is
and formula (7.51) reduces to
s,
h e = Rey L U * s d x - Id2 2
lsI2dx.
(7.55)
Therefore equations (7.53) yield (7.56)
so that the estimate 9 of y is (7.57)
Exercise. Check that the estimate
9 is unbiased, that is
-
9 = y.
(7.58)
Hint: Use the equation a = y S ( x ) . Exercise. Calculate the variance:
(9 - 7 ) 2
U2
= --,
E :=
lsI2dx.
(7.59)
Estimate (7.59) shows that the variance of the estimate of y decreases as the energy E of the signal S ( x ) grows, which is intuitively obvious. Assume that hypothesis Ho occured. Then the quantity 9 defined by formula (7.57) is Gaussian with zero mean value and its variance equals $ by formula (7.59). Therefore Prob(1.j.l
> b) = 2erf(bE1I2/a)
(7.60)
Applications
169
where
Lrn
erf(z) := (27r1-l’~
If one takes confidence level occured if
6
exp(-t2/2)dt.
(7.61)
= 0,95 and decides that hypothesis Ho
2erf(.j.~l/~/o) > E,
(7.62)
then the decision rule for detection of a known signal with an unknown amplitude against the Gaussian white noise background is as follows: 1) given the observed signal U ( z ) calculate 9 by formula (7.57), 2) calculate the left hand side in the formula (7.62); if the inequality (7.62) holds then the decision is that (7.49) holds; otherwise the decision is that (7.48) holds.
7.3 Quasioptimal estimates of derivatives of random functions
7.3.1
Introduction
Suppose that the observed signal in a domain D
+
c RT is
U ( Z )= s(z) n(z), x E D C RT,
(7.63)
where S(X)is a useful signal and n(z)is noise, 2 = 5 = 0. If one wishes to estimate djs(z0) optimally by the criterion of minimum of variance then one has a particular case of the problem studied in Chapter 2 with As = d j s (see formula (1.5) of Chapter 1). This estimation problem can be solved by the theory developed in Chapter 2. However, the basic integral equation for the optimal filter may be difficult to solve, the optimal filter may be difficult to implement, and the calculation of the optimal filter depends on the analytical details of the behavior of the spectral density of the covariance kernel R(z,y) = u*(z)u(y), R ( x , ~ )E R. That is, if one changes &(A) locally a little it ceases to be a rational function, for example. One can avoid the above difficulties by constructing a quasioptimal estimate of the derivative which is easy to calculate under a general assumption about the spectral density, which is stable towards small local perturbations of the spectral density and depends basically on the asymptotic behavior 00, and which is easy to implement practically. of this density as 1x1
170
Random Fields Estimation Theory
The notion of quasioptimality will be specified later and it will be shown that the quasioptimal estimate is nearly as good as the optimal one. The basic ideas are taken from [Ramm (1968); Ramm (1972); Ramm (1981); Ramm (1984); Ramm (1985b)l. 7.3.2 Estimates of the derivatives
+
Consider first the one-dimensional case: U ( t ) = s ( t ) n(t).Assume that
Is”(t)l
5 M.
(7.65)
Let us assume for simplicity that n(t) and s ( t ) are defined on all of R1, that s ( t ) is an unknown deterministic function which satisfies (7.65), and the noise n(t)is an arbitrary random function which satisfies (7.64). Let A denote the set of all operators T : C(R1)-+ C(R’), linear and nonlinear, where C ( R 1 )is the Banach space of continuous functions on R1 with the norm (1 f (I= maxtER1 If(t)l. Let
+
ahU := (2h)-l[U(t h) - U ( t - h ) ] ,
(7.66)
h(6) := (2b/M)1/2, E ( 6 ) := (2MS)1/2.
(7.67)
First, let us consider the following problem: given U ( t )and the numbers b > 0 and M > 0 such that (7.64) and (7.65) hold, find an estimate U of s ’ ( t ) such that
11 u - s ’ ( t ) I[+
0 as 6 --t 0
(7.68)
and such that this estimate is the best possible in the sense
11 U - s’(t) /I=
inf TEA
sup
11 TZA - 11 .
(7.69)
Is”lSA4
149
This means that among all estimates TU the estimate U is the best one for the class of the data given by inequalities (7.64), (7.65). It turns out that this optimal estimate is the estimate (7.70) in the following theorem. Theorem 7.1
The estimate
ti := ah(&)u
(7.70)
Applications
171
has the properties
and (7.72)
where ~ ( 6 and ) h(6) are defined by (7.67) and Ah(a+/ is defined by (7.66).
Proof. Onehas iA/Jf - s'I 1]Ah(U-
s)I + l&S
- S'I
6 h
Mh 2
1 - + -.
(7.73)
Indeed
iAh(ZA - s)I I and
1 =I
s(t
+ h)2;s(t
s(t)
+ +
(n(t h ) ( in(t - h ) ( 6 Ih' 2h
- h) - st/
+ s'(t)h + s"(C+)$
- s ( t ) + s'(t)h - st'([-)$
2h
Mh --
I<=
Mh2 (7.74)
2 ,
where & are the points in the remainder in the Taylor formula and estimates (7.64), (7.65) were used. For fixed 6 > 0 and M > 0, minimize the right side of (7.73) in h > 0 to get
6
Mh
6
Mh(b)
(7.75)
where h(6) are ~ ( 6 are ) defined in (7.67). This proves inequality (7.71). To prove (7.72),take ~1
M = --t[t 2
and extend it on R1 so that
- 2h(6)] 0
5 t 5 2h(6),
(7.76)
172
Random Fields Estimation Theory
Here h(6) is given by (7.67). The extension of s1 with properties (7.77) is possible since on the interval [0,2h(6)]conditions (7.77) hold. Let (7.78)
s z ( t ) = -s1(t).
One has lsFl
IM,
Isp/I 6,
P = 1,2.
(7.79)
Take U ( t )= 0, t E R’. Then
lU@)- s p ( t ) l I 6, P = 1,2.
(7.80)
Therefore one can consider U ( t ) as the observed value of both s l ( t ) and s z ( t ) . Let T E A be an arbitrary operator on C(R1). Denote (7.81)
(7.82) where ~ ( 6 is ) given by (7.67). Taking infimum in T E A of both sides of (7.82) one obtains
From (7.83) and (7.71) the desired inequality (7.72) follows. Theorem 7.1 is proved. 0 7.3.3
Derivatives of random functions
Assume now that s ( t ) is a random function, that s ( t ) and n(t) are uncorrelated, A = 0, and n = uv,where the variance of v, denoted by D[v], is 1, so that ~ [ v=] I , D [ ~=] 2.
(7.84)
The problem is t o find a linear estimate Lu such that
D[LU - s’] = min
(7.85)
Applications
173
+
given the observed signal U ( t ) = s ( t ) n(t). As was explained in section 7.3.1, we wish to find a quasioptimal linear estimate of s’ such that this estimate is easy to compute, easy to implement, and is nearly as good as the optimal estimate. Let us assume that Z)[s(m)(t)] 5
Mi,
(7.86)
where dm)(t)is the m-th derivative of s ( t ) . Let us seek the quasioptimal estimate among the estimates of the form (7.87)
If m = 2q or m = 2q + 1let us take Q = q. If one expands the expression on the right hand side of (7.87) in powers of h and requires that the order of the smallness as h --t 0 of the function ALQ’s - s‘ be maximal, one obtains the following system for the coefficients ALQ): (7.88)
where
The system (7.88) is uniquely solvable since its determinant does not vanish: it is a Vandermonde determinant. One can find by solving system (7.88) that
Af) = 0,
1 A21 = f q9 , A*2 (3) -+%, A*3 (3) - f-
20
(7.91)
174
Random Fields Estimation Theory
1 A,,(4) - -70
+-.
(7.92)
We will need
+
Lemma 7.2 Let m = 29 1. Assume that the coefficients A t ) in ('7.8'7) satisfy ('7.88) with 0 5 j 5 29, and let
Then
V [At's - s'] 5 ymh2m-2,
ym := k M , ,2
(7.94)
where V is the symbol of variance. In order to prove this lemma, one needs a simple
Lemma 7.3
Let gj be random variables and aj be constants. If V[gjIIM ,
15 j 5 n
(7.95)
then (7.96)
Proof of Lemma 7.3 Note that (7.97)
by Cauchy's inequality. Let f(z1,. , . ,z,) be the probability density of the joint distribution of the random variables g1,g2,.. . ,gn. Let us assume without loss of generality
Applications
that g k
= 0.
Denote dx
= dxl
. . . dz,,
175
JRn = J, x = ( 2 1 , .
n
n
. . ,xn). Then
n
n
(7.98) j= 1
Lemma 7.3 is proved.
0
Proof of Lemma 7.2 One has
(7.99) where t K are the points in the remainder of Taylor's formula. Apply Lemma 7.3 to equation (7.99) and take into account the assumption (7.86) to get
which is equivalent to (7.34). Lemma 7.2 is proved. Lemma 7.4
0
One has
D [AF'U
- s']
5 4(h),
(7.101)
where
Here m = 2q
+ 1, R(t - T ) := v*(t)v(.)
is the covariance function of v ( t ) , and conditions (7.84) hold.
(7.103)
Random Fields Estimation Theory
176
Proof.
By assumption s and v are uncorrelated. Therefore (7.104)
where we took into account that the coefficients A:) are real numbers. Lemma 7.4 is proved.
Definition 7.1
The estimate
LU := AF’U
(7.106)
is called quasioptimal if h minimizes the function 4 ( h ) defined by (7.102). Thus, the quasioptimal estimate minimizes a natural majorant q5( h ) of the variance of the estimate At)tr among all estimates (7.106) with different h. The majorant 4 ( h ) is natural because the equality sign can be attained in (7.101) (for example, if s = 0 then the equality sign is attained in (7.101)). The quasioptimal filter is easy to calculate: it is sufficient to find minimizer of @(h),h > 0. This filter is easy to implement: one needs only some multiplications, additions and time shift elements. We will compare the error estimates for optimal and quasioptimal filter shortly, but first consider an example.
Example 7.4
Let m = 2, q
= 1,
-
(1) Ah u-
W t + h ) - u(t - h ) 2h
(7.107)
By formulas (7.93) and (7.89) for m = 2 and q = 1 one calculates c2
1 4
= -.
(7.108)
Let us assume that the constant M i := M in the estimate (7.86) is known, the variance a2 of noise is known (see (7.84)) and the covariance function of v(t) is
R(t)= exp(-ltl).
(7.109)
Then formula (7.102) yields
M
+(h)= T h 2
02h-2 +[l - exp(-2h)] 2
Mh2
=
+ a2h-2
4 2 [R(O)- R(2h)l.
(7.110)
177
Applications
If CJ << 1 then the minimizer of the function (7.110) should be small. Assuming h << 1 and using 1 - exp(-h) x h for h << 1, one obtains Mh2 u2 +--, dJ(h)= 4 h
(7.111)
h>0.
It is easy to check that the function (7.111) attains its minimum at
(g) 113
hmin =
(7.112)
and min dJ x ,4/3M1/3
(2213 + 2-113 ) = 2 . 3 8 1 ~ ~ / 3 M 1 / ~ .
(7.113)
Note that if the minimizer is small, so that hmin << 1, then the behavior of the covariance function is important only in a neighborhood of t = 0. But this behavior is determined only by the asymptotic behavior of the spectral density A(X) as 1x1 -i00. Let us now compare briefly optimal and quasioptimal estimates. Let us define the spectral density fi,(X) of s ( t ) : (7.114) J-00
where
R,(t
-T )
(7.115)
:= S * ( t ) S ( T ) .
We assume that 0 < &(A)
5
A
I (1 + X 2 ) a ’
a
22
(7.116)
and that the spectral density of noise 0 < A(X) 5
+*
(1 X2)b’
b>l.
(7.117)
One has for the quasioptimal estimate (7.107) formula (7.110), and
(7.118)
Random Fields Estimation Theory
178
The term R(0)- R(2h) in (7.110) can be written as
R(0)- R(2h) =
2n
/
00
I?(X)[l - exp(2iXh)ldX.
(7.119)
-M
Therefore the function $(h) in (7.110) can be expressed entirely in terms of spectral densities. One has
5 const x Bh.
(7.120)
It follows from (7.110), (7.118) and (7.120) that ,$(h) 5 const (Ah2
+
y)
(7.121)
7
where const does not depend on A , B, u and h but depends on a and b. If a 2 and b > 1 one can take an absolute constant in (7.121). This shows that estimate (7.121) does not depend much on the details of the behavior of the spectral densities. One can see from (7.121) that
4
&in
5 con st^^/^
(g)
1/3
,
hmin
= const
(y)1/3 .
This estimate shows how the variance of noise and the ratio the behavior of the error as u -+ 0. Note that M
dX
1 = D[v] = R(0) =
(7.122)
$ influence
= const
x B.
Therefore B is of order of magnitude of 1, and formula (7.122) can be written as &in
5 constc~~/~A~/~.
(7.123)
This estimate holds if hmin << 1, that is if u << All2.
All these estimates are asymptotic as u
4
(7.124) 0.
179
Applications
The optimal h satisfies the equation t
1-T
[&(y - z )
+ 02R(y- z ) ] h(x, t.)dz = f(y, z),
t -T 5 y 5 t, (7.125)
where
a
f ( y , z ) := s*(y)s'(z) = -R,(y dX
- z) = -Rt(y - x).
(7.126)
The error of the optimal estimate can be computed by formula (2.108): (7.127) where ho is the solution to equation (7.125) of minimal order of singularity. For simplicity let us assume that t - T = -00 and t = 00. The error for this physically nonrealizable filter is not more than for the physically realizable filter. The reasons for considering the nonrealizable filter are: 1) this filter is easy to calculate, and 2) its error gives a lower bound for the error of realizable filter. Let z = 0 in (7.125). Since the random functions we consider are assumed to be stationary there is no loss in the assumption that x = 0. Take the Fourier transform of (7.125) with t - T = -00, t = +00, and use the theorem about the Fourier transform of convolution to get [ii,(X)
+ (r2R(A)]ho(X)= -iXR,.
(7.128)
Thus
ho = -iXR,[R,
+ &I-1.
(7.129)
Use (7.129) and apply Parseval's equality to (7.127) to get
-
2 / 2n
00
-m
X2R,(X)iZ(X)[i?,(X)
+~ ~ f i ( X ) ] - l d X .
(7.130)
It follows from (7.130), (7.116) and (7.117), if we assume that for large X the sign I in (7.116) and (7.117) becomes asymptotic equality, that
1, 00
~ ( 0I) const x a2AB
I const x
o2 as
(1
(r+
X2dX
+ X2)bA+ a2B(1+ X2))" 0.
(7.131)
Random Fields Estimation Theory
180
The estimate (7.113) gives O ( U ~ /as~ CT) -+ 0 but if one takes larger m the estimate will be 0 ( ( T ~ ( ~where )), a(m) + 2 as m grows. One can estimate a(m) using formula (7.102). For small h one has q5 5 ~ ~ h 2 m -+2 consta2h-l, hmin $/(2m-1) and &,in m g 2 (zm-z)/(zm-l), so that a(m)= 2 E . Therefore a(m)-+ 2 as m + 00.
-
7.3.4 Finding critical points Before we discuss the case T > 1 of random functions of several variables, let us outline briefly an application of the results of section 7.3.2 to the problem of finding the extremum of a random function. Assume that the observed function U ( t ) = s ( t ) + n ( t ) , where s ( t ) is a smooth function defined on the interval [0,1] which has exactly one maximum on this interval. Such functions s ( t ) are called univalent. Suppose that this maximum is attained at a point T . Assume that
(7.132) The problem is to find T given the signal U ( t ) ,0 5 t 5 1. The solution to this problem is:
1) divide the interval [0,1] by the points t k = kh, where h = h(6) is given by (7.67), k = 0 , 1 , 2 , .. . 2) CdCUlate c k := (2h)-l [ U ( t k -k h) - U ( t k - h)]= A r ) u ( t k ) 3) Compute uk&+l, k = 0 , 1 , 2 , .. . 4) if .
.
I
(7.133) and
(7.134) then tj
< T < t j + h.
(7.135)
Indeed, from (7.67) and (7.71) one concludes that c(tk)-E(6)
5 S’(tk) 5 c ( t k )
+ E(6).
(7.136)
From (7.133), (7.134) and (7.136) it follows that s ’ ( t ) changes sign on the interval ( t j , tj+l). This implies (7.135) since s ( t ) is univalent.
Applications
181
If (7.133) is not valid for some k = ko, then the maximum may be on the interval ( t k - h, t k h). In this case it may happen that for no j condition (7.134) holds. The above method may not work if the derivative of s ( t ) is very small (smaller than ~ ( 5 ) in ) a large neighborhood of T .
+
Remark 7.1 Since we used formula (7.71) we assumed that U ( t ) is defined on all of R1. If it is defined only on a bounded interval [a,b] then the expression Ar'U(t) is not defined fort < a+ h. In this case one can define
A r ' U ( t ) = h-' [U(t+ h) - U ( t ) ] , u 5 t < u + h = Ar)U(t), a + h 5 t 5 b - h = h-l [U(t)- U ( t - h)], b - h 5 t 5 6.
(7.137)
In this case
1 A f ) U ( t )- s ' ( t ) / 5
25
Mh
+2
(7.138)
so that the minimizer hmin of the right hand side of (7.138) is hmin = 2(6/M)ll2
(7.139)
and the minimum of the right hand side of (7.138) is Z(5) = 2 ( M ~ 5 ) l / ~ .
(7.140)
Note that E(5) = 2 1 / 2 ~ ( 6 )where , ~ ( 6 is ) given b y (7.67).
7.3.5
Derivatives of random fields
Let us consider the multidimensional case. There are no new ideas in this case but we give a brief outline of the results for convenience of the reader. Suppose that U ( z )= s(z) n(z), z E R". Let Vs denote the gradient of s(z) and 1) s /I= maxzERPls(z)I. Assume that
+
(7.141) and max J(d%(z)e,e)l I M
xER'
where
ve E s1:= { e : e E R", e . e = I},
(7.142)
Random Fields Estimation Theory
182
Define
U ( X+ he) - U ( Z - he) , h>0. 2h Theorem 7.2 If h ( 6 ) and ~ ( 6 are ) given by (7.67) then AhU(x) :=
~ A ~ ( ~ )-uv+) (~)
- el I +),
ve E sl.
(7.143)
(7.144)
Moreover inf sup 11 T U ( x )- Vs(x) . O
TEA
II=
~(6)
(7.145)
s,n
and the infimum is attained at T = Ah(&).Here the supremum is taken over all s(x) E C2(R'), which satisfy (7.142), and all n(x),which satisfy (7.141), and the infimum is taken over the set A of all operators T : C(R') -+ C(R') linear or nonlinear. The proof of this theorem is similar to the proof of Theorem 7.1. The role of the function s l ( t ) in formula (7.76) is played by the function
M
s~(z) = - (1x12- 2 h ( 6 ) x . 0 ) 2
in Bar
(7.146)
where BJ is the ball, centered at the point h(S)B, with radius h(S), 1xI2 = (xjI2. It is clear that s1(x) vanishes at the boundary ~ B ofJ the ball, that
xizl
I(d2sl(x)e,e)l I M
ve E s1
(7.147)
and that
Is1(x)l I d .
(7.148)
Let sz(x) = -sl(x) and argue as in the proof of Theorem 7.1 in order to obtain (7.144) and (7.145).
7.4 Stable summation of orthogonal series and integrals with randomly perturbed coefficients 7.4.1 Introduction Consider an orthogonal series (7.149)
j=1
where
and
Suppose that the data {bj:=cj+Ej},
l<j
(7.152)
are given, that is the Fourier coefficients of f are known with some errors ~ j . Assume that -
~j
= 0,
3
= u2hjm,
u = const
> 0.
(7.153)
The problem is: given the data (7.152), (7.153), estimate f (x). From the point of view of systems theory, one can interpret this problem as follows. Suppose that the system’s response to the signal +j(x) is Kjc$j(x),where Kj is a generalized transmission coefficient of the system. For example, if w is a continuous analogue of j and &(x) = exp(iwx) then K(iw) is the usual transmission coefficient of the linear system. If there is a noise a t the output of the system then one actually receives C E l ( K j c j ~j)+j(x) at the output, where ~j is the noise comlponent corresponding t o the j - t h generalized harmonic c$j (x). Let us consider two methods for solving the problem. These methods are easy t o use in practice. The first method is to define
+
(7.154) and to choose N
= N(cT)so
that
The second method is t o define
(7.156)
184
Random Fields Estimation Theory
where v > 0 is a parameter, and to choose the multipliers p j ( v ) so that IIg-f112=min.
(7.157)
The same problem can be formulated for orthogonal integrals that is for continuous analogues of orthogonal series: (7.158) and
L
d(., X>d*(.,
(7.159)
X’)d. = 6(A - At).
We assume that b(X) are given, b(X) = .(A) +€(A), E * ( X ) E ( X ’ ) = a 2 6 ( A - X t ) and the problem is to estimate f(x). This can be done in the same way as for the problem for series. 7.4.2
Stable summation of series
Let us consider the first method. Assume that
where c and A are positive constants which do not depend on x and j. Then N
M
j,j’=l
j=N+1 00
5 a 2 N+ A2
j-2a j=N+1
2
+~
N-2af1
:= y(N),
u
1 2
> -.
(7.161)
Here we used (7.150), (7.152), (7.153) and (7.160). Let us find N, which y(N) = min, a and A assumed fixed. One has
(t)
(7.162)
= constA1/aa(2a-1)/a,
(7.163)
(-)2a
2a
N(a) := N, =
for
-1
ll(2a)
lla
and ym := y(N,)
185
Applications
where const depends on a but not on A and u. We have proved
If N ( u ) is given by (7.162)
Proposition 7.1
then
11 fN(o,(~) - f ( ~ 1)12 5 ~ o n ~ t A ' l ~ c ~ ( ~ " - ' ) / " .
(7.164)
Therefore formula (7.154) with N = N ( u ) gives an estimate of f(z) such that the error of this estimate goes to zero according to (7.164) as u -+ 0.
7.4.3 Method of multipliers Let us consider the second method. Take p j ( v ) := exp(-vj).
(7.165)
These are multipliers of convergence used in Abel's summation of series. Then
II
M
M
=
C1
00
- exp(-jv)121cj12
+ u2C e x p ( - 2 j u ) j=1
j=l
I A2
112
M
c 00
I1 - exp(-jv)l2 j2a
j=1
exp (- 2u)
+ u21- exp(-2u)
(7.166) '
For fixed A and u one can find urn which minimizes the right side of (7.166). If ?;n is the minimum of the right side of (7.166)' then the error estimate of the method is
3 5%.
(7.167)
One can see that yrn 4 0 as 0 4 0. 7.5
7.5.1
Resolution ability of linear systems
Introduction
Let us briefly discuss the notions of resolution ability of a linear system. In optics Rayleigh gave an intuitive definition of resolution ability: if one has two bright points [b(z- a ) b(x a)]/2 as the input signal and if the
+
+
Random Fields Estimation Theory
186
optical system is described by the transmission function h(z,y), so that the output signal is [ h ( za, ) h(z,-u)]/2 then the two points can be resolved according to the Rayleigh criterion if [h(O,a ) + h(0,-a)]/2~0.8h(O). Note that we took the signal [6(z- u ) 6(x a)]/2 rather than S(z u ) 6(z u ) in order to compare this signal with a bright point at the origin S(z). The sum of the coefficients in front of delta-functions should be therefore equal to 1, the coefficient in front of 6(z).The factor 0.8 in the Rayleigh criterion is an empirical one. The transmission function h(z,y) is defined as follows: if Sin(z) is the input signal then the output signal of the linear system is given by
+
+
+
+
+
(7.168) The domain D in optics is usually the input pupil of the system, or its input vision domain. The optical system is called isoplanatic if h ( z ,y) = h(z-y). In describing the Rayleigh criterion we assume that h(z,y) has absolute maximum at z = y, that the distance 2a between two points is small, so that both points lie in the region near origin in which h(z,y) is positive. Suppose now that (7.169) (7.170) and one observes the signal %(X)
=
s,
h(z,y)sj(y)dy
+ n(s),
(7.171)
-
where n(x)is the output Gaussian noise, ii = 0, lnI2 = g2 < 00, and j = 0 (hypothesis Ho) or j = 1 (hypothesis H I ) . T h e problem: given the observed signal Uj(x)decide whether HO o r H I occured. (7.172) If with the probability 1 one can make a correct decision no matter how small a > 0 is, then one says that the resolution ability of the system in the sense of Rayleigh is infinite. The traditional intuition (which says that, for a fixed size of the input pupil, an optical system can resolve the distances of order of magnitude of the wavelength) is based on the calculation of diffraction of a plane wave by a circular hole in a plane. In [Ramm (197Oc)l and [Ramm (1970d)l it was proved that, in the absence of noise,
Applications
187
the transmission function of a linear system can be made as close to the 6(x - y) as one wishes, by means of apodization. This means that there is a sequence hm(x,y) of the transmission functions which is a delta sequence in the sense that
for any continuous s(y). 7.5.2
Resolution ability of linear systems
In this section we apply the theory developed in Section 7.2 in order to show that there exists a linear system, or what is the same for our purposes, the transmission function h(x,y) such that the resolution ability of this system in the sense of Rayleigh is infinite. More precisely, we will formulate a decision rule for discriminating between the hypoteses Ho and HI such that the error of this rule can be made as small as possible. The error of the rule is defined to be
that is, the probability to decide that hypothesis H1 occured when in fact Ho occured. The meaning of the parameter m, the subscript of a, will be made clear shortly. This parameter is associated with the sequence of linear systems whose resolution ability increases without limit. First let us choose hm(x,y) so that the sequences
.>J,
hna(x,
=
hm(x,Y)qY
- .)dY
:= L ( x , ).
(7.174)
are delta-sequences. Then, by formula (7.171), the observed signals became &(z) =
4n(z, a ) + L ( 5 , -a) 2
+ .(x)
:= S l ( Z )
+ .(x)
(7.175)
or
&(x) = Sm(Z, 0)
+ .(x)
:= so(x)
+ .(x).
(7.176)
Let us apply to the problem (7.172) the decision rule based on formula (7.41).
Random Fields Estimation Theory
188
First one should solve the equation (7.29): R V :=
s,
R(z,y)V(y)dy =
&rl(x, a)
+26,(x,
-a)
- &n(xC,0)
(7.177)
= SI(Z) - SO(Z) := f,
where R(z,y) := n*(z)n(y) is the covariance function of noise. We assume that R(x,y) E R,and, for simplicity, that P(A) = 1. In this case R(x,y) solves the equation
We also assume that S,(x, y) is negligibly small for points 0, a, and -a are inside of D and
1 2
- yI >
q,and that
where ~ ( xr) , is the distance between point x and I? = dD. In this case one can neglect the singular boundary term of the solution to equation (7.177) and write this solution as
Let us write inequality (7.39):
where we assume that the coefficients of Q ( L ) are real. Otherwise one would write [Q(.C)f* in place of Q ( L ) f * . Let us write the expression (7.173) for a,:
where V , s1 and
SO
are given by (7.180), (7.175) and (7.176) respectively.
A pplacataom
189
It follows from (7.182) that (s; - s2;)Q(L)(sl- s0)dz =P
{ 2Re S,n(z)V*(x)dx>S,(s;Q(L)sl+ -
s2;Q(C)so)dz
1.
(7.183)
Here we took into account that
S,
o
s j * ~ ( ~ ) s= ~ d z for
i
+j
(7.184)
because of the assumption that for m sufficiently large the functions SI(x) and So(z) have practically nonintersecting supports: each of them with derivatives of order sq is negligibly small in the region where the other is not small. One can write (7.183) as
a , = P {Re
S,
n(z)V*(z)dz2 - A , 23
}
,
(7.185)
where we denote by A , the positive quantity of the type
Am =
S,
o ) Q ( w ~ ( W, ~,
~ ( z ,
(7.186)
and one can write in place of b,(z, 0) the functions b,(z, a) or bm(z,-a). The basic property of A , is
A,-t+oo
as m - t c o .
(7.187)
Indeed, the elliptic operator Q ( L ) is positive definite on H"ql2(R'), and one can assume that 6,(z,O) E fi5q/2(B,),B, := {z : z E RT,lzl5 a}. Therefore
Here c is a positive constant which does not depend on bm(x,0) (it depends on Q ( L ) only) and the integral in (7.188) tends to infinity because b,(z,O) is a delta-sequence by construction (see formula (7.174) and the line below it). Let us apply the Chebyshev inequality to get:
Random Fields Estimation Theory
190
Here we took into account that A = 0, and D[n]stands for the variance of random quantity n. One has
=s,
3
f(~)&(L).f* ( Y ) ~ Y= S A m .
(7.190)
Here we used definition (7.177) of f , formula (7.186) and equalities of the type (7.184) for the functions hm(x,0),Sm(x,u) and S m ( Z , -u). From (7.189) and (7.190) it follows that
If X is an arbitrary random variable then
From (7.185), (7.187), (7.191) and (7.192) it follows that (7.193) Let us compute the probability to take the decision that the hypothesis HO occurred while in fact H I occurred. We have ,Om
:= P(y0
=
I H I ) = P {2Re L U 1 V * d x 5 L ( s o V * + s;V)dx nV*dx I L ( s o V * - slV*)dx
P {2Re
= P {Re = P
(Re
S, S,
nV*dz I
nV*dx 5
where we used formula (7.190).
-;
f Q ( L ) f* d x } (7.194)
Applications
If R e x < - A then
191
1x1 > A. Therefore (7.195)
P(lX1 2 A ) 2 P { R e X I -A}. From (7.191), (7.194) and (7.195) one concludes that
'm'p{l~
1
hV*dx > - A ,
3 16 <---} 29A,
-
8
3A,
1 0 as m t m . (7.196)
We have proved the following
Theorem 7.3 The problem (7.172) can be solved by the decision rule from Section 7.2 and a, --+ 0, ,Om t 0, where a, is defined an (7.182) and ,Om is defined in (7.194). 7.5.3
Optimization of resolution ability
In this section we give a theory of optimization of resolution ability of linear optical instruments. Let us consider the resolution ability for the problem of discriminating between two arbitrary signals sl(x) and sg(z) which are deterministic functions of x E RT.The observed signal in a bounded domain D c R ' is
U ( Z )= sj(x) + n ( x ) , j
=O
or j = 1,
(7.197)
n(x) is Gaussian noise,
n = 0,
q1 .
= 2,
n*(x)n(y)= R ( x - y).
(7.198)
Assume that the linear optical instrument is isoplanatic. This means that its transmission function, h(x,y), is h(x - y). In optics T = 2 and D is the input pupil, or entrance pupil, of the instrument. Consider the case of incoherent signals. For incoherent signals the transmission power function is Ih(x-y)l2. This means that if Ijn(x) := ls(x)I2is the intensity of the signal s(x) in the object plane then in the image plane one has the distribution of the intensity Io,t(x) given by
(7.199) The bar denotes averaging in phases. This will be explained soon. Let us
Random Fields Estimation Theory
192
briefly derive (7.199). One has
(7.200) Let us now explain the meaning of the average. For incoherent signals by definition we have sin(y) =
c
(7.201)
Aj(y)ei4j(Y),
j
where Aj(y)ei4j are sources which form the signal sin(y) at the point y, and 4j are their phases which assumed random, uniformly distributed in the interval [-T, T ] and statistically independent, so that = 6jjt6(y $j = 0. Under these assumptions one has sin(y)sTn(y’) = C A ~ ( Y ) A T ~ ( Y{i[$j(~) ’ ~ X P - ~~’(Y’)II j,j
=
C Aj (Y)Aj*f( Y Y j j 4 Y - Y‘) = c IAj(3)12% j,j’
- 3’)
j
= Iin(Y)G(y - I/‘)*
(7.202)
Here we took into account that
(7.203) From (7.202) and (7.200) one obtains (7.199). Denote
H ( z ) := lh(z)I22 0,
(7.204)
where h(z)is the transmission function for an isoplanatic linear instrument. Then
(7.205) for incoherent signals. Let us assume that
s
H 2 ( 2 ) d z := E
< 00.
(7.206)
One often assumes in applications that
B(X) is negligible outside A,
(7.207)
Applications
193
where A c R2 is a finite region. The assumption (7.178) means that the instrument filters out the spatial frequencies which do not belong to A. Let us finally assume that the noise is a real-valued function which is a perturbation of the observed intensity in the image plane. One can think of n ( z ) ,for example, as of the noise of the receiver of the intensity.
The problem is: given the observed signal
U ( z ) = I j ( z )+ n ( z ) , j = 1 or j = 0 , Ij(z)=
s
H ( x - y)sj(y)dy, s ~ ( x:= ) IZnj(z)
(7.208)
decide whether it is of the form (7.179) with j = 1 (hypothesis H I ) or with j = 0 (hypothesis Ho). Applying the decision rule from Section 7.2 and taking into account that the signals are real valued, one solves the equation (7.29)
s,
R ( z ,y ) V ( y ) d y = II(~) - lo(z):= I(z)
(7.209)
and then checks inequality (7.39) with real-valued signals: (7.210)
If (7.210) holds then the decision is that hypothesis H I occurred. Otherwise one concludes that hypothesis HO occurred. The error of the first kind of this rule is the probability to decide that H1 occurred when in fact HO occurred:
(7.211) where I(z)is given by (7.209). Note that (7.212)
Random Fields Estimation Theory
194
We assume here that R(z,y) and V ( z ) are real-valued. Since JD n(z)V(z)dzis Gaussian (because n ( z ) is) one concludes from (7.211), (7.212) and (7.213) that 1
exp
=-
ff
d f i
d2/2
(-$) dt
= erf(d/2),
(7.214)
where erf(z) is defined in (7.61) and we used the equation
d2 =
s,
I(z)V(z)dz
(7.215)
which can be easily checked:
because of (7.209). Therefore for those H ( z ) for which d = m a .
a = min
(7.216)
For the error of the second kind which is the probability to decide that HO occurred while in fact H I occurred one has
-
1
/
fi
--dP
exp
-m
Therefore both a and optimization problem
d2 :=
(-f)
dt = erf(d/2) = a.
(7.217)
P = a will be minimized if H ( z ) solves the following
// D
D
R ( z,y ) V ( y ) V ( z ) d y d z = max
(7.218)
Applications
195
subject to the conditions
R(z,Y>V(Y)dY= Il(.> - Io(z) := I @ )
(7.219)
and I(2)=
J
H ( z - y)s(z)d2,
s(2) := Sl(2) - so(2).
(7.220)
Let us assume for simplicity that D is the whole plane:
D =R ~ .
(7.221)
Then, taking the Fourier transform of (7.219) and (7.220) yields
il(X)V(X)= I ( X )
= H(X)S(X).
(7.222)
Here the last assumption (7.198) was used. Thus
V(X) = R(X)qA)R-yX).
(7.223)
Write (7.218) as d2 =
s,.
1 I ( Z ) V ( Z ) d 2= -
k111?(X)121Z12dX = max
(7.224)
and (7.206) as (7.225)
The instrument with the power transmission function H ( x ) which maximizes functional (7.224) under the restriction (7.225) will have the maximum resolution power for the problem of discriminating two given signals s1(z) and SO(Z). should be parallel to the The solution to (7.224) is easy to find: vector P11s12,so
1R12= R-'lsl2 . const, where the constant is uniquely determined by condition (7.225):
Formula (7.227) determines uniquely IGJ:= A(X), so that
(7.226)
Random Fields Estimation Theory
196
where $(A) is the unknown phase of f i ( X ) :
s,,
H ( z )exp(iX. z)dz = A(X) exp[iq!(X)],
H ( z ) 2 0.
(7.229)
Formula (7.224) shows that the resolution power of the optimal instrument depends on A(X) = Ifi(X)l only. The phase $(A) does not influence the resolution ability but influences the size of the region D out of which H ( z ) is negligibly small. If one writes the equation
H ( z )exp(iX. z)dz = A(X) exp[i$(X)], H ( z ) 1 0
(7.230)
and consider it as an equation for H ( z ) and $(A), given A(X) and argH(z) = 0, then one has a phase retrieval problem. In (7.230) D C R2is assumed to be a finite region with a smooth boundary. This problem has been studied in [Kl], where some uniqueness theorems are established. However, the numerical solution to this problem has not been studied sufficiently. The condition (7.206) does not seem to have physical meaning. One could assume that
1
H(z)dx = E.
(7.231)
In this case the const in (7.226) cannot be found explicitly, in contrast to the case when condition (7.206) is assumed. If (7.231) is assumed then it follows from (7.176) that
L"&) L yER* ImXJ ( In 7.5.4
)E.
(7.232)
A general definition of resolution ability
In this section a general definition of resolution ability is suggested. The classical Rayleigh definition deals with very special signals: two bright points versus one bright point. Suppose that the set M of signals, which one wishes to resolve is rather large. For example, one can assume that M consists of all functions belonging to L 2 ( D )or to Ci(D), the space of functions which have one continuous derivative in a bounded domain D c R' and are compactly supported in D.Suppose that the linear system is described by its transmission function (7.233)
Applications
197
Assume that actually one observes the signal
U ( z )= L f + n ( z ) , E = 0, D[n]= 0 2 ,
(7.234)
where n(z) is noise. Let
B,U
(7.235)
:= fu
denote a mapping which recovers f given U ( X ) . Let us assume that the operator L-l exists but is unbounded. Then in the absence of noise one can recover f exactly by the formula f = L-lU, but L-l cannot serve as B, in (7.235): first, because in the presence of noise U may not belong to the domain of L - l , secondly, because if L-lU is well defined in the presence of noise it may give a very poor estimate of f due to the fact that L-’ is unbounded. Let us define the resolution ability of the procedure B, on the class M of input signals as a maximal number T E [0,1] for which (7.236) This definition takes into account the class M of the input signals, the procedure B for estimating f , and the properties of the system (see the definition (7.234) of U ) . Therefore all the essential data are incorporated in the definition. The definition makes sense for nonlinear injective mappings L as well. Roughly speaking the idea of this definition is as follows. If the error in the input signal is O ( a ) and the identification procedure B, produces fu = BU such that, in some suitable norm, 11 f, - f [I= O ( 8 ) as a 0, then the large T is, the better the resolution ability of the procedure B, is. --f
Example 7.5
Assume that the transmission function is
h(a:- Y> =
(;)
1’2
sin(a: - y)
(7.237)
X-Y
Note that (7.238) Let (7.239)
Random Fields Estimation Theory
198
where .(a) > 0 for u > 0 will be chosen later. Let M be the set of functions f (x)E L2(-oo, 00) such that
II f lILq-l,1]5
f(4=0
m,
for
1x1 > 1,
(7.240)
where m > 0 is a constant. Assume that n(z)is an arbitrary function such that
By Parseval’s equality, one has
(7.242) Choose
.(a) = U112.
(7.243)
Then (7.242) and (7.243) yield
I1 B,U
-
f1 1 12 0 + m)g.
(7.244)
Therefore T = 1 for the procedure B, defined by formula (7.239): lim sup
u-4 j € M
BuU-f U
11 5 2 ( 1 + m ) < 00.
(7.245)
7.6 Ill-posed problems and estimation theory
7.6.1
Introduction
In this section we define the notion of ill-posed problem and give examples of ill-posed problems. Many problems of practical interest can be reduced to solving an operator equation
AIL=f,
(7.246)
Applications
199
where u E U , f E F , A : U --+ F , U and F are Banach spaces, A is an injective mapping with discontinuous inverse and domain D ( A ) c U . Such problems are called ill-posed because small perturbations of f may lead to large perturbations of the solution u due to the fact that A-' is not continuous. In many cases R ( A ) , the range of A , is not the whole F . In this case small perturbations o f f may lead to an equation which does not have a solution due to the fact that the perturbed f does not belong to
R(A)* The problem (7.246) is said to be well-posed in the sense of Hadamard if A : D ( A ) --t F is injective, surjective, that is R ( A ) = F , and A-l : F --t U is continuous. The formulation of an ill-posed problem (7.246) can often be given as follows. Assume that the data are (6, A , fa}, where 6 > 0 is a given number, fa is a &approximation of f in the sense
II f - f a I15 6.
(7.247)
The problem is: given the data (6, A , fa}, find u6 E U such that ~ ~ U ~ - - U I I + O as 6+0.
(7.248)
We use 11 . 11 f o r norms in U and F . A more general formulation of the ill-posed problem is the one when the data are { 6,7,A , , fa} where S and fa are as above, 7 > 0 is a positive number and A, is an approximation of A in a suitable sense, for example, if A is a linear bounded operator one can assume 11 A, - A 1 1 1 q. We will not discuss this more general formulation here. (See [I], for example.) Ill-posed problems can be formulated as estimation problems. For example, suppose that A is a linear operator, u solves equation (7.246), and one knows a randomly perturbed right hand side of (7.246), namely
f +n,
(7.249)
n = 0, n*(z)n(y)= a26(z- 9).
(7.250)
where n is a random variable,
The problem is to estimate u given the data (7.249), (7.250). Let us give a few examples of ill-posed problems of interest in applications. Example 7.6
Numerical diffemntiation.
Random Fields Estimation Theory
200
Let
Au :=
lz
u ( t ) d t = f (z).
(7.251)
Assume that 6 > 0 and fg are given such that 11 fg - f 115 6. Here 11 f I/= max,<,St, If(z)1, F = U = C([a,b]).The problem (7.251) is ill-posed. Indeed, the linear operator A is injective: if Au = 0 then u = 0. Its range consists of the functions f E C1[a,b] such that f ( a ) = 0. Therefore equation (7.251) with fg in place of f has no solutions in C[a,b]if fg 9 C1[a,b]or fg(0) # 0. If one takes f g = f +6 sin[w(z-a)], then the solution to equation (7.251) with fa in place o f f exists: ug = f’+6wcos[w(z-a)]. Since u = f’ one has 11 u g - u )I= 6w >> 1 if w is sufficiently large. Therefore, a small in the norm of U perturbation of f resulted in a large in the same norm perturbation of u. Therefore the formula ug = A-lfa = fi in this example does not satisfy condition (7.248). In Section 7.3 a stable solution to the problem (7.251) is given: (7.252) It is proved that
where h(6) and ~ ( 6 are ) given by formulas (7.67) (see Theorem 7.1). Formula (7.252) should be modified for z < a h ( 6 ) and z > b - h ( 6 ) (see (7.107)).
+
Remark 7.2 The notion of ill-posedness depends o n the topology of F . For example, i f one considers as F the space C1[a,b] of functions, which satisfy condition A(0) = 0, then problem (7.251) is well posed and A is an isomorphism of C[a,b]onto F .
Example 7.7 Stable summation of orthogonal series with perturbed coeficients. Let 00
u(z) =
C C ~ ~ $E(DZc)R‘., E
(7.254)
j=1
Suppose ( + j , +m) = Sj, is a basis of L 2 ( D ) ,the parentheses denote the inner product in L 2 ( D ) .Assume that u E L 2 ( D ) .This happens if and only
Applications
201
if (7.255) j=l
Suppose the perturbed coefficients are given cjg = cj
+
Ej,
l€jj(56.
(7.256)
The problem is: given the data cjg and b > 0, find ug such that (7.248) holds, the norm in (7.248) being L 2 ( D ) norm. Consider the map A : L 2 ( D )3 C" which sends a function u(x)E L 2 ( D ) into a sequence c = { c j } = (c1, .. . ,c j , . . .) E C" by formula (7.254). Since, in general, the perturbed sequence cg = { c j g } $! C2, the series cj&(z) diverges in L 2 ( D ) ,so that the perturbed sequence {cjg} may not belong to the range of A. It is easy to give examples when {cjg} E R ( A ) but the cj&((z) differs from u in L 2 ( D )norm as much as one function ug := wishes no matter how small 6 > 0 is.
CE,
cc,
Exercise. Construct such an example: Therefore A-' is unbounded from Cw into L 2 ( D ) ,and the problem is ill-posed. Note that if one changes the topology on the set of sequences {cj} from C" to C2 then the problem becomes well-posed and the operator A : L 2 ( D )3 C2 is an isomorphsm. A stable solution to the problem is given in Section 7.4.
Example 7.8 Integral equation of the first kind. Let A be an integral operator A : H --+ H , H = L 2 ( D ) ,
Au =
J, A ( z ,Y ) U ( Y ) d Y = f(z),
2
E D.
(7.257)
If A is injective, that is, Au = 0 implies u = 0, and A is compact, then A-' is not continuous in H , R ( A ) is not closed in H , so that small in H perturbations of f may result in large perturbations of u,or may lead to an equation which has no solutions in H (this happens if the perturbed f does not belong to R ( A ) ) .Therefore, the problem (7.257) is ill-posed. Example 7.9 Computation of values of unbounded operators. Suppose B : F -+ U is an unbounded linear operator densely defined in F (that is, its domain of definition D ( B ) is dense in F ) . Suppose f E D ( B ) ,
Random Fields Estimation Theory
202
B f = u. Assume that instead of f we are given a number 6 such that 11 f - fg 1 15 6. The problem is: given fa and B compute f ll+ 0 as 6 --+ 0.
ug
such that
11
> 0 and ug - u
fg
11=11
ug -B
This problem is ill-posed. If A-' is unbounded and B = A-', the problem (7.257) reduces to the above problem.
Example 7.10 Analytic continuation. Suppose f (2) is analytic in a bounded domain D of the complex plane and continuous in D.Assume that D1 c D is a strictly inner subdomain of D.
The problem is: given f(z) in D1 find f(z) in D. By Cauchy's formula one has (7.258) This is an integral equation of the first kind for the unknown function f ( t ) on aD. If f ( t )is found then f ( z ) in D is determined by Cauchy's formula. Therefore the problem of analytic continuation from the subdomain D1 to the domain D is ill-posed.
Example 7.11 Identification problems. Let sin(z)be the input signal and sout(z) be the output signal of a linear system with the transmission function h(z,y ) , that is (7.259) where D and A are bounded domains in R', and the function h in continuous in D x A.
The identificationproblem is: given s,,t(z)
and h ( z ,y ) find si,(y).
Equation (7.259) is an integral equation of the first kind. Therefore the above problem is ill-posed.
203
Applications
Example 7.12 Many inverse problems arising in physics are ill-posed. For example, the inverse scattering problem in three dimensions. The problem consists of finding the potential from the given scattering amplitude (see [Ramm (2005)l and Chapter 6). Inverse problems of geophysics are often ill-posed [Ramm (ZOOS)]. Example 7.13 Ill-posed problems in linear algebra. Consider equation (7.246) with U = R" and F = R", R" is the mdimensional Euclidean space. Let N ( A ) = {u : Au = 0) be the null-space of A , and R ( A ) be the range of A. If N ( A ) # (0) define the normal solution uo to (7.246) as the solution orthogonal to N ( A ) :
AUO= f, uo IN ( A ) .
(7.260)
This solution is unique: if 60 is another solution to (7.260) then
A(UO-60) = 0, uo - Go IN ( A ) . This implies that uo = 60. One can prove that the normal solution can be defined as the solution to (7.246) with the property that its norm is minimal:
where minimum is taken over the set of all solutions to equation (7.246), and the minimizer is unique: u = uo. Indeed, any element u E R" can be uniquely represented as:
(7.263)
If Au = f then Auo = f, and (7.263) implies (7.261). Moreover, the minimum is 11 '110 and is attained if and only if u1 = 0. The normal solution to the equation Au = f can be defined as the least squares solution:
I]
11 Au - f II=
min,
u IN ( A )
(7.264)
in the case when f @ R ( A ) . This solution exists and is unique. Existence follows from the fact that minimum in (7.264) is attained at the element u such that 11 Au - f II= dist(f, R ( A ) ) (note that R ( A ) is a closed subspace of R"). Uniqueness of the normal solution to (7.264) is proved as above.
Random Fields Estimation Theory
204
Lemma 7.5 T h e problem of finding normal solution t o (7.246) is a wellposed problem in the sense of Hadamard: for any f E R" there exists and is unique the normal solution uo t o equation (7.246), and this solution depends continuously o n f : i f
Auo
=
f,
I1 f - f6
AU06 = fa,
115 6
(7.265)
then
11 uo -u6
f = f l CE f 2 , f1
(7.266)
Existence and uniqueness are proved above. Let us prove (7.266).
Proof. Let
where
as 6 4 0.
II+O
f6 = f 6 1 a3 f 6 2 ,
E R(A),f2 IR(A),fa1 and
fa2
(7.267)
are defined similarly. One has
Auo = f l AU06 = f 6 1 .
(7.268)
The operator A : N(A)* + R(A) is an isomorphism. Therefore A-' : R(A) + N(A)* is continuous. Lemma 7.5 is proved. 0
Definition 7.2 The mapping A+ : f + uo is called pseudoinverse of A. The normal solution uo is called sometimes pseudosolution. Although we have proved that the problem of finding normal solution to equation (7.246) is well-posed in the sense of Hadamard when A : R" ---f R", we wish to demonstrate that practically this problem should be considered as ill-posed in many cases of interest. R", N ( A ) = {0}, so that A is As an example, take n = m, A : R" injective and, by F'redholm's alternative, A is an isomorphism of Rn onto R". Consider the equations ---f
AU = f, A u ~ = f6.
(7.269)
Thus
A(u - U J ) = f - f6,
u - U S = A-'(f - f6).
Therefore (7.270) Since (7.271)
Applications
205
one obtains (7.272) Define v ( A ) ,the condition number of A , to be (7.273) Then (7.272) shows that the relative error of the solution can be large even if the relative error 11 fs - f (1 / (1 f (1 of the data is small provided that v ( A ) is large. Note that the inequality (7.272) is sharp in the sense that the equality sign can be attained for some u,u g , f and fs. The point is that if v ( A ) is very large, then the problem of solving equation (7.246) with A : R" + R", with N ( A ) = {0}, is practically illposed in the sense that small relative perturbations of f may lead to large relative perturbations of u. 7.6.2
Stable solution of ill-posed problems
In this section we sketch some methods for stable solution of ill-posed problems, that is, for finding u g which satisfies (7.248). First let us prove the following known lemma. By a compactum M c U we mean a closed set such that any infinite sequence of its elements contains a convergent subsequence. Lemma 7.6 Let M c U be a compactum. Assume that A : M + N i s closed and injective, N := A M . Then A-l : N 4 M is continuous.
Remark 7.3 W e assume throughout that U and F are Banach spaces but often the results and proofs are valid f o r more general topological spaces. These details are not of prime interest for the theory developed in this work, and we do not give the results in their most general form for this reason.
Proof. Let Au, = f,, f,
EN.
Assume that
I I f n - f Il+O
n+m.
(7.274)
We wish to prove that f E N , that is, there exists a u E M such that Au = f , and
(1 u n - 21 I[+
0.
(7.275)
Random Fields Estimation Theory
206
Since u n E M and M is a compactum, there exists a convergent subsequence, which is denoted again un, with limit u,11 un - u II+ 0. Since M is a compactum, it is closed. Therefore u E M . Since u, + u,Au, + f, and A is closed, one concludes that Au = f . Lemma 7.6 is proved. 0 This lemma shows that if one assumes a priori that the set of solutions to equation (7.248) belongs to a compactum M then the operator A-l (which exists on R ( A ) since we assume A to be injective) is continuous on the set N := AM. Therefore an ill-posed problem which is considered under the condition f E N becomes conditionally well posed. This leads to the following definition.
Definition 7.3 A quasisolution of equation (7.246) on a compactum M is the solution to the problem
1) Au - f Here A : U
+F
))=min,
(7.276)
uE M.
is a linear bounded operator.
The functional ~ ( u:=I/ ) Au- f 11 is continuous and therefore it attains its minimum on the compactum M . Thus, a quasisolution exists. In order to prove its uniqueness and continuous dependence on f , one needs additional assumptions. For example, one can prove
Theorem 7.4 If A is linear, bounded, and injective, M is a convex compactum and F is strictly convex then, for any f E F , the quasisolution exists, is unique, and depends on f continuously. The proof of Theorem 7.4 requires some preparation. Recall that F is 11 v )I implies that called strictly convex if and only if 11 u v 11=(1 u 11 w = Xu for some constant A.
+
+
Exercise. Prove that X has to be positive. The spaces LP(D), lP,Hilbert spaces are strictly convex, while L 1 ( D ) ,
C(D)and L' are not strictly convex. Definition 7.4 If g E U is a vector and M c U is a set, then an element h E M is called the metric projection of g onto M if and only if 11 g - h II= infuEM 11 g - u 11. The mapping P : g + h is called the metric projection mapping, P g = h, or PMg = h. In general, P g is a set of elements. Therefore the following lemma is of interest.
207
Applications
Lemma 7.7 If U is strictly convex and M is convex then the metric projection mapping onto M is single valued.
Proof.
Suppose hl
# h2, hj
Since M is convex, (hl
E Pg, j = 1,2. Then
+ h2)/2 E M . Thus
Therefore, since U is strictly convex, one concludes that g - hl = X ( g - hz), X is a real constant. Since 11 g - hl 11=11 g - h2 11, it follows that X = fl. If X = 1 then hl = h2, contrary to the assumption. If X = -1, then 9 = (hl
+ h2)/2.
(7.278)
Since M is convex, equation (7.278) implies that g E M . This is a contradiction since g E M implies Pg = g . Lemma 7.7 is proved. 0 Lemma 7.8 If U is strictly convex and M is convex then PM : U is continuous.
--f
M
Proof. Suppose 11 gn - 9 I/-+ 0 but 11 hn - h 112 E > 0, where hn = Pgn, h = Pg. Since M is a compactum, one can assume that hn -+ h,, h, E M . Thus 11 h, - h (12E > 0. One has
and
From (7.279) and (7.280) one obtains 11 g-h, 11=11 g-h 11. This implies h, = h as in the proof of Lemma 7.7. This contradiction proves Lemma 7.8.
0
208
Random Fields Estimation Theory
Exercise. Prove that dist(g,M) := inf,,M function of g.
11
g -u
(1
is a continuous
We are ready to prove Theorem 7.4.
Proof of Theorem 7.4 Existence of the solution to (7.276) is already proved. Since M is convex and A is linear the set AM := N is convex. Since N is convex and F is strictly convex, Lemma 7.7 says that PN f exists and is unique, while Lemma 7.8 says that PN f depends on f continuously. Let Au = PN f. Since A is injective u = A-lPN f is uniquely defined and, by Lemma 7.6, depends continuously on f. Theorem 7.4 is proved. 0 It follows from Theorem 7.4 that if M c U is a convex compactum which contains the solution u to equation (7.246), if A is an injective linear bounded operator, and F is strictly convex, then the function Ug = A-~PAMfg
(7.282)
satisfies (7.248). The function ug can be found as the unique solution to optimization problem (7.276) with fg in place of f. One could assume A closed, rather than bounded, in Theorem 7.4. Uniqueness of the quasi solution is not very important for practical purposes. If there is a set { u g } of the solutions to (7.276) with fg in place of f, if A is injective and Auo = fo, andif11 fg-fo 1116,thenIIug-uo II-+Oas6-+Oforanyoftheelementsof the set {ug}. Indeed 11 Aug - fg 11<11 Auo - fg 11=11 fo - fg 1 1 16. Therefore 11 Aug - Auo 11<11 Aug - fg 11 11 fg - fo 116 26. Since M is compact and ug, uo E M the inequality 11 Aug - Auo 1 11 26 implies 11 ug - uo ll-+ 0 as 6 + 0 (see Lemma 7.6). We have finished the description of the first method, the method of quasisolutions, for finding a stable solution to problem (7.246). How does one choose M? The choice of M is made in accordance to the a priori knowledge about the solution to (7.246). For instance, in Example 7.6 one can take as M the set of functions u,which satisfy the condition
+
where Mj are constants, j = 1,2. Then
Au=lxudt= f
Applications
209
and
If”l 5 M z , If’(.)/
5 Ml, f(.)
= 0.
(7.284)
Inequality (7.283) defines a convex compactum M in L2[a,b]and (7.284) defines N = AM in L2[a,b]. Theorem 7.4 is applicable (since L2[a,b] is strictly convex) and guarantees that 11 u - u6 I I L Z [ ~ , ~ ] T ~ 0. A stable approximation of u = f’(z)in C[a,b] norm is given in Theorem 7.1. Let us now turn to the second method for constructing u g which satisfies equation (7.248). This will be a variational method which is also known as a regularization method. While in the first method one needs to solve the variational problem (7.276) with the restriction u E M , in the second method one has to solve a Variational problem without restrictions. Consider the functional F ( u ) :=I1 Au - f6
[I2
+?””(u),
11 f6 - f
115 6,
(7.285)
where 0 < y = const is a parameter, A is a linear bounded operator and ‘(u)is a positive strictly convex densely defined functional which defines a norm:
‘(ul -k u z ) < ‘(ul) ‘ ( u 2 ) if u1 # X u 2 , X = const. (7.287) 2 2 We also assume that the set of u E U, which satisfy the inequality +
4(u) I c,
(7.288)
is compact in U . In other words the closure of the set Dom4(u) in the norm
4(u)is a Banach space U+ c U which is dense in U , and the imbedding operator a : U+ -+U is compact. One often takes 4(.> =II Lu 11,
(7.289)
where L : U -+ U is a linear densely defined boundedly invertible operator with compact inverse. An operator is called boundedly invertible if its inverse is a bounded operator defined on all of U . Let us assume that U is reflexive so that from any bounded set of U one can select a weakly convergent in U subsequence. We will need a few concepts of nonlinear analysis in order to study the minimization problem F ( u ) = min.
R a n d o m Fields Estimation Theory
210
Definition 7.5 A functional F : U 4 R1 is called convex if D ( F ) := DomF is a linear set and for all u , v E D(F) one has
F
(Xu
+ (1 - X)v) 5 XF(u) + (1- X ) F ( V ) ,
Definition 7.6 from below if
0 5 X 5 1.
A functional F ( u ) is called weakly lower semicontinuous
u, 2 u + liminf F(u,) 2 F ( u ) ,
(7.291)
n+cc
where
3
(7.290)
denotes weak convergence in U .
Lemma 7.9 A weakly lower semicontinuous from below, functional F ( u ) , in a reflexive Banach space U is bounded f r o m below on any bounded weakly closed set M C DomF and attains its minimum on M at a point of M . Note that the set is weakly closed if
Un
E
M and U n 3 u implies u E M .
Proof of Lemma 7.9 Let -00
5d
:= inf F ( u ) , UEM
F(un)
-+
d,
Un E
M.
(7.292)
-
Since M is bounded and U is reflexive, there exists a weakly convergent subsequence of un which we denote un again, u, u. Since M is weakly closed one concludes that u E M . Since F is weakly lower semicontinuous from below, one concludes that
d 5 F ( u ) 5 liminf F ( u n ) = d. n+oo
Therefore d
(7.293)
> -00, and F ( u ) = d. Lemma 7.9 is proved.
Lemma 7.10 A weakly lower semicontinuous from below functional F(u) in a reflexive Banach space attains its minimum on every bounded, closed, and convex set M .
Proof. Any such set M in a reflexive Banach space is weakly closed. 0 Thus Lemma 7.10 follows from Lemma 7.9. Exercise. Prove the following lemmas. Lemma 7.11 If F ( u ) is weakly lower semicontinuous from below functional an a reflexive Banach space such that
F ( u ) + +oo
as
)I u
1 1 4 00
then F ( u ) attains its minimum on any closed convex set M C U.
(7.294)
Applications
211
Lemma 7.12 A weakly lower semicontinuous f r o m below functional in a reflexive Banach space attains its minimum o n every compactum. Lemma 7.13 A convex Gateux differentiable functional F ( u ) is weakly lower semicontinuous from below in a reflexive Banach space. Definition 7.7 only if
A functional F ( u ) is Gateux differentiable in U if and lim t - ’ [ F ( z
t++O
+ th) - F ( z ) ]= Ah
(7.295)
for any x , h E U , where A : U -+R1is a linear bounded functional on U .
Proof of Lemma 7.13 If un
-
u then convexity of F ( u ) implies
+
F(u)I F(un) F’(u)(u - U n ) -
(7.296)
Pass to the limit infimum in (7.296) to get
F ( u ) 5 liminfF(u,).
(7.297)
n+cc
Lemma 7.13 is proved.
CI
Exercise. Prove that if F ( u ) is Gateux differentiable and convex in the sense (7.290) then
F ( u ) - F ( v ) 5 F‘(u)(u - v) Vu,v
E DomF.
(7.298)
In fact, if F ( u ) is Gateux differentiable then (7.290) @ (7.298) @ (F’(u)- F’(v),u - v) 2 0.
(7.299)
The last inequality means that F’(u) is monotone. The parentheses in (7.299) denote the value of the linear functional F’(u) - F’(v) E U* at the element u - v E U . By U* the space of linear bounded functionals on U is denoted. We are now ready to prove the following theorem.
Theorem 7.5
Assume that A is a linear bounded injective operator de4(u) is a strictly convex weakly lower semicontinuous from below functional such that the set (7.288) is compact in U and u f := A-I f E D ( 4 ) . Then the minimization problem
fined o n the reflexive Banach space U,and
F ( u ) = min,
(7.300)
Random Fields Estimation Theory
212
where F ( u ) is defined an (7.285), has the unique solution ' 1 ~ 6f ,o~r any y > 0, and if one chooses y = y(6) so that y(6) -+ 0,
b2r-l(6) I m < 00
where m = const > 0, then u g
:= ~ g , ~ ( gsatisfies )
Proof. The functional F ( u ) 2 0. Let 0 be the minimizing sequence
as
6
+ 0,
(7.301)
(7.248).
I d := infuCuF(u), and let un (7.302)
F ( u n ) + d. Then
d 5 F(u,) 5 d
+ E I d2 + ~ $ ~ ( u fuf ) ,:= A - l f ,
Vn >n(E),
(7.303)
where E > 0 is a sufficiently small number, see (7.317) below. From (7.301) and (7.303) one concludes that (7.304)
so that $ 2 ( ~ nI ) c, c := m + $ 2 ( u f )Therefore . one can choose a convergent subsequence from the sequence u,. This subsequence is denoted also u,: u,-uo,u~ U
(7.305)
= 2106 := u g .
Since A is continuous one has
II Aun - f 11-+11 The lower weak semicontinuity of
$(ti)
A ~ -of
II
(7.306)
*
and (7.305) imply
liminf $(u,) 2 $(uo). n+x
(7.307)
Thus
d
I F(u0) I liminf F(u,) n+w
= d.
(7.308)
Therefore the solution to (7.300) exists and the limit (7.305) is a solution. Suppose v is another solution: d = F ( w )= F(u0).
(7.309)
Then, since F ( u ) is convex, one has
d
I F (Xu0 + (1 - A)v) 5 AF(u0)+ (1- A ) F ( v ) = d,
VA E [0, 11. (7.310)
213
Applications
Therefore (7.311) This implies that
It Auo - fs II
+
II Av - fs II
2
+
= l luoA v7 - f / /
2
(7.312)
and (7.313) From (7.287) and (7.312) one concludes that v = mo, c = const. Since $(cu) = Ic14(u), equation (7.313) implies that c 2 0. From (7.310) it follows that
F (Am0
+ (1 - A)uo)
=d
VX E [0,1].
(7.314)
Let p := X(c - 1). If c # 1, then for all real sufficiently small p one obtains from (7.314) that
II Auo - fs 112
+Yd2(U0)
=II
Auo - fs
+ pAuo /I2 + Y ( l +
p)2$2(uo).
Thus 0 = p2
II Auo [I2
+2pRe(Au0 - f s , A u ~+) 2 7 ~ 4 ~ ( u+0Y) ~
' ~ ~ ( U O(7.315) ) .
Since (7.315) is a quadratic equation it cannot be satisfied for all small p , since its coefficients are not all zeros. Therefore c = 1, and uniqueness of the solution to (7.300) is established. Let us prove the last statement of Theorem 7.5. Assume that (7.301) holds. Let Au = f u = A-lf := uf.One has (7.316)
11 Auo - fs 112
+ Y ~ ~ ( uI o 62 )
+ Y $ ~ ( u ) ,u = A-lf.
(7.317)
Thus, by (7.301), Cp(u0)
I &(u) + m := c
(7.318)
and, using (7.301) again, one obtains
11 AUo - fs
y (y-ld2
+ $2(u))5 cy
-+
0 as 6 -+ 0.
(7.319)
214
Random Fields Estimation Theory
Similarly, for sufficiently small 6, one has
I1 Aua - fa 112<
q ( 6 ) -+ 0 as 6 -+ 0.
(7.321)
From (7.321) it follows that
(1 Au6 - Au (1 5 11 Au6 - f 6 (1 + (1 f 6 - (1 5 11 A U g - f6 11 +6 5 C1/2y1/2(6)+ 6
-+
0
as 6 --+ 0.
(7.322)
Let M := {w : w E U,42(w)< c ] . Then M is a compactum, ug E M , u E M by (7.320) and (7.318). Therefore, (7.322) and Lemma 7.6 imply (7.248). 0 Theorem 7.5 is proved.
Remark 7.4 If one finds for y = $6) not the minimizer u g = itself but an approrimation to it, say w6, such that F ( w 6 ) 5 F ( u ) , then, as above, (7.323)
Thus
11 At16 - AU 115 ~ l / ~ y ” ~+( 6 ) 0. (7.324) (7.323) and (7.324) it follows that 11 w6 -u I[--+ 0 as 6 -+ 0. -+
A s above, from Therefore one can use an approximate solution t o the minimization problem (7.300) as long as the inequality F(w6) 5 F ( u ) holds. Remark 7.5 One can assume in Theorem 7.5 that A is not bounded but closed linear operator, D ( A ) 3 D ( 4 ) and the set (7.288) is compact in the space G A which is D ( A ) equipped with the graph n o m 11 u I l ~ : = l l u 11 11 Au 11. One can also assume that 4(u) is a conwex lower weakly semicontinuous functional, the set +(u)< c is not necessarily compact. The change in the proof of Theorem 7.5 under this assumption is as follows. From (7.304) it follows that u, uo an U. It follows [Ru, p . 65, Theorem 3.131 that there exists a convex combination 6, of u, such that 6, --+ UO. The sequence ii, is minimizing if un is. Therefore equations (7.306)- (7.308) hold with G , in place of u,. The proof of the uniqueness of the solution to u as 6 + 0 and (7.900) is the same as above. One can prove that 216 u as 6 .-+ 0. that there is a conwex combination 6 6 of ug such that 66 However, there is no algorithm to compute 6 s given ua.
+
-
-
-+
Applications
215
If R,,a : F --+ U is the mapping which sends f a into u , ~ ,the solution to (7.300), then (7.325)
Rsfs := R,(s),afs = us satisfies (7.248), that is
11 Raf6 - A-'f I[+
0 as 6
--f
0.
(7.326)
A construction of such a family R,J of operators, that there exists a(6) such that (7.326) holds, is used for solving an ill-posed problem (7.246). The family R,,a is called a regularizing family for problem (7.246). The error estimate of the approximate solution u,,g := R,fa can be given as follows:
II ~ a +-i U IISII Ra(f6 - f) I( + (1 R
~ - uu 115 w ( a ) 6 +Q(Q)
:= E ( E , ~ ) .
Here we assumed that R, is a linear operator and that 11 R, 115 w(a).One assumes that R, and u are such that ~ ( a+) 0 and w ( a ) .+ +m as a --+ 0. Then there exists a = a(6)such that a(6)+ 0 and E ( a ( b )6, ) := ~ ( 6 + ) 0 as 6 0. Therefore R, is a regularizing family for problem (7.246) provided that w(a) +m and Q(Q) .+ 0 as Q .+ 0. The stable approximation to the solution u of equation (7.246) is U J := Ra(6)f6,where ~ ( 6 is) chosen so that € ( a6) , 2 E (a(&), 6) := ~ ( 6 for ) any 6 > 0. One has the error estimate --+
--+
We gave two general methods for constructing such families. The theory can be generalized in several directions:
1) One can consider nonlinear A; unbounded A, for example, closed densely defined A; A given with some error, say A, is given such that 11 A- A, 11 < E.
2) One can consider special types of A, for example convolution and other special kernels; in this case one can often give a more precise error estimate for approximate solution. 3) One can study the problem of optimal choice of y and of the stabilizing functional 4(u)in (7.285). 4) One can study finite dimensional approximations for solving ill-posed problem (7.246). 5) One can study methods of solving problem (7.246) which are optimal in a suitable sense.
Random Fields Estimation Theory
216
These questions are studied in many books and papers, and we refer the reader t o [Ivanov et. al. (1978); Lavrentiev and Romanov (1986); Morozov (1984); Ramm (1968); Ramm (1973b); Ramm (1975); Ramm (1980); Ramm (1981); Ramm (1984); Ramm (198513); Ramm (198713); Ramm (1987~);Tanana (1981); Tikhonov (1977)]. In [Ramm (2003a)l and [Ramm (2005)l a new definition of the regularizing family is given. The new definition operates only with the data (6, f 6 , K}, and does not use the unknown f . The compact K: in this definition is the compact to which the unknown solution belongs. The knowledge of this compact is a n a priori information about the solutions of ill-posed problems. One calls Ra,6 a regularizing family, if l i r n s , o s ~ p { , : , , ~ , ~ ~ ~ , - ~l l~Rlb~f 6~ ~ }uII = 0, where u solves the equation Au = f,and R6 = R a ( 6 ) , 6 for some 0 < a(6)3 0 as 6 4 0.
7.6.3
Equations with random noise
In this section we look at the problem (7.246) with linear bounded injective operator from the point of view of estimation theory. Let us consider the equation (7.327)
Aw=f+n,
where A is an injective linear operator on a Hilbert space H , and n is noise. Let us assume for simplicity that noise takes values in H . In practice this is not always the case. For example, if H = L2(R")then a sample function n ( x ) may belong to L2(R') locally, but not globally if n(x) does not decay sufficiently fast as 1x1 -+00. Therefore the above assumption simplifies the theory. Assume that
72 = 0,
n*(x)n(y) = a2R(z,y),
(7.328)
where a2 > 0 is a parameter which characterizes the power of noise. Let us assume that the solution to (7.327) exists and f E RanA. One may try to suggest the following definition.
Definition 7.8
The solution to (7.327) is statistically stable if
D[w - u ]
-+0
as a
t
0,
(7.329)
where u solves the equation AU = f .
(7.330)
Applications
217
We will also use this definition with (7.329) substituted by ((W-U(((2+0
where
as
g+o,
(7.331)
(1 . I( denotes the norm in H .
This definition is very restrictive. First, the assumption that equation (7.327) is solvable means that a severe constraint is imposed on the noise. Secondly, the requirement (7.329) is rather restrictive. Let us illustrate this by examples and then consider some less restrictive definition of the stable solution to (7.327). Note that, under the above assumption,
w = A-lf
+ A-ln = u + A-ln.
(7.332)
Thus
D[w - U ] = D[A-ln].
(7.333)
Let us assume that H = L 2 ( D ) ,D c RT is a finite region, and A is a selfadjoint compact operator on H with kernel A ( z ,y),
Then 00
A-ln = ~ A ~ ' ( n , $ j ) $ j . j=1 Therefore m
(7.336)
Random Fields Estimation Theory
218
where A - l ( z , y ) is the kernel of the operator A-l in the sense of distributions and is given by the formula 00
A-%,Y) = -p;14;(44j(Y).
(7.338)
j=1
For the right hand side of (7.337) to converge to zero it is necessary and sufficient that the kernel B ( z ,y ) of the operator A-lRA-' be finite for all IC and y = z, B ( I C , I < C )00. If one requests in place of (7.329) that (7.339) where the bar denotes statistical average and
(1 w [I:=
(w, W)''2,
(7.340)
then the following condition (7.342) will imply (7.339). One has
(A-In, A-ln) =
L
dzD[A-ln]= U ~ T T ( A - ' R A - ')i 0
u +0 (7.341)
provided that A is selfadjoint, positive, and
T ~ ( A - ~ R A<- 03. ~)
(7.342)
Condition (7.342) is a severe restriction on the correlation function R(z,y) of the noise.
Example 7.14 Consider the case H = L2(R'),
i, 00
Au := Then
where
A(.
- y)u(y)dy.
(7.343)
219
Applications
one obtains (a derivation is given below) the formula (7.346) Here we have assumed that (7.347) where c(X) is a random process with orthogonal increments such that
c(x)= 0,
d<*(X)dC(X) = g2R(X)dX,
dC*(X)dC(p) = 0 for
X # p.
(7.348) (7.349)
If B is a linear integral operator with convolution kernel:
1, 00
Bn =
B(. - Y ) 4 Y ) d Y ,
and the spectral representation for n(x) is (7.347) then the spectral representation for Bn is
If
B1
and
B2
are two linear integral operators with convolution kernels,
then
-00
q, 00
=
Bl(X)B;(X)R(X)dX.
(7.351)
Equation (7.346) is a particular case of (7.351). Note that stationary random functions (7.347) have mean value zero and
1, 00
= u2
R(X)dX.
(7.352)
Random Fields Estimation Theory
220
It follows from formula (7.346) that the variance of the random process A-'n at any point x is finite if and only if the spectral density R ( X ) of the noise is such that fi(A)(A(A)l-2 E L'(-oo,oo). If A(A) tends to zero as 1x1 --f 00, the above condition imposes a severe restriction on the noise. For example, if the noise is white, that is a(X) = 1, then the condition IA(A)I-2 E L'(-oo,oo) is not satisfied for A(X) -+ 0 as 1x1 + 00. Example 7.15
Consider the Hilbert space H of 2~-periodicfunctions
f = C f mex~(imz)
(7.353)
m#o
with the inner product
C fmgk.
(f,g) :=
(7.354)
m#O
+ n(2) = 0,
Assume that f(z) n(z) is given, where n E H
n*(z)n(y)= 2 R ( X - y),
R(z
+ 27r) = R(z).
(7.355) (7.356)
Note that R(z) can be written as ~ ( z=) o2
C rmexp(-imz).
(7.357)
m#O
+
Suppose one wants to estimate f'(z) given f(z) n(z). If one uses the function ii := f'(z) + n'(z) as an estimate of f',then the variance of the error of this estimate can be calculated as follows. Let
n(z) =
C nmexp(imz),
(7.358)
m#o where nm, the Fourier coefficientsof n(z),are random variables, such that
nm = 0,
nLnj = g2rmSmj.
The numbers rm can be determined easily. Indeed
(7.359)
221
Applications
From (7.356) and (7.360) one can see that the numbers r, in (7.359) can be calculated by the formula
r, = 2.rr
Jr -r
R ( z )exp(imz)dz.
(7.361)
Since R ( x ) is a covariance function, the numbers r, are nonnegative as it should be. From (7.358) and (7.359) it follows that
C m2rm.
~[n= ’ ]g2
(7.362)
m
Therefore the numbers T, have to satisfy the condition that the right side of (7.362) is a convergent series, in order that the estimate ii be statistically stable in the sense of Definition 7.7.
Example 7.16 Let C be a selfadjoint positive operator on a Hilbert space H . Assume that the spectrum of C is discrete: as m + m .
O I X ~ I X Z I . . . , X,+ca
(7.363)
Consider the problem ut=Cu,
t>O
(7.364)
u(0)= f .
Let
(7.365)
4, be the eigenvectors of C: L4m = Xm4m
and assume that the system {&}, 1 basis of H :
Im <
(7.366) 00,
(4m74j) = L j .
forms an orthonormal
(7.367)
The formal solution to problem (7.364)-(7.365) is
where (7.369)
Random Fields Estimation Theory
222
Formula (7.368) gives a formal solution to (7.364)-(7.365) in the sense that formal differentiation in t yields m
(7.370) m=l and formal application of the operator C and formula (7.366) yield M
C.U=
C m=l
Am
exp(Xmt)fm4m,
(7.371)
so that (7.370) and (7.371) yield (7.364). Put t = 0 in (7.368) and get (7.365). Formula (7.368) gives the strong solution to (7.364)-(7.365) in H if and only if the series (7.370) converges in H , that is 00
(7.372) m=l This implies that the problem (7.364)-(7.365) is very ill-posed. This problem is an abstract heat equation with reversed time. If one takes -C in place of L in (7.364) then the problem is analogous to the usual heat equation and is well posed. Suppose that H = L 2 ( D ) ,and that the noisy data are given in (7.365), the function f(z) n(z) in place of f(z)where n(z) is noise. Let
+
00
n(z>=
C nm+m(z),
(7.373)
m=l
where nm := (n(z),+(z))
(7.374)
where -the parentheses denote the inner product in L2(D). It is clear that if n(z) = 0, then 12,= 0,
Vm.
(7.375)
Let
n*(z)n(y)= a2R(z,y). Then
(7.376)
Applications
223
The kernel R(z,y) is selfadjoint and nonnegative definite, being a covariance function. Let us assume that the matrix (7.378) is such that the series (7.377) converges in L 2 ( D )x L 2 ( D ) .If one uses the formula
for the solution of the problem (7.364)-(7.365) with the noisy initial data, then for the variance of the error of this estimate one obtains
It is clear from (7.380) that the solution .iL is not statisticaly stable since the series (7.380) may diverge although the series (7.377) converges.
Exercise. Let (7.363)-(7.368) hold, 11 u(0) ((<E and that 11 u(t) 115 d - f c f for 0 5 t 5 T .
(1 u(T) ( ( 5c.
Hint: Consider 4(t) :=I[ u(t) 112= C z = , e~p(-2X,t)lf,(~. #I > 0 and (In +)I' 2 0. Thus In is convex. Therefore
+
+
+
In4 [(l- a)o TI 5 (1 - a )ln@(O) ah@'), Let a
=
Prove
Check that
0 5 a 5 1.
$. Then the desired inequality follows.
The above examples lead to the following question: how does one find a statistically stable estimate of the solution to equation (7.327)? Let us outline a possible answer to this question. We consider linear estimates, but the approach allows one to generalize the theory and to consider nonlinear estimates as well. The approach is similar to the one outlined in Section 2.1. Let us look for a linear statistically stable estimate .iL of the solution to equation (7.327) of the form .iL = L(f
+ n).
(7.381)
Random Fields Estimation Theory
224
Assume that the injective linear operator A in (7.327) is an integral operator on H = L 2 ( D ) ,D c R‘, with the kernel A(z,y), and (7.382) Since Au = f,one has:
Ifi - u12
=
I(LA - I).
+ LnI2 = I(LA - I)uI2+m
(7.383)
where I is the identity operator and the term linear with respect to n vanishes because A = 0. Let us calculate the last term in (7.383):
= o2
// D
L*(z,y)R(y, z)L’(z,z)dydz.
(7.384)
D
Here we used second formula (7.328) and the standard notation
L’(z,z) := L ( z ,z ) .
(7.385)
Remember that star denotes complex conjugate (and not the adjoint operator). Integrating both sides of (7.383) in z over D yields (7.386) where Q is an integral operator with the kernel
and TrQ stands for the trace of the operator Q. This operator is clearly nonnegative definite in H = L 2 ( D ) :
(Q4,d)= (L*RL’4, 4) = (RL’4, L‘4) 2 0.
(7.388)
Here we used the fact that R is nonnegative definite; that (L*)t = L’, where At denotes the adjoint operator in L 2 ( D ) ;and we have assumed that the function Q(z,<) is continuous in z,< E B. The last assumption and the fact that the kernel Q(s, is nonnegative definite imply that
c)
(7.389)
A pplauations
225
One wants to choose L such that E = min
(7.390)
where E is given in (7.386). If one put L = A'l then the first term on the right side of (7.386) vanishes and the second is finite if
Tr { ( A - ' ) * I ~ ( A - ~ )<:' }00.
(7.391)
We assume that (7.391) holds. We claim that if (7.391) holds and then one can choose L so that Emin :=E(u)
t o
as
(T
+0
(7.392)
d+O.
Any such choice of L yields a statistically stable estimate of the solution u. Let us prove the claim that the choice of L which implies (7.392) is possible. For simplicity we assume that A is positive selfadjoint operator on H . Put
L = ( A + 61)-l
(7.393)
where 6 > 0 is a small number. Then the spectral theory yields: 2
(x+6 x - 1)
II(LA - 1)u1l2= IlAll
= I GGp I Im
d(Exu,u )
62
d(Exu,).
(7.394)
where Ex is the resolution of the identity of the operator A . Since IlAll
IlAll
62
d(Exu,u) I
d(Exu,u) =I] u
112<
00
(7.395)
and, as 6 + 0, the integrand in (7.394) tends to zero, one can use the Lebesgue dominant convergence theorem and conclude that
\\(LA- 1)u\\'
:= ~ ( 6 , u+ ) 0
as S -+ 0
(7.396)
where L is given by (7.393). The claim is proved. Lemma 7.14
If L is defined by (7.393) and (7.391) holds then 1imsupTr {L*RL'} < 00. 640
(7.397)
Random Fields Estimation Theory
226
Proof.
One has
+
+
[ ( A SI)-l]*R [ ( A SI)-l]’ =
+
[ ( A+ S I ) - l ] * A*(A-l)*R(A-l)’A’[ ( A SI)-l]’.
(7.398)
The operator (A-l)*R(A-l)’is in the trace class by (7.391). Moreover, if A > 0 and 6 > 0, then (7.399) Both inequalities (7.399) can be proved similarly or reduced one to the other, because A* = A’. Note that A > 0 implies
A* > 0. Indeed, for any
since A
(7.400)
4 E H , one has
> 0 by the assumption and q5*
E H . Therefore
(A*+,q5)* = (A*4,q5)> 0 Vq5
E
H.
(7.402)
The desired estimate (7.399) follows from the spectral theorem: (7.403) Lemma 7.14 is proved.
0
It is now easy to prove the following theorem.
Theorem 7.6 Let A > 0 be a bounded operator on H = L2 (D). Assume that condition (7.328) holds and T r R < 00. Then the estimate 6 = L(f +n),
(7.404)
with L given by (7.393), is statistically stable (in the sense (7.331)) estimate of the solution to equation Au = f , provided that parameter S = b(u) in formula (7.393) is chosen so that E := u2Tr(L*RL’) +r](S,u)= min.
(7.405)
227
Applications
Proof. Note that we do not assume in this theorem that condition (7.391) holds. Therefore Tr(L*RL'):= $(6) > 0 will, in general, satisfy the condition
$(S)
as 6 -+ 0.
(7.406)
+ r](S,u).
(7.407)
-+ +m
Thus E = u2$(S)
From (7.406), (7.407) and (7.396) it follows that the function E considered as a function of S for a fixed u > 0 attains its minimum at S = S(u) and 6(a) -+ 0
as u
+ 0.
(7.408)
Therefore ~ ( u=) emin = ~ ( b ( u -+ ) )0
as u
.+ 0.
Theorem 7.6 is proved.
(7.409) 0
If some estimates for $(S) and r](b,u)are found then an estimate of be obtained. This requires some a priori assumptions about the solution. ~ ( ucan )
Example 7.17 A simple estimate for $(S) is the following one.
Tr(L*RL') 5 T r R 11 L*
[I2<
TrR
62
(7.410)
*
Here we used the estimates
II L* 11=11
L' 1 1 1r1,
(7.411)
where L is given by (7.393), and the estimate
Tr(L*RL') 5 T r R 11 L
112.
(7.412)
We will prove inequality (7.412) later. Let us estimate r](S,u).To do this, assume that
where c > 0 is a constant, and IlAll
f :=
X-"dEx
f,
(7.414)
228
Random Fields Estimation T h e o q
(7.415) Since Au = f , u = A-'f it follows from (7.396) and (7.415) that
I c262b.
(7.416 )
Therefore, under the a priori assumption (7.413) about f, one has
a2TrR
€5-
62
+ ,2626
(7.417)
The right hand side in (7.417) attains its minimum in 6 ( a > 0 being fixed) at (7.418) and emin = € ( a )I consta2 bla
(7.419)
where const can be written explicitly. Let us finally prove inequaity (7.412). This inequality follows from
Lemma 7.15 If B is a linear bounded operator on H and R 2 0 is a trace class operator, then BR and RB are trace class operators and
ITr(BR)I 511 B
11 TrR,
Tr(RB)511 B
11 TrR.
(7.420)
Proof. Let us recall that a linear operator T : H + H is in the trace class if and only if M
(7.421) j=1
where s j ( T ) are the s-numbers of T . These numbers are defined by the equality s j ( ~= ) xj
{( T * T ) ' / ~ }
(7.422)
229
Applications
where A1 2 A2 2 . . . 2 0 are the eigenvalues of the nonnegative definite selfadjoint operator (TtT)lj2,and Tt is the adjoint of T in H . The minimax principle for the s-values is (7.423)
where L, runs through all j-dimensional subspaces of H , and 4 ILj means that is orthogonal to all elements of Lj. If B is a linear bounded operator on H then
Therefore if (7.421) holds then
The first part of Lemma 7.15 is proved since if R 2 0 one has TrR =I/ 111. The second part can be reduced to the first. Indeed, T and T* are simultaneously in the trace class since
R
sj(T) = sj(T*), Vj.
(7.426)
(TB)*= B*T*.
(7.427)
One has
Since 11 B 11=11 (7.425) that
B*
11,
and (TTT)= TrT*, one concludes from (7.427) and
Take T = R 2 0 then 11 T TrR and the second inequality (7.420) is obtained. Lemma 7.15 is proved. 0 Additional information about s-values one can find in Section 6.3.
Random Fields Estamataon Theory
230
7.7 A remark on nonlinear (polynomial) estimates Let U ( X )= S(X)
+n(z),
zE
D
C
R’
(7.429)
Consider the polynomial estimate (filter): m
HjUbl
AU :=
(7.430)
j=l
where
The problem is to find A such that E
:= D[AU - s] = min.
(7.433)
Here D is the symbol of variance, the assumptions about S(X) and n ( z )are the same as in Chapter 1, the optimal estimate is defined by n functions ( h l , .. . , hn), and we could consider by the same method the problem of estimating a known operator on s(z), for example a j s ( ~ ) Let . us substitute (7.430) in (7.433): m E
m
HjUblH,*U[i]*- 2Re
:=
i,j=l m
= min
HtU[i]*s(z) + ls(~)1~
i=O m
(7.434)
Here
aij := UliIU[il*= U ( [ l ) .. .U(<j)U*([i)..a*([;), (7.436)
231
Applications
(7.437) Note that (7.435) implies that
Let hj(S1,.
. . L j ) + Ej77j(J1,. .
(7.439)
.I&)
be substituted for hj in (7.437) and we surpress exibiting dependence on x since x will be fixed. Here ~j are numbers. The condition e = min at ~j = 0 implies that m
C u i j H j = bi,
1 5 a 5 m.
(7.440)
j= 1
This is a system of integral equations for the functions
dJj = bi(X,J;,.
. ., < j ) :
hj(x,tl,.
. . ,ti).
(7.441)
Consider as an example the case of polynomial estimates of degree 2. Then
s,
a21
(ti,G1Jdhl (
/J D
D
(7.442) a22(G I
t;,El, 52)h2(Cl152)d51d52 = b2(E:IG).
(7.443)
If a z z ( < : , ( h , 5 1 , < 2 ) belongs to R,or to some class of the operators which ) equation (7.443) in terms of can be inverted, one can find h 2 ( & , & ?from hl(J1) and then (7.442) becomes an equation for one function hl(J1).
232
Random Fields Estimation Theory
In the framework of correlation theory it is customary to consider only linear estimates because the data (covariance functions) consist of the moments of second order.
Chapter 8
Auxiliary Results
8.1 8.1.1
Sobolev spaces and distributions
A general imbedding theorem
Let D c R' be a bounded domain with a smooth boundary r. The LP(D), p 2 1 spaces consist of measurable functions on D such that 11 u I I L P ( D ) : = (J, I u I P d z ) l / p < 00. For p = +m one has 11 u I~L-(D):= esssup,,, Iu(z)I. If in place of the Lebesgue measure a measure p is used then we use LP(D,p ) as the symbol for the corresponding space. These are Banach spaces. If C F ( D )is the set of infinitely differentiable functions with compact support in D ,then C F ( D )is dense in L*(D), 1 5 p < 00. If one defines a mollifier, i.e. a function 0 5 p ( x ) E Cr(R'), p(z) = 0 for 1x1 2 1, Jp(z)dz = 1, J := JRP,e.g. p(z) := cexp{(1xI2 - 1)-'}
for 1x1 < l,p(x) = 0 for
1x1 2 1 (8.1)
where c is the normalizing constant chosen so that J p d z = 1, then the function u,(z) := E - n
J
p ( l z - yIE-l)u(y)dy,
E
> 0,
(8.2)
belongs to Clz and 11 u, - u IILpo,+ 0 as E + 0. By Lf,, one means a set of functions which belong to LP on any compact subset of D or R'. Convergence in Lro,(D) means convergence in LP(B) where b is an arbitrary compact subset in D. By WeiP(D),the Sobolev space, one means the Banach space of functions u(x)defined on D with the finite norm
e
11
11 D"U
!IW',p(D):= IA=O
233
IlL.P(D) .
(8.3)
Random Fields Estimation Theory
234
Here j is a multiindex, j = (jl,.. . 'j,.), Dj = D& . . . D k , Ijl = jl +. . .+j,. The space C"(D) of infinitely differentiable in D functions is dense in W e i p ( D ) By . f i e > p ( Dwe ) denote the closure of C r ( D ) in the norm (8.3). By H e ( D ) we denote Wei2(D). This is a Hilbert space with the inner product
(8.4)
(n)
Let C" denote the space of restrictions to D of functions in C-(Rr). If the boundary of D is not sufficiently smooth, then C w ( ~may ) be not dense in W e i p ( D )A . sufficient condition on I? for C"(B) to be dense in We>.(D) is that D is bounded and star shaped with respect to a point. D is called star shaped with respect to a point 0 if any ray issued from 0 intersects I? := a D at one and only one point. Another sufficient condition for C"(D) to be dense in WeJ'(D) is that every point of I? has a neighborhood U in which D n U is representable in a suitable Cartesian coordinate system as x, < f ( x 1 , . . . x v - l ) , where f is continuous. Any function u E W e + ( D ) p, 2 1, I 2 1, (possibly modified on a set of Lebesgue RT-measure zero) is absolutely continuous on almost all straight lines parallel to coordinate axes, and its distributional first derivatives coincide with the usual derivatives almost everywhere. The spaces WeiP(D) are complete. We say that a bounded domain D c Rr satisfies a uniform interior cone condition if there is a fixed cone CD such that each point of I' is the vertex of a cone C D ( X )c D congruent to CD. A strict cone property holds if I' has a locally finite covering by an open set { U j } and a corresponding collection of cones {Cj} such that Vx E Uj n I? one has x Cj E D. According to Calderon's extension theorem, there exists a bounded linear operator E : WeJ'(D) + WeiP(Rr)such that Tu = u on D for every u E We+'(D)provided that D C C0>l.The class C0>lof domains consists of bounded domains D such that each point x E I? has a neighborhood U with the property that the set D n U is represented by the inequality xr < f(xt.1,.. . , x r - l ) in a Cartesian coordinate system, and the function f is Lipschitz-continuous. The domains in Cop1have the cone property. The extension theorem holds for a wider class of domains than Coll, but we do not go into details.
+
Auxiliary Results
235
Let us formulate a general embedding theorem ([Mazja (1986)l).
Theorem 8.1 Let D c R be a bounded domain with a cone property, and let p be a measure o n D such that
and s > 0. [If s 5 n is a n integer, then p can be s-dimensional Lebesgue measure o n D n Fa, where Fa is an s-dimensional smooth manifold). Then, f o r any u E C o o ( D )n WeJ’(D),one has k
11 D’u
IILq(D,p)I
c
11
(8.5)
I/WL.p(D)
j=O
where c = const > 0 does not depend o n u. Here the parameters q, s,t?,p, k satisfy one of the following sets of restrictions: a) b) c) d) e)
p > 1, 0 < r - p(C - k ) < s 5 r , q 5 sp[r - p(C - k)]-’ p=l, O 1, r = p ( t - k ) , s 5 r , q > 0 is arbitrary i f p > 1, T < p(C- k ) OT p = 1, r 5 t - k then k
If f) p 2 1, (t? - k
-l)p
< (C - k ) p , and X := C - k - rp-’ then
.#Y
If g) (t? - k - l ) p = r then (8.7) holds for all 0 < X < 1. The imbedding operator i : WeJ’(D) + WktQ(DS nD)is compact if s > 0, r > ( C - k ) p , r - ( l - k ) p < s 5 r , and q < sp[r - (1- k ) p ] - l . If r = (C - k ) p , q 2 1 , s 5 r then the above imbedding operator i i s compact. If r < (C - k ) p then i : W e + ( D ) + C k ( D )is compact. The trace operator i : H e ( D ) -+ He-lI2(I’) is bounded i f t? >
3.
Random Fields Estimation Theory
236
8.1.2
Sobolev spaces with negative indices
We start with the brief exposition of the facts in distribution theory which ) will be used later. Let S(R') be the Schwartz's space of C M ( R T functions which decay with all their derivatives faster than any negative power of 1x1 as 1x1 00, so that --f
m
l4lm :=
z ~ ( l1 x+1 )C~ I D ~ ~ (
o 5 m < 00.
(8.8)
j=O
The set of the norms I . Jm defines the topology of S(R'). A sequence - q51m + 0 as n + 00 for 0 5 m < 00. It converges to 4 in S if is easy to see that CF(R') is dense in S in the above sense. A space S' of tempered distributions is the space of linear continuous functionals on S. Continuity of a linear functional f on S means that (f,&) + (f,4) if 4n + 4 in S. By (f, 4) the value of f at the element 4 is denoted. A wider class of distributions is used often, the class V' of linear continuous functionals over the space D = Cr(R') of test functions. Continuity of f E V' means that for any compact set K C R' there exist constants c and m such that I(f,4)I I C & ~ , S U P , ~ K lDj41, 4 E CF(K). BY the derivative of a distribution f E S' one means a distribution f' E S', such that
(f', 4) = -(f, 4') v4 E s.
(8.9)
Also
(D"f74) = (-1)IWYf,Drn4),
w E s.
Let D be an open set in R', and S ( D ) be the completion of CF(D)in the topology of S(R'). A continuous linear functional on S ( D )is a distribution in D. The space of these distributions is denoted S'(D). If ( F ,4) = (f,4) for all 4 E S ( D ) , some F E S'(R') and some f E S'(D), then f is called the restriction of F to D and we write f = p F . Since S(D) is, by definition, a closed linear subspace of S ( R T ) ,the Hahn-Banach theorem says that a linear continuous functional f E S'(D) can be extended from S ' ( 0 ) to S'(R'). Let us denote this extension by Ef = F. The space S'(RT)is a Frechet space, i.e. a complete locally convex metrizable space, so that Hahn-Banach theorem holds. If F E S'(R') then one says that F = 0 in an open set D if ( F ,4) = 0 for all 4 E S ( D ) , or which is equivalent, for all 4- E Cr(D).If D is the maximal open set on which F = 0, one says that 52 := ' R \ D is the support of F . By the closure of R is denoted. If F
Auxiliary Results
237
is locally integrable, i.e. a usual function, then supp F is the complement of the maximal open set on which F = 0 almost everywhere in the sense of Lebesgue measure in R'. Note that supp DmF supp F . The Fourier transform of f E S'(R') is defined by ($7
$*I = (2T)'(f, 4 * )
(8.10)
where the star stands for complex conjugation, the tilde stands for the Fourier transform, and
This definition of $ is based on the Parseval equality for functions: ($1, &) = 2 ~ ( 4 1$;), , and on the fact that the Fourier transform is a linear continuous bijection of S(R') onto itself. Note that (8.11)
F ( f *$) =f$, F f :=$
(8.12)
where f * $ is the convolution of f E S'(R') and $ E S ( R T ) defined , by f * $ = (f,$(x - y)). Here x E R' is fixed and (f,$(x - y)) denotes the value of f at the element 4(x - y) E S(R'). The function f * 4 is infinitely differentiable. Any f E S'(Rr) can be represented in the form (8.13) where
fj
are locally continuous functions which satisfy the inequality
Ifj(.)l
I c ( l + IxIY, I j l I m,
(8.14)
m and N are some integers, and c = const > 0. The space He(R') one can define either as the closure of CF(R') in the norm 11 . l e:=11 . ~ ( H Lor, by noting that, for $ E Cr(R'),one can define an equivalent norm by the formula: (8.15) and (8.16)
Random Fields Estimation Theory
238
In (8.15) one can assume -00 < t
< 00. Iff
E Re(R') and
4 E S(RT)then (8.17)
where
11 [I-!:= f#J
1/2
(2r)-"2
(/(1+
IX12)-'14l2dX)
.
(8.18)
It is clear from (8.17) that one can define H-'(R') as the closure of S ( R P ) in the norm (8.19) Thus H-'(R') is the dual space to He(R') with respect to the pairing given by the inner product (-,.)o. Consider the spaces f i e ( D )and fie(R), -00 < C < 00, of functions belonging to H e ( R T )with support in and R respectively, R := R3 \ D. These spaces are closed in H e ( R T )norm subspaces of He(R') which can be described as completions of CF(D)and C?(R) in the norm He(R'). If f E f i e ( D )then (f,4) = 0 V4 E Cr(R). Consider the space H e ( D ) ,where D c E H e , 1 2 0. This means that D has the property that there exists an extension operator E : H e ( D )+ He(R') with the property E f = f in D , 11 Ef I I H t ( R r ) I c 11 f I I H ~ ( D ) . A similar definition can be given for D c EWeJ'. Bounded domains D c Coy1 are in the class EH' (and in EW'sP). The property Ef = f in D for C < 0 means that p E f = f, where Ef E He(R') c S'(R'), C < 0, and p is the restriction operator p : S'(R') -+ S'(D). Thus, for any t , -00 < C < 00, we consider the linear space of restrictions of elements f E H e ( R T ) ,-00 < C < 00, to D c COll. Define a norm on this space by the formula
11 f lle:=ll f
llHqD)= '",f
It Ef
ltHq Rr) 7
< 00,
--a3 <1
(8.20)
where the infimum is taken over all extensions of f belonging to He(R'). If E f is such an extension, then E f f- is also such an extension for any
f-
E
&(a).
+
If C 2 0 and D c E H e then the norm (8.20) is equivalent to the usual norm (1.4). Let (He(D))'denote the dual space to H e ( D )with respect to L2(D)= H o ( D ) . This notion we introduce in an abstract form. Let H+ and HO be a pair of Hilbert spaces, R+ c Ho, H+ is dense in HO in HO norm and
Auxiliary Results
11 f 11+>/1
110.
239
Define the dual to H+ space H i := H- as follows. Note that i f f E HO and E H+ then f
+
I(f,+)I
1 1 1f
11011
+ l l o l l l f + II+, 11011
(8.21)
where (f,4) = (f,+)o is the inner product in Ho. Define (8.22)
It follows from (8.21) that 11 f (I-<(I f 110, and that (f,+) is a bounded linear functional on H+. By Riesz's theorem, ( f , + ) ~= (If,+)+,where I : HO 4 H+ is a linear bounded operator, 11 If 11+<11 f (10. Define H- as the completion of HO in the norm (8.22). Clearly H+ c Ho c H-. The triple {H+, Ho, H - } is called a rigged triple of Hilbert space. The space H- is called the dual space to H+ with respect to Ho. The inner product in H- can be defined by the formula
Indeed, the operator I was defined as an operator from HO into H+. Therefore, for f,g E HO the right side of (8.23) is well defined. Consider the completion of HO in the norm 11 f 11- = 11 If I[+. Then we obtain H-. Note that the right side of (8.22) can be written as 11 If II+. Therefore, the operator I is now defined as a linear isometry from H- into H+. In fact, this isometry is onto H+. Indeed, suppose (If,+)+ = 0 Vf E H-. We wish to prove that = 0. If f E Ho, then 0 = (If,+)+ = ( f , + ) ~ .Thus, = 0. If one considers H+ as the space of test functions, then H- is the corresponding space of distributions. In this sense, we treat (H'(D))' as the space of distributions corresponding to H+ = H'(D). Since l?'(D) c H'(D), one has (H'(D))' c (@(D))'. One of the results we will need is the following:
+
+
(fi'(D))' = H-'(D), (H'(D))' = A-'(D),
-00 -00
< e < 00, < t? < 00.
(8.24)
Here we assume that D c Cosl. Let us recall that fi-'(D) can be defined as the completion of the set Cr(D)in the norm of H-'(R'), while H-'(D) can be defined as the space of restrictions of elements of H-'(R') to the domain D c Coil with the norm (8.20).
Random Fields Estimation Theory
240
Let us now describe a canonical factorization of the isometric operator -+ H+ defined above. This factorization is
I : H-
I = P+P-,
(8.25)
Ho and p+ : HO -+ H+ are linear isometries onto HO and where p - : HH+ respectively. In order to prove (8.25) let us note that ( I f , g ) += ( f , g ) o for all f,g E H+. Thus I is a selfadjoint positive linear operator on H+, ( I f , f ) += ( f , f ) o 2 0 (= 0 if and only if f = 0). Let J+ be the closure of 11/2: H+ + H+ considered as an operator in Ho. Since 11 I1I2f 11+=11 f 110 the operator Ill2 is closable. If fn E H+ and fn converges in HOto f E Ho, then 1 1 / 2 f n -H+ g . Let us define J+f = I l l 2f = g. Then J+ is defined on all of Ho, it is an isometry: 11 J + f 11+=11 f 110, and its range is all of H+. Indeed, suppose (J+f,4)+ = 0 for all f E HO and some 4 E H+. Then (f,4)0 = 0 Vf E Ho. Thus 4 = 0. Since ,7+ is an isometry, its range is a closed subspace in H+. We have proved above that the orthogonal complement of the range of J+ is trivial. Therefore the range of J+ is H+. So one can take p+ = J+. Define p - := J q ' I . Then (8.25) holds and p - : H- + HO is an isometry with domain H- and range Ho, while p+ : Ho 4 H+ is an isometry with domain HO and range H+. One has I = p + p - , which is the desired factorization (8.25). If i : H+ -+ Ho is the imbedding operator and I is considered as an operator from HO into H+ then i = I*, where I* is the operator adjoint to I : Ho -+ H+, I* : H+ 4 Ho. Indeed -+
If one assumes that the imbedding operator i : H+ -+ HO is in the Hilbert-Schmidt class as an operator in Ho, then the p+ is in the HilbertSchmidt class as an operator in Ho. Indeed, p+ : HO -+ H+ is an isometry. Therefore it sends a bounded set in Ho, say the unit ball of HO :I[ u 11051, into a bounded set in H+, 1) p+u ))+=I/ u J J o l 1 if 1) u 1105 1. Since the imbedding i : H+ -+ HOis Hilbert-Schmidt, the operator p + considered as an operator on HO is in the Hilbert-Schmidt class: p+ : Ho -+ HO = (i : H+
+
Ho)(p+ : HO
-+
H+).
(8.27)
The right side of (8.27) is the product of an operator in the Hilbert-Schmidt class and a bounded operator. Therefore the product is in the HilbertSchmidt class.
Auxiliary Results
Because p + p -
=I,
24 1
one has (8.32)
q-q+ = I-l.
Note that (f,q+'ZL)o = (q-f,'ZL)o
so that q- = q; is the adjoint to q+ in To check (8.33) one writes
(f,Q+'ZL)o = (47-f, Q-Q+'zL)-
f E Ho,
H+,
(8.33)
= ( 4 - f , 2110
(8.34)
21
E
Ho.
= (q-f, I- 1
.I-
which is equation (8.33). 8.2
8.2.1
Eigenfunction expansions for elliptic selfadjoint operators
Resoluion of the identity and integral representation of selfadjoint operators
Every selfadjoint operator A on a Hilbert space H can be represented as
1, 03
A=
(8.35)
where Ex is a family of orthogonal projection operators such that
E: = Ex,
E-, = 0,
E+m = I ,
(8.36)
242
Random Fields Estimation Theory
where 0 is the zero operator, I is the identity operator, A = ( a , b ] ,-m < a < b < 00, EA := Eb - E,. The family Ex is called the resolution of the identity of A. The domain of definition of A is: Dom A = {f : f E H ,
X2d(E,f,f) < m}.
(8.38)
A function $(A)is defined as
where
J-w
The operator integrals (8.35), (8.39) can be understood as improper operator Stieltjes integrals which converge strongly. 8.2.2
Differentiation of operator measures
A family E ( A ) of selfadjoint bounded nonnegative operators from the set of Bore1 sets A E R‘ into the set of B ( H ) of bounded linear operators on a E(Aj) Hilbert space H is called an operator measure if E ( U z l Aj) = C,”=, where the limit on the right is taken in the sense of weak convergence of operators, Ai n Aj = 0 for i = j , and E(0) = 0. Assume that for A bounded one has T r E(A) < m, where T r A is the trace of A. Then p(A) := T r E(A) 2 0 is a usual (scalar) measure on R1.Let IQl denote the
(czl
1/2
Hilbert-Schmidt norm of a linear operator 9, IQl = I( Qej \I2) , where { e j } is an orthonormal basis of H . In this chapter the star will stand for the adjoint operator and the bar for complex conjugate or for the closure.
Lemma 8.1 For p-a.e. (almost everywhere) there exists a HS (HilbertSchmidt) operutor-valuedfunction @(A) 2 0 with 11 Q(X) 115 TrQ(X)5 1 such that (8.40)
The function 9 ( X ) is uniquely defined p-a.e. and can be obtained as a weak
243
Auxiliary Results
limit @(A) = w - limE(Aj)p-'(Aj)
as A j
-+
A.
(8.41)
The integral in (8.40) converges in the norm of operators f o r any bounded A. The limit in (8.41) means that X E Aj and IAj I + 0 , where 1 Aj I is the length of Aj, and A j is a suitable sequence of intervals.
Let A be a selfadjoint operator on H and E(A) be its resolution of the identity. In general, t r E ( A ) is not finite so that Lemma 8.1 is not applicable. In order to be able to use this lemma, let us take an arbitrary linear densely defined closed operator T on H with the properties: i) RanT = H , T is injective, ii) T-' E Q where u2 is the class of Hilbert-Schmidt operators. Definition 8.1 A linear compact operator K on H belongs to the class u p ,K E up, i f and only i f C,"=lsK(K) < 00, where s n ( K ) are the s-values of K , which are defined by the formula sn(K)= Xn[(K*K)1/2]and X,(B) are the eigenvalues of a compact selfadjoint nonnegative operator B ordered so that A1 2 X2 2 ... 2 0. If K E 01 it is called trace class operator; i f K E 02 it is called H S operator. More information about trace class operators is given in Section 8.3.3. Having chosen T as above, define the operator measure O(A) := T-l*E(A)T-l. If A E 02 define its H S norm by the formula IAI2 := Cj"=,1) A e j I(', where 11 A f 1) is the norm in H of the vector Af,and { e j } is the orthonormal basis of H . By [A/denote the norm of A. Note that lABl 5 IAl II B II, II A 1 1 114,P A l = IXIIAI, IA+BI I IAl PI, and that IAl does not depend on the choice of the orthonormal basis { e j } of H . Since Tr(B*AB)I ( A ((1 B (1' one has TrO(s) 511 T-' (I2 IE(A)I 511T-l (I2< 00. Therefore Lemma 8.1 is applicable to O(A), so that
+
where d p is a nonnegative measure, p((-00, 00)) < 00, *(A) is a nonnegative operator function, *(A) 2 0, and I*(X)l I Tr*(X) = 1. Let, for a fixed A, &(A) and va(X), a = 1,2,. ..NA 5 00, be respectively the orthonormal system of eigenvectorsof *(A) and the corresponding eigenvalues. Then
Random Fields Estimation Theory
244
where the bar denotes complex conjugate,
One can write
(E(A)f,9 ) =
1
(8.45)
(Tf, $a@)) (Tg,$a(W)dP(X>.
A a=l
If F(X) is a Bore1 measureable function on A, then
(F(A)f, 9 ) =
SW
F(X)( W ) T f ,Tg)dP(X),
(8.46)
--M
where f E D ( F ( A ) )n D(T),g E D(T). If one takes T = q+ and if the imbedding i : H+ -+ HO is in the Hilbert-Schmidt class, then the operator P(X) := T*Q(X)T,which appears in (8.42):
(E(A)f,9 ) = and in (8.46):
(F(A)f,9 ) =
1(P(X)f, A
(8.47)
9 ) dP(N
Srn
F ( 4 (P(X)f,9 ) d P ( 4 ,
(8.48)
-W
can be considered as an operator from H+ into H- for any fixed X E R‘. This operator is in the Hilbert-Schmidt class because it is a product of two bounded operators and a Hilbert-Schmidt operator q(X): T* = q- : HO H- is bounded, T = q+ : H+ -+ Ho is bounded, @(A) : HO -+ HO is in the Hilbert-Schmidt class. The range of the operator P(X) is a generalized eigenspace of the operator A corresponding to the point A. This eigenspace belongs to H-. Formula (8.47) can be written as -+
(8.49) where the integral (8.49) converges weakly, as follows from formula (8.47), but also it converges in the Hilbert-Schmidt norm of the operators C(H+, H - ) , where by C(H+, H - ) we denote the set of linear bounded operators from H+ into H-. Indeed
IWX>I II T IIL IW4l 5 1 where we took into account that 11 T 11=11 q+ II= 1, 11 T* 11=11 IP(X>l1 1 1T* II
(8.50) q-
[I=
1.
Azlziliary Results
245
The operator P ( A ) is an orthogonal projector in the following sense. If 4 E H+ and
(P(A)u,$), = 0 vu E H+
(8.51)
P(A)fp = 0.
(8.52)
then
Indeed 0 = (P(X)u,$10 = (Q-WNQ+'LL, d o = (@(A)Q+U,Q + d O
$(u?++),
= (Q+% = (u?q-w+?+4), = (ulP(X)(b), = 0 vu E H+.
(8.53)
Thus, equation (8.52) follows from (8.53). Therefore, if 4 is orthogonal t o the range of P(A) then the projection of q!~ onto the range of P ( A ) vanishes. Let us rewrite formula (8.43) in terms of the generalized eigenvectors. Define
T*$, = q-$,
:= q, E
H-.
(8.54)
Then (8.43) can be rewritten as
where
Formula (8.45) becomes (8.57) Since P(A) is in the Hilbert-Schmidt class, it is an integral operator with a kernel @(xly, A)l the (generalized) spectral kernel (see the Remark at the end of Section 3.6). The operator Ex is an integral operator with the kernel (8.58)
Random Fields Estimation Theory
246
The operator F ( A ) is an integral operator with the kernel
8.2.3
Carleman operators
An integral operator (8.60) is called a Carleman operator if
v ( A ) := sup xERr
J
IA(z,y)12dy < 00.
(8.61)
A selfadjoint operator C is called a Carleman operator if there exists a continuous function +(A),
$(A) E
W), 0 < I4(A)l I c
VA E A,
(8.62)
where c = const > 0, A is the spectrum of L, such that the operator A = 4(C) is an integral operator with the kernel A(z,y) which satisfies (8.61). Let HO= L2(R') and take H+ = L2 (R',p(z)), where p ( z ) 2 1 and
sp'(z)d2.
< 0O.
(8.63)
+
For example, one can take p ( z ) = (1 1z12)(rfE)/2, where E > 0 is any positive number, Then the operator A defined by the formula (8.60) is in the Hilbert-Schmidt class az(H0,H+) if the condition (8.63) holds. Indeed, if {fj}, 1 5 j < 00, in an orthonormal basis of Ho, then
(8.64) where Parseval's equality was used to get the second equality and the conditions (8.61) and (8.63) were used to get the final inequality (8.64).
Auxiliary Results
247
If A = q5(L) and A E u2(Ho,H+) one can use the triple
L2 ( R ' , p ( s ) ) c L2(R') c L2 (R',p-'(s)) for eigenfunction expansions of the operator
(8.65)
L. Here
H+ = L2 ( R ' , p ( z ) ) , Ho = L2(R'), H- = L2 (R',p-'(z)) .
(8.66)
Indeed, the basic condition, which has to be satisfied for the theory developed in Section 8.2.2 to be valid, is
Tr { (T-')*E(A)T-'} < 00,
(8.67)
provided that A is bounded. Since +(A) is continuous and (8.62) holds, one has
where &(A)
I 14(x)12c(A),
E A,
(8.68)
is the characteristic function of A: (8.69)
and .(A) = const > 0. Inequality (8.68) implies
E(A) I .(A)
I4(A)l2dEx = c(A)4*(0#0).
(8.70)
Therefore
{
Tr { (T-')*E(A)T-'} I c(A)Tr [4(C)T-']' +(L)T-l} =
c(~)14(~)~-11.
(8.71)
Thus (8.67) holds if
I(b(L)T-ll < 00.
(8.72)
If T-' = p + , then condition (8.72) becomes
I4(%+l < O0.
(8.73)
In the case of the triple (8.65)-(8.66) the operator p+ is a multiplication operator given by the formula
P+f
=P - " 2 ( 4 f ( 4 .
(8.74)
248
Random Fields Estimation Theory
Condition (8.73) holds if p(s) satisfies condition (8.63) and 4(C) = A is a Carleman operator so that condition (8.61) holds. Indeed, inequality (8.73) holds if
//
lA(w)I2P-l(Y)dYda:
< m.
(8.75)
We assume hat the function 4(A) is such that (8.61) implies sup VERP
] IA(z,y)I2dz <
00.
(8.76)
If this is the case then inequality (8.75) holds provided that (8.63) and (8.76) hold. Therefore in this case one can use the triple (8.65) for eigenfunction expansions of the operator C and the generalized eigenfunctions of C are elements of L2 ( R T , p - l ( s ) )so , that they belong to Lfo,(RT). Inequality (8.61) implies (8.76), for example, if $(A) is a real-valued function. In this case A = A*, so that (8.77) and if (8.77) holds then clearly (8.61) implies (8.76). In many applications one takes
4(A)
= (A - t ) - m l
(8.78)
where t is a complex number and m is a sufficiently large positive integer. If 4 is so chosen then (8.79) J-03
and
(8.80) where we have used the equation (8.81)
Auxiliary Results
249
which follows from the assumed selfadjointness of L. Therefore, if both kernels A(z, y; z ) and A(z, y; f ) satisfy inequality (8.61), then (8.76) holds.
Elements of the spectral theory of elliptic operators in L ~ ( R ~ )
8.2.4
Let
where z E R', j is a multiindex, a j ( z ) E Cljl(Rr), a,(z,[) :=
uj(z)[j
# 0 for (x,[) E R' x (R'\O),
(8.83)
IA=~
and assume that L is formally selfadjoint
L = c*
(8.84)
that is
(LA$)= ($,CClcI)
W,$ E G Y R ' ) .
(8.85)
The function a s ( z , [ )is called the symbol of the elliptic operator (8.82), and condition (8.83) is the ellipticity condition. If (8.83) holds and T 2 3 then s is necessarily an even number. Often one assumes that L is strongly elliptic. This means that
Reaj(z)ti # 0 for (z, [) E R' x (R'
\ 0).
(8.86)
Iil=s
The assumptions (8.84) and (8.86) imply that the operator L is bounded from below on C r (R').
( L u , ~ )2oCI
11 21 11&(Rp)
-c2
11 u 11&(~')
vu E Cr(R'),
(8.87)
where c1 and c2 are positive constants. Define the minimal operator in L2(R') generated by the formally selfadjoint differential expression (8.82) as the closure of the symmetric operator u + Cu with the domain of definition C r ( R ' ) . Any densely defined of the a Hilbert space H symmetric operator L is closable. Recall that closure of L,is defined as follows. Let U n E DomL := D ( L ) , un 4 u in H and Lu, +. f in H . Then one declares that u E D ( z ) and zu = f. This definition implies that is defined on the closure of D ( L ) in the graph norm
z,
Random Fields Estimation Theory
250
+
z
11 u 11~:=11 u 11 11 Cu 11 and the graph of is the closure in the graph norm of the graph of C, that is of the set of ordered pairs {u,Cu}, u E D(C). One says that C is closable if and only if the set { u , z u } , u E D ( E ) , is a graph. In other words, C is closable if and only if there is no pair { O , f } , f # 0, in the closure of the graph of C. This means that if u, E D ( C ) , u, - 0
and Cu,
4
f
(8.88)
then
f = 0.
(8.89)
If C is symmetric and densely defined and
4 E D ( C ) , then
(f,q5) = T Ilim (Cun,q5) = lim (un,Cq5) = 0 Vq5 E D(C). + 00 n+w
(8.90)
Since D ( C ) is dense, one concludes that f = 0. Therefore C is closable. We denote by C,. Under some assumptions on the coefficients u j ( x ) it turns out that C , is selfadjoint. If no assumptions of the growth of u j ( x ) as 1x1 -+ 00 are made then C , may be not selfadjoint (see [Ma, p 1561 for an example). Let us give some sufficient conditions for C , to be selfadjoint. Note that if C is densely defined symmetric and bounded from below it has always a (unique) selfadjoint extension with the same lower bound, the Friedrichs extension CF. This extension is characterized by the fact that its domain of definition belongs to the energy space of the operator C,that is to the Hilbert space HL defined as the closure of D ( C ) in the metric
z
[u, I. = (Lu, ).
+ 4%u),
(8.91)
+
where c > 0 is a sufficiently large constant such that C c1 is positive definite on D ( C ) . Since the closure of C is the minimal closed extension of C and since CF is a closed extension of C, one concludes that if C is bounded from below in H = L2(R")and C, is selfadjoint then Cm = CF.
(8.92)
In order to give conditions for C, to be selfadjoint, consider first the case when u j ( z ) = ~j(z) = const,
In this case C is symmetric on CF(R").
Ijl 5 s.
(8.93)
Auxiliary Results
Lemma 8.2 of the set
25 1
If (8.93) holds then Cm is selfadjoint. Its spectrum consists
(8.94) Proof.
Let
u(z) = 3 - l G = (27r)-'I2
s
exp(i<. z)G(<)d<
T ( D p u ) = tpG(<), 1 5 p 5 T ,
D p = -i-.
d
(8.96)
(8.97)
dXP
Therefore
and
cu = T-lL(<)Tu,
(8.98)
where (8.99) The operator T of the Fourier transform is unitary in L2(R'). Formula (8.98) shows that the operator C is unitarily equivalent to the operator of multiplication by the polynomial C(<)defined by formula (8.99). This multiplication operator is defined on the set T(C,O"(R'))of the Fourier transforms of the functions belonging to the set C,O"(R'),that is to the domain of definition of the operator C. Consider the closure M in L2(Ri) of the operator of multiplication by the function L(<): MG := L(<)G(<).
(8.100)
The domain of definition of M is
D ( M ) = ( 6 : L(<)G(<)E L 2 ( R ' ) } .
(8.101)
Random Fields Estimation Theory
252
The operator M is clearly selfadjoint since the function L(<)is real-valued. Therefore
C ,
=F
1M3
(8.102)
is also selfadjoint. Since the spectra of the unitarily equivalent operators are identical and the spectrum of M is the set (8.94), the spectrum of C , is the set (8.94). Lemma 8.2 is proved. 0 The following well-known simple lemmas are useful for proving, under suitable assumptions, that C , is selfadjoint.
Lemma 8.3 Let A and B be linear operators on a Hilbert space H , A is selfadjoint, D ( A ) c D ( B ) , B is symmetric, and
II Bu 1 1 1 II Au II +c II u 11, vu E D(A),
(8.103)
where 0 < E < 1 is a fixed number and c is a positive number. Then the operator A B with Dom(A B ) = DomA is self-adjoint.
+
+
Lemma 8.4 Let A be a symmetric densely defined in a Hilbert space H operator. Then 2,the closure of A, is selfadjoint if and only if one of the following conditions holds
dRan(A fiX) = H
(8.104)
N(A* fi X ) = (0).
(8.105)
or
Here X > 0 is an arbitrary fixed number, dRanA is the closure of the range of A,
N ( B ) = {U : BU = 0)
(8.106)
and A* is the adjoint of A . For convenience of the reader let us prove these lemmas. We start with Lemma 8.4.
Proof of Lemma 8.4 a) The argument is the same for any X > 0, so let us = 2.If A*u = iu take X = 1. Suppose that 2 is selfadjoint. Then A* = then
x*
(A*u,u) = i(u,u).
(8.107)
Auxiliary Results
253
Since A* is selfadjoint the quadratic form (A*u,u)is real valued. Thus, equation (8.107) implies U = 0, and (8.105) is established. To derive (8.104) one uses the formula dRan(A fi)
N(A*Ti) = H ,
(8.108)
where @ is the symbol of the orthogonal sum of the subspaces. From (8.108) and (8.105) one gets (8.104).
b) Assume now that (8.105) holds and prove that 2 is selfadjoint. If (8.105) holds then (8.104) holds as follows from (8.108). Conversely, (8.104) implies (8.105). Note that (8.109)
dRan(A f i) = Ran@ fi) because Ran@ f i) are closed subspaces. Indeed, if f then
fn
(2f i)un = f,
and
+
II ('21k+-Jnrn II+
Xunm
n,m
+
(8.110)
m,
is symmetric, one obtains from (8.11) that
where u,, := un -u,. Since
1) (Zfi ) ~ , ,
0,
)I2 + 1) u,, )I2+
0,
TI, m
--+
00.
(8.111)
Therefore u, is a Cauchy sequence. Let u, + u.Then, since 2 is closed, one obtains
(Zfi)u=f.
(8.112)
Therefore Ran@ fi) are closed subspaces. If (8.105) holds then (8.113)
Ran@ f i) = H .
This and the symmetry of 2 imply that 2 is selfadjoint. Indeed, let
((71+ i)u,v) = (u, f) vu E D(Z).
(8.114)
Using (8.113), find w such that
(Z-i)w=f.
(8.115)
This is possible because of (8.113). Use (8.114) and the symmetry of 2 to obtain
((X+ i)u,v) = (u,(X- i),
+
= ((2 i)u,w) .
(8.116)
254
Random Fields Estimation Theory
+
By (8.113) one has Ran@ i) = H . This and (8.116) imply v = w. Therefore v E D(x)and 2 is selfadjoint on D(2). Lemma 8.4 is proved. 0 Proof of Lemma 8.3 It is sufficient to prove that, for some X > 0,
Ran(A
+ B fi X ) = H .
(8.117)
One has A + B f i X = [I + B(A fiX)-'] (A fi X ) .
(8.118)
Equation (8.117) is established as soon as we prove that
1) B(A fiX)-' [I<
1.
(8.119)
Indeed, if (8.119) holds then the operator I+B(A+iX)-l is an isomorphism of H onto H and, since A is selfadjoint, Ran(A f i X ) = H , so Ran(A + B f i X ) = Ran { [I+ B(A f iX)-l] (A fi X ) } = H .
(8.120)
Use the basic assumption (8.103) to prove (8.119). Let (A + iX)-'u = f, B(A
+~X)-'U
= Bf.
(8.121)
Then
If
E
< 1 and X > 0 is large enough, then E+CX-l
< 1.
(8.123)
Thus (8.122) holds, and Lemma 8.3 is proved. In deriving (8.122) we have used the inequalities
11 (A fiX)-' I)<
X-',
X
> 0,
0 (8.124)
and
11 A(A fiX)-' 111 1,
(8.125)
A d l a a r y Results
255
both of which follow immediately from the spectral representation of a selfadjoint operator A: if (8.126) then
II + ( A ) 1 1 1m g + ( t ) l .
(8.127)
Both inequalities (8.124) and (8.125) can be derived in a simple way which does not use the result (8.126)-(8.127). This derivation is left for the reader as an exercise. We now return to the question of the selfadjointness of C , . If C , is selfadjoint then C is called essentially selfadjoint. We wish to prove that if the principal part of C is an operator with constant coefficients and the remaining part is an operator with smooth bounded coefficients then C is essentially selfadjoint. The principal part of C is the differential expression (8.128) ljl=s
Let us recall that a polynomial P ( J )is subordinate to the polynomial Q(J) if (8.129)
If (8.129) holds then we write P 44 Q. We say that
(8.130)
Q(t) is stronger than P(<)and write (8.131)
if
< C,
Q(0-
'#J E R',
(8.132)
where c > 0 is a constant, and (8.133)
Random Fields Estimation Theory
256
In the following lemma a characterization of elliptic polynomials are given. A homogeneous polynomial &(E) of degree s is called elliptic if
Q(E) # 0 for 5 E R'\O.
(8.134)
Lemma 8.5 A homogeneous polynomial &([) of degree s is elliptic i f and only i f it is stronger than every polynomial of degree 5 s. I n particular clElS I IQ(E)I,
(8.135)
where c = const > 0.
A proof of this result can be found in [Hormander (1983-85), vol. 11, p. 371.
Lemma 8.6
Assume that ~u = LOU
+ C aj(z)Pu
(8.136)
ljl<s
and
LOU=
ajDju,
(8.137)
where
,c~(<):=
C aj<j
is an elliptic polynomial
(8.138)
Iil=s
and (8.139)
Then C is essentially selfadjoint o n C F ( R T )and H" (R').
=C ,
is selfadjoint o n
Proof. By Lemma 8.2 Lo defined on CT(RT)is essentially selfadjoint. If (8.138) holds then inequality (8.135) holds. Therefore the closure of LO, the operator Corn, is selfadjoint and
DomLo, = H"(R').
(8.140)
Let us apply Lemma 8.3 with A = Lom and B = C - Lo. The basic condition to check is inequality (8.103) on N"(RT).It is sufficient to check this inequality on C F ( R r )since CF(RT)is dense in H S ( R r ) .If (8.103) established for any u E CF(Rr)then one takes u E H S ( R T and ) a sequence
Auxiliary Results
257
un E CF(R') such that 11 u, - u I I H ~ ( R ~ ) - + 0, n + 00, and passes to the limit n + 00 in (8.103). This yields inequality (8.103) for any u E H"(R'). In order to check inequality (8.103) it is sufficient to prove that
II .j(X>Dj. I I I 6 II Lou II + C ( 4 II 21 II,
k h E cr(R')
for any E > 0, however small, and any j such that Ijl and Parseval's equality one obtains:
(8.141)
< s. Using (8.139)
11 aj(x)Dju 116 c 11 Dju II= c (1 t j i i I( .
(8.142)
On the other hand, Parseval's equality, condition (8.138) and inequality (8.135) yield
II Lou ll=Il If Ijl
Co(E)C 1 1 1c
II IEl"c II .
(8.143)
< s then
lElj I ~ltl"[El > R = R ( E ) . In the region
(8.144)
I R one estimates IEjCl
I c(R)ICl,
(8.145)
where, for example, one can take c(R)= Rljl. Therefore
and
(8.147) where Parseval's equality and estimates (8.144) and (8.145) were used. From (8.142), (8.146) and (8.147) one obtains the desired inequality (8.141). Lemma 8.6 is proved.
Remark 8.1 Note that the method of the proof allows one to relax condition (8.139). For example, one could use some integral inequalities to estimate the Fourier transform. Let us now prove the following result.
Lemma 8.7 If C defined by (8.82) has smooth and uniformly bounded in R' coefficients such that the principal part of C is an elliptic operator
Random Fields Estimation Theory
258
of uniformly constant strength, then L i s essentially selfadjoint o n CF(RT) and L, is selfadjoint o n H S ( R " ) . Proof.
Let us recall that the principal part LO of C, (8.148)
is an elliptic operator of uniformly constant strength if the principal symbol
C aj(x)tj
a s ( x , t ) :=
UI=S
satisfies the ellipticity condition (8.83) and the condition of uniformly constant strength (8.149) where c does not depend on x , y and
's(x,') :=
{c
1/2
}
By Lemma 8.4 the operator
(8.150)
ID;as(X,t)l2
Iil>o
.
L, is selfadjoint on H S ( R Tif) the equations (8.151)
(L, fiA)u = f,
are solvable in H S ( R Tfor ) any f E CF(R') and for some X > 0. Indeed, in this case Ran(L, f i X ) 3 CF(R') and therefore is dense in H = L 2 ( R T ) . Since C , is symmetric Ran(L, fZX) is closed in L2(R') and, being dense in L2(R'), has to coincide with L2(Rr).This implies, by Lemma 8.4 that L, is selfadjoint. Existence of the solution to (8.151) in H S ( R T )for any f E C F ( R T ) follows from the existence of the fundamental solution E(x,y , A):
(Lm f i A ) E ( z , y, A)
= b(x - y)
in R'
(8.152)
and the estimate (CIX -Y
IW,Y,A)I L
c1x - y
I~-~,
p
+ c1log &,
if r is odd or r > s, la:-Yl L 1 if r is even and r 12
- YI I 1
5 s,
(8.153)
259
Auxiliary Results
where c and c1 are positive constants, and
where c > 0 is a constant and a(X) > 0 is a constant, a(X) t +co as X 4 +co. Also &(z, y, A) is smooth away from the diagonal z = y, and the following estimates hold
+ c11z - yls-T-ljl
+
if s # T Ijl, 12 - yI I 1 q,+c1llogla:-yll i f s = r + I j l , l z - y l 5 1. (8.156) Indeed, if there exists the fundamental solution with the properties (8.152)-(8.156) then q, IDj&(z,Y,A l l I
Irn
solves (8.151) and u E HS(RT), so that is selfadjoint. Existence of the fundamental solution with the properties (8.152)(8.156) for elliptic selfadjoint operators with constant coefficients can be established if one uses the Fourier transform [Hormander (1983-85), vol. I, p. 1701, and for the operators of uniformly constant strength it is established in [Hormander (1983-85), vol. 11, p. 1961. Thus, Lemma 8.7 is 0 proved. It is not difficult now to establish that the function (CN-iX)-' := $(L), X > 0, has a kernel which is a Carleman operator if N > &. Indeed, it follows from the estimate (8.156) that the singularity of the kernel of the operator $(L)is 0 (lz - y I N s - T ) so that this kernel is locally in L2 if N > &. On the other hand, the estimate (8.154) implies that the kernel of $(C)is in L 2 ( R T globally. ) Since the constants in the inequalities (8.154) and (8.156) do not depend on z, one concludes that $(C) is a Carleman operator. Let us formulate this result as Lemma 8.8. Lemma 8.8 Suppose that N > & and the assumptions of Lemma 8.7 hold. T h e n the operator (LN- ZX)-', X > 0, is a Carleman operator.
Random Fields Estimation Theory
260
Asymptotics of the spectrum of linear operators
8.3
In this section we develop an abstract theory of perturbations preserving asymptotics of the spectrum of linear operators. As a by-product a proof of Theorem 2.3 is obtained.
Compact operators
8.3.1
8.3.1.1 Basic definitions Let H be a separable Hilbert space and A : H + H be a linear operator. The set of all bounded operators we denote L ( H ) . The set of all linear compact operators on H we denote goo. Let us recall that an operator A is called compact if it maps bounded sets into relatively compact sets. It is well known that A is compact if and only if one of the following conditions hold 1)
fn
2) fn
3
f, g n
implies (Afn,gn)4 ( A f , g ) ,
f implies Af, + A f ,
3 ) from any bounded sequence of elements of H one can select a subsequence f n such that
(Afnm,f n m ) converges as n, rn + 00 where
-
By we denote weak convergence and norm of H (strong convergence).
--+
stands for convergence in the
If A is compact and B is a bounded linear operator then AB and B A are compact. A linear combination of compact operators is compact. If A, E cooand 11 A, - A 0 as n + 00, then A E coo.The operator A is compact if and only if A* is compact. I n this section we denote the adjoint operator by A*. I f H is a separable Hilbert space and A is compact then there exists a sequence A, of finite rank operators such that 11 A , - A 11- 0. An operator B is called a finite rank operator if rank A := dim RanB < 00. If A is compact then A*A 2 0 is compact and selfadjoint. The spectrum of a selfadjoint compact operator is discrete, the eigenvalues X,(A*A) are nonnegative and have at most one limit point X = 0. We define the singular
Aw5laar-y Results
261
values of a compact operator A (s-values of A) by the equation S j ( ~ )=
x;/~(A*A).
(8.158)
One has
sl(A) 2 s ~ ( A 2 )
*
a
*
2 0.
(8.159)
Note that
Sl(A) =II A
II
(8.160)
and if A = A* then S j ( 4
=
IW)I.
(8.161)
The following properties of the s-values are known
sj(A) = sj(A*), S j ( B 4 5 II B II S j ( 4 , Sj(AB) 5 It II sj(A>
(8.162) (8.163) (8.164)
for any bounded linear operator B. Obviously
sj(cA) = IcIsj(A), c = const.
(8.165)
Any bounded linear operator A can be represented as
A = UlAl
(8.166)
IAl := (A*A)'/'
(8.167)
where
and U is a partial isometry which maps Ran(A*); onto RanA. Representation (8.166) is called polar representation of A. The operator JAJis selfadjoint. If A is compact then IAl is compact. Let q5j be its eigenvectors and s j = sj(A) be its eigenvalues:
Then
Random Fields Estimation Theory
262
where the series (8.169) converges to IAI in the norm of operators:
Let $j
:= uf#Jj.
(8.171)
Then
c 00
A=
Sj(.,
4jMj.
(8.172)
j=1
Formula (8.172) is the canonical representation of a compact operator A . Note that ( $ j , $m)
= Jjm
since U is a partial isometry. It follows from (8.172) that A is a limit in the norm of operators of the finite rank operator Cj”=,sj(., 4j)$j. Moreover 00
A* = C.j(A)(.,$j)4j.
(8.173)
j=1
In the formulas (8.172) and (8.175) the summation is actually taken over all j for which s j ( A ) # 0. If rank A < 00, then s j ( A ) = 0 for j > rank A . If A is compact and normal, that is A*A = AA*, then its eigenvectors form an orthonormal basis of H
c 00
A=
x j ( A ) ( . f#JjMj, ,
&j
=M44j
(8.174)
j=l
and
8.3.1.2 Minimax principles and estimates of eigenvalues and singular values
Lemma 8.9
Let A be a selfadjoint and compact operator on H . Let A; >_A,+ >_ ...
(8.176)
Auxiliary Results
263
be its positive eigenvalues counted according to their multiplicities and are corresponding eigenvectors
4j
(8.177)
(8.178) where L, c H is an n-dimensional subspace. Maximum in (8.178) is attained on the subspace (8.179)
Lj(A) := span(41,. . . + j }
spanned by the first j eigenvectors of A corresponding to the positive eigenvalues. Remark 8.2 Maximum may be attained not only on the subspace (8.179). The sign 4 IL means that q5 is orthogonal to the subspace L. Lemma 8.10
If A E T(,
then (8.180)
Lemma 8.10 follows immediately from Lemma 8.9 and from the definition of the s-values given by formula (8.158).
Lemma 8.11
If A E
T (,
then
K where
Kj
II
(8.181)
is the set of operators of rank 5 j .
The following inequalities for eigenvalues and singular values of compact operators are known.
Lemma 8.12 that is
If A and B are selfadjoint compact operators and A 2 B ,
then (8.183)
Random Fields Estimation Theory
264
Lemma 8.13
If A and B are selfadjoint and compact operators, then
X;+,-,(A
+ At (B ) ,
+B) I
(8.184)
and
Moreover
and
p;(4
-
+ B)I 111 B 11
(8.187)
< 0 are the negative eigenvalues of a selfadwhere A,(A) I &(A) I joint compact operator A counted according to their multiplicities. Lemma 8.14 then
If A is compact and B is afinite rank operator, rank B
~ j + v ( A )I sj(A Lemma 8.15
If A, B E T,( Srn+n-l(A
Lemma 8.16
+ B ) F sj-v(A).
= Y,
(8.188)
then
+ B ) I srn(A) + Sn(B),
(8.189)
If A E coothen nj”,llxj(A)I I n&lsj(A).
(8.191)
Lemma 8.17 If A , B E u, and f(x), 0 5 x < 00, is a real-valued nondecreasing, convex, and continuous function vanishing at x = 0, then n.
for all n = 1 , 2 , . . . , m .
n
265
Auxiliary Results
In particular, i f f (x)= x, one obtains (8.193) j=l
j=1
j=l
for all n = 1,2,. . . ,co. If f ( x ) , f ( 0 ) = 0, 0 5 x < 00, is such that the function $ ( t ) := f (exp(t)) is convexJ -00 < t < 00, then n
n
j=1
j=1
(8.194)
for all n = 1,2,. . . , co. In particular, i f f ( x )= x , then n
n
j=1
j=1
(8.195)
for all n = 1,2,.
. . ,oo.
Lemma 8.18
Let A , B E cooand lim nasn(A) = c,
n+w
(8.196)
where a > 0 and c = const > 0. Assume that lim nasn(B)= 0.
new
(8.197)
Then lim nas,(A
n+oo
+ B ) = c.
Proofs of the above results can be found in [GK]. 8.3.2 8.3.2.1
Perturbations preserving asymptotics of the spectrum of compact operators
Statement of the problem
Here we are interested in the following question. Suppose that A and Q are linear compact operators on H, and B is defined by the formula
B = A(I
+ 0).
(8.198)
Random Fields Estimation Theory
266
Question 1: Under what assumptions are the singular values of B asymptotically equivalent to the singular values of A in the following sense: (8.199) Assume now that
sn(A)= cn-p
[1+ O(n-pl)],
n -+ co,
(8.200)
where p and pl are positive numbers, and c > 0 is a constant.
Question 2: Under what assumptions i s the asymptotics of the singular values of B given by the formula sn(B) = c n - p [I
+ ~ ( n - q ),] n
--+
co?
(8.201)
W h e n i s q = pl? We will answer these questions and give some applications of the results. 8.3.2.2
A characterization of the class of linear compact operators
We start with a theorem which gives a characterization of the class of linear compact operators on H . In order to formulate this theorem let us introduce the notion of limit dense sequence of subspaces. Let
Ln
c Ln+l C , . . . ,dim L,
=n
(8.202)
be a sequence of finite-dimensional subspaces of H such that
p(f,L,)--+O as n+oo
forany f € H
(8.203)
where p ( f , L ) is the distance from f to the subspace L.
Definition 8.2 A sequence of the subspaces L, is called limit dense in H if the conditions (8.202) and (8.203) hold. Theorem 8.2 A linear operator A : H -+ H is compact i f and only i f there exists a limit dense in H sequence L, of subspaces L, such that (8.204)
Auxiliary Results
267
If (8.204) holds for a limit dense in H sequence Ln then it holds for every limit dense in H sequence of subspaces. Proof. Suficiency. Assume that L, is a limit dense in H sequence of subspaces and condition (8.204) holds. We wish t o prove that A is compact. Let P, denote the orthoprojector in H onto L,. Condition (8.204) can be written as 7, := sup
)I Ah \I+
0 as n --t
(8.205)
03.
llhll=1 hLLn
Therefore
-
IlAgll I
SUP g=( I
- Pn ) h
SUP gJ-Ln
I/ Ag II=
7n
+
0.
llhllll
(8.206) Therefore A is the norm limit of the sequence of the operators AP,. The operator AP, is of finite rank 5 n. Therefore A is compact. Note that in the sufficiency part of the argument the assumption that the sequence L, is limit dense in H is not used. In fact, if condition (8.204) holds for any sequence of subspaces L, c L,+1 then A is compact as we have proved above. 0
Necessity. Assume now that A is compact and { L n } is a limit dense in H sequence of subspaces. We wish t o derive (8.204). We have SUP hlLn
11 Ah 11 =
IIh II= 1
SUP Pn h=O
11 Ah - APnh 115
llhll=1
=
1) A(I - P,)
0
SUP
IIhll= 1
as n
11 A ( I - Pn)h 11
+ 00.
(8.207)
The last conclusion follows from the well known result which is formulated as Proposition 8.1.
Proposition 8.1 If A is compact and the selfadjoint orthoprojection P, converges strongly to the identity operator I then
11 A(I - Pn) I(-+ 0
as n 4 03.
(8.208)
Note that P, I strongly if and only if the sequence L, is limit dense in H . Let us prove Proposition 8.1. Let A be compact and Bz = B, -+ 0
Random Fields Estimation Theory
268
strongly. In our case B = I - Pn. Represent A = K finite rank operator and 11 F, I[< E . Then
+ F,, where K is a
(8.209) I1 ABn 11<11 KBn I1 + (1 KBn ((I C€+ 11 KBn 11 . Here c 211 B, 1) does not depend on n and E . Choose n sufficiently large. Then
since Bn
+0
strongly and K is a finite rank operator. Indeed m
KBnh =
C sj(Bnh,
4j)$j
(8.211)
j=1
where m
K :=
C sj(.,
+ j ) ~ j , s j = const.
(8.212)
j=l
It is known that B, -+ 0 strongly does not imply B: 4 0 strongly, in general. But since we have assumed that B, = B;,we have m
11 ~
C
~ IIs n
1h 3jI
II $ j IIII h IIII Bn4j Its ~ ( nII )h II
(8.213)
j=1
where
E(n) -+ 0
as n -+
00
because
Proposition 8.1 and Theorem 8.2 are proved. Note that in the proof of the necessity part the assumption that the sequence Ln is limit dense in H plays the crucial role: it allows one to claim that Pn 4 I strongly. If the sequence L, is not limit dense then there may exist a fixed vector h # 0 such that 11 Ah (I> 0 and h I Ur=, L,. In this case condition (8.204) does not hold. This is the case, for example, if h is the first eigenvector of a selfadjoint compact operator A , and L, := span(42,. . . , #,+I} where Aq5j = Aj+j. 8.3.2.3 Asymptotic equivalence of s-values of two operators We are now ready to answer Question 1. Recall that N ( A ) := {u : Au = 0).
Auxiliary Results
269
where
(8.218) Here M , is so chosen that the condition 4 I M , is equivalent to the condition ( I Q ) 4 I &(A), where &(A) is the linear span of the first n eigenvectors of the operator (A*A)'/'
+
M,
+
:= ( I
+ Q*)L,(A).
(8.219)
+
Since N ( I Q) = ( 0 ) and Q is compact, the operator I Q is an isomorphism of H onto H and so is I Q*. Therefore the limit dense in H sequence of the subspaces .&(A) is mapped by the operator I Q* into a limit dense in H sequence of the subspaces Mn. Indeed, suppose that f I M , Vn, that is
+
(f,( I + Q*)4j)= 0
+
V.i
(8.220)
where { 4 j } is the set of all eigenvectors of the operator (A*A)1/2 including the eigenvectors corresponding to the eigenvalue X = 0 if zero is an
Random Fields Estimation Theory
270
eigenvalue of (A*A)l12.Then
Since the set of all the eigenvectors of the operator (A*A)l12is complete in H , we conclude that
(I
+ Q ) f = 0.
(8.222)
+
This implies that f = 0 since I Q is an isomorphism. The fact that E , -+ 0 follows from the compactness of Q and Theorem 8.2. Let B = A ( I + Q ) . Since ( I + Q ) - l = I-t-91 where Q1 := -Q(I+Q)-l is a compact operator, one has A = B(I Q1). Therefore one obtains as above the inequality:
+
From (8.217) and (8.223)equation (8.215) follows. The proof of (8.216) reduces to (8.215) if one uses property (8.162) of s-values. Theorem 8.3 is proved. 0 The result given in Theorem 8.3 is optimal in some sense. Namely, if Q is not compact but, for example, an operator with small norm, then the conclusion of Theorem 8.3 does not hold in general (take, for instance, Q = EI where I is the identity operator). The assumption rank A = 00 is necessary since if rank A < 00 one has only a finite number of nonzero singular values. The assumption N ( I Q ) = { 0 } is often easy to verify and it is natural. It could be dropped if the assumption about the rate of decay of sn(A) is
+
but we do not go into detail. 8.3.2.4 Estimate of the remainder Let us now answer the second question.
Theorem 8.4 Assume that A and Q are linear compact operators on H , N ( I Q ) = { 0 } , B := A(I Q ) ,
+
+
sn(A)= cn-p [1+O(neP1)]
as n
-+ 00,
(8.224)
Auxiliary Results
271
where p , p l and c are positive numbers, and
Then
where
(
q := min p l ,
-
(8.227)
lypa).
I n particular,
if
pa > p l 1+ p a
(8.228)
then q = p l
and therefore not only the main term of the asymptotics of sn(A) is preserved but the order of the remainder as well. Remark 8.3 The estimate (8.227) of the remainder in (8.226) is sharp in the sense that it is attained f o r some Q.
Proof.
Let n and m be integers. It follows from (8.180) that
Here, as in the proof of Theorem 8.3, M, is defined by formula (8.219), and C,(A) is the linear span of first m eigenvectors of the operator (A*A)1/2.This means that we have chosen Ln+, to be the direct sum of the subspaces Mn C,(A). Since the sequence L m ( A )is limit dense in H one can use Theorem 8.2 and conclude from (8.229) that
+
~ n + m + l ( BI ) sn+l(A)(1 + e m ) , and
Therefore
6,
--t
0
m
4
co
(8.230)
272
Random Fields Estimation Theory
Unfortunately our assumptions now do not allow t o use the argument similar t o the one used at the end of the proof of Theorem 8.3. The reason is that our assumptions now are no longer symmetric. with respect to A and B. For example, inequality (8.225) is not assumed with B in place of A. In applications it is often possible to establish the inequality (8.225) with B in place of A, and in this case the argument can be simplified: one can use by symmetry the estimate (8.232) in which B and A exchange places. With the assumption formulated in Theorem 8.4 we proceed as follows. Write A =B(I+
Qi),
Qi
=
-Q(r + Q)-’.
(8.233)
Choose
and use the inequalities similar to (8.229)-(8.232) to obtain
It follows from (8.232) and (8.235) that
where we took into account that O<s,(A)-+O
as
m+cc
(8.237)
so that s&(A) 5 sm(A) for all sufficiently large m. Therefore, for all sufficiently large m one has
I
Sn+m+1 (A)
[i + 0 (s&(A))]. (8.238)
Choose
m = nl-”,
o < z < 1.
(8.239)
273
Results
A&liary
It follows from (8.224) and (8.239) that
=1
m + 0 (---) + O ( n - p l )= 1 + O(n-%)+ 0(n-p1).(8.240)
From (8.240) and (8.238) one obtains
Let q := min { P I , 2,(1 - z ) p a } .
(8.242)
Then
(8.243) Since
n+m+l ~1 n
as n + m
(8.244)
it follows from (8.243) that formula (8.226) holds. Choose now 2 , 0 < 2 such that min (x,(1 - z ) p a ) = max.
<1
(8.245)
An easy calculation shows that (8.245) holds if
(8.246) Therefore q =min
(
PI,-
1Ypa).
This is formula (8.227). Theorem 8.4 is proved.
0
We leave for the reader as an exercise to check the validity of the Remark. Hint. A trivial example, which shows that for some Q the order of the remainder in (8.226) is attained, is the following one. Let A > 0 be a
Random Fields Estimation Theory
2 74
selfadjoint compact operator. In this case s j ( A ) = Aj(A). Take Q = $ ( A ) . Then
L ( B ) = An { A [I+ $ ( A ) ] = ) Xn(A) 11 + $(An11
(8.247)
by the spectral mapping theorem. If one chooses $(A) such that q5(An) = o ( n - P 1 ) .
(8.248)
Then q = p1 is the order of the remainder in the formula
An(B)= cn-p [1+O(n-P1)] . 8.3.2.5
(8.249)
Unbounded operators
Note that the results of Theorems 8.3, 8.4 can be used in the cases when the operators we are interested in are unbounded. For example, suppose that L is an elliptic selfadjoint operator in H = L 2 ( D )of order s and e is a selfadjoint in H differential operator of lower order, ordE = m. We wish to check that (8.250)
+
+
Since Xn(L)-+ +GO and An(L c I ) = X,(C) c where c is a constant, one can take C CI in place of C in (8.250) and choose c > 0 such that the operator C+cI is positive definite in H . Then the operator A := (C+cI)-l is compact in H . Moreover
+
c + c~ + e = [ I + e(c+ d ) - l ]
(L:
+d )
so that
B := (C+ C I If ordl
+ el-1
= (c
+ c ~ ) - [l I + E(L: + C 1 ) - 1 ] - l .
(8.251)
< ordC then the operator S := l ( C + cI)-'
is compact in H .
One can always choose the constant c that
(8.252)
> 0 such that N ( I + S) = {0}, so
( I + S)-' = I
+Q
(8.253)
where Q is compact in H . Then (8.251) can be written as
B =A(I+Q)
(8.254)
Auxiliary Results
275
and the assumptions of Theorem 8.3 are satisfied. In fact, since A and B are selfadjoint and X,(A) and X,(B) are positive for all sufficiently large n, one has
s n ( B )= An(B), sn(A)= X,(A),
Vn
> 720.
(8.255)
By Theorem 8.3 one has XTl ( B ) lim - 1.
X,(A)
Since A,(B-l)
= & l ( B ) , it
(8.256)
follows from (97) that (8.257)
This is equivalent to (8.250) because, as was mentioned above, lim ~
-
+ +CI) =1
X,(C c X,(C+l) +
(8.258)
~
for any constant c. 8.3.2.6 Asymptotics of eigenvalues In this section we prove some theorems about perturbations preserving asymptotics of the spectrum. In order to formulate these theorems in which unbounded operators appear, we need some definitions. Let A be a closed liner densely defined in a Hilbert space H operator, D ( A ) is its domain of definition, R ( A ) is its range, N ( A ) = {u : Au = 0) is its null-space, a ( A ) is its spectrum.
Definition 8.3 We say that the spectrum of A is discrete if it consists of isolated eigenvalues with the only possible limit point at infinity, each of the eigenvalues being of finite algebraic multiplicity and for each eigenvalue X j the whole space can be represented as a direct sum of the root subspace M j corresponding to X j , and a subspace H j which is invariant with respect to A and in which the operator A - XjI has bounded inverse. In this case Xj is called a normal eigenvalue. The root linear manifold of the operator A corresponding to the eigenvalue X is the set of vectors which solve the equation
M A := {u : ( A - XI)"u
=0
for some n } .
(8.259)
Random Fields Estimation Theory
276
The algebraic multiplicity .(A)
of the eigenvalue X is
.(A)
:= dimMx.
(8.260)
If M A is closed in H it is called root subspace. The geometric multiplicity .(A) of X is the dimension of the eigenspace corresponding to A, .(A) = dim N ( A- X I ) . If E > 0 is small enough so that there is only one eigenvalue in the disc Iz - XI < E then 1 PA := -2nz
lz-*,=.
R ( z ) d z , R ( z ) := ( A - z I ) - l
(8.261)
is the projection, that is P 2 = P . The subspace PxH is invariant for A ,
PA commutes with A , PxA = APx, the spectrum of the restriction of A onto PxH consists of only one point A, which is its eigenvalue of algebraic multiplicity .(A). An example of operators with discrete spectrum is the class of operators for which the operator ( A - XoI)-' is compact for some XO E C. Such are elliptic operators in a bounded domain. If A is an operator with discrete spectrum then
IX,(A)I
4 0 0
as .--too.
(8.262)
+ cI) = 1.
(8.263)
Thus, for any constant c, lim
n-cc
X,(A
X,(A)
Therefore it is not too restrictive to assume that A-' exists and is compact: if A-l does not exist then choose c such that ( A cl)-l exists and study the asymptotics of X,(A cI) = X,(A) c. Note that if ( A - X o I ) - l is compact for some XO, then ( A- XI)-l is compact for any X for which A - X I is invertible. This follows from the resolvent identity
+
( A- XI)-'
= ( A- X O ) - l
+
+
+ (A - Xo)(A- XI)-I(A - X o I > - l .
(8.264)
If A-l is compact we define the singular values of A by the formula (8.265)
If
A = A* 2 m > 0
(8.266)
Auxiliary Results
277
then we denote by H A the Hilbert space which is the completion of D(A) in the norm 11 u ] ] A = (Au,u)ll2. Clearly H A c H , 11 u 115 m-l (1 u ( [ A , and ( u , v ) ~:= ( A u , ~is) the inner product in H A . The inner product can be written also as ( u ,V ) A = (A1/2u, A1/2v).If B = B* 2 --m, then by H B we mean the Hilbert space which is the completion of D ( B ) in the norm 11 u IIB:= ( ( B m 1)u,u)'I2. All unbounded operators we always assume densely defined in H .
+ +
Theorem 8.5 Let A = A* 2 m > 0 be a linear closed operator with discrete spectrum, T be a linear operator, D ( A ) c D ( T ) , B := A T , D ( B ) = D ( A ) . Assume that A-lT is compact in H A , B = B* and H A c D ( T ) . Then
+
(8.267)
The conclusion (8.267) remains valid if A 2 -m and [ A+ ( m+ 1)II-l T is compact in HA. Remark 8.4 If T > 0 then A-IT is compact in H A if and only if the imbedding operator i : H A + HT i s compact. By HT we mean the Hilbert space which is the completion of D ( T ) in the n o r m (Tu,u)lj2. If T is not positive but I ( T f , f ) l 5 (Qf,f) f o r some Q > 0 and all f c D ( T ) , D ( T ) C D ( Q ) , and if the imbedding i : H A 4 HQ i s compact then A-lT is compact. The reader can prove these statements as an exercise or find a proof in [Glazman (1965), $41. To prove Theorem 8.5 we need a lemma.
Lemma 8.19 If the operator A-lT is compact in H A then H A = H B and the spectrum of B is discrete. Assuming the validity of this lemma let us prove Theorem 8.5 and then prove the Lemma.
Proof of Theorem 8.5 Let us use the symbol LI for orthogonality in H A and I for orthogonality in H . If ,&(A) is the linear span of the first n eigenvectors of A-' then f I ,&(A) is equivalent t o f LI & ( A ) . Indeed, if A-'4j = Ajq5j, X j # 0, then
Random Fields Estimation Theory
278
Note also that inf{a(l+@)}>(l-sup@)infa
if
a 2 0 and
-l
(8.268)
We will use the following statement
(8.269) which follows from Theorem 8.2 and the assumed compactness of A-lT in HA. We are now ready to prove Theorem 8.5. By the minimax principle one has
where we have used (8.268) and (8.269). By symmetry, for all sufficiently large n, one has
We left for the reader as an exercise to check that under the assumptions of Theorem 8.5 the operator ( B cI)-lT is compact if B cI is invertible. From (8.270) and (8.271) the desired conclusion (8.267) follows. If A 2 -m and [ A ( m 1)II-lT is compact in HAthen we argue as above and obtain in place of (8.270) the following inequality
+
+
+ +
Xn+1
{B
+ ( m+ 1)I)L Xn+l { A+ ( m+ 1)I)( 1 - T n ) , Tn
+
+
Since Xn(A cI) = Xn(A) c, X,(A) (8.272) implies
-+ +00
-+
0
a~
n + 00.
(8.272)
and Xn(B)+ + m yinequality
and the rest of the argument is the same as above. Theorem 8.5 is proved.
Auxiliary Results
279
+
Proof of the Lemma 8.19 Note that B = A(I S),where S := A-lT is compact in HA.Let us represent S in the form S = Q+ F where 11 Q [ [ A < 1 and F is a finite rank operator. The operator S is selfadjoint in H A . Indeed
(sf, g ) A = (Tf, !?) = (f,Tg)= (f,S g ) A where we used the symmetry of T = B - A on H . We choose Q and F to be selfadjoint in HA. The operator I Q is positive definite in H A while
+
N
(F u ,u)A = j=1
aj 1 (u, $ j ) A l2
for some orthonormal in HA set of functions $ j , for some constants a j , and some number N = rank F . Since D(A) is dense in HA,one can find wj E D(A) such that 1) $j -wj IIA< E , where E > 0 is arbitrarily small. Then N
I ( F u , u ) ~5l
xlajlI(u,$j
j=1
I C1E II u ;1
+c2
-Wj)~+(u,AWj)1~
II 1' L [I2
(8.274)
where c1 and c2 are some positive constants which do not depend on u.It follows from (8.274) that ( B u ,u)= ( A ( I
+ Q)u,u)+ (AFu,u) +
= ((I -k Q)u,u ) ~( F u ,u ) A
2 co
II 1' 1 ;1
--C1E
II 1' 1 ;1
-c2
II u [I2
(8.275)
where Q > 0. It follows from (8.275), if one chooses E so small that Q - E C ~ > 0, that B is bounded from below in H. Since, clearly, (Bu,u)5 c(u, U ) A one concludes that the metrics of HAand HB are equivalent, so that HA= HB. It remains to be proved that the spectrum of B is discrete. Since B is selfadjoint it is sufficient to prove that no point of spectrum a ( B ) of B belongs to the essential spectrum aess(B). Recall that X E a e s s ( B ) , where B = B * , if and only if dimE(A,)H < 00 for any E > 0, where A, = ( A - E , X+E) and E(A) is the resolution of the identity corresponding to the selfadjoint operator B. Assume that X E a ( B ) and dimE(A,,)H = 00 for some sequence En + 0. Then there exists an orthonormal sequence u, E H , such that 11 Bun - Xu, I -+ 0 as n + 00. Thus
11 u, + A-lTu,
- XA-lu,
/I+
0 as n + 00.
(8.276)
Random Fields Estimation Theory
280
Since I[ u, ((=1 we have
(Aunlun)+ (TUn,un)- X ( ~ n , u n+ ) 0,
n
--+
00.
(8.277)
If A-'T is compact in H A , we have proved that, for any E > 0,
I(Tu,u)I I E
11 ;1
+c(E)
11 u /I2,
u E HA.
(8.278)
It follows from (8.277) and (8.278) that
1I u n [ [ A
(8.279)
where c > 0 is a constant which does not depend on n. Since A-lT is compact in H A , inequality (8.279) implies that a subsequence of the sequence u, exists (we denote this subsequence again by u,) such that A-lTu, converges in H A and, therefore, in H . Since the set u, is orthonormal, u, converges weakly to zero in H (8.280)
n-+m
u,-O,
Therefore
)I A - ~ T ~]I-+, o
as n -+
(8.281)
00.
From (8.281) and (8.276) it follows that
1) u, - XA-Iu,
0 as n -+
00
(8.282)
where u, is an orthonormal subsequence. This means that if X # 0 then X E uess(A) which is a contradiction since, by assumption, A does not have essential spectrum. If X = 0 then (8.282) cannot hold since 11 u, ))=1. Therefore B does not have essential spectrum and its spectrum is discrete. Lemma 8.19 is proved. 0
Example 8.1 Let A be the Dirichlet Laplacian -A in a bounded domain D c R' and B = -A + q(x), where q(z) is a real-valued function. In this case Tu = q(x)u is a multiplication operator. The condition A-'T is compact in H A means that (-A)-lq is compact in ;'(D). This condition holds if and only if A-1/2T is compact in H = L 2 ( D ) , D C R', that is (-A)-lI2q is compact in L 2 ( D ) . If, for example, q E LP(D) then the operator (-A)-'/2q(x) is compact in L 2 ( D ) provided that q E L r ( D ) , y > T , and Theorem 8.5 asserts that, in this case, =I 1.) lim &%(-A+( n+w
A,(-A)
(8.283)
Auxiliary Results
28 1
In the calculation of the LP class to which q belongs we have used the known imbedding theorem which says that the imbedding i : WkJ'(D) +
L*(D) is compact for k p < r , where Wk+'(D) is the Sobolev space of functions with derivatives of order I k belonging to LP(D). If q E L r ( D ) and u E L 2 ( D )then qu E LP(D),
where a
> 1, p = 5, p p = 2, p a = y. Thus P=
and p =
2 ( a - 1 ), p = a
Y -, a
so that a = l + -Y 2
&.On the other hand, if qu E LP(D) then A-iqu E W'ip(D)c
5
3,
L*(D). If > 2, that is p > then for which (8.283) holds can be relaxed.
> r. The condition on q ( x )
In the next theorem we assume compactness of A-lT and TA-l in H rather than in H A .
Assume that A = A* 2 m > 0 is an operator with discrete spectrum, D(A) c D ( T ) , B = A T , D ( B ) = D ( A ) , B is normal, 0 6 a ( B ) ,and the operator A-lT is compact in H . Then the spectrum of B is discrete and Theorem 8.6
+
(8.284)
Proof. First we prove that the spectrum of B is discrete. Since A is selfadjoint positive definite and its spectrum is discrete it follows that A-l is compact. Let
+ T u = Xu + f, u + A-lTu = XA-lu + A - l f . (8.285) Since 0 $! a ( B )the operator I + A-lT has bounded inverse. Therefore = X ( I+ A - ~ T ) - ~ A % + ( I + A - ~ T ) - ~ A - ~ ~ . (8.286) The operator ( I + A-lT)-lA-l is compact being a product of a bounded Au
and compact operator. Therefore ( B - XI)-' = [ I - X(I
Equations (8.285) and (8.286) are equivalent.
+ A - l T ) - l A - l ] - l ( I + A-lT)-lA-l.
(8.287)
282
Random Fields Estimation Theory
It follows from (8.287) that X E c ( B ) if and only if A-' E o ( F ) , F := ( I + A-lT)-lA-l. Since F is compact, each X is an isolated eigenvalue of finite algebraic multiplicity and o ( B ) is discrete. In this part of the argument we did not use the assumption that B is normal. If B is normal then IX,(B)I = sn(B), where sn(B)are the singular values of B. Since B = A(I + A-lT) and A-lT is compact, since s,(B) = szl(B-l), and since A-l is compact, we can apply Theorem 8.2 and get (8.288) Since A > 0, we have sn(A)= X,(A). Therefore the desired result (8.284) will be proved if we prove that (8.289)
Let us prove (8.289). Let
A$j
+ T+j = X j 4 j .
(8.290)
Since B is normal we can assume that (8.291) Rewrite (8.290) as (8.292) Multiply (8.292) by q$ to get
Since A-lT is compact and q5j
-
0 as j
--f
03,
(A-1T4j,4j)-+ 0 as j
we have: 4
03.
(8.294)
Note that (A-'$x~, c&) > 0. Therefore it follows from (8.293) that
This implies (8.289). Theorem 8.6 is proved.
0
Auxiliary Results
283
8.3.2.7 Asymptotics of eigenvalues (continuation) In this section we continue to study perturbations preserving asymptotics of the spectrum of linear operators. Let us give a criterion for compactness of the resolvent ( A - XI)-' := R(X) for X # a ( A ) , where A is a closed densely defined linear operator in H .
Theorem 8.7 The operator ( A- X I ) - ' , if the operator ( I A*A)-' is compact.
+
Proof.
Suficiency. Suppose ( I
I1 gn II<
c,
X
61 a ( A ) is compact i f and only
+ A*A)-l is compact and X 61 a ( A ) . Let ( A - W-lg, = fn.
(8.296)
Then
where c denotes various positive constants. Therefore (8.298)
+ +
+
The operators ( I A*A)-l and ( I A*A)-1/2are selfadjoint positive operators. They are simultaneously compact or non-compact. Therefore if ( I A*A)-' is compact then ( I A * A ) - 1 / 2is compact and (8.298) implies that the sequence {f,} is relatively compact. Therefore the operator ( A - X I ) - ' , X @ a ( A ) , maps any bounded sequence g, into a relatively compact sequence f,. This means that ( A - XI)-l is compact.
+
Necessity. Assume that ( A - XI)-l is compact and 11 h, 115 c. Then the sequence ( A - XI)-'h, is relatively compact. We wish to prove that the sequence 4 , := ( I A*A)-'h, is relatively compact. The sequence ( I A*A)q, = h, is bounded. Thus
+
+
((1+ A*A)qn,qn)=II qn
[I2 + 1) Aqn / I 2-<
Define p, := ( A - XI)qn, qn = ( A - XI)-'p,.
II Pn 1 1 1 1 1
C.
(8.299)
We have
I1 +PI II 4n 115 c
(8.300)
where c denotes various constants. From (8.300) and compactness of ( A XI)-' it follows that the sequence qn = ( A- XI)-'p, is relatively compact. Theorem 8.7 is proved. 0
Random Fields Estimation Theory
284
Remark 8.5 0 # 4A).
Let T be a linear operator in H , D(A)
Definition 8.4
c
D(T) and let
If for any sequence f n such that
the sequence Tfn is relatively compact then T is called A-compact. In other words, T is A-compact if it is a compact operator from the space G A into H . The space G A is the closure of D(A) in the graph norm 11 f (IG, ,:=~( f 11 11 Af 11. If A is closed, which we assume, then D(A) = G A is a Banach space if it is equipped with the graph norm.
+
Proposition 8.2 The operator T is A-compact i f and only i f the operator TA-' is compact in H .
Proof. Suppose T is A-compact. Let 11 f n 115 c, and define gn = A-' f n . Then 11 gn 11 11 Ag, 115 c. Therefore the sequence Tgn is relatively compact. This means that the sequence TA-lf, is relatively compact. Therefore TA-l is compact in H . Conversely, suppose TA-' is compact in H and 11 f n 11 11 Afn 111 c. Then the sequence Tfn = TA-'Afn is relatively compact. Proposition 8.2 is proved. 0
+
+
8.3.2.8 Asymptotics of s-values In this section we prove
Theorem 8.8 Let A be a closed linear operator in H . Suppose that a(A), the spectrum of A, is discrete and 0 # a(A). Let T be a linear operator, D ( A ) C D(T), B = A T, D ( B ) = D(A). If the operator TA-' is compact then B is closed. If, in addition, A-' is compact and, f o r some number k # IJ(A), the operator B+ kI is injective, then a ( B ) is discrete and
+
lim sn(B) - 1 sn(A)
n-+m
as n-+ oo.
(8.301)
The following lemma is often useful.
Lemma 8.20 Suppose that { f n } E H is a bounded sequence which does not contain a convergent subsequence. Then there is a sequence { $ m } = {fn,+, - fn,} such that
lCIm-O
as m-too
(8.302)
Awiliary Results
285
and {qm}does not contain a convergent subsequence.
Proof.
Since { f n } is bounded we can assume that it converges weakly:
fn
f
(8.303)
(passing to a subsequence and using the well known fact that bounded sets in a Hilbert space are relatively weakly compact). Since { f n } does not contain a convergent subsequence, one can find a subsequence such that
II fn,
-fn,
112 E > 0 for all m # k.
(8.304)
If
qrn := fn,+l
-
(8.305)
fn,
then (8.303) implies (8.302), and the sequence {qm} does not contain a convergent subsequence because
[I qrn 112 E > 0,
(8.306)
and if there would be a convergent subsequence Qmj it would have to converge to zero since its weak limit is zero. Lemma 8.20 is proved. 0 This lemma can be found in [Glazman (1965), $51 where it is used in the proof of the following result: if A is a closed linear operator in H and K is a compact operator then a,(A K ) = a,(A), where a,(A) is continuous spectrum of A that is the set of points X such that there exists a bounded sequence qmE D(A) which does not contain a convergent subsequence and which has the property 11 Aqrn - Xqrn[I--+ 0 as m + m.
+
Proof of Theorem 8.8 (1) Let us first prove that B is closed. Assume that fn
and
fn C
4
f
7
B f n = Afn
+Tfn
+
9,
(8.307)
D ( B ) = D ( A ) . Suppose we have the estimate
II Afn 115 C.
(8.308)
Then the sequence T f n = TA-lAf, contains a convergent subsequence since TA-l is compact. This and the second equation (8.307) imply that %he sequence { Af n } contains a convergent subsequence which we denote
Random Fields Estimation Theory
286
again Af,.
Since A is closed by the assumption, we conclude that f E
D ( A ) = D ( B ) and Af +Tf = g where we took into account that lim T f n = lim T A - l A f , = T A - l A f = T f .
n-co
n+cc
Thus, the operator B is closed provided that (8.308) holds. Let us prove inequality (8.308). Suppose (8.309)
(8.310) Equation (8.307) implies
Agn
+Tgn
+0,
n -+ 00.
(8.311)
As above, compactness of the operator TA-' and the last equation (8.310) imply that one can assume that the subsequence Tg,,, which we denote again Tg,, converges in H . This and equation (8.311) imply that Ag, converges to an element h:
Ag, Since A is closed and gn diction:
+ 0,
+ h.
one concludes that h
(8.312) = 0.
This is a contra-
This contradiction proves estimate (8.308). We have proved that the operator B is closed. (2) Let us prove that u(B)is discrete. We have
( B- XI)-'
=(A
+T - XI)-'
=(A
+ k I ) - ' ( I + Q - pS)-l,
(8.313)
where
Q := T S , p =X
S = ( A + kI)-'
+ k,
k @ a(A).
(8.314) (8.315 )
Auxiliary Results
287
+
+
The-operators S and Q are compact. If B kI is injective then I Q is injective. Since Q is compact this implies that I Q is an isomorphism of H onto H . Therefore
(I
+ Q - pS)-'
=(I
+
+ Q)-'(I - @)-'
(8.316)
is compact.
(8.317)
where
K := S ( I + Q)-'
Therefore the set p for which the operator B - XI is not invertible is a discrete set, namely the set of the characteristic values of the compact operator K . Recall that pj is a characteristic value of K if
Thus the set { p j } has the only possible limit point at infinity. Each pj is an isolated eigenvalue of k of finite algebraic multiplicity and therefore X j = pj - k is an isolated eigenvalue of B of finite algebraic multiplicity. Finally, the corresponding to X j projection operator (8.261) is finite dimensional, so that X j is a normal eigenvalue. We have proved that a(B)is discrete. (3) Let us prove the last statement of the theorem, i.e. formula (8.301). We have
snp)=~
; l ( ~ -= l )s;'
{ A - ~ (+ I TA-~)-~}.
(8.319)
We can assume without loss of generality that k = 0. In this case the operator I TA-' is invertible and since TA-l is compact one can write ( I TA-')-' = I S, where S is a compact operator. The operator A-' is compact by the assumption. We can apply now Theorem 8.3 and obtain
+
+
+
lim
n-00
+
sn { A V 1 ( I S)} = 1. sn(A-')
This is equivalent to the desired result (8.301).
(8.320)
0
8.3.2.9 Asymptotics of the spectrum for quadratic forms In this section we study perturbations preserving spectral asymptotics for quadratic forms. As a motivation to this study let us consider the following classical problem.
288
Random Fields Estimation Theory
Let D C R' be a bounded domain with a smooth boundary I?. Consider the problems
( - A + l ) u j =Xjuj
(-A+l)uj =pjuj
in D,
UN
=O
on I'
in D, u ~ + a u = O on
where a = a ( s ) E C1(r) and N is the outer normal to The question is: how does one see that
(8.321)
r
r.
Pn = 1 lim -
(8.323)
An
n+a
(8.322)
The usual argument uses relatively complicated variational estimates. The eigenvalues An are minimums of the ratio of the quadratic forms (8.324)
while pn are minimums of the ratio
The desired conclusion (8.323) follows immediately from the abstract result we will prove and from the fact that the quadratic form J, aluI2dsis compact with respect to the quadratic form sD[/Vul2 luI2]dx. Let A[u,w]and T[u,w]be bounded from below quadratic forms in a Hilbert space, T[u,u]2 0 and A[u,u]> m 11 u 112, m > 0. Assume that D[A]c D [ T ] where , D[A]is the domain of definition of the form A , and that the form A is closed and densely defined in H . The form A is called closed if D[A]is closed in the norm
+
(8.326)
If A[u,u]is not positive definite but bounded from below: A[u,u]2 -m u [I2, Vu E D[A],then the norm 11 u is defined by
11 u IIA=
+ + I)(%
{A[%u] (m
u)}1'2.
11
(8.327)
The following proposition is well-known (see e.g. [Kato (1995)l).
Proposition 8.3 Every closed bounded from below quadratic form A[u,w] is generated by a uniquely defined selfadjoint operator A.
Auxiliary Results
289
This means that A[u,v] = ( A u , ~ ) Vu E D(A),
w E D[A]
and D(A) c D[A] c H is dense in D[A] in the norm (8.327). The spectrum of the closed bounded from below quadratic form is the spectrum of the corresponding selfadjoint operator A. Definition 8.5 A quadratic form T is called A-compact if from any sequence f n such that 11 f n 1 11 c one can select a subsequence f n k such that
T[fnk - f n , , f n ,
-fn,]--tOasm1k--t~.
Theorem 8.9 If A[u, u] is a closed positive definite quadratic form in H with discrete spectrum X,(A), and T[u,u] is a positive A-compact quadratic form, D(A) c D(T), then the form B[u,u]:= A[u,u] T[u,u],D[B]= D[A], is closed, its spectrum is discrete and
+
(8.328)
The conclusions of the theorem remain valid if T[u,u] is not positive but IT[u,u]I 5 TI[u, u] and TI is A-compact. We need a couple of lemmas for the proof. Lemma 8.21 Under the assumptions of Theorem 8.9 the quadratic form T[u,u] > 0 can be represented as
T[u,v]= [Tu,v]
(8.329)
where [u,w] is the inner product in H A := D[A] and T > 0 is a compact selfadjoint operator in H A .
Proof. Consider the quadratic form T[u,w]in the Hilbert space H A . Since T[u,u]is A-compact, it is bounded in H A . If T[u,v]is not closed in H A consider its closure and denote it again by T[u,v].By Proposition 8.2 there exists a selfadjoint in H A operator T > 0 such that (8.329) holds. Let us prove that T is compact in H A . Suppose 11 U n l l ~
Thus
n, m -+
00.
Random Fields Estimation Theory
290
Since T > 0 is selfadjoint, T112is well defined and (8.330) can be written as (8.331)
This implies that T112is compact in HA. Therefore T is compact. Lemma 8.21 is proved. 0 Lemma 8.22
Proof.
Under the assumptions of Theorem 8.9 one has H B = H A .
It is sufficient to prove that
+ C ( E ) 11 u 112
T [ u u] , 5 eA[u,u]
VE> 0.
(8.332)
If (8.332) holds then (1- E ) A [u] ~-, C ( E ) 11 u
I B[u,4 I (1+ E)A[u, I. I C 2 ( E ) A [ U ,u]
+ C ( E ) 11 [I2 21
+
so that the norm { B [ uu] , C ( E ) 11 u 112}1/2 is equivalent to the norm 11 u IIA. This means that H B = HA. The proof of (8.332) is the same as the proof of Lemma 8.19 used in the proof of Theorem 8.5. Lemma 8.22 is proved.
0 Proof of Theorem 8.9 We need only prove formula (8.328) and the fact that B has a discrete spectrum. The other conclusions of Theorem 8.9 have been proved in Lemmas 8.21 and 8.22. Since the form B[u,u]is bounded from below in H we may assume that it is positive definite. If not we choose a constant m such that B,[u,u] := B[u,u] m(u,u)is positive definite. Since X,(B,) = Xn(B) m and since Xn(A)+ +m, the equation
+
+
An (Bm) lim --1,
n-00
Xn(A)
n+00
is equivalent to (8.328). Note first that the spectrum of the form B[u,u]is discrete. Indeed, the following known proposition (Rellich's lemma) implies this. 0 Proposition 8.4 Let B[u,u]be a positive definite closed quadratic form in H . The spectrum of B is discrete i f and only i f the imbedding operator i : H B + H is compact.
For the convenience of the reader we prove Proposition 8.4 after we finish the proof of Theorem 8.9.
Auxiliary Results
291
Returning t o the proof of Theorem 8.9 we note that A has a discrete spectrum by the assumption. Therefore i : H A --+ H is compact. Since H A = H B the imbedding i : H B 3 H is compact. By Proposition 8.4 this implies that the spectrum of B is discrete. To prove formula (8.328) we use the minimax principle:
= &l+1(A)(1-"In), "In
03.
(8.333)
Here we used Theorem 8.2 and denoted by &(A) the linear span of the first n eigenvectors of the operator A generated by the quadratic form A[u,u]. Interchanging A and B we get
Xn+l(A)2 Xn+l(B)(l-L), Sn
4
0,
(8.334)
F'rom (8.333) and (8.334) formula (8.328) follows. The last statement of Theorem 8.9 follows from the following proposition 8.4. Proposition 8.5 If IT[u,u]I5 Tl[u,u]and TI is A-compact then the operator A-lT is compact in H A .
We will prove this proposition after the proof of Proposition 8.4. Proposition 8.5 granted, the proof of the last statement of Theorem 8.9 is quite similar to the given above and is left t o the reader. Theorem 8.9 is proved. Proof of Proposition 8.4 Assume that the spectrum of B [ v ,u]is discrete. Then the corresponding selfadjoint operator B has only isolated eigenvalues 0 < m 5 X,(B) +. +m. Therefore the operator B-l is compact in H . This implies that B - 1 / 2 is compact in H . Assume that 11 u, 11~5c, that ~n a is, (1 B1/2un 1 11 c. Then the sequence u, = B - 1 / 2 B 1 / 2 contains convergent in H subsequence. Thus, the imbedding i : H B +. H is compact. Conversely, suppose i : H B -+ H is compact. Then any sequence un such that 11 u, I I B = ~ ~ B1I2u, I[< c contains a convergent subsequence. This
Random Fields Estimation Theory
292
means that B-1/2 is compact. Since B-ll2 is selfadjoint it follows that B-l is compact. Since B 2 m > 0 this implies that the spectrum of B is discrete. Proposition 8.4 is proved. 0
Proof of Proposition 8.5 Denote Q := A-lT, Q1 = A-lTl. Then I[Qu,ulII [ Q ~ u , u l -
(8.335)
where the brackets denote the inner product in HA. The operator Q1 is nonnegative and compact in HA. Indeed,
[Qiu, I. = (Tiu,u)2 0 so Q1 1 0 in HA. Suppose sequence in H T ~that , is
(TIun, -
11
,un,
un IJASc. Then T1un contains a Cauchy -%k)
+
0,
m, k
00.
(8.336)
+ 00.
(8.337)
-+
Thus [QI
(un, - unk),unm- unk]+ 0, m, k
Since Q1 L 0 equation (8.337) implies that Q:'2 is compact in HA. Therefore Q1 is compact in HA. Conversely, if Q1 is compact in HA then1'2 2 0 is A-compact. Indeed, if 11 un / / A < c then T1un = AQlu, so that there is a subsequence unm such that
(TI(un, - unk ,un,
- un,) = [Ql(un, - u7Lk)
9
un,,, - unk]
-+
0
as m, k + 00, because Q 1 is compact in HA. So we have proved that
TI 2 0 i s A compact i f and only i f A-lTl
i s compact in HA. (8.338)
Let us prove that if Q1 is compact in HA. This is the conclusion of Proposition 8.5. It is sufficient to prove that if f n 3 0 and gn 2 0 in HA then
Indeed, if
Auxiliary Results
293
then {fn
I
f
and gn
A
9)
=+ [Qfn,gnI
+
[Qf,gI,
(8.341)
that is, Q is compact. To check that (8.340) implies (8.341) one writes
It follows from (8.342) that (8.340) implies (8.341). Let us check that (8.335) implies (8.340). One uses the well known polarization identity
IQf,
1
91 =
4{[(f+ 9) f + 91 - [(f- g), f - 91 - i [Q(f + is),f + ig] 7
+
i [Q(f - is),f - is]}. (8.343)
It is clear from (8.343) and (8.335) that
1
I[Qfn, gnl I 5 (,I [QI (fn + gn)
7
fn
+ grill + I[Q1 (fn - gn) ,fn - gn]I
+ l[Q1 (fn + ign) > fn + ign]l + I[Ql(fn - ign), fn - isn]/} 0, 4
n
-
-+ 00.
(8.344)
The last conclusion follows from the assumed compactness of Q1 and the fact that if fn 0 and gn 3 0 then any linear combination clfn c2gn converges weakly to zero. Proposition 8.5 is proved. 0
+
Example 8.2 It is now easy to see that (164) holds. Indeed, the imbedding i : H ' ( D ) + L2(r, 1). is compact. Therefore the quadratic form J , nlu12ds is A-compact, where A[u, u] = ,J (1VuI2 lu12)dx. From this and Theorem 8.9 the formula (8.323) follows.
+
8.3.2.10 Proof of Theorem 2.3 In this section we prove Theorem 2.3. First let us note that if (8.345)
Random Fields Estimation Theory
294
where w ( A ) E C(R1),w ( m ) = 0 then the operator R : L 2 ( D ) where D c R' is a bounded domain, and
-+
L2(D),
is compact. This is proved in Section 4.2 (cf. the argument after formula (4.71)). If w ( X ) 2 0 then R = R* 2 0. Suppose that Wl(X)
where $(A) E C ( R 1 ) 1, can be written as
= w ( A ) [I
+ $(A)
+4(4],
4(fm) = 0
(8.347)
> 0. Then the corresponding operator
Ri = R(I + Q ) ,
R1
(8.348)
where Q is a compact operator with the kernel
Srn
(8.349)
lim ___ sn(R1) -1. s,(R)
(8.350)
Q(z,Y) =
4(A)@(z,Y,A)dp(X).
-W
By Theorem 8.3 one has
~-KS
+
Note that the operator I + Q is injective since 1 4(A > 0. Since R1 2 0 and R 2 0 one has sn(R1)= An(R1),sn(R)= X,(R). Therefore (191) can be written as (8.351)
Therefore it is sufficient to prove formula (2.31) for w(X) = (1 Secondly, let us note that if one defines
+ X2)-"l2. (8.352)
then formulas A, = cnp[l
and
+ o(l)]
as n + +m,
c = const > 0,
p
> 0,
(8.353)
Auxiliary Results
295
are equivalent. This follows from the fact that the function N ( X ) is the inverse function for X(N) := AN in the sense that N(XN) = N and X(N(X)) = AN. Therefore if one knows the asymptotics of N ( X ) then one knows the asymptotics of A, and vice versa. In [Ramm (1975), p. 3391 it is proved that if o(1) in (8.353) is O(n--Pl),p l > 0, then o(1) in (8.354) is o ( x - P ~ ~ P ) . Thirdly, let us recall a well known fact that an elliptic operator L of order s with smooth coefficients and regular boundary conditions in a bounded domain D c R' with a smooth boundary (these are the regularity assumptions) has a discrete spectrum and N(X,L ) = yX'/S [l
+ o(l)] ,
(8.355)
X --+ +m,
where N(X,13) is defined by (8.352) with A, = X,(L), and y = const defined by formula (2.32). By formula (8.353) one obtains
X,(L) =y- s / r ns/r [ l + o ( l ) ] ,
n-1 +m.
> 0 is
(8.356)
The operator R is a rational function of L so that by the spectral mapping theorem one obtains X,(R)
= X,"(L)
[l
+ ~ ( l )=] yaS/'n-aS/' [l + o(l)] ,
n --+ 00.
(8.357)
This is formula (2.31). For the function w(X) = (1 X2)-"l2 and even a a proof of the formula (2.31) is given in [Ramm (1980), p. 621. This proof goes as follows. The problem
+
(8.358) is equivalent to the problem
where q5,(z) : = 0
Q(L)u, = O
in R
(8.360)
R
(8.361)
in
Random Fields Estimation Theory
296
u,(oo)= 0, 8kun =,$;#A,
on
r,
as 0I j I - - 1, 2
(8.362)
+
where &(A) := (1 X 2 ) a / 2 . The equivalence means that every solution to (8.358) generates the solution to (8.359)- (8.362) and vice versa. The problem (8.359)- (8.362) can be written as AnQ(.C)$n = X D ( X ) $ ~ ( X ) in R',
(8.363)
where XD(x) =
{
1,
X € D
0, x E R.
This problem has been studied [Tulovskii (1979)] and formula (8.357) has been established. For the general case of w(X) = (1 A2)-a/2, a > 0 one can use the results from the spectral theory of elliptic pseudo-differential operators. Under suitable regularity assumptions the following formula for the number N(X) := {#A, : A, 5 A} of eigenvalues of such an operator R in a bounded domain D C R' is valid:
+
N(X)= (27r)-' meas {(x,() E
D
x R' : r(x,E) < A} [1+o(l)] ,
X
t
+oo.
(8.364) Here meas is the Lebesgue measure, and r ( z ,() is the symbol of the pseudodifferential operator R. This means that
Rh := (27r)-'
11
exp {z(x- y) . <}~ ( zJ)h(y)dydJ, ,
1 kr =
. (8.365)
The symbol of the elliptic operator (2.5) is '&jllsaj(x)(iJ)j. Only the principal symbol, that is is Cljl=s aj(x)Sj, defines the main term of the asymptotics of N(X). Since s = ordC is even one chooses C so that CO(Z,() := C l j l = s a j ( x ) J j> 0 for 151 # 0. For example, one chooses C = -A rather than A. In this case (27r)' meas
( ( 2 , ~E)
D x R'
: ~ o ( z , c )< A} = ~'/~(27r)-' ID
qda: = y ~ ' / ~ ,
(8.366) where q is given by (2.33) for the operator LO in the selfadjoint form. The asymptotic behavior of the function N(X) has been studied extensively for wide classes of differential and pseudo-differential operators (see,
Auxiliary Results
297
e.g., [Levitan (1971); Hormander (1983-85); Safarov and Vassiliev (1997); Shubin (1986)] and references therein).
8.3.3
Trace class and Hilbert-Schmidt operators
In this section we summarize for convenience of the reader some rsults on Hilbert-Schmidt and trace-class operators. This material can be used for references. One writes A E up,1 5 p < 0;) if
j=1
8.3.3.1 Trace class operators
The operators A E ( T I are called trace class (or nuclear) operators. The operators A E 02 are called Hilbert-Schmidt ( H S ) operators. The class u, denotes the class of compact operators. If A E up then A E uq with q 2 p. We summarize some basic known results about trace class and H S operators.
Lemma 8.23 If and only if A E u1 the sum T r A := E F l ( A + j , $ j )is finite for any orthonormal basis of H and does not depend on the choice of the basis. In fact T r A = C,”it’ Aj(A), where v(A) is the sum of algebraic multiplicities of the eigenvalues of A. The following properties of the trace are useful.
Lemma 8.24
If Aj E 0 1 , j = 1,2, then
+
+
1) Tr(clA1 c2A2) = clTrAl c ~ T ~ A cj z= , const, j = 1,2, 2) TrA* = (TrA), the bar stands for complex conjugate, 3) Tr(A1A2)= T T ( A ~ A I ) ,
4 ) Tr(B-’AB)
= TrA where B is
5) Tr(A1A2)2 0 if A1
a linear isomorphism of H onto HI
2 0 and A2 2 0,
6) Tr(A1Az)lI25 TrA1:TrAZ if A1 2 0 and A2 2 0, 7) ITrAl 5 Sj(A):=I1 A 111, 8) 11 A 11i= sup Cj”=, I(Afj,h j ) ) ,where the sup is taken over all orthonorma1 bases {fj} and {hj} of H , 9) 1 I ClAl C2A2 1115 IClI II A1 111 + k 2 l II A2 I l l 1
c;,
+
Random Faelds Estimation Theory
298
10) Assume that A j E 01, 1 I j < cm, 11 Aj 1111 c, c does not depend o n j , and A j 2 A. Then A E 01, and 11 A ))I< supj (1 Aj 111, the symbol denotes weak convergence of operators, 11) i f A E 01 and B is a bounded linear operator, then
-
I1 A B Ill 5 I1 A tllll B 11) II B A 111 1 II A 11111 B II . 8.3.3.2 Hilbert-Schmidt operators The operators A E
are called Hilbert-Schmidt (HS) operators.
02
Lemma 8.25 A E 0 2 i f and only i f the s u m 11 A I/;:= Cj”=,11 Aq5j finite for any orthonormal basis (q5j) of H . If A E 02 then
(I2
is
00
I1 A Ili= If cj
= const and
Aj E
c2
1sj2(A). j=1
then
II c i A i + ~ 2 A 21 1 1lcil II A1 112 +Iczl II A2 112 If A E (
~ 2and
B is a bounded linear operator then
II A 112 = II A’ ll2l I1 A B I12 1I1 A ll2ll B II . Lemma 8.26 A E (4j)of H for which
02
(8.367)
if and only if there exists an orthonormal basis
(8.368) j=l
In particular, if (8.368) hold for a n for a n orthonormal basis of H,then it holds f o r every orthonormal basis of H . Lemma 8.27 H f o r which
AE
01
i f and only i f there exists a n orthonormal basis of
(8.369)
However i f (8.369) holds f o r an orthonormal basis of H it m a y not hold for another orthonormal basis of H .
Auxiliary Results
299
i,..
Example 8.3 Let H = &, f = c ( 1 , 21 , . . ., .) where c = const > 0 is chosen so that 11 f I[= 1. Let A be the orthogonal projection on the one-dimensional subspace spanned by f , and let {&}, $j = & j , be an orthonormal basis of &. Then A4j = f , Cj”=, 11 A4j II= Cj”=, = 00. Lemma 8.28
If and only i f A E 01 it can be represented in the f o r m A = A1A2 where A j E u2, j
= 1,2.
Lemma 8.29 The classes 01 and 02 are ideals in the algebra C ( H ) of all linear bounded operators on H . If H is a separable Hilbert space then the set 0 , of all compact operators on H is the only closed proper non-zero ideal in L ( H ) . The ideal is called proper if it is not C ( H ) itself. The closedness is understood as the closedness in the norm of linear operators o n H .
8.3.3.3 Determinants of operators Definition 8.6
If A E 01 then
n
4‘4)
d ( p ) := det(I - P A )
:=
[1 - & ( A ) ]
j=1
One has
1 ) ld(P)I I exp(IcLI II A 111). 2 ) d ( p ) = exp (T r [A(I- PA)-’] d p ) , if the operator I - XA, 0 5 X 5 p, is invertible. 3) det(I - A ) = limn--+m det [&j - ( A h ,4j)]i,j=1,..,,n where {4j} is an arbitrary orthonormal basis of H . 4) det(1- A B ) = det(I- B A ) , AB E 01, B A E 01, A E o,, B E L ( H ) . 5) det [ ( I - A ) ( I - B ) ]= det [ ( I- B ) ( I - A ) ] ,A, B E g 1 . 6) If A ( z ) E 01 is analytic operator function in a domain A of the complex plane z then d ( 1 ,z ) := det ( I - A @ ) )is analytic in A; here d ( p , z ) := det ( I - pA(z)). 7 ) -&Tr { F ( A ( z ) ) }= Tr F’ ( A ( z ) ) where F(X) is holomorphic in the domain which contains the spectrum of A ( z ) for all z E A, and F ( 0 ) = 0. 8) det(I+A) = exp { T r log(1 A ) } ,A E 01 where log(I+A) can be defined by analytic continuation of log(1+ zA). This function is well defined for I4 II A II< 1.
:s
F}
{
+
Random Fields Estimation Theory
300
If A E 02 then the series C z , IXj(A)I may diverge and the Definition 8.6 is not applicable. One gives Definition 8.7
n
4-4)
& ( p ) := d;t(I-
PA) :=
{[l- pAj(A)]exp [pXj(A)]}.
j=1
One has:
{
9) ~ d 2 ( p )L1 exp $wA*A)}. limn--roodet [&j - ( A h ,d ~ j ) lexp ~ [CT=l(A4j, ~ ~ , ~ ~h)] ~ where { + j } is an arbitrary orthonormal basis of H . 11) I ~ A , B E ~ ~ ~ ~ ~ I - C = ( I - Athen ) ( I - B )
10) &(I)
=
d$I
- C) exp [Tr(AB)]= d;t(1- A) dgt(1- B ) .
If B E 01 then det(1- C) = det(1- A ) det(1- B ) exp {Tr [(I- A ) B ] }. 12) 2 2 Definition 8.8
If A
E up then
d p ( p ) := det(1- pA) := P
[l- pAj(A)]exp j=1
Carleman’s inequality: If A E a2, X j are eigenvalues of A counted according to their multiplicities, [ A l l 2 1x21 2 ... and q5x(A) := X j A- l ) exp(X j A- ) , then
ngl(l
14)
8.4
8.4.1
Elements of probability theory The probability space and basic definitions
A probability space is a triple {O,U, P } where R is a set, U is a sigma algebra of its subsets, and P is a measure on U ,such that P ( Q )= 1, so its a measure space {QU}equuipped with a normalized countably additive
301
Auxiliary Results
measure
c
A random variable is a U-measurable function o n 0, that is, a U measurable map R + R 1 . A random vector is a U-measurable map R 4
R'.
c
<
A distribution function of is F ( z ) = P(<< z). It has properties: F ( - o = 0, F ( - t o ) = 1, F is nondecreasing, F ( z 0) - F ( z ) = P(J= z), F ( z - 0) = F ( z ) . The probability density f (z) := F ' ( z ) is defined in the classical sense if F ( z ) is absolutely continuous, so that F ( z ) = f ( t ) d t . If J is a discrete random variable, that is 4 takes values in a discrete set of points, then its distribution function is F ( z ) = PiO(z - xi),where
+
s_",
xi
xi
The probability density for this distribution is f (z) = Pid(z-zi) where 6(z) is the delta-function. The probability density has the properties: 1) f 2 0, 2) J-", f d t = 1. A random vector E = (el, . . . , J r ) has a distribution function
This function has characteristic properties
F ( + o , . . . ,+o) = 1, F ( z 1 , . . . ,z , = -00,. . . , z r )= 0 for any 1 5 m 5 r , F(z1, . . . ,z , = +oo, . . . , z,) = F(z1, . . . ,zm-1, 2,+1, . . . ,z,), F is continuous from the left, i.e. F ( z l , . ., ,z, - 0,. .. , z r ) = F(z1,.. . ,,z , . . . , z,) and nondecreasing in each of the variables z1,.
. . ,2,.
The probability density f ( 2 1 ,
. . . ,z,) is defined by the formula
Random Fields Estimation Theory
302
so that
F(X1,.. .
1
-1 I,
I1
Xr) =
dxl..
-W
dXrf(X1,.
.
,X2).
-W
This formula holds if the measure defined by the distribution function F is absolutely continuous with respect to the Lebesgue measure in R', i.e. if P(J E A) = 0 for any A c ' R such that measA = 0 where meas is the Lebesgue measure in R'.
Example 8.4 A random vector J is said to be unzformly distributed in a set A c R' if its probability density is (8.370)
Example 8.5 A random vector J is said to be Gaussian (or normally distributed, or normal) if its probability density is
f (z) = (27r)-'l2[det C]1/2exp
{'
cij(xi - rni)(zj - rnj)
ij=l
Here C = (cij)is a positive definite matrix, M [ ( ]= the matrix C-' is the covariance matrix of (:
=m=
1
.
(8.371)
(rnl,. . . ,m T ) ,
where all the quantities are real-valued, the bar denotes the mean value, defined as follows:
If g(x1,. . . ,z,) is a measurable function defined on R', g : R' --+ R1, then 77 = g ( & , . . . ,&) is a random variable with the distribution function:
F,(x) := P(q < X ) =
s
g(z1,...,I T ) < V
dF(z1,. . .,2,)
where F(x1,. . . ,z,) is the distribution function for the vector One has
. . . ,Jr).
(Jl,
Auxiliary Results
303
In particular the variance of a random variable is defined as
1
00
D[[] := (6 - <)2
=
(x- m)2dF(x)
--M
where
-
W
E =m =
xdF(x).
Let us define conditional probabilities. If A and B are random events then
(8.373)
1
where AB is the event A n B and P ( A B ) is called the conditional probability of A under the assumption that B occured. The conditional mean value (expectation) of a random variable ( = f(u)under the assumption that B occured is defined by
(8.374)
In particular, if
E=
{
1 if A occurs
0 otherwise
then (8.374) reduces to (8.373). If A = U p l E j , Ej n Ejt = 0,j
# j', then (8.375)
j=1
and
P ( E j 1 A)
= P(A
1 Ej)P(Ej)P-'(A).
(8.376)
This is the Bayes formula. The conditional distribution function of a random variable ( with respect to an event A is defined as
(8.377)
Random Fields Estimation Theory
304
One has
I
M (< A ) =
I
xdFc (x A ) .
(8.378)
--M
The characteristic function of a random variable function F ( x ) is defined by
< with the distribution
00
+(t>:= M [exp(it<)l=
exp(itx)dF(x).
(8.379)
--M
It has the properties
1) 4(0) = 1, I4(t)l I 1, --oo < t < 00, 4(-t) = 4 * ( t ) 2) 4(t) is uniformly continuous on R1 3) $(t) is positive definite in the sense n
C
4(tj - tm)zjzk
20
(8.380)
j,m=l for any complex numbers 2j. and real numbers tj, 1 5 j 5 n. Theorem 8.10 (Bochner-Khintchine ) A function +(t) is a characteristic function if and only if it has properties 1)-3).
One has
if F ( x ) is continuous at the point z. More generally
+
$(t) exp(-itx)dt = F(x 0) - F(x). If conditions 2) and 3) hold but condition 1) does not hold then one still has the formula 4(t) =
/
00
exp(itz)dF(z)
(8.381)
--M
where F(x) is a monotone nondecreasing function, but the condition F ( + m ) = 1 does not hold and is replaced by F ( + m ) < 00. Let us define the notion of random function. Let {R,U, P } be the probability space and <(t,w), w E R, be a family of random variables depending on a parameter t E D c R'. The space X to which the variable 1/2<(t, w)belongs for a fixed w E R is called the phase space of the random
Awilaary Results
305
function <(z,w ) . If r = 1 the random function is called a random process, if r > 1 it is called a random field. Usually one writes <(t)in place of E(t,w ) . If X = Rm then <(t)is a vector random field, if m = 1 then <(t)is a scalar random field. One assumes that X is a measurable space (X,23) where f3 is the Bore1 sigma-algebra generated by all open sets of X. If one takes n points x1 ,. . . ,x, E D c RP then one obtains a random vector { < ( t l , w ) , . . . , < ( t , ~ ) }Let . F(t1,.. . , t,) be the distribution function for this random vector. For various choices of the points tj one obtains various distribution functions. The collection of these functions is consistent in the following sense:
,) = 1) Fcl,...,c,, (XI 7 ~ 2 , 7 X m - 1 , x m E X,. . ., x F(xi,.. . xm-1, x,+i, . . . ,x,) 2) Fcl(...,c n ( X 1 , . . . , X n ) =Fea, c a n ( x i l l . * . , x i n ) for any permutation ( i l l . .. , in) of the set (1,.. . ,n). (...)
The following theorem gives conditions for a family F(x1,.. . ,x,) of functions to be the family of finite-dimensional distribution functions corresponding to a random function [(t).
Theorem 8.11 (Kolmogorov) If and only i f the family of functions F is compatible and each of the functions has the characteristic properties 1)-4) of a distribution function, there exists a random function for which this family is a family of finite-dimensional distribution functions.
Moment functions of a random function are defined as
mj(ti,.. . ,t j ) = M [<(ti). . .[ ( t j ) ]= [ ( t i ) . .
<(tj).
(8.382)
Especially often one uses the mean values
[<(t,w)l= m(t>=
Eo
(8.383)
and covariance function
). := [<(t) - m(t>l* I<(.) - 4.11 .
(8.384)
The characteristic property of the class of covariance functions is positive definiteness in the sense
c n
i,j=l
R(ti,tj)Zi*Zj
20
(8.385)
Random Fields Estimation Theory
306
for any choice of real ti and complex zi. The star stands for complex conjugate.
8.4.2
Hilbert space theory
Let us assume that a family of random variables with finite second moments is equipped with the inner product defined as
(8.386) and the norm is defined as
(8.387) Then the random variables belong to the space L2 = L2(R,U,P ) . Convergence in this space is defined as convergence in the norm (8.387). If <(t)is a random function then its correlation function is defined as
B ( t ,T ) := < * ( t ) E ( T ) .
(8.388)
If = 0 then B ( t ,T) = R(t,T ) , where R(t,7 ) is the covariance function (8.384). The characteristic property of the class of correlation functions is positive definiteness in the sense (8.385). A random function <(t)E L 2 ( R , U ,P ) is called continuous (in L2 sense) at a point t o if
where p ( t , t o ) is the distance between 1/2 and t o . Lemma 8.30 For (8.389) to hold it is necessary and suficient that B ( t ,7 ) be continuous at the point ( t o , t o ) .
A random function <(t)E L2 is called diflerentiable (in L2 sense) at a point to if there exists in L2 the limit
<'(to) := l.i.m.e+oc-l [<(to
+
C)
- <(to)],
(8.390)
where 1.i.m. = limit in mean stands for the limit in L2. Lemma 8.31
For (8.390) to hold it is necessary and suficient that
307
Auxiliary Results
a z ~ ~ ~ ' t exists, o )
that is
- B(t0,t o 4-€2)
If
- B(t0
+ €1, t o ) + B(t0,t o ) ].
(8.391)
exists then
(8.392) Let ((x), x E D c R' be a random function and ~ ( x be ) a finite measure. The Lebesgue integral of [(x) is defined as (8.393) where &(x)
I En+l(x), &(x)
E
L2 and En(x)
lim P ( ( ( ( x )- &(.)I
n-cc
> E)
= 0,
that is
VXE D
(8.394)
for every E > 0. If p ( D ) < 00 and
(8.395) then (8.396) Assume that <(x) E L2 and B(z,y) is continuous in D x D , where
D c R' is a finite domain. Then, by Mercer's theorem (see p. 61) one has
where
308
R a n d o m Fields Estimation Theory
(8.399)
(8.400)
Put
Then (8.402)
and
Lemma 8.32
The series (8.404)
converges in L2 for every x E D if the function B ( x ,y) is continuous in D x D. Remark 8.6
If one defines B l ( t , T ) by the formula Bi(t,T ) := ‘C(t)‘C* (T)
(8.405)
so that B1 = B*, then formulas (8.397)-(8.400) hold for B1, in formula (8.401) in place of (bn one puts 4; and in formula (8.404) one puts (bj in place of 4;. Let us define the notion of a stochastic or random measure. Consider a random function C(z) E L2, 5 E D. Let 23 be a sigma-algebra of Borel subsets of D. Suppose that to any A c B there corresponds a random variable p ( A ) with the properties 1) p(A) E L2, P ( 0 ) = 0 2 ) p ( A 1 u A,) = p ( A 1 ) p ( A 2 ) if A1 n A2 = 0 3) P ( A l ) P * ( A 2 >= m(A n A d ,
+
where m ( A )is a certain deterministic function on 8.Note that
Auxiliary Results
309
provided that A1 n A2 = 0,so that m(A) has some of the basic properties of a measure. It is called the structural function of p(A) and p(A) is called an elementary orthogonal stochastic measure. Assume that m ( A )is semiadditive in the following sense: for any A E B the inclusion A c Uj”=,Aj, A j E B,implies (8.406)
Then m(A) can be extended to a Bore1 measure on B. Let f(x) E L2 ( D ,B,m). Define the stochastic integral as the following limit
where f n ( x )is a sequence of simple functions such that
A sample function is a function of the type n
fn(x) = E c j X A j j=1
where
cj = const, X A is ~
Lemma 8.33
the characteristic function of the set Aj
(8.409)
c B.
I f f i E L 2 ( D , m ) i, = 1 , 2 , and ci = const, then
and
s,
fl(.)C(d.)
/
D
f,*(x)C*(dx)=
s,
f1(.>fi*(.)m(dz).
(8.411)
Using the notion of the stochastic integral one can construct an integral representation of random functions.
310
Random Fields Estimation Theory
Suppose that a random function [(x), x E D , has the covariance function of the form (8.412) where m(dX)is a Bore1 measure on the set A, g(x,A) E L2(A,m(dX))Vx E D , and the set of functions { g ( x ,A), x E D } is complete in L2(A,m ( d X ) ) .
Lemma 8.34 Under the above assumptions there exists an orthogonal stochastic measure C(dX) such that (8.413)
Equation (8.413) holds with probability one, and m(A) := J, m ( d A ) is the structural function corresponding to C(dX). There is an isometric isomorphism between L2(A,m(dA))and L i , where L: is the closure of the set of random variables of the form Cj”=, cjC(Aj), A j E A, in the norm (8.387). This isomorphism is established by the correspondence
[(x)
g(x,A), [(A)
xA(A).
(8.414)
If hi(A) E L2(A,m(dA)),i = 1 , 2 then (8.415)
where r
( 8.416) This theoy extends to the case of random vector-functions. 8.4.3 Estimation in Hilbert space L2(Cl,U,P ) Assume that L i is the subspace of L2(R,U,P ) , a random variable 77 E L 2 ( R ,U ,P ) and we want t o give the best estimate of 77 by an element of L i , that is to find 770 E Li such that
6 :=I1 rl - 770
11=
d
infz II rl q
-
4 II,
(8.417)
where the norm is defined by (8.387). The element 770 E Lp does exist, is unique, and is the projection of 77 onto the subspace L i in the Hilbert space L2(R,U, P ) .
Auxiliary Results
311
The error of the estimate, the quantity b defined by (8.417), can be calculated analytically in some cases. For example, if ((1, .. . , In) is a finite set of random variables then
1
(8.418)
vo = r where F = I?(&,
. . . ,In) is the Gramian of (el,. . . ,&): (8.419)
One has
(8.420) The optimal estimate qo E L: satisfies the orthogonality equation (170 -
1 7 , E k ) ) = 0,
k c ED
(8.421)
which means geometrically that 17 - 170 is orthogonal to L:. Equation (1.6) for the optimal filter is a particular case of (8.421): if, in the notation of Chapter 1,
vo(z) =
s,
h ( z ,Y)%/)dY,
17 = S(ZL
(8.422)
then (8.421) becomes
(8.423)
Random Fields Estimation Theory
312
8.4.4
Homogeneous and isotropic random fields
If ( ( x )is a random field and
[ ( x )= 01
[*(X)E(Y)
= R ( x - Y)
(8.424)
then [ ( x ) is called a (wide-sense) homogeneous random field. It is called a homogeneous random field if for any n and any x l l .. . , x,, x the distribution function of n random variables [ ( X I x),. . . ,E(z, x ) does not depend on x. Here x E R' or, if x E D c R', then one assumes that x , y E D implies x y E D. The function R ( x ) is positive definite in the sense (8.380). Therefore, by Bochner-Khintchine theorem, there exists a monotone nondecreasing function F ( x ) ,F(+co < co,such that
+
+
+
(8.425) r 5 .y
=Cxjyj.
(8.426)
j=1
One often writes dF(y) = F ( d y ) to emphasize that F determines a measure on R'. Monotonicity in the case r > 1 is understood as monotonicity in each of the variables. If T > 1 then a positive definite function R ( x )is the Fourier transform of a positive finite measure on R'. This measure is given by the function F ( x ) which satisfies characteristic properties 2)-4) of a distribution function. It follows from (8.425) that
0 < R(0) = F(R')
< 00
R ( - X ) = R*(x).
(8.427)
(8.429)
A homogeneous random field is called isotropic if
for all x , y E R' and all g E S O ( n ) ,where SO(n) is the group of rotations of Rr around the origin. Equation (8.430) for homogeneous random fields is equivalent to
R ( x ) = R(gz) Vg E SO(n).
(8.431)
Azlziliary Results
313
R ( z )= R(I.1)
(8.432)
This means that
where 1x1 = ( z f + .. .+z?)l/' is the length of the vector x. This and formula (8.425) imply that d F ( y ) = d$(lyl). If (8.433)
then d F = f ( y ) d y and f (y) is continuous: f (y) = (27r)-'
1
(8.434)
IRI2dx < 00
(8.435)
exp(-iz. y ) R ( z ) d z .
If
then d F = f ( y ) d y , f (y) E L2(R'), and formula (8.434) holds in L2-sense. It is known that
where J n ( t ) is the Bessel function and d s is the element of the surface area of the sphere 1x1 = p in R'. Using formula (8.436) one obtains Lemma 8.35 Assume that R(p) is continuous function. This function is a correlation function of a homogeneous isotropic random field in R' i f and only i f it is of the form:
where g ( X ) is a monotone nondecreasing bounded function, g(+co) < and r ( z ) is the Gamma-function.
If r
00,
= 2 formula (8.437) becomes
(8.438) for n = 3 one gets
(8.439)
Random Fields Estimation Theory
314
From formula (8.425) and Lemma 8.34 it follows that a homogeneous random field admits the spectral representation of the form (8.440) where [(dy) is an orthogonal stochastic measure on R'. If the random field is homogeneous isotropic and continuous in the L2sense then
Here c, = const, Sm,j(0)is the orthonormalized in L2(ST-') system of the spherical harmonics, S'-' is the unit sphere in R', 0 E ST-', h(m,r)=(2m+r-2)
+
( m T - 3)! (r - 2)!m! '
r 2 2 is the number of linearly independent spherical harmonics corresponding t o the fixed m. For example, if T = 3 then h(m,3) = 2m 1. The stochastic orthogonal measures Cmj (dX) have the properties
+
[mj(dX)
= 0,
(8.442)
where A, and A2 are arbitrary Bore1 sets in the interval (0, co),and m(A) is a finite measure on (0, co). If E(z) is a homogeneous random field with correlation function (8.425), then one can aply a differential operator Q(-ZO), O = (81,.. . ,Or), Oj = ax3 to I (. in) L2 sense if and only if 1
(8.444)
If condition (8.444) holds then Q(-ia)C(z) is a homogeneous random field and its correlation function is Q* (-Za)Q( -Za)R(z) and the corresponding spectral density is IQ(y)12f(y), where F ( d y ) = f(y)dy. By the spectral density of the homogeneous random field with correlation function (8.425) one means the function f ( y ) defined by F ( d y ) = f(y)dy in the case when F(dy) is absolutely continuous with respect t o Lebesgue's measure, so that f ( Y ) E L1(RT).
Auxiliary Results
8.4.5
315
Estimation of parameters
Let ( be a random variable with the distribution function F ( x 1 8 ) which depends on a parameter 8. The problem is to estimate the unknown 8 given n sample values of to which F ( z ,8) belongs. The estimated value of 8 let us denote B = 8 ( x l l . . . 2,) where x j l 1 5 j 5 nl are observed values of (. What are the properties one seeks in an estimate? If p(8, 8) is the risk function which measures the distance of the estimate 8 from the true value of the parameter 8, then the estimate is good if *
A
<
-~ P('7 ') 5 81)7
(8.445)
P('1
where
is any other estimate] and
and F ( x l8) is the distribution function of E for a fixed 8. A minimax estimate is the one for which sup p(8,B) = min
(8.447)
e
where B runs through the set 0 in which it takes values. A Bayes estimate is the one for which
k p ( B , d p ( R ) = min
(8.448)
where p(8) is an a priori given distribution function on the set 0. This means that one prescribes a priori more weight to some distribution of 8. An unbiased estimate is the one for which -
e = e.
(8.449)
An eficient estimate is the one for which
10
- 812
I 18 -
for any
il.
(8.450)
The Cramer-Rao inequality gives a lower bound for the variance of the estimate:
Random Fields Estimation Theory
316
where one assumes that (78) holds, (8.452)
and one assumes that d F has a density p ( x , 8 ) with respect to a u-finite measure v ( d z )
d F = p ( z ,8 ) v ( d z )
(8.453)
and p(x,8) is differentiable in 8. A mesure v on a set E in a measure space is called a-finite if E is a countable union of sets Ej with v(Ej)< 00. In particular, if measure v is concentrated on the discrete finite set of points y1, . . . ,yn, then (8.454)
The quantity I ( 8 ) is called the information quantity. A measure Y i s called concentrated o n a set A c E i f v ( B ) = v ( B n A ) for every B c E , that is, if v ( B ) = 0 whenever B n A = 0. Sometimes an estimate 8 is called efficient if the equality sign holds in (8.451). An estimate 6 is called sufficient if d F ( x , 8) = p ( z , 8 ) v ( d z ) and p(x1, 8 ) . . . p ( x n ,0 ) = g e ( d ) h ( z l , * *
* 7
Zn)
where ge and h are nonnegative functions, h does not depend on 8 and ge depends on 2 1 , . . . ,xn only through 8 = B(z1,. . . , xn). Suppose that the volume of the sample grows, i.e. n + 00. An estimate O(x1,.. . ,z n ) := 8, is called consistent if A
lim P
n-+m
(lin-el
> E)
L
=o
.
> 0.
for every
(8.455)
There are many methods for constructing estimates. Maximum-likelihood estimates of 8 is the estimate obtained from the equations (8.456)
,em),
Here 8 is a vector parameter c9 = ( e l , .. . the function L(8,x1,. . . ,x,) is called the likelihood function and is defined by
q e , z l , . . . , x n ) := I I ; = , ~ ( ~e),~ ,
(8.457)
317
Auxzlzary Results
where p ( x ,0 ) is the density of dF defined by dF = p ( x ,8)dx. Cramer proved that if: 1)
2)
ak lo&$z,e),
1-
k 5 3, exists for all 0 E 0 and almost all z E R1 5 g k ( z ) where g k ( z ) E L1(R1),k = 1 , 2 , and
SUPeGo .I-" ~,s ( x ) P ( ~ , < 00 3) I ( 0 ) is positive and finite for every 8 E 0 , where I ( 0 ) is defined by (8.452) with v(dx) = dx,
XI,.
then equation (8.456) has a solution . . , x,) which is a consistent, asymptotically efficient and asymptotically Gaussian estimate of 8. Here asymptotic efficiency is understood in the sense that inequality (8.451) becomes an equality asymptotically as n + 00. More precisely, define
Then the estimate
8 is asymptotically efficient if (8.459)
lim eff(8,) = 1.
n+w
The estimate
8,
is asymptotically Gaussian in the sense that
1(e)n1/2[ e c q , .
. . , x n ) - e]
N ( O ,1) as n --+ co
(8.460)
where N(0,l) is the Gaussian distribution with zero mean value and variance one, and we assumed for simplicity that B is a scalar parameter. We do not discuss other methods for constructing estimates (such as the method of moments, the minimum of x2 method, intervals of confidency, the Bayes estimates etc.). 8.4.6
Discrimination between hypotheses
One observes n values 2 1 , . . . ,x,, of a random quantity E , and assumes that there are two hypotheses Ho and H1 about E. If HO occurs then the probability density of the observed values is ! , ( X I , . . . , x , H o ) , otherwise it is fn(lcl,. . , ,z, H I ) . Given the observed values X I , . . . , x , one has to decide whether HO or H1 occured. Let us denote yi,i = 0,1, the decision that Hi occured. The decision yo is taken if ( X I , . . . ,x,) E Do, where DO is a certain domain in R". The choice of such a domain is the choice of the
I
I
Random Fields Estimation Theory
318
decision rule. If (21,. . . , 2), @ Do then the decision y1 is taken. The error the first kind is defined as
a10 of
Thus a10 is the probability to take the decision that where D1 = R" \DO. H1 occured when in fact HO occured. The error of the second kind is
The conditional probabilities to take the right decisions are
I
P(y0 Ho) = 1 - Q l O ,
I
P(y1 H1) = 1 - 001.
(8.463)
One cannot decrease both a10 and a01 without limit: if a10 decreases then D1 decreases, therefore DO increases and a01 increases. The problem is to choose an optimal in some sense decision rule. Let us describe some approaches to this problem. The Neyman-Pearson approach gives the decision rule which minimizes a01 under the condition that a10 5 a , where a is a fixed confidence level. Let us define the likelihood ratio (8.464) and the threshold c > 0 which is given by the equation
1
(8.465)
P { q z ) 2 c Ho} = a. The Neyman-Pearson decision rule is if t(x) < c then HO occured, otherwise H1 occured. If $ ( t ) is a monotone increasing function then l ( x ) q5 (!(x)) < $(c). In particular, the rule if logl(x) < logc then HO occured, otherwise
H1
(8.466)
< c if and only if
occured
(8.467)
is equivalent t o (8.466). Assume that the a priori probability po of Ho is known, so that the a priori probability of H I is 1 - PO. Then the maximum a posteriori probability decision rule is:
if l ( x ) <
__
1- P o
then Ho occured, otherwise H I occured.
(8.468)
Auziliary Results
319
If no a priori information about po is known, then one can use the m a x i m u m likelihood decision rule if [ ( x ) < 1 then Ho occured, otherwise H1 occured.
(8.469)
All these rules are threshold rules with various thresholds, and other decision rules are discussed in the litrature.
8.4.7 Generalized random fields A generalized random field J(x), x E ' R is defined as follows. Suppose that {41(x),. . . , &(x)} is a set of C r ( R ' ) functions, and to each such set there corresponds a random vector {<(&), . . . ,E(&)}, such that the distribution functions for all such vectors for all m and all choices of {+I,. . . , &} are consistent. Then one says that a generalized random field is defined. The theories of generalized random functions of one and several variN ables are similar. A linear combination Cj=lcj<j(x)of generalized random functions is defined by the formula
<
N
N
j=1
j=1
+
Similarly, if E Cm(R'),then +<(4) := E(+4), E(x+h)(+) := E (4(z - h ) ) . := m(4) is a continuous linear functional on C r , then m := is If called the mean value of 5,
m(4)=
1
zdF,
whereF(z) = P ( ( ( 4 ) < x} .
(8.470)
Recall that m(4) is called continuous in C r if &(x) -+ 4(x) implies m(&) + m($),where $n -+ 4 means that all & and 4 vanish outside of a fixed compact set E of R' and max,EE 14$) + 0 as n + co for all multiindices j. The correlation functional of a generalized random field is defined as
B ( 4 , + )= E*($)<(+). The covariance functional is defined as
Both functionals are nonnegative definite:
(8.471)
Random Fields Estimation Theory
320
If the random vectors {E(q51), . . . ,E(&)} for all m are Gaussian, then ,$ is called a generalized random Gaussian field. If B(q5,+) and m(q5) are continuous in C r (R‘) bilinear and, respectively, linear function& and R($,q5) 2 0 then there is a generalized random Gaussian field for which B ( 4 ,+) is the correlation functional and m(q5) is the mean value functional. A generalized random field is called homogeneous (stationary) if the random vectors { E (q51(z+ h ) ) . .,E (q5m(z + h ) ) }and { E (q51(z))> *. * C ($m(z))} have the same distribution function for any h E R‘. The mean value functional for a homogeneous generalized random field is 7.
1
m(4)= const
/
4da:
and its correlation functional is
w 4 , +)
=
/
i(X>?Z*(X)P(dX)
where &A) is the Fourier transform of q5(x>,and p(dX) is a positive measure on R‘ satisfies for some p 2 0 the condition
The measure p is called the spectral measurn of E. One can introduce the spectral representation of the generalized random field similar to (8.440). An important example of a Gaussian generalized random field is the Brownian motion. 8.4.8
Kalman filters
Let us start with the basic equation for the optimal Wiener filter for the filtering problem
a2h(t,T)
+
s”
h(t,+)Rs(7,T / ) ~ T=/ f(7, t ) , to< T
< t,
(8.473)
to
where U = s
+ n is the observed signal, s and n are uncorrelated, n * ( t ) n ( T ) = a26(t- T ) ,
- -
s ( t ) = n(t) = 0,
Rs(7,t ) := s * ( T ) s ( ~ ) , f(T,t):= u * ( T ) S ( t ) .
(8.474)
321
Auxiliary Results
The optimal estimate of s is (8.475) The error of this estimate
q t ) := s ( t ) - q t )
(8.476)
and t
D [s(t)]= R s ( t ,t ) -
Lo
h*(t,T ) ~ ( Tt ,) d ~ .
(8.477)
Let us assume that s ( t ) satisfies the following differential equation S(t) = A(t)s
+W ,
(8.478)
where for simplicity we assume all functions to be scalar functions, w to be white noise, and
w = 0, w * ( ~ ) w ( T=)Qb(t - T ) , Q = const > 0.
(8.479)
One could assume that
U = H s ( t ) + n,
(8.480)
where n is white noise, and H is a linear operator, but the argument will be essentially the same, and, for simplicity, we assume (8.479). Note that f(7,
t )=u*(T)S(t) = [S*(T)
+
n*(T)]S ( t ) = & ( T , t )
(8.481)
assuming that the noise n ( ~and ) the signal s ( t ) are uncorrelated n*(T)S(t)
= 0.
(8.482)
Also
+
(8.483)
A = 0.
(8.484)
R(T,t ) := u*('T)u(t) = Rs(7,t)026(t - T )
provided that (8.482) holds and n*(T)n(t)= 0 2 6 ( t - T ) ,
322
Random Fields Estimation Theory
To derive a differential equation for the optimal impulse function, differentiate (8.473) in t using (8.481): (8.485) For r
< t equation (8.483) becomes R(T,t) = Rs(T,t),
7
< t.
(8.486)
This and equation (8.473) yield after multiplication by h(t,t ) : t
h(t,t)R(T,t ) =
lo
h(t,t)h(t,T’)R(T,7’)dT’.
(8.487)
From (8.478) one obtains
a
-R,(T,t)
at
(8.488)
< t.
(8.489)
= A&(T,t),T
where one used the equation
s * ( r ) w ( t )= O for
T
To derive (8.489) note that equation (8.478) implies
s ( t ) = $(t,t o M t o )
+
t
lo
$(t,T ) W ( T ) d T
where $(t,r ) is the transition function for the operator
w*(t)w(T) = 0 for
T
< t,
(8.490)
$ - A . Since (8.491)
it follows from (8.490) that (8.489) holds. From (8.488), (8.481) and (8.473) one has (8.492) From (8.485), (8.492) and (8.487) one obtains
for T < t. Since R is positive definite (see (8.483)), equation (8.493) has only the trivial solution, one gets (8.494)
Auxiliary Results
323
This is a differential equation for the optimal filter h(t,7). Let us find a differential equation for the optimal estimate i ( t ) defined by (8.475):
+
b = h(t,t ) U ( t )
where
I, t
%U(T)dT at
(8.495)
f := g. From (8.494) and (8.495) one gets
s = h ( t ,t ) U ( t )+ =
t
lo +
[Ah&T ) - h(t,t ) h ( t ,T ) ]U ( T ) ~ T
h(t,t ) U ( t ) AB(t)- h(t,t ) i ( t )
= Ad(t)
+ h(t,t ) [ U ( t )
-
(8.496)
i(t)] .
This is the differential equation for i(t). The initial condition is b(to) = 0 according t o (8.475). Let us express h(t,t ) in terms of the variance of the error (8.373). If this is done then (8.496) can be used for computations. From (8.473) and (8.481) one obtains:
Pu t get
T
=
t in (8.497) assume that h ( t , ~is) real-valued and use (8.477) to h(t,t ) = K 2 D [ S ( t ) ].
(8.498)
Let us finally derive a differential equation for h(t,t ) . From (8.476), (8.478) and (8.496) it follows that = As
+ w - Ab
-
+ w - h(t,t ) n ( t ) .
= [ A- h(t,t ) ]s(t)
The solution t o (8.499) is rt
+
h(t,t ) [s(t) n(t) - B ( t ) ] (8.499)
Random Fields Estimation Theory
324
One obtains, using (8.499)’ that
+ S*(t)$(t)]
iL(t,t) = K 2[S*(t)i(t)
= 0-~2Re{ [A*(t)- h(t,t ) ]02h(t,t )
+ w*(t)i(t) - h ( t , t ) h * ( t ) i ( t ) } = [A*(t+ ) A ( t ) ]h ( t ,t ) - 2h2(t,t ) + Qc-’ + h2(t,t ) = [ A * ( t )+ A(t)]h ( t , t )- h2(t,t)+ Q o - ~ , (8.501) where we assumed that w, n and s ( t o ) are uncorrelated, took into account that h(t,t ) > 0 (see (8.498)) and used formula (8.500) to get 1
Q
w*(t)S(t)= -Q$(t,t)= 2 2
(8.502)
and (8.503) Note that $(t,t ) = 1 and the f factor in (8.502) and (8.503) appeared since we used the formula b(t - ~ ) f = d if(t). ~ Equation (8.501) is the Riccati equation for h(t,t). Equations (8.494), (8.496) and (8.501) define Kalman’s filter. This filter consists in computing the optimal estimate (8.475) by solving the differential equation (8.496) in which h(t,t ) is obtained by solving the Riccati equation (8.501). The initial data for equation (8.501) is
st”,
h(t0,to) = a 2 v [Z(to)]= a-’v [ s ( t ) ] = O - 2 R s ( t 0 , t o ) .
(8.504)
Here we used equation (8.476) and took into account that $(to) = 0. The ideas of the derivation of Kalman’s filter are the same for random vector-functions. In this case A ( t ) is a matrix. For random fields there is no similar theory due to the fact that there is no causality in the space variables in contrast with the time variable.
Appendix A
Analytical Solution of the Basic Integral Equation for a Class of 0ne-Dirnensional P roblerns In this Section we develop the theory for a class of random processes, because this theory is analogous to the estimation theory for random fields, developed in the next Section. Let
rL
where the kernel R ( z , y ) satisfies the equation QR = Pd(x - y ) . Here Q and P are formal differential operators of order n and m < n, respectively, n and m are nonnegative even integers, n > 0, m L 0, Qu := qn(s)u(n) xyIt q j ( z ) u ( j ) ,Ph := h(m) C > . l p j ( z ) h ( j ) ,qn(z) 2 c > 0, the coefficients q j ( x ) and pj(x) are smooth functions defined on R,6(z) is the delta-function, f E H*(O, L ) , a := H a is the Sobolev space. An algorithm for finding analytically the unique solution h E fi-*(O, L ) to (*) of minimal order of singularity is given. Here f i - * ( O , L ) is the dual space to H"(0, L ) with respect to the inner product of L2(0,L ) . Under suitable assumptions it is proved that R : f i - * ( O , L ) -+ H*(O, L ) is an isomorphism. Equation (*) is the basic equation of random processes estimation theory. Some of the results are generalized to the case of multidimensional equation (*), in which case this is the basic equation of random fields estimation theory. The presentation in Appendix A follows the paper [Ramm (2003)l.
+ y,
+
325
Random Fields Estimatiori Theory
326
A.l
Introduction
In Chapter 2 estimation theory for random fields and processes is constructed. The estimation problem for a random process is as follows. Let u(z)= s(z) n(z) be a random process observed on the interval (OIL), ~ ( zis) a useful and n ( z )is noise. Without .loss of generality we as-signal sume that s(z) = n(z)= 0, where the overbar stands for the mean value, u*(z)u(y) := R(z,y), R(z,y) = R(y,z), u*(z)s(y) := j(z,y), and the star here stands for complex conjugate. The covariance functions .R(z,y) and f ( z , y ) are assumed known. One wants to estimate s(z) optimally in the sense of minimum of the variance of the estimation error. More precisely, one seeks a linear estimate
+
rL
such that
This is a filtering problem. Similarly one can formulate the problem of optimal estimation of ( A s ) ( z ) ,where A is a known operator acting on s(z). If A = I , where I is the identity operator, then one has the filtering problem, if A is the differentiation operator, then one has the problem of optimal estimation of the derivative of s, if As = s(z+so), then one has an extrapolation problem, etc. The kernel h ( z ,y) is, in general, a distribution. As in Chapter 1, one derives a necessary condition for h to satisfy (A.2): rL
Since z enters as a parameter in (A.3), the basic equation of estimation theory is:
The operator in L2(0,L ) defined by (A.4) is symmetric. In Chapter 1 it is assumed that the kernel
(-4.5)
Analytical Solution of the Basic Integral Equation
327
where P(X) and Q(X) are positive polynomials, @(z,y , A) and dp(X) are spectral kernel and, respectively, spectral measure of a selfadjoint ordinary differential operator I in L2(R), degQ(X) = q, degP(X) = p < n,p 2 0, ordI := cr > 0, z,y E RT,T 2 1, and I is a selfadjoint elliptic operator in L2(RT). It is proved in Chapter 4 that the operator R : f i - " ( O , L ) -+ H"(0, L ) , a := F a is an isomorphism. By H"(0, L ) the Sobolev space W"i2(0,L ) is denoted, and fi-"(O,L) is the dual space to H " ( 0 , L ) with respect to L2(0,L) := Ho(O,L) inner product. Namely, fi-"(O,L) is the space of distributions h which are linear bounded functionals on H"(0, L ) . The norm of h E f i - " ( O , L ) is given by the formula
where (h,g) is the L2(0,L ) inner product if h and g belong to L2(0,L ) . One can also define f i - " ( O , L ) as the subset of the elements of H-"(R) with support in [0, L ] . W e generalize the class of kernels R ( x ,y) defined in (A.5): we do not use the spectral theory, do not assume I2 t o be selfadjoint, and do not assume that the operators Q and P commute. We assume that
where Q and P are formal differential operators of orders n and m respectively, n > m 2 0, n and m are even integers, b(z) is the delta-function,
j=O
j=O
R. We also assume that the equation Qu = 0 has linearly independent solutions u; E L2(-m,0) and 5 linearly independent solutions uj' E L2(0,m).In particular, this implies that if Qh = 0, h E H"(R), a > 0, then h = 0, and the same conclusion holds for h E H p ( R ) for any fixed real number p, including negative p, because any solution to the equation Qh = 0 is smooth: it is a linear combination of n linearly independent solution to this equation, each of which is smooth and none belongs to L2(R). q j and p j are smooth functions defined on
R a n d o m Fields Estimation Theory
328
Let us assume that R ( z ,y) is a selfadjoint kernel such that
c1ll'pIlc I ( R V , ~I)c~ll'p11?, c1 = const > 0,
V'p E C r ( R ) ,
(A.9)
where (.,.) is the L2(R) inner product, ll'pll- := ll'pllH-..(R, := [ l ' p [ l - a , := 71--m, Il'pJ1p := I l ' p l l H ~ ( R ) ,and we use below the notation llpll+ := Q
Il'pll~..(o,~, := ll'pII~+. The spaces Ha(O,L) and fi-.(O,L) are dual of each other with respect to the L 2 ( 0 , L )inner product, as was mentioned above. If 'p E H-a(O, L ) , then 'p E H-"(IR), and the inequality (A.9) holds for such 'p. By this reason we also use (for example, in the proof of Theorem A . l below) the notation H - for the space f i P a ( O , L ) . Assumption (A.9) holds, for example, for the equation Rh =
exp(-lz - yl)h(y)dy= f (z),
I z I 1.
-1
Its solution of minimal order of singularity is
h ( z ) = (-f "+f)/2+6(z+ 1) [-
f'( -1)
+f (-1)]/2 +J(z
-
+f (1)]/2.
1) [f'( 1)
One can see that the solution is a distribution with support at the boundary of the domain D if the following inequalities (A.lO) and ( A . l l ) hold:
I
~311~II-a+n llQ*~lI-aI ~ 4 l l ~ l l - a + nc3? ,
c4 = const
> 0, VP
E Cr(R)?
(A.lO)
(A.ll) where Q* is a formally adjoint to Q differential expression, and c5 and q are positive constants independent of 'p E Cr(IR). The right inequality ( A . l l ) is obvious because ordPQ* = n + m , and the right inequality (A.lO) is obvious because ordQ* = n. Let us formulate our basic results.
Theorem A . l If (A.9) holds, then the operator R, defined in (A.5), is an isomorphism of H - a ( O , L ) onto Ha(O,L ) , Q = n--m 2 ' Theorem A.2 If (A.7), (A.10) and ( A . l l ) hold, then (A.9) holds and R : H - a ( O , L ) -+ H"(0, L ) is an isomorphism. Theorem A.3 If (A.7), (A.10) and ( A . l l ) hold, and f E H"(O,L), then the solution to (A.4) in H - . ( O , L ) does ex&, is unique, and can be
Analytical Solution of the Basic Integral Equation
329
calculated analytically b y the following formula: h=
lz
G ( z ,y ) Q f d y
+
n-CX-1
+
[aT(-l)jGf)(z,O) a ~ ( - l ) j G f ) ( L x ,) ] , j=O
(A.12) where af are some constants and G ( x , y ) is the unique solution to the problem
PG = d(x - y),
G(x,y ) = 0 for IC < y .
(A.13)
The constants a: are uniquely determined from the condition h ( x ) = 0 for x > L. Remark A . l The solution h E H-.(O,L) (A.4) of minimal order of singularity.
is the solution to equation
Remark A.2 If P = 1 in (A.7) then the solution h to (A.4) of minimal order of singularity, h E H - s ( O , L ) , can be calculated by the formula h = QF, where F is given by (A.22) (see below) and u+ and u- are the unique solutions of the problems Qu+ = 0 if x > L, u y ) ( L )= f ( j ) ( L ) 0, F j 5 n (j) 5 - 1, u+(m) = 0 , and Qu- = 0 if x < 0, u- (0) = f ( j ) ( O ) , 0 I j I - 1, u-(-m) = 0. A.2
Proofs
Proof of Theorem A . l . The set Cr(0, L ) is dense in k-a(O, L ) (in the norm of H - " ( R ) ) . Using the right inequality (A.9),one gets:
(A.14) by the symmetry of R in L2(0,L). This implies IIRIIH-+H+5 c2. Using the left inequality (A.9),one gets: cl((h((2 5 / R h ( / + / ~ hso ~J-,
(A.15)
1
IIR-lIIH++H- I --
(A.16)
C1
Consequently, the range Ran(R) of R is a closed subspace of H+. In fact, Ran(R) = H+. Indeed, if Ran(R) # H + , then there exists a g E H - such
R a n d o m Fields Estimation Theory
330
that 0 = (R$,g) V$J E H - . Taking $J = g and using the left inequality (A.9) one gets (Ig(1- = 0, so g = 0. Thus Ran(R) = H+. Theorem A.l is proved. 0
Proof of Theorem A.2. From (A.7) and (A.8) it follows that the kernel R(z,y) defines a pseudodifferential operator of order -2a = m - n. In particular, this implies the right inequality (A.9). In this argument inequalities (A.lO) and ( A . l l ) were not used. Let us prove that (A.lO) and ( A . l l ) imply the left inequality (A.9). One has
because ordQ* = n. Inequality (A.lO) reads:
If (A.18) holds, then Q* : c4 are positive constants. H-"(R) is an isomorphism of H-"+"(R) onto H-"(R) provided that N(Q) := {w : Qw = 0, w E Ha(&!)}= (0). Indeed, if the range of Q* is not all of H-"(R), then there exists an w # 0, w E H"(R) such that (Q*cp,w)= 0 V p E C r ( R ) , so Qw = 0. If Qw = 0 and w E H"(R), then, as was mentioned below formula (A.8), it follows that w = 0. This proves that Ran(Q*) = H-"(R). Inequality ( A . l l ) is necessary for the left inequality (A.9) to hold. Indeed, let $ = Q*cp, cp E C r ( R ) , then (A.9) implies
where
c3
and -+
c~II(PII'L+~ I ~ l l Q * ' ~ lI l L(RQ*v,Q*v)= (QRQ*v, 'P) = (P&*CP, 'P),
(A.19) where c > 0 here (and elsewhere in this paper) stands for various estimation constants. Because -a n= inequality (A.19) is the left inequality (A.11). The right inequality ( A . l l ) is obvious because the order of the operator PQ* equals to n m. Let us prove now that inequalities ( A . l l ) and (A.lO) are sufficient for the left inequality (A.9) to hold. Using the right inequality (A.lO) and the left inequality (A.ll), one gets:
+
T,
+
Analytical Solution of the Basic Integral Equation
331
Let us prove that the set {$I} = { Q * [ P } V ~ ~ Cis~ dense ( W ) in fi-a(OIL). Assume the contrary. Then there is an h E H-"(O, L ) , h # 0 , such that (Q*cp,h ) = 0 for all cp E C r ( R ) . Thus, (cp, Qh) = 0 for all cp E Cr(IR). Therefore Q h = 0, and, by the argument given below formula (A.8), it follows that h = 0. This contradiction proves that the set { Q * ( P } V ~ ~ C ~ ( W ) is dense in f i P a ( O , L ) . Consequently, (A.20) implies the left inequality (A.9). The right inequality (A.9) is an immediate consequence of the observation we made earlier: (A.7) and (A.8) imply that R is a pseudodifferential operator of order -2a = -(n m). Theorem A.2 is proved.
+
Proof of Theorem A.3. Equations (A.4) and (A.7) imply
P h = g := QF.
(A.21)
Here
f,
x < 0, O<x
U+,
x > L,
21-,
(A.22)
where
Qu- = 0 , x < 0 ,
(A.23)
x > L,
(A.24)
Qu+ = 0 ,
and u- and u+ are chosen so that F E H a @ ) . This choice is equivalent to the conditions: U'j'(0) = j(j)(O), 0
5 j 5 a - 1,
@ ( L ) = f ( j ) ( L ) , 0 5 j 5 (31 - 1. If F E H a ( R ) , then g := QF E Ha-n(R) one gets:
g
= ~j
+
c
n--a-1
j=O
=
(A.25) (A.26)
H - v ( B ) , and, by (A.22),
[a;cW(x) + aj+d(j)(z- L ) ] ,
where aj' are some constants. There are n - a = the same number of constants a;.
(A.27)
9constants aj'
and
Random Fields Estimation Theory
332
Let G ( x ,y ) be the fundamental solution of the equation
which vanishes for x
PG = b(x - y ) in IR,
(A.28)
G ( x ,y ) = 0 for z < y .
(A.29)
< y:
Claim. Such G ( x , y ) exists and is unique. It solves the following Cauchy problem:
PG = 0 ,
x >y,
Gji’(x,y)l
0
= bj,m-l,
Ij 5 m -
1, (A.30)
Z=y+O
satisfies condition (A.29), and can be written as m
G ( z , y )= & ( Y ) % ( X ) ,
z
> Yl
(A.31)
j=l
where cpj(x), 1 I j 5 m, is a linearly independent system of solutions to the equation: P p = 0.
(A.32)
Proof of the claim. The coefficients cj(y) are defined by conditions (A.30) : m
C c j ( y ) ~( k.)(jY ) = bk,m-l,
o I k I m - 1.
(A.33)
j=1
The determinant of linear system (A.33) is the Wronskian W(cp1,.. . ,cpm) # 0, so that c j ( y ) are uniquely determined from (A.33). The fact that the solution to (A.30), which satisfies (A.29), equals to the solution t o (A.28) - (A.29) follows from the uniqueness of the solution to (A.28) - (A.29) and (A.30) - (A.29), and from the observation that the solution t o (A.28) - (A.29) solves (A.30) - (A.29). The uniqueness of the solution to (A.30) - (A.29) is a well-known result.
Analytical Solution of the Basic Integral Equation
333
Let us prove uniqueness of the solution to (A.28) - (A.29). If there were two solutions, GI and Gz, to (A.28) - (A.29), then their difference G := GI - G2, would solve the problem:
P G = 0 in R,
G = 0 for z < y.
(A.34)
By the uniqueness of the solution to the Cauchy problem, it follows that G == 0. Note that this conclusion holds in the space of distributions as well, because equation (A.34) has only the classical solutions, as follows from the 0 ellipticity of P. Thus the claim is proved.
From (A.21) and (A.27) - (A.29) one gets:
(A.35) It follows from (A.35) that h E H - a ( R ) and h = O for z
< 0,
(A.36)
that is, (h,cp) = 0 Vcp E Cp(R) such that suppcp c (-m,O). In order to guarantee that h E H d a ( 0 ,L ) one has to satisfy the condition
h = 0 for z > L.
(A.37)
Conditions (A.36) and (A.37) together are equivalent to supph c [0,L]. Note that although Qf E Ik-* (0, L ) , so that Qf is a distribution, the integral J : G(z, y)Qf dy = J-", G(z, y)Qf dy is well defined as the unique solution to the problem Pw = Qf, w = 0 for z < 0. Let us prove that conditions (A.36) and (A.37) determine the constants af, o 5 j 5 - 1, uniquely. If this is proved, then Theorem A.3 is proved, and formula (A.35) gives an analytical solution to equation (A.4) in Ik-a(O,L) provided that an algorithm for finding uf is given. Indeed, an algorithm for finding G(z,y) consists of solving (A.29) - (A.30). Solving (A.29) - (A.30) is accomplished analytically by solving the linear algebraic system (A.33) and then using
9
334
Random Fields Estimation Theory
formula (A.31). We assume that m linearly independent solutions cpj(z) to (-4.32) are known. Let us derive an algorithm for calculation of the constants u f , 0 5 j 5 - 1, from conditions (A.36) - (A.37). Because of (A.29), condition (A.36) is satisfied automatically by h defined in (A.35). To satisfy (A.37) it is necessary and sufficient to have
9
l o G(z,
y)&f dy
+ H ( z ) = 0 for z > L.
(A.38)
By (A.31), and because the system { c p j } l ~ j ~ is m linearly independent, equation (A.38) is equivalent to the following set of equations:
(A.39) Let us check that there are exactly m independent constants uf and that all the constants uf are uniquely determined by linear system (A.39). If there are m independent constants uf and other constants can be linearly represented through these, then linear algebraic system (A.39) is uniquely solvable for these constants provided that the corresponding homogeneous system has only the trivial solution. If f = 0, then h = 0, as follows from Theorem 1.1, and g = 0 in (A.27). Therefore a: = 0 V j , and system (A.39) determines the constants a: V j uniquely. Finally, let us prove that there are exactly m independent constants u;. Indeed, in formula (A.21) there are linearly independent solutions u; E L2(-Co,O), so
5
(A.40) j=1
and, similarly, u+ in (A.21) is of the form
(A.41)
Analytical Solution of the Basic In.tegra1 Equation
335
where uj' E L2(0,m). Condition F E H a @ ) implies
and
= 7 indeEquations (A.42) and (A.43) imply that there are pendent constants by and 7 independent constants b: , and the remaining n-m constants b; and b; can be represented through these m constants by consolving linear systems (A.42) and (A.43) with respect to, say, first stants, for example, for system (A.42), for the constants b y , 1 5 j 5 This can be done uniquely because the matrices of the linear systems (A.42) and (A.43) are nonsingular: they are Wronskians of linearly independent j solutions {ui}1 G n-m 5 7 and { u + The constants a f can be expressed in terms of b: and f by linear relations. Thus, there are exactly m independent constants af . This completes 0 the proof of Theorem A.3.
2
y.
Remark A.3 In Chapter 5 a theory of singular perturbations for the equations of the form
&+Rh, =f
(A.44)
is developed for a class of integral operators with a convolution kernels R ( x , y ) = R ( x - y ) . This theory can be generalized to the class of kernels R ( x ,y ) studied in the present paper. The basic interesting problem as: f o r any E > 0 equation (A.44) has a unique solution h, E L2(0,L ) ; how can one find the asymptotic behavior of h, as E --+ 0 2 The limit h of h, as E -+ 0 should solve equation Rh = f and, in general, h is a distribution, h E H P a ( O , L ) . The theory presented in Chapter 5 allows one to solve the above problem for the class of kernels studied here.
Remark A.4 Theorems A.1 and A.2 and their proofs remain valid in the case when equation (A.4) i s replaced by the R ( x , y ) h ( y ) d y= f ,
x E D.
(A.45)
336
Random Fields Estimation Theory
Here D c R', r > 1, is a bounded domain with a smooth boundary S, is the closure of D , R ( x ,y ) solves (A.7),where P and Q are uniformly elliptic differential operators with smooth coeficients, o r d P = m 2 0 , ordQ = n > m, equation Qh = 0 has only the trivial solution in H p ( R ' ) f o r any fixed real number p. Under the above assumptions, one can prove that the operator defined by the kernel R ( x ,y ) is a pseudodifferential elliptic operator W e do not assume that P and/or Q are of order -20, where Q := selfadjoint or that P and Q commute. An analog of Remark 2.1 holds f o r the multidimensional equation (A.44)as well. Equation (A.45)is the basic integral equation of random fields estimation theory.
y.
Appendix B
Integral Operators Basic in Random Fields Estimation Theory
B.l
Introduction
Integral equations theory is well developed starting from the beginning of the last century. Of special interest are the classes of integral equations which can be solved in.closed form or reduced to some boundary-value problems for differential equations. .There are relatively few such classes of integral equations. They include equations with convolution kernels with domain of integration which is the whole space. These equations can be solved by applying the Fourier transform. The other class of integral equations solvable in closed form is the Wiener-Hopf equations. Yet another class consists of one-dimensional equations with special kernels (singular integral equations which are reducible to Ftiemann-Hilbert problems for analytic functions, equations with logarithmic kernels, etc). (See e.g. [Zabreiko et al. (1968)], [Gakhov (1966)j.) In Chapter 5 a new class of multidimensional integral equations is introduced. Equations of this class are solvable in closed form or reducible to a boundary-value problem for elliptic equations. This class consists of equations (B.3) (see below), whose kernels R(z,y) are kernels of positive rational functions of an arbitrary selfadjoint elliptic operator in L2(Rn),where n 2 1. In Appendix A this theory is generalized to the class of kernels R(z,y) which solve problem QR = P6(z- y), where 6(z) is the delta-function, Q and P are elliptic differential operators, and z E R1.Ellipticity in this case means that the coefficient in front of the senior derivative does not vanish. In Appendix A integral equations (B.3) with the kernels of the above class are solved in closed form by reducing them to a boundary-value problem for ODE. Our aim is to generalize the approach proposed in Appendix A to the multidimensional equations (B.3) whose kernel solves equation QR = Pd(x - y) in R",where n > 1. This is
337
338
Random Fields Estimation Theory
not only of theoretical interest, but also of great practical interest, because, as shown in Chapter 1, equations (B.3) are basic equations of random fields estimation theory. Thus, solving such equations with larger class of kernels amounts to solving estimation problems for larger class of random fields. The kernel R(z,y) is the covariance function of a random field. The class of kernels R, which solve equation QR = PS(x - y) in R", contains the class of kernels introduced and studied in Chapters 1-4. Our theory is not only basic in random fields estimation theory, but can be considered as a contribution to the general theory of integral equations. Any new class of integral equations, which can be solved analytically or reduced to some boundary-value problems is certainly of interest, and potentially can be used in many applied areas. For convenience of the reader, the notations and auxiliary material are put in Section B.4. This appendix follows closely the paper [Kozhevnikov and Ramm (2005)l. Let P be a differential operator in Rn of order p1
P := P ( 2 ,D ) :=
c
a,
( 2 )D a ,
l45P
where a, (x) E C" (R") The polynomials
P ( 2 ,C) :=
.
C a,
E"
(2)
PO ( 2 ,6) :=
and
149
C a, (x)E" lffl=P
are called respectively symbol and principal symbol of P. Suppose that the symbol p(z,<) belongs to the class SG(Pyo)(Rn)consisting of all C" functions p ( 2 ,<) on R"x R", such that for any multiindices a ,p there exists a constant Ca,p such that
lD,aD;P(z,C)I
I c,,p
(<)P-'pl
(x)- I 4
1
x , l E R",
(0 :=(I + IEl2 11/2
(B.1) It is known (cf. [Wloka et al. (1995), Prop. 7.21) that the map P ( x , D ) : S ( R n ) 4 S(Rn)is continuous, where S(Rn) is the Schwartz space of smooth rapidly decaying functions. Let HS(R") (s E R) be the usual Sobolev space, It is known that the operator P ( x ,D ) acts naturally on the Sobolev spaces, that is, the operator P ( x ,D ) is (cf. [Wloka et al. (1995), Sec. 7.61) a bounded operator: H S ( R n )-+Hs-P(Rn) for all s E R. The operator P ( x , D ) is called elliptic, if p o ( z , < ) # 0 for any z E R", E E Rn \ (0).
Integral Operators Basic in Random Fields Estimation Theory
339
Let P (2,D) and Q (z, D) be both elliptic differential operators of even orders p and v respectively, 0 I p < v, with symbols satisfying (B.l) (for Q (5, 0 ) we replace p and p in (B.l) respectively by q and v). The case p 2 v is a simpler case which leads to an elliptic operator perturbed by a compact integral operator in a bounded domain. W e assume also that P (x,D ) and Q (x,D ) are invertible operators, that is, there exist the inverse bounded operators P-' ( z , D ) f H"-P (R") ---f H S(R") and Q-l(z, D): Hs-" (R") .+ H S (Rn)for all s E R. Let R := Q-l (z, 0 )P (2,D) . The invertibility of P (z, D) and Q (x,D) imply that R is an invertible pseudodifferential operator of negative order p - v acting from H" (R") onto H"+"-P (R") (s E R) . Since P and Q are elliptic, their orders p and v are even for n > 2. If n = 2, we assume that p and v are even numbers. Therefore, the number a := (v - p ) /2 > O is an integer. Let 52 denote a bounded connected open set in R" with a smooth boundary dR (Cm-class surface) and its closure in L2(R), = R U d o . The smoothness restriction on the domain can be weakened, but we do not go into detail. The restriction Rn of the operator R to the domain R c R" is defined as
a
n
(B.2)
Rn := rnRen-,
where en- is the extension by zero to R- := R n \ n and rn is the restriction t o R. It is known (cf. [Grubb (1990), Th. 3.11, p. 3121) that the operator Rn defines a continuous mapping
Rn : H" (52) + H"+"-'
(52)
(S >
-1/2),
where H S (52) is the space of restrictions of elements of H" (R") to 52 with the usual infimum norm (see Section B.4). The pseudodifferential operator R of negative order p - v and its restriction Rn can be represented as integral operators with kernel R (z,y ) :
Rh=/R(s,y)h(y)dy, Rnh=/n(z,Y)h(Y)dY
(.€a),
n
W"
where R (z, y) E C" (W"x R" \ Diag) , Diag is the diagonal in R" x R", Moreover, R (2,y) has a weak singularity:
IR (2,!Ill I c 1%
- Yl-'
n+p-v
I u < n.
Random Fields Estimation Theory
340
+
For n p - v < 0 , R (x,y) is continuous. Let y := n + p - v and rZy := 1x - yI 4 0. Then R(z,y) = O(r;;) if n is odd or if n is even and v < n, and R ( x ,y) = O(r;J logr,,) if n is even and v > n. In Chapter 1, the equation
is derived as a necessary and sufficient condition for the optimal estimate of random fields by the criterion of minimum of variance of the error of the estimate. The kernel R(x,y) is a known covariance function, and h(x,y) is the distributional kernel of the operator of optimal filter. The kernel h(z,y) should be of minimal order of singularity, because only in this caSe this kernel solves the estimation problem: the variance of the error of the estimate is infinite for the solutions to equation (B.3), which do not have minimal order of singularity. In Chapters 1-4, equation (B.3) was studied under the assumption that P and Q are polynomial functions of a selfadjoint elliptic operator defined in the whole space. In Appendix A some generalizations of this theory are given. In particular, the operators P and Q are not necessarily selfadjoint and commuting. In this appendix an extension to multidimensional integral equations of some results from Appendix A is given. We want to prove that, under some natural assumptions, the operator Rn is an isomorphism of the space H C a ( R ) onto H a ( R ) , where a = (v - p ) /2 > 0, and H," (R) , s E R, denotes the subspace of H 3 (Rn) that consists of the elements supported in To prove the isomorphism property, we reduce the integral equation (B.3) t o an equivalent elliptic exterior boundary-value problem. Since we look for a solution u belonging to the space H a (R-) = H ( v - p ) / 2 (R-) , and the differential operator Q is of order v, then Qu should belong to some Sobolev space of negative order. This means that we need results on the solvability of equation (B.3) in Sobolev spaces of negative order. Such spaces as well as solvability in them of elliptic differential boundary value problems in bounded domains have been investigated in b i t b e r g (Iggfj)] and later in [Kozlov et al. (1997)]. The case of Pseudodifferential boundary value problems has been studied in [Kozhevnikov (2001)). In [Erkip and Schrohe (1992)l and in [Schrohe (1999)] the solvability of elliptic differential and pseudodifferential boundary value problems for unbounded manifolds, and in particular for exterior domains, has been established.
a.
Integral Operators Basic in Random Fields Estimation Theory
341
These solvability results have been obtained in weighted Sobolev spaces of positive order s. To obtain the isomorphism property, we need similar solvability results for exterior domain in the weighted Sobolev spaces of negative order. One can find in Section B.4 the definition of these spaces (cf. [Roitberg (1996)l). B.2
Reduction of the basic integral equation to a boundaryvalue problem
In Theorem B.l the differentiation along the normal to the boundary 03, is used. This operator is defined in Section B.4.
Theorem B . l Integral equation (B.3) is equivalent to the following system (B.4), (B.5), (B.6):
Qu=O in RD i u = D i f ondR, Ph = QF,
h
E
O<j_
H;" (0),
(B.4) 03.5)
where u E Ha(0-) is an extension o f f : F E H a (Rn),
F
:=
f E Ha(R) inR, u E H a (R-) in R-
Proof. Let h E HCa (0)solve equation (B.3), Rnh = f E H a (R). Let us define F := Q-IPh. Since h E H i a (a),it follows that Ph E H-"-" (Rn) and F = Q-lPh E H-a+v-P(Rn) = H a ( R " ) . We have f = Rnh = rnQ-'Ph = rnF, so F is an extension o f f . Therefore, F can be represented in the form (B.6). Furthermore, since F = Q-IPh, then Ph = QF, that is, h solves (B.5). Since h E H;"(R), then QF = Ph E HFa-" ( R ) . It follows, that Qu = 0 in R-. Since F E H a (R") , we get D i u = Dhf on Nl-, 0 5 j 5 a - 1. This means that u E H a (R-) solves the boundaryvalue problem (B.4). Thus, it is proved that any solution to (B.3) solves problem (B.4), (B.5). Conversely, let a pair (u,h) E H a (R-) x HGa (R) solve system (B.4), (B.5), (B.6). Since Ph = QF, then Rh = Q-'Ph = F. It follows from (B.6) that Rnh = Rhl, = F J , = f , i.e. h solves (B.3). 0 Remark B . l If p > 0, the boundary value problem (B.4) is underdetermined because Q is an elliptic operator of order u which needs u / 2 boundary
Random Fields Estimation Theory
342
conditions, but we have only a ( a < v/2) conditions in (B.4). Therefore, the next step is a transformation of equation (B.5) into 1.112 extra boundary conditions to the boundary value problem (B.4). This will be done in Theorem B.2. Let us define
K.
([',A)
:= (1
+ 1t'I2+ A2)"2.
Choose a function
E-
p ( 7 ) E S ( R ) with suppF-'p c and p ( 0 ) = 1. Let a > 2sup (8,p (T)1. Let denote a family (A E R+, t E Z) of order-reducing
pseudodifferential operators (K
(c', A) p (*))
:=
F 1 x + ([,
+ i[n)t are their
symbols.
A) 3, where
x+ (t,A)
It has been proved in
(-9 onto itself and has the following isomorphism
[Grubb (1996), Sec. 2.51 that the operator Z:"+,x maps the space So R
{u E S (R") : supp u c -7 R,
:=
:=
properties for s E R:
E"+,x: HG (R?)
1 : H,SFt (RY).
03.8)
It is known ([Grubb (1996)], [Schrohe (1999)l) that using Z:"+,x and an appropriate partition of unity one can obtain, for sufficiently large A, the operator A$ which is an isomorphism:
A: : H" (Rn)= H S - t (R") , Vs E R, and
A;
: H,S
(52) 1: H,S-t (0), Vs E R.
(B.9)
Lemma B . l Let P (2, D ) be an invertible differential operator of order p, that is, there exists the inverse operator P-' ( x , D ) which is bounded: P-' (z, D ) : HS-W (R") --+ H s (R") for all s E R. Then a solution h to the equation
P (z, D ) h = g ,
gE
Hra-' (0)
belongs to the space HF" (R) i f and only if g satisfies the following p/2 boundary conditions: ranDiA,a-p'2P-1 (z, D ) g = 0
( j = 0, ..., p/2 - 1 ) .
Integral Operators Basic in Random Fields Estimation Theory
343
proof. Necessity. Let h = P-’ (x,0 )g, h E El;” (O), solve the q u a tion P ( z , D )h = g, g € Hra-’ (0). By (B.9), we have Ala-’/’h € H f 2 (R) . Therefore, ranD~A~”-’/’h = 0 ( j = 0 , . . . , p /2 - 1). Sufficiency. Assume that the equalities ranDjn A-”-’“/h + = 0 ( j = 0, ...,p / 2 - 1) hold. Since g E Hta-’ (0)C H-”-’ ( R n ) , we have h = P-l (z, D ) g E H-” (&in). Therefore, 9 := A+”-”/’h E HP/’ (IFF). Since raaDi9 = 0 ( j = 0, ...,p/2 - l), we have 8 = 9+ 8-, where Q+ := ea-r@ E H f ’ (52) and 9- := earn-9 E Hf’ (52-) . Since A”: : H f ’ (52) E H;“ (Q) , it follows that E HGa (0).Moreover, is a differential operator with respect t o the variable zn, hence supp9- c implies ~ u p p A ” + ~ 9cSince P is a differential operator,
+
a-
n-.
9- c suppA”,”8-
supp (PA:”)
c a-.
On the other hand, we have
+
@ := (PA”,”) 9 = (PA”,”)
For any cp E C r
(9+ 9-)= (PA:”)
Thus, s ~ p p ( P A f ; / ~9)
(
+ ( (PA”,/’)
E
c
n. It follows that
A:”9-
PA:”
c 80. Therefore, = 0 for 9- E CF (0-). Since C r
one gets
=
9-.
>
=0
( (PA”,”)
9-,’p)
su~p(PAf;’~)9- c 80.
0 *.0(R-)
C F ( 0 - ) , we have
supp PA”/2 \k+
+ (PA.;/’)
(R-)one has:
O = (@,cp) = ( ( P A f ; ” )
For any 8-
9+
9-
C”(Rn) and
€
= 0. Since P is invertible,
is dense in H f ’
(a2_),
for 9- E H f 2 (a2_). It follows that
h = A”,”@ = A’;/’++
+ A +v/’ 9- = II”,’~Q+ E H;” ( Q ) .
Lemma B . l is proved. 0 Let F E C“ (52) n S Assume that F has finite jumps Fk of the normal derivative of order k (k = 0,1, ...) on 80. For x’ € 852, we will use the following notation:
p-).
Fo
:= [F],,
:=
.:yo
( F (x’+En) - F
- En)),
Random Fields Estimation Theory
344
(n-)
(9
Let f E C" and u E S , and define Y k f (2') := ranDkf (x'), T k u (x'):= ranDkU (Z') . Let ban denote the Dirac measure supported on ds2, that is, a distribution acting as
It is known that for any differential operator Q of order u there exists a v
C Q j D:,
representation Q =
where Qj is a tangential differential operator
j=O
of order v - j (cf Section B.4). We denote by { D a F (x)} the classical derivative a t the points where it exists. The following Lemma B.2 is essentially known but for convenience of the reader a short proof of this lemma is given.
Lemma B.2
The following equality holds f o r the distribution Q F :
(B.10)
Proof, Let cos (nxj) denote cosine of the angle between the exterior unit normal vector n t o the boundary as2 of s2 and the xj-axis. We use the known formulas
1
1
n
an
e d x=
u (x)cos (nq)do,
u (x)E C"
,
j = 1, ..., n,
w ( x ) ~ C r p - ) j = 1 , . , . , n. n-
an
where do is the surface measure on dfl. Applying these formulas to the products u (x)cp (x) and w (x)p (x) where p (z) E CP (IF') u, (z)E C" , w (x)E CF (K) , we get
(9
1
*cp
n
9X.j
(x)dx = -
1
u (x)*dx
n
dXj
+
1
u (x)cp (x)cos (nxj) d a
an
j = 1,...,n, (B.ll)
345
Integral Operators Basic in Random Fields Estimation Theory
1zcp
n-
w (x)cp (x) cos (nxj) da
(x) dx = n-
an j = 1,..,,n. (B.12)
By (B.ll), (B.12), we have
This means,
It follows, D,F = {D,F} -iF&n. F'urthermore, using the last formulawe have D:F = D, { D n F } - iD, (FoSan) = { D B F } - iF1Gan - iD, (FoJan) and so on. By induction one gets: j- 1
D ~ = F { D ~ F-} i
C D; (Fj-l-kJan)
( j = 1,2, ...I.
k=O Y
Substituting this formula for D i F into the representation Q =
C &jD3,, j=O
we get (B.lO). Lemma B.2 is proved.
0
Denoting in the sequel the extensions by zero to R" of functions f (x) E , as fo and uo, and using Lemma B.2, we obtain
(9
(a_)
Coo , u (x)E S the following formulas: Y
i-I
(B.13)
Random Fields Estimation Theory
346
j-1
w
+
(Qu)O= Q (uo) i
C C Dk ( (Di-'-ku) Ian Sari) Qj
(u E S @-))
,
k=O
j=1
(B.14) where ( D i f ) := r8nDi.f. Using these formulas one can define the action (Q) and fjs,w(Q-) of the operator Q upon the elements of the spaces (s E R) (defined in Section B.4) as follows (cf. [Kozlov et al. (1997), Sec. 3.2.1, [Roitberg (1996), Sec. 2.41):
Ian
fjsiw
w
j-1
-iCQjC@($~j-kaan) j=1 k=O
(Q(f7$J)O:=Q(f0)
((f7$J
E$""(R)),
(B.15)
(B.16) It is known ([Roitberg (1996)], [Kozlov et al. (1997)], [Kozhevnikov (2001)l) that Q , defined respectively in (B.15) and (B.16), is a bounded mapping Q : fjs9w (R)
--f
?is-" (0) and Q : fjsJ' (0-) -+ ?is-"(0-).
Moreover, Q is respectively the closure of the mapping f -+ Q ( z , D )f (f E C" or u -+ Q (z, 0 )u (uE S between the corresponding spaces. Let Wme ( m = 1, ..., p / 2 , f2 = a 1,...,v) be the operator acting as follows:
(9)
p-))
+
Y
-a-p/2p-1
Wmt (4) := iym-lA+
W
C j=e C QjDi-'
(daan),
4 E C" (an),
ka+l
(B.17) where yk is the restriction to dQ of the Dk (cf. Section B.4). The mapping Wme is a pseudodifferential operator of order m - p v/2 - 1- l . Therefore, for any real s, this mapping is a bounded operator:
+
wme: HS(
8 ~+) Hs-m+p-wP+l+e
(an).
For (f,$J E fja9" (R) , one has g := Q (f,$J E Hg-w(a), and we set
wa+m := -7m-1
~+ - a - ~ " / ~ - i( ~ mo=
1,..., p / 2 ) 7
(B.18)
Integral Operators Basic in Random Fields Estimation Theory
347
where the operator ym-lA+a-"/2P-1 (x,D ) is a trace operator of order m - 1 - a - 3p/2. It follows that wu+, E HP/2-m+1/2( d o ) . Theorem B.2
Integral equation (B.3) Rah = f E H a (a), h E H;" (R)
is equivalent to the following boundary-value problem: in R-,
(Qu=O
where the functions u, f and h are related by the formulas h = P-lQF,
F E H a ( W ) , F :=
f
E
Hu(R) inR,
u E H a (0-) in R-.
Proof. Our starting point is Theorem B . l . Consider the equation P h = QF, h E H;"(R). Since F E H'(Rn) and Qu = 0 in R- by (B.4), then QF E Ht-" (0) = HGa-" (0).By Lemma B . l , a solution h to the equation P h = QF E Hca-@(0)belongs to the space H t a (0)if and only if QF satisfies the following p/2 boundary conditions:
ranD,~ - ~ + A - ~ - " / ~ P -=~0, Q Fm = 1,...,p/2.
(B.20)
+
+
Since F = f o uo, one has QF = Q ( j o ) Q (uo). Substituting the last expression into (B.20) we have + P-lQ (u') = -7m-1 A-.-P/2 + P-lQ (f') ym-1 A-a-P/2
From (B.15) and (B.16), one gets: v
i-1
j=1
k=O
j=1
k=O
m = 1,...,PI 2.
Random Fields Estimation Theory
348
Since
F
:=
f
E H a ( R ) inR, u E H a (R-) in 0-
and F E H a (Rn),
it follows that y j - 1 ~= y j - l f , j = 1, ...,a. Therefore, $ j = yj-lu = y j - l f = $j, j = 1I...,a. We identify the space H a (0) with the subspace of (R) of all = ( f , $ 1 , ...,$), such that = ... = &, = 0. Let (f,$) belong to this subspace and (u, 4) = ( u ,$ 1 , ...,$ u ) E fia~(.) (R-).Then we can rewrite (B.21) as fial(v)
(f,d
V
i
j=1
e=a+l
Changing the order of the summation
we get u
i~m-lA+a-”2P-1
v
C C QjDi-e ($[ban)
ka+l j=t
- -7m-1 A-a-p/2 + p-l (Q (f,*))O,
(B.23)
where m = 1, ...,p/2. In view of (B.17) and (B.18), formula (B.23) can be rewritten as p /2 equations
Since 4j = y j - 1 ~for u E
s (E), we get
U
C
Wme (ye-lu) = wa+m on 80, m = 1, ...,p/2.
e=a+l These equations define boundary-value problem
p/2 extra boundary conditions for the
in R-, D i u = D i f onaR, O < j < a - l . Qu=O
Integral Operators Basic in Random Fields Estimation Theory
349
Theorem B.2 is proved.
B.3
cl
Isomorphism property
We look for a solution u E H a (R-) to the boundary-value problem (B.19). Let us consider the following non-homogeneous boundary-value problem associated with (B.19): in RondR,
u=w 7oB.j~:= 7oDA-l~= ~j
{Q
1I j I a
Y
T d a + m U :=
C
e=a+l
Wme(ue-1) = wa+m on aR,
1I m
I p/2,
(B.24) where w , wj j = 1,..., v/2, are arbitrary elements of the corresponding Sobolev spaces (see below Theorems B.3 and B.4). For the formulation of the Shapiro-Lopatinskii condition we need some notation. Let E > 0 be a sufficiently small number. Denote by U (&-conicneighborhood) the union of all balls B ( 2 ,E ( x ) ), centered at x E 80 with radius E ( x ). Let y = ( y ’ , y n ) = ( y l , ...,yn-l, y n ) be normal coordinates in an Econic neighborhood U of dR, that is, dR may be identified with { y n = 0) , yn is the normal coordinate, and the normal derivative D, is D,, near dR. Each differential operator on Rn with SG-symbol can be written in U as a differential operator with respect to D,, and Dyn:
j=O
( y , D,!)are differential operators with symbols belonging to SG(”lo)( W ) .Let
where
Qj
I,
q ( Y ,E ) = 4 ( Y ,
C qj (Y,E’)
be the symbol of Q, where E‘ and En are cotangent variables associated with y‘ and yn. Assumption 1. W e assume that the operator Q is md-properly elliptic (cf. [Erkip and Schrohe (1992), Assumption 1, p. do]), that is, for all large IyI IE’I the polynomial q (y, E’, z ) with respect to the complex variable z has exactly u/2 zeros with positive imaginary parts T I (y’, , ..., T ~ (y’, / ~ .
+
Random Fields Estimation Theory
350
We conclude from Assumption 1 that the polynomial q (y, z ) has no real zeros and has exactly v/2 zeros with negative imaginary part for all large IYI + In particular, the Laplacian A in the space R” (n 2 2) is elliptic in the usual sense but not md- properly elliptic, while the operator I - A is md-properly elliptic. Let
where g = ( g i j ) is a Riemannian metric on 130.We denote
n
u/2
4+ (9’1 “1
:=
( z - x (Y’lE’)-l rj (Y’,E‘))
.
j=1
Consider the operators Bm (m = 1,..., v / 2 ) from (B.24). Each of them is of the form V-1 - Bm = Bmj (Y’, ~ y) Dt $ ~
C
j=O
in the normal coordinates y = (y’,y,) = (yl, ...,yn-l,yn) in an &-conic neighborhood of I30. Here B m j (y’, Dyt) is a pseudodifferential operator of order Pm - j ( p m E N) acting on 80. Let bmj (y’, E’) denote the principal symbol of Bmj (y’, Dyj). The operators B , in the boundary-value problem (B.24) are operators of this type. We set u-1
bm ( ~ ’ 7E’,
:=
C bmj ( ~ ’ 1E’) x ( ~ ’ 1E
I
-p,+j
zj .
j=O
Define the following polynomials with respect to z:
j=1
as the residues of bm (y’, E’, z ) modulo qf (y’, t’,z ) , i.e. we get rmj (y‘, t’) representing b, (y‘, z ) in the form u/2
bm (Y’, E’, Z ) = qm ( z ) q+ (Y’,E‘)
+ C rmj (Y’, E’) zj-l, j=1
Integral Operators Basic an Random Fields Estimation Theory
351
where qm ( z ) is a polynomial in t. Assumption 2. (Shapiro-Lopatinskii condition) The determinant det(r,j (y’, S’)) is bounded and bounded away from zero, that is, there exist two positive constants c and C such that
0 < c 5 det (rmj
5 C.
Remark B.2 The following Theorem B.3 has been proved in [Erkip and Schrohe (1992), Th. 3.11 in the more general case of the SG-manifold. The latter includes the exterior of bounded domains which is a particular case of the SG-manifolds. This particular case was chosen f o r simplicity of the exposition. Moreover, the results in [Erkip and Schrohe (1992), Th. 3.11, [Schrohe (1999)] have been obtained for operators acting in weighted Sobolev spaces. The usual Sobolev spaces in Theorem B.3 are particular cases of the weighted Sobolev spaces with zero order of the weight. Theorem B.3 (cf [Erlcip and Schrohe (1992), Th. 3.11, [Schrohe (1999)l). If the differential operator Q of even order v satisfies Assumptions 1 and 2, that is Q is md-properly elliptic and the Shapiro-Lopatinskii condition holds for the operator (Q,yoBl,...,yoBV/2),then the mapping (Q,yoB1,...,yoB,/2) : H” (R-)
-+
Hs-” (Cl-)~n HS-pJ-1/2(an),s 2 v, j=1
is a fiedholm operator. Assumption 3. The fiedholm operator ( Q ,yoB1, ...,yoB,/z) has the trivial kernel and colcernel. For example, if the kernel R(z,y) has the property (Rh,h ) 2 cllhll”,,-. for all h E H t a , where c = const > 0 does not depend on h, then the operator in Assumption 3 is invertible (see Chapter 1).
Corollary B . l Under the assumptions of Theorem B.3 and in addition under Assumption 3, f o r any s E R,there exists a bounded (Poisson) operator
K : nT-1HS+2m-j-1/2 3 =O
(an)-+
(R-1
(B.25)
which gives a unique solution u = K X to the boundary-value problem Qu = 0 in 0-,
yoBlu = X I , ...,y ~ B , / ~=uxVl2
(B.26)
Random Fields Estimation Theory
352
with
x = (xl,...,xv12)E IIYZ-1Hs+v-j-1/2 (80). More precisely, the operator u = K X solves the problem with s < 0 in the sense that u = K X is the limit in the space H"+" (a_)of a sequence u, in H v ( R - ) with Qu,
= 0,
lim x yOBju, = xj,, ( j = 1, ...,rn), ,4m-n
+
X
in ~ u / 2 - 1 H S + U - j - l / 2 (80). 3 =O Proof. The statement of Corollary is an immediate consequence of Theorem B.3 due to the fact that the solution operator to the boundary-value problem (B.26) with homogeneous equation Qu = 0 in R- is a Poisson operator. The latter acts in the full scale of Sobolev spaces [Schrohe (1999)], 0 that is, (B.25) holds for all s E R. Theorem B.4 Under the assumptions of Theorem B.3 and in addition under Assumption 3, the mapping Rn, defined in the Introduction, is an isomorphism : HFa ( 0 )-+ H a ( 0 ) .
Proof. Let us consider the operator (Q,~oI31,...,yoBU/z) generated by the boundary value problem (B.24). Taking into account that pj = order Bj = j - 1 pj = order Bj = j - p
+ v/2
-
for j 2 for j
= 1,..., a , =a
+ 1,..., u/2,
one concludes by Theorem B.3 that the mapping
(u,@) H (Q
( . I $ )
,?OBI( ~
9 4 J
,...>TOB~/Z ( ~ 7 4 J ) = (w,wI,...,w~/z)
is a Fredholm operator. It maps the space H" (0-) to the space
n a
H3-v
(Q-)
n
u/2 HY-j+1/2
j=1
(an)
HS--j+CL--Y/2+3/2
(an)
(s
2 .).
j=a+l
Assumption 3 implies that this mapping is an isomorphism. By Corollary, the operator K , solving the boundary-value problem QU = 0 , yoBju = xj ( j
1, ...,rn) ,
is a Poisson operator
K : nm-lHS+2m--j-1/2 3 =O (80)-+ Hs+2m(R-)
(SER).
Integral Operators Basic in Random Fields Estimation Theory
353
Choosing s = a and using Theorem B.2, we conclude, that for any f E H a (0) the function u is a unique solution t o the boundary-value problem (B.19). Therefore, again by Theorem B.2, the operator RR is an isomorphism of the space HC" (0)onto H a (R). Theorem B.4 is proved.0
Example B . l Let P = I be the identity operator (its order p = 0) and Q = I - A, (v = ordQ = 2 ) . Then, by Theorem B.4, the corresponding operator Rn is an isomorphism: Hgl (R) -+ H1 (R) . Under the assumptions of Theorem B.4 there exists a unique solution t o the integral equation (B.3). Let us find this solution. Examples of analytical formulas for the solution t o the integral equation (B.3) can be found in [Ramm (199O)I. The analytical formulas for the solution in the cases when the corresponding boundary-value problems are solvable analytically, can be obtained only for domains R of special shape, for example, when R is a ball, and for special operators Q and P , for example, for operators with constant coefficients. We give such a formula for the solution of equation (B.3) assuming P = I and Q = -A a21. Consider the equation
+
with the kernel R(z,y) := exp(--alz - yI)/(47rlz-y1), P = I , and Q = -A+a21. By formula (2.24), one obtains a unique solution to the equation (B.27) in H
(B.28) where u is a unique solution to the exterior Dirichlet boundary-value problem
(-A For any
'p E
+ a2)u = 0
in
R-,
ulan = flan.
C r ( R n ) one has: ((-A
+ a')&
'p) =
(Rh,(-A
+ a')'p)
(B.29)
354
Random Fields Estimation Theory
= /(-A
n
s
+ a 2 ) f i d z +an (anf - anu) ds,
where the condition u = f on a R was used. Thus, we have checked that formula (B.28) gives the unique in Ht1(R) solution t o equation (B.27). This solution has minimal order of singularity.
B.4
Auxiliary material
We denote by R the set of real numbers, by C the set of complex numbers. Let Z := { O , f l , f 2 ,...}, N := {0,1, ...}, N+ := { 1 , 2,...}, R" := {z = (x1,...,z n ) : xi E R, i = 1,..., n} . Let a be a multi-index, a := ( a l ,. . .,an),aj E N, la1 := a1 . .+a,, i := Dj:= i-la/azj; D" := DY1D,"2.. .D,"n. Let C" (D) be the space of infinitely differentiable up to the boundary functions in K. Near dR there is defined A normal vector field n(x) = (nl(z) , ...,nn (z)), is defined in a neighborhood of the boundary a0 as follows: for zo E dR, n(x0) is the unit normal to afi, pointing into the exterior of 0. We set
+.
m;
n (x):= n ( 2 0 ) for z of the form z = zo
+ sn (ZO) =: < (zo, s)
where xo E 8 0 , s E ( 4 , s ) . Here 6 > 0 is taken so small that the representation of x in terms of zo E a R and s E (-6,s) is unique and smooth, that is, C is bijective and C" with C" inverse, from 6'R x (-6,s) to the set <(as2x (-6,s)) c Rn. We call differential operators tangential when, for z E (dR x (-6,d)), they are either of the form
<
355
Integral Operators Basic in Random Fields Estimation Theory
or they are products of such operators. The derivative along n is denoted
an :
anf :=
c n
af
nj (z) -(z)
axj
j=1
for II: E C (dR x ( 4 , b ) ) .Let Dn := i - l a n . Let R- := Rn denote the exterior of the domain R, ran, rn be respectively the restriction operators to 80, R : ranf := f l a n , rnf :=
\a
fln
Let S(W) be the space of rapidly decreasing functions, that is the space of all u E C" (Rn) such that
- Let S (n-) be the space of restrictions of the elements u E S (Rn)to 0- (this space is equipped with the factor topology). Let u E Coo@?) and Y E S p-), then we set yku ( ~ i u ) y~ := ranDiv = ( D i v ) Let H" (Rn) (s E R) be the usual Sobolev space:
Ian,
Ian.
{
H" (R") := f E S'
:=
ra&u
=
I 3-l (1 + I E 1 y 2 Ff E L2 (R")} ,
where 3 denotes the Fourier transform f t-+ FZ+f(z) = JRn e - i z E f ( z ) d z , F-l its inverse and S' = S' (R") denote the space of tempered distributions which is dual to the space S (R") . Let H" (0) and H" (R-) (0 I s E R) be respectively the spaces of restrictions of elements of H S (R") to R and R-. The norms in the spaces H" (0)and H" (0-) are defined by the relations IlfllHs(n) := inf 11gIIHS(Wn)
IlfllHs(n-)
(s 2 O)
:= inf 11!?11Ha(Rn)
where infimum is taken over all elements g E H" in R respectively in R-.
(s
2 O)
7
I
(Rn) which are equal to f
Random Fields Estimation Theory
356
By H," (R) (s E R) and H," (0-), we denote the closed subspaces of the space H" (Rn) which consist of the elements with supports respectively in R or in that is,
a-,
H," (R) := {f E H s (Rn) : suppf E n }
C
H" (R"),
s E R,
H , " ( o - ) := {f E H" (R") : suppf C Q-} c H" (R"), s E R. We define the spaces
H s (R-) s > 0, H," (R-) s 5 0.
7-l" (0-) := For s
# k + 1/2 ( k = 0,1, ...,l - 1), we define the spaces 3"ie(0) and
Ej">e(R-) respectively as the sets of all
( ~ ~=9( ~ ), $ i , . . . , $ e ) and ( ~ ,=d (w,,$i,...,$e) where u E 31" (0), w E 'FI" (0-), 4 = ($1, ...,48) and $ = ($1, ..., $e) vectors in
e
n
H S - j + l I 2 (aR)
are
satisfying the condition
j=1
43. - Di-l ~ l a n ,
,$. - Dj-1 3 -
n
wlan
for j
< min (s, l ) .
The norms in fj"ie(R) and fj3'e (0-) can be defined as e
e
II b J 4
Il;..L(n)
+
= llwll;s(n)
c
ll~jll;s-~+l/z(as-)~
j=O
Since only the components $j and independently of u,we can identify spaces.
qj with
Lisle
index j < s can be chosen
(52) and Bsie (R-) with the following
Integral Operators Basic in Random Fields Estimation Theory
For s
# k + 1/2 (k = 0,1, ...,1 -
e
'HS(Q) , = 0, 7f" (R) , 1 I 1< s
3-1" (0)x ?-Is (0)x
fi
1),
+ 1/2,
Hs-j+1/2 j=[s+1/2]+1
ne
H"-j+1/2
(an),0 < [ s + f ] < 1,
( a w ,s <
31
j=1
I
357
,
?-Is (L), e = 0 , 3-1" (Q-) , 1 5 e < s + 1/2,
fjsle( a _ )=
7fs (Q-) x ?-Is (R-) x
If
Hs--j+l12 ( j=[s+1/2]+1
ne
Hs-j+1/2
d o ) , 0 < [s
+ $1 < e,
(an) s < f 1
j=1
+
Finally, for s = k 1/2 ( k = 0,1, ...,l - I ) , we define the spaces fjs,e (R) , (0-) by the method of complex interpolation. Let us note that for s # k+1/2 (k = 0,1, ...,1 - 1), the spaces fPe(R) , Ejs,e(0-) are completion of C" ,S respectively in the norms Ejsie
(n) (n-)
2
e-
1
+ C I I ~ ~ ~ I I (an) L~-~-I,~
II(ulyou, ...,ye-iu)IIfi8,e(n)= 114-pp)
j=O
This page intentionally left blank
Bibliographical Notes
The estimation theory optimal by the criterion of minimum of the error variance has been created by N. Wiener (1942) for stationary random processes and for an infinite interval of the time of observation. A large bibliography one can find in [Kailath (1974)l. The theory has been essentially finished by mid-sixties for a finite interval of the time of observation and for stationary random processes with rational spectral density. Many attempts were made in engineering literature to construct a generalization of the Wiener theory for the case of random fields. The reason is that such a theory is needed in many applications, e.g., TV and optical signal processing, geophysics, underwater acoustics, radiophysics etc. The attempts to give an analytical estimation theory in engineering literature (see [Ekstrom (1982)] and references therein) were based on some type of scanning, and the problem has not been solved as an optimization problem for random fields. The first analytical theory of random fields estimation and filtering, which is a generalization of Wiener’s theory, has been developed in the series of papers [Ramm (1969); Ramm (196913); Ramm (1969~);Ramm (1970b); Ramm (1970~);Ramm (1970d); Ramm (1971); Ramm (1971~);Ramm (1973~);Ramm (1975); Ramm (1976); Ramm (1978); Ramm (197813); Ramm (1978~);Ramm (1978d); Ramm (1979); Ramm (1980b); Ramm (1980~);Ramm (1984b); Ramm (1985); Ramm (1987d); Ramm (2002); Ramm (2003); Kozhevnikov and Ramm (2005)] and in [Ramm (1980), Chapter 11. This theory is presented in Chapters 2-4. Its applications are given in Chapter 7, and its generalizations to a wider class of random fields are given in the Appendices A and B. The material in Chapter 3 is based on the paper [Ramm (1985)l. The material in Section 7.7 is taken from [Ramm (1980)] where a reference to the paper Katznelson, J. and Gould, L., Construction of nonlinear filters and control systems, Information and Control,
359
360
Random Fields Estimation Theory
5, (1962), 108-143, can be found together with a critical remark concerning this paper. In Section 7.2 the paper [Ramm (1973b)l is used; in Section 7.3.4 papers [Ramm (1968); Ramm (1973b); Ramm (1978); Ramm (1981); Ramm (1985b); Ramm (1984); Ramm (1987b); Ramm (1987c)l are used; the stable differentiation formulas (7.66, (7.67) and (7.71) were first given in [Ramm (1968)l; in Section 7.5 the papers [Ramm (196913); Ramm (1970b); Ramm (1970~);Ramm (1970d)l are used. There is a large literature in which various aspects of the theory presented in Section 7.6 are discussed, see [Fedotov (1982)], [Ivanov et. al. (1978)], [Lattes and Lions (1967)], [Lavrentiev and Romanov (1986)], [Morozov (1984)], [Payne (1975)], [Tanana (1981)], [Tikhonov (1977)l and references therein. The presentation in Section 7.6 is self-contained and partly is based on [Ramm (1981)l. The class R of random fields has been introduced by the author in 1969 [Ramm (1969b); Ramm (1970b); Ramm (1970~);Ramm (1970d)l. It was found ( see [Molchan (1975)], [Molchan (1974)l) that the Gaussian random fields have Markov property if and only if they are in the class R and P(A) = 1 (see formula (1.10)). Chapter 5 contains a singular perturbation theory for the class of integral equations basic in estimation theory. This chapter is based on [Ramm and Shifrin (2005)I (see also [Ramm and Shifrin (1991)], [Ramm and Shifrin (1993)], [Ramm and Shifrin (1995)l). Random fields have been studied extensively [Adler (1981)], [Gelfand and Vilenkin (1968)], [Koroljuk (1978)], [Pitt (1971)], [Rosanov (1982)], [Vanmarcke (1983)], [Wong (1986)], [Yadrenko (1983)], but there is no intersection between the theory given in this book and the material presented in the literature. In the presentation of the material in Section 8.1.2 the author used the book [Berezanskij (1968)]. Theorem 8.1 in Section 8.1.1 is taken from [Mazja (1986), p. 601, and the method of obtaining the eigenfunction expansion theorem in Section 8.2 is taken from [Berezanskij (1968)l. There is a large literature on the material presented in section 8.2.4. Of course, it is not possible in this book to cover this material in depth (and it was not our goal). Only some facts, useful for a better understanding of the theory presented in this book, are given. For second-order elliptic operators C a number of stronger conditions sufficient for C to be selfadjoint or essentially selfadjoint are known (see [Kato (1981)I). The assumption that L is selfadjoint is basic for the eigenfunction expansion theory developed in [Berezanskij (1968)]. In some cases an eigenfunction expansion theory sufficient for our purposes can be developed
Bibliographical Notes
361
for certain non-selfadjoint operators (see [Ramm (1981b); Ramm (1981~); Ramm (1982); Ramm (1983)l. We have discussed the case when D , the domain of observation, is finite. For the Schrodinger operator the spectral and scattering theory in some domains with infinite boundaries is developed in [Ramm (1963b); Ramm (1963); Ramm (1965); Ramm (1968b); Ramm (1969d); Ramm (1970); Ramm (1971b); Ramm (1987); Ramm (1988b)l. The material in Section 8.3.1 is well known. The proofs of the results about s-values can be found in [Gohberg and Krein (1969)l. The material in Section 8.3.2 belongs t o the author [Ramm (1980); Ramm (1981b); Ramm (1981c)l and the proofs of all of the results are given in detail. The material in Section 8.3.3 is known and proofs of the results can be found in [Gohberg and Krein (1969)], [Konig (1986)], and [Pietsch (1987)l. In Section 8.4 some reference material in probability theory and statistics is given. One can find much more material in this area in [Koroljuk (1978)]. The purpose of the Chapter 6 is to explain some connection between estimation and scattering theory. In the presentation of the scattering theory in Section 6.1 the papers [Ramm (196313); Ramm (1963); Ramm (1965); Ramm (196810); Ramm (1969d)l are used, where the scattering theory has been developed for the first time in some domains with infinite boundaries. Most of the results in Section 6.1 are well known, except of Theorems 6.1 and 6.2, which are taken from [Ramm (1987e); Ramm (1987f); Ramm (1988); Ramm (1988~);Ramm (1989)l. It is not possible here to give a bibliography on scattering theory. Povsner (1953-1955) and then Ikebe (1960) studied the scattering problem in R3. Much work was done since then (see [Hormander (1983-85)] vol. 11, IV and references therein). A short and selfcontained presentation of the scattering theory given in Section 6.1 may be useful for many readers who would like to get a quick access to basic results and do not worry about extra assumptions on the rate of decay of the potential. Lemma 6.1 in Section 6.2 is well known, equation (6.13) is derived in [Newton (1982)], our presentation follows partly [Ramm (1987e); Ramm and Weaver (1987)l and Theorem 6.1 is taken from [Ramm (1987e)l. A connection between estimation and scattering theory for onedimensional problems has been known for quite awhile. In [Levy and Tsitsiklis (1985)l and [Yagle (1988)] some multidimensional problems of
362
Random Fields Estimation Theory
estimation theory were discussed. In Section 6.3 some of the ideas from [Yagle (1988)l are used. Our arguments are given in more detail than in [Yagle (1988)]. The estimation problem discussed in [Levy and Tsitsiklis (1985)] and [Yagle (1988)) are the problems in which the noise has a white component and the covariance function has a special structure. In [Levy and Tsitsiklis (1985)l it is assumed that T = 2 and R(z,y) = R(lz - yl), that is the random field is isotropic, and in [Yagle (1988)l T 2 2 and AzR(z,y) = AyR(z,y). The objective in these papers is to develop a generalization of the Levinson recursion scheme for estimation in one-dimensional case. The arguments in [Levy and Tsitsiklis (1985)l and [Yagle (1988)] are not applicable in the case when the noise is colored, that is, there is no white component in the noise. There was much work done on efficient inversion of Toeplitz’s matrices [Friedlander et. al. (1979)l. These matrices arise when one discretize equation (3.4). However, as E + 0, one cannot invert the corresponding Toeplitz matrix since its condition number grows quickly as E -+ 0. If E = 1 in (3.4), then there are many efficient ways to solve equation (3.4). It would be of interest to compare the numerical efficiency of various methods. It may be that an iterative method, or a projection method will be more efficient than the discretization method with equidistant nodes used together with the efficient method of inverting the resulting Toeplitz matrix. The results in Appendix A are taken from [Ramm (2003)l and the results in Appendix B are taken from [Kozhevnikov and Ramm (2005)l. The author has tried to make the material in this book accessible to a large audience. The material from the theory of elliptic pseudodifferential equations (see, e.g., [Hormander (1983-85)]) was not used. The class R of kernels is a subset in the set of pseudodifferential operators. The results obtained in this book concerning equations in the class R are final in the sense that exact description of the range of the operators with kernels in class R is given and analytical formulas for the solutions of these equations are obtained, The general theory of pseudodifferential operators does not provide analytical formulas for the solutions. It was possible to derive such formulas in this book because of the special structure of the kernels in the class R. In [Eskin (1981), $271 an asymptotic solution is obtained for a class of pseudo-differential equations with a small parameter.
Bibliography
Adler, R. (1981). The geometry of random fields, J. Wiley, New York. Agmon, S. (1982). Lectures on exponential decay of solutions of second-order elliptic equations, Princeton Univ. Press, Princeton. Akhieser , N . (1965). Lectures o n approximation theory, Nauka, Moscow. Aronszajn, N. (1950). Theory of reproducing kernels, Trans. Am. Math. SOC. 68, pp. 337-404. Berezanskij, Yu (1968). Expansions in eigenfunctions of selfadjoint operators, Amer. Math. SOC.,Providence RI. Deimling, K. (1985). Nonlinear functional analysis, Springer Verlag, New York. Ekstrom, M. (1982). Realizable Wiener filtering in two dimensions, IEEE Trans. on acoustics, speech and signal processing, 30, pp. 31-40; Ekstrom, M and Woods, J. (1976). Two-dimensional spectral factorization with applications in recursive digital filtering, ibid, 2, pp. 115-128. Erkip, A. and Schrohe, E, (1992). Normal solvability of elliptic boundary-value problems on asymptotically flat manifolds, J . of Functional Analysis 109, pp. 22-51. Eskin, G. (1981). Boundary value problems for elliptic pseudodifferential equations, Amer. Math. SOC.,Providence, RI. Fedotov, A. (1982). Linear ill-posed problems with random errors in the data, Nauka, Novosibirsk. Friedlander, B., Morf, Mi, Kailath, T., Ljung, L. (1979). New inversion formulas for matrices classified in terms of their distance from Toeplitz matrices, Linear algebra and its applications, 27, pp. 31-60. Gakhov, F. (1966). boundary-value problems, Pergamon Press, Oxford. Glazman, I. (1965). Direct methods of qualitative spectral analysis of singular differential operators, Davey, New York. Gohberg, I. and Krein, M. (1969). Introduction to the theory of linear nonselfadjoint operators, AMS, Providence. Gilbarg, D. and Trudinger, N. (1977). Elliptic partial differential equations of second order, Springer Verlag, New York. Gelfand, I. and Vilenkin, N. (1968). Generalized functions, vol. 4., Acad. Press, New York. 363
364
Random Fields Estimation Theory
Grubb, G. (1990). Pseudo-differential problems in L, spaces, Commun. in Partial Differ. Eq.,15 (3), pp. 289-340. Grubb, G. (1996). Functional calculus of pseudodzfferential boundary problems, Birkhauser , Boston. Hormander, L. (1983-85). The analysis of linear partial differential operators, vol. I-IV, Springer Verlag, New York. Ivmov, V., Vasin, V. and Tanana, V (1978). Theory of linear ill-posed problems and applications, Moscow, Nauka. Kato, T . (1995). Perturbation theory for linear operators, Springer Verlag, New York. Kato, T. (1981). Spectral theory of digerential operators, North Holland, Amsterdam. (Ed. Knowles, I. and Lewis, R.) pp. 253-266. Kato, T. (1959). Growth properties of solutions of the reduced wave equation,Comm. Pure Appl. Math, 12, pp. 403-425. Kailath, T. (1974). A view of three decades of linear filtering theory, I E E E Trans. o n inform. theory, IT-20, pp. 145-181. Kantorovich, L. and Akilov, G. (1980). Functional analysis, Pergamon Press, New York. Klibanov, M. (1985). On uniqueness of the determination of a compactly supported function from the modulus of its Fourier transform, Dokl. Acad. Sci. U S S R , 32, pp. 668-670. Konig, H. (1986). Eigenvalue distribution of compact operators, Birkhauser, Stuttgart. Koroljuk, V. ed. (1978). Reference book in probability and statistics, Naukova Dumka, Kiev. Kozhevnikov, A. (2001). Complete scale of isomorphisms for elliptic pseudodifferential boundary-value problems, J . London Math. SOC.(2) 64, pp. 409-422. Kozhevnikov, A. and Ramm, A.G. (2005). Integral operators basic in random fields estimation theory, Intern. J . Pure and Appl. Math., 20, N3, 405-427. Kozlov, V. Maz'ya, V. and Rossmann, J. (1997). Elliptic boundary-value problems in domains with point singularities, AMS, Providence 1997. Krasnoselskii, M., et. al. (1972). Approximate solution of operator equations, Walters - Noordhoff, Groningen. Lattes R. and Lions J. (1967). Me'thode de quasi-reversibilite et applications, Dunod, Paris. Lavrentiev, M. and Romanov, V. and Shishatskii, S. (1986). Ill-posed problems of mathematical physics and analysis, Amer. Math. SOC.,Providence, RI. Levitan, B. (1971). Asymptotic behavior of spectral function of elliptic equation, Russ. Math. Survey, 6, pp. 151-212. Levy, B. and Tsitsiklis, J. (1985). A fast argorithm for linear estimation of twodimensional random fields, I E E E Trans. Inform. Theory, IT-31, pp. 635644. Mazja, V. (1986). Sobolev spaces, Springer Verlag, New York. Molchan, G. (1975). Characterization of Gaussian fields with Markov property, Sou. Math. Doklady, 12, pp. 563-567. Molchan, G. (1974). L-Markov Gaussian fields, ibid. 15, pp. 657-662.
Bibliography
365
Morozov, V. (1984). Methods for solving incorrectly posed problems, Springer Verlag, New York. Naimark, M. (1969). Linear dzfSerentiaZ operators, Nauka, Moscow. Newton, R. (1982). Scattering of waves and particles, Springer Verlag, New York. Payne, L.E. (1975). Improperly posed problems, Part. Dif. Equ., Regional Conf. Appl. Math., Vol. 22, SIAM, Philadelphia. Pietsch, A. (1987). Eigenualues and s-numbers, Cambridge Univ. Press, Cambridge. Piterbarg, L. (1981). Investigation of a class of integral equations, DZ8. Uravnenija, 17, pp. 2278-2279. Pitt, L. (1971). A Markov property for Gaussian processes with a multidimensional parameter, Arch. Rat. Mech. Anal., 43, pp. 367-391. Preston, C. (1967). Random fields, Lect. notes in math. N34, Springer Verlag, New York. Ramm, A.G. (1963). Spectral properties of the Schrodinger operator in some domains with infinite boundaries. Doklady Acad. Sci. USSR, 152, pp. 282285. Ramm, A.G. (196313). Investigation of the scattering problem in some domains with infinite boundaries I, 11, Vestnik 7, pp. 45-66; 19, pp. 67-76. Ramm, A.G. (1965). Spectral properties of the Schrodinger operator in some infinite domains, Mat. Sbor. 66, pp. 321-343. Ramm, A.G. (1968). On numerical differentiation. Math., Izvestija vuzov, 11, 1968, 131-135. 40 # 5130. Ramm, A.G. (1968b). Some theorems on analytic continuation of the Schrodinger operator resolvent kernel in the spectral parameter. Zzu. Ac. Nauk. Arm. SSR, Mathematics, 3, pp. 443-464. Ramm, A.G. (1969). Filtering of nonstationary random fields in optical systems. Opt. and Spectroscopy, 26, pp. 808-812; Ramm, A.G. (1969b). Apodization theory. Optics and Spectroscopy, 27, (1969), pp. 508-514. Ramm, A.G. (1969~).Filtering of nonhomogeneous random fields. Ibid. 27, pp. 881-887. Ramm, A.G. (1969d). Green’s function study for differential equation of the second order in domains with infinite boundaries. Diff. eq. 5, pp. 1509-1516. Ramm, A.G. (1970). Eigenfunction expansion for nonselfadjoint Schrodinger operator. Doklady, 191, pp. 50-53. Ramm, A.G. (1970b). Apodization theory 11. Opt. and Spectroscopy, 29, pp. 390394. Ramm, A.G. (1970~).Increasing of the resolution ability of the optical instruments by means of apodization. Ibid. 29, pp. 594-599. Ramm, A.G. (1970d). On resolution ability of optical systems. Ibid., 29, pp. 794798. Ramm, A.G. (1971). Filtering and extrapolation of some nonstationary random processes. Radiotech. i Electron. 16, pp. 80-87. Ramm, A.G. (1971b). Eigenfunction expansions for exterior boundary problems. Ibid. 7, pp. 737-742.
366
Random Fields Estimation Theory
Ramm, A.G. (1971~).On multidimensional integral equations with the translation kernel. Diff. eq. 7, pp. 2234-2239. Ramm, A.G. (1972). Simplified optimal differentiators. Radiotech. i Electron. 17, pp. 1325-1328. Ramm, A.G. (1973). On some class of integral equations. Ibid., 9, pp. 931-941. Ramm, A.G. (1973b). Optimal harmonic synthesis of generalized Fourier series and integrals with randomly perturbed coefficients. Radio technika, 28, pp. 44-49. Ramm, A.G. (1973~).Discrimination of random fields in noises. Probl. peredaci i n f o m a c i i , 9, pp. 22-35. 48 # 13439. Ramm, A.G. (1975). Approximate solution of some integral equations of the first kind. Dig. eq. 11, pp. 582-586. 440-443. Ramm, A.G. (1976). Investigation of a class of integral equations. Doklady Acad. Sci. USSR, 230, pp. 283-286. Ramm, A.G. (1978). A new class of nonstationary processes and fields and its applications. Proc. 10 all-union sympos. “Methods of representation and analysis of random processes and fields’’ Leningrad, 3, pp. 40-43. Ramm, A.G. (197813). On eigenvalues of some integral equations. Dig. Equations, 15, pp. 932-934. Ramm, A.G. (1978~).Investigation of a class of systems of integral equations. Proc. Intern. Congr. on appl. math., Weimar, DDR, (1978), 345-351. Ramm, A.G. (1978d). Investigation of some classes of integral equations and their application. In collection “Abel inversion and its generalizations”, edited by N. Preobrazhensky, Siberian Dep. of Acad. Sci. USSR, Novosibirsk, pp. 120179. Ramm, A.G. (1979). Linear filtering of some vectorial nonstationary random processes, Math. Nachrichten, 91, pp. 269-280. Ramm, A.G. (1980). Theory and applications of some new classes of integral equations, Springer Verlag, New York. Ramm, A.G. (1980b). Investigation of a class of systems of integral equations, Joum. Math. Anal. Appl., 76, pp. 303-308. Ramm, A.G. (1980~).Analytical results in random fields filtering theory, Zeitschr. Angew. Math. Mech., 60, T 361-T 363. Ramm, A.G. (1981). Stable solutions of some ill-posed problems. Math. Meth. in Appl. Sci. 3, pp. 336-363. Ramm, A.G. (1981b). Spectral properties of some nonselfadjoint operators, Bull, A m . Math. SOC.,5, N3, pp. 313-315. Ramm, A.G. (1981~).Spectral properties of some nonselfadjoint operators and some applications in “Spectral theory of differential operators”, Math. Studies, North Holland, Amsterdam, ed. I. Knowles and R. Lewis, pp. 349-354. Ramm, A.G. (1982). Perturbations preserving asymptotics of spectrum with a remainder. Proc. A.M.S. 85, N2, pp. 209-212; Ramm, A.G. (1983). Eigenfunction expansions for some nonselfadjoint operators and the transport equation, J. Math. Anal. Appl. 92, pp. 564-580. Ramm, A.G. (1984). Estimates of the derivatives of random functions. J. Math. Anal. Appl. 102, pp. 244-250.
Bibliography
367
Ramm, A.G. (1984b). Analytic theory of random fields estimation and filtering. Proc. of the intern sympos. on Mathematics in systems theory (Beer Sheva, 1983), Lecture notes in control and inform. sci. N58, Springer Verlag, pp. 764-773. Ramm, A.G. (1985). Numerical solution of integral equations in a space of distributions. J. Math. Anal. Appl. 110, pp. 384-390. Ramm, A.G. (1985b). Estimates of the derivatives of random functions 11. (with T . Miller). J. Math. Anal. Appl. 110, pp. 429-435; Ramm, A.G. (1986). Scattering by obstacles, Reidel, Dordrecht. Ramm, A.G. (1987). Sufficient conditions for zero not to be an eigenvalue of the Schrodinger operator, J. Math Phys., 28, pp. 1341-1343. Ramm, A.G. (1987b). Optimal estimation from limited noisy data, Journ. Math. Anal. Appl., 125, pp. 258-266. Ramm, A.G. (1987~).Signal estimation from incomplete data, Journ. Math. Anal. Appl., 125, pp. 267-271. Ramm, A.G. (1987d). Analytic and numerical results in random fields estimation theory, Math. Reports of the Acad. of Sci., Canada, 9, pp. 69-74. Ramm, A.G. (1987e). Characterization of the scattering data in multidimensional inverse scattering problem, in the book: Inverse Problems: A n Interdisciplinary Study. Acad. Press, New York, pp. 153-167. (Ed. P. Sabatier). Ramm, A.G. (1987f). Completeness of the products of solutions to PDE and uniqueness theorems in inverse scattering inverse problems, 3, L77-L82. Ramm, A.G. (1988). Multidimensional inverse problems and completeness of the products of solutions to PDE. J. Math. Anal. Appl. 134, 1, pp. 211-253. Ramm, A.G. (1988b). Conditions for zero not to be an eigenvalue of the Schrodinger operator, J. Math. Phys. 29, pp. 1431-1432. Ramm, A.G. (1988~).Recovery of potential from the fixed energy scattering data. Inverse Problems, 4, pp. 877-886; Ramm, A.G. (1989). Multidimensional inverse scattering problems and completeness of the products of solutions to homogeneous PDE. Zeitschr. f. angew. Math. u. Mech., T305, N4-5, T13-T22. Ramm, A.G. (1990). Random fields estimation theory, Longman Scientific and Wiley, New York, pp. 1-273. Ramm, A.G. (1990b). Stability of the numerical method for solving the 3D inverse scattering problem with fixed energy data. Inverse Problems, 6, L7-Ll2; J. reine angew. math., 414, (1991), pp. 1-21. Ramm, A.G. (1990b). Is the Born approximation good for solving the inverse problem when the potential is small? J. Math. Anal. Appl. 147, pp. 480485. Ramm, A.G. (1991). Symmetry properties of scattering amplitudes and applications to inverse problems. J. Math. Anal. Appl. 156, pp. 333-340. Ramm, A.G. (1992). Multidimensional inverse scattering problems, Longman, New York, (Russian edition, Mir, Moscow 1993). Ramm, A.G. (1996). Random fields estimation theory, MIR, Moscow, pp. 1-352. Ramm, A.G. (2002). Estimation of Random Fields, Theory of Probability and Math. Statistics, 66, pp. 95-108.
368
Random Fields Estimation Theory
Ramm, A.G. (2003). Analytical solution of a new class of integral equations. Differential and Integral Equations, 16 N2 p. 231-240. Ramm, A.G. (2003a). On a new notion of regularizer, J. Phys. A , 36, p. 21912195. Ramm, A.G. (2004). One dimensional inverse scattering and spectral problems, Cubo Math. Journ., 6, N1, p. 313-426. Ramm, A.G. (2005). Inverse problems, Springer, New York. Ramm, A.G. and Shifrin, E.I. (1991). Asymptotics of the solution to a singularly perturbed integral equation, Appl. Math. Lett. 4, pp. 67 - 70. Ramm, A.G. and Shifrin, E.I. (1993). Asymptotics of the solutions to singularly perturbed integral equations, Journal of Mathematical Analysis and Applications, 178, No 2, pp. 322 - 343. Ramm, A.G. and Shifrin, E.I. (1995). Asymptotics of the solutions to singularly perturbed multidimensional integral equations, Journal of Mathematical Analysis and Applications, 190, No 3, pp. 667 - 677. Ramm, A.G and Shifrin, E.I. (2005). Singular pertubation theory for a class of Fredholm integral equations arising in random fields estimation theory, Journal of integral equations and operator theory. Ramm, A.G.and Weaver, 0. (1987). A characterization of the scattering data in 3 0 inverse scattering problem. Inverse problems, 3, L49-52. Ramm, A.G. and Weaver, 0. (1989). Necessary and sufficient condition on the fixed energy data for the potential to be spherically symmetric. Inverse Problems, 5 pp. 445-447. Roitberg, Ya. A. (1996). Elliptic boundary-value problems an the spaces of distributions, Kluwer, Dordrecht. Rosanov, Yu (1982). Markov random fields, Springer Verlag, New York. Rudin, W. (1973). Functional Analysis, McGraw Hill, New York. Safarov, Yu. and Vassiliev, D. (1997). The asymptotic distribution of eigenvalues of partial dzfferential operators, American Mathematical Society, Providence, RI, 1997. Saito, Y. (1982). Some properties of the scattering amplitude and the inverse scattering problem, Osaka J. Math., 19, pp. 527-547. Schrohe, E. (1987). Spaces of weighted symbols and weighted sobolev spaces on manifolds, In: Pseudo-Differential Operators, Cordes, H.O., Gramsch, B., and Widom, H. (eds.) Springer LN Math. 1256, pp. 360-377, SpringerVerlag, Berlin. Schrohe, E. (1999). Frkchet algebra techniques for boundary value problems on noncompact manifolds, Math. Nachr. 199, pp. 145-185. Shubin, M. (1986). Pseudodifferential operators and spectral theory, Springer Verlag, New York. Skriganov, M. (1978). High-frequency asymptotics of the scattering amplitude, Sou. Physics, Doklady, 241, pp. 326-329. Somersalo, E. et. al. (1988). Inverse scattering problem for the Schrodinger equation in three dimensions, IMA preprint 449, pp. 1-7. Tanana, V. (1981). Methods for solving operator equations, Nauka, Moscow.
Bib laogmphy
369
Tikhonov, A. and Arsenin, V. (1977). Solutions of ill-posed problems, Winston, Washington. Tulovskii, V. (1979). Asymptotic distribution of eigenvalues of differential equations, Matem. Sborn., 89, pp. 191-206. Vanmarcke, E. (1983) Random fields: analysis and synthesis, MIT Press, Cambridge. Vishik, M.I. and Lusternik, L. (1962). Regular degeneration and boundary layer for linear differential equations with a small parameter, Amer. Math. SOC. R a n s l . 20 pp. 239 - 264. Wloka, J.T. (1987). Partial diflerential equations, Cambridge University Press. Wloka, J.T., Rowley, B. and Lawruk, B. (1995). Boundary value problems for elliptic systems, Cambridge University Press. Wong, E. (1986). In search of multiparameter Markov processes, in the book: Communications and Networks, ed. I.F. Blake and H.V. Poor, Springer Verlag, New York, pp. 230-243. Yadrenko, M. (1983). Spectral theory of random fields, Optimization Software, New York. Yagle, A. (1988). Connections between 3 0 inverse scattering and linear leastsquares estimation of random fields, Acta Appl Math. 13, N3, pp. 267-289. Yagle, A. (1988). Generalized split Levinson, Schur, and lattice algorithms for 3 0 random fields estimation problem (preprint). Zabreiko, P., et. al. (1968). Integral equations, Reference Text, Nauka, Moscow.
This page intentionally left blank
Symbols
Spaces H+ c Ho C H-, 239 C e ( D ) ,236 LP(D),LP(D,P I , 233 W'>P(D),233 H e ( D ) ,238 D = C r , 236
D',236 S, 236 S', 236
f i e ( D ) ,238 H-'(D), k e ( D ) ,239 V, 70 W , 71 P(D) = P(D) nH ~ ( D ) Classes of domains COJ
With cone property EWesQ,238 Classes of overators up,243 u1
trace class, 243
u2 Hilbert-Schmidt class, 243 fJ2(H1,H2)
371
Random Fields Estimation Theory
372
C elliptic operator, 255 Special functions Jv(x), 26 Ko(x), 27 h,(r) spherical Hankel functions, 160 Y,(O) spherical harmonics, 160 6 ( x ) delta function, 4
Sumbols used an the definition o f class R d p spectral measure of C, 3 P(A),&(A) polynomials, 3 p = degP(A), 3 4 = degQ(X), 3 s = ordC, 3 R class of kernels, 3 a(., y, A) spectral kernel of C , 3 A spectrum of C, 3
Various sumbols
R' Euclidean r-dimensional space, 1 S2 the unit sphere in R3, 161 s(5) useful signal, 1 n ( x ) noise, 1
U ( z ) observed signal, 1 h(z,y) optimal filter, 2 Ho, H I hypotheses, 163 C(ul,. . . , u,) the likelihood ratio, 163 N(A) null-space of A, 275 RanA range of A, 275 D(A) domain of A, 275 a(A) spectrum of A, 275 -+ strong convergence, 39 weak convergence, 260 .f = JRP19
-
Index
Approximation of the kernel, 46 Asymptotic efficiency, 317
Mean value, 305 Mercer’s theorem, 61 Moment functions, 305
Bochner-Khintchine theorem, 304 Order of singularity, 3 Characteristic function, 304 Characterization of the scattering data, 138 Completeness property of the scattering solution, 131 Conditional mean value, 303 Correlation function, 306 Covariance function, 305
Projection methods, 39 Random function, 305 Reproducing kernel, 59 Rigged triple of Hilbert spaces, 239 Singular support, 15 Sobolev spaces, 233 Solution of (2.12) of minimal order of singularity (mos solution), 14 Spectral density, 314 Spectral measure, 18 Stochastic integral, 309
Direct scattering problem, 111 Distributions, 236 Elliptic estimate, 76 Estimate, Bayes’, 315 Estimate, efficient, 315 Estimate, maximum likelihood, 316 Estimate, minimax, 315 Estimate, unbiased, 315 Estimation in Hilbert Space, 310
Transmission problem, 15 Variance, 303 Weakly lower semicontinuous, 210
Integral representation of random functions, 309 Inverse scattering problem, 134 Iterative method, 38
373