PSEUDOSOLUTION OF LINEAR FUNCTIONAL EQUATIONS Parameters Estimation of Linear Functional Relationships
Mathematics and Its Applications
Managing Editor: M. Hazewinkel Centrefor Mathematics and Computer Science, Amsterdam, The Netherlands
-
PSEUDOSOLUTION OF LINEAR FUNCTIONAL EQUATIONS Parameters Estimation of Linear Functional Relationships
ALEXANDER S. MECHENOV Moscow State University, Russia
- Springer
Library of Congress Cataloging-in-Publication Data A C.I.P. record for this book is available from the Library of Congress.
ISBN 0-387-24505-7
e-ISBN 0-387-24506-5
Printed on acid-free paper.
O 2005 Springer Science+Business Media, Inc.
All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, Inc., 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now know or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if the are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed in the United States of America. 9 8 7 6 5 4 3 2 1
SPIN 1 1383666
Contents
......................................................... vii General Preface ................................. . . Labels and Abbreviations ................................................................................. ix 1.......................................................................................................................... SYSTEMS OF LINEAR ALGEBRAIC EQUATIONS ................................ 1 1. ANALYSIS OF PASSIVE EXPERIMENTS............................................ 1 1.1 Multiple Linear Regression Analysis .................................................. 2 1.2 Linear Model Subject to the Linear Constraints ................................ 13 1.3 Estimation of the Normal Parameter Vector ..................................... 18 1.4 Confluent analysis of Passive Experiment ........................................ 36 1.5 Stable Parameter estimation of the Degenerated Confluent Model .... 56 1.6 Confluent-Regression Analysis of Passive Experiment ...................... 67 1.7 Stable Estimation of Normal Parameters of the Degenerated ConfluentRegression Model................................................................................... 84 2........................................................................................................................... SYSTEMS OF LINEAR ALGEBRAIC EQUATIONS .............................. 93 2. ANALYSIS OF ACTIVE AND COMPLICATED EXPERIMENTS ..... 93 2.1 Analysis of Active Experiment ......................................................... 94 2.2 Analysis of Active-Regression Experiment...................................... 111 2.3 Analysis of Passive-Active Experiment........................................... 117 2.4 Analysis of Passive-Active-Regression Experiment ......................... 127 3........................................................................................................................... LINEAR INTEGRAL EQUATIONS........................................................ 141 3. ANALYSIS OF PASSIVE AND OF ACTIVE EXPERIMENTS ......... 141 3.1 Variational Problems for the Construction of Pseudosolutions of Linear Integral Equations ............................................................................... 141 3.2 About the Appurtenance of Gauss-Markov Processes to Probability Sobolev Spaces ................................................................................... 166 3.3 Fredholm Linear Integral Equations with the Random Right-Hand Side Errors ................................................................................................. 182 3.4 Linear Integral Equation with the Measured or Realized Core ......... 187 3.5 Technique of Computations ........................................................... 201 References ............................................................................................... 213 Application .............................................................................................. 221 Index ....................................................................................................... 229 Glossary of symbols ................................................................................. 233
GENERAL PREFACE
In the book there are introduced models and methods of construction of pseudo-solutions for the well-posed and ill-posed linear functional equations circumscribing models passive, active and complicated experiments. Two types of the functional equations are considered: systems of the linear algebraic equations and linear integral equations. Methods of construction of pseudos6lutions are developed in the presence of passive right-hand side errors for two types of operator errors: passive measurements and active representation errors of the operator, and all their combinations. For the determined and stochastic models of passive experiments the method of the least distances of construction of pseudosolutions is created, the maximum likelihood method of construction of pseudosolutions is applied for active experiments, and then methods for combinations of models of regression, of passive and of active experiments are created. We have constructed regularized variants of these methods for systems of the linear algebraic equations with the degenerated matrices and for linear integral equations of the first kind. In pure mathematics, the solution techniques of the functional equations with exact input data more often are studied. In applied mathematics, problem consists in construction of pseudosolutions, that is, solution of the hctional equations with perturbed input data. Such problem in many cases is incomparably more complicated. The book is devoted to a problem of construction of a pseudosolution (the problem of a parameter estimation) in the following fundamental sections of applied mathematics: confluent models passive, active and the every possible mixed experiments. The models circumscribed by systems of the linear algebraic equations with inaccurately measured and assigned, but inaccurately realizable, matrix and the models reducing in linear integral equations of the second and first kind with inaccurately measured and prescribed core and with inaccurately measured right-hand side. The necessity of operation is stipulated by need for solution techniques of these problems, which arise by reviewing many applied problems (and sometimes theoretical). For example, problem of rounding errors' compensation at systems of the linear algebraic equations' solution on computers, carrying out of scientific researches and data processing. Problems of handling of research results of the scientific experiments, carrying errors of measurement or errors of implementation of the assigned data, permanently arise both in theoretical and in experimental research, but especially
viii at a solution of practical applied problems where it is necessary to deal with the experimental data. The purpose of book is development of various types of experimental data models for linear functional equations, setting problems of pseudosolution construction, creation of solution techniques of these problems, theirs finishing up to numerical algorithms, writing of handlers of some experiments. Stochastic models of experimental data and the same models of the representation of the a priori information on a required solution (parameters), the statistical approach to a solution of problems in view the confluent-regression analysis is considered. Main results are obtained on the method of least squares-distances and on the maximum likelihood method. In the book only one mode of deriving of estimations is suggested: point estimation. The interval estimation and the test of hypotheses have remained behind frameworks of work in view of lack of researches about distributions of necessary values. It will serve as stimulus for the further researches and in the following book the author will try to fill this gap. The authorship of the least-squares estimation method goes back to Gauss and Legendre. After this the authorship of the least-distances estimation method goes back to Pearson and the authorship of the active experiment estimation to Fedorov. The author has proposed the least-squares-distances estimation method and finished all development in this area of point estimation. In it minimization on auxiliary group of unknown regressors and a right-hand side all again will be spent. They are deduced after that from reviewing, than the possibility of minimization on required parameters is achieved. Such method has allowed to solve the broad congregation of problems. The authorship of regularized methods goes back to Tikhonov and Ivanov. It was possible and to receive to the author in this area generalizing results using the same method of preliminary minimization concerning unknown core and unknown right-hand side. The majority of known problems of the regression analysis are problems of the confluent-regression analysis of passive (or) and active experiment which solution techniques are developed only for the elementary cases and had no therefore practically any circulating. The suggested estimation methods are especially important for the tasks demanding a high exactitude of calculations. The author expresses gratitude to scientists of laboratory of statistical simulation of computational mathematics and cybernetics faculty of the Moscow state university named after M.V. Lomonosov for fruitful arguing of obtained result.
LABELS AND ABBREVIATIONS Uniform labels for all work as a whole: in work (compare a sentence [Lloyd and Ledennan 19901, Vol. 2) the semiboldface Greek letters designate the vectorial and matrix nonrandom values: vectors by lower case and matrices by capital letters. That is the Greek letters designate both unknown parameters and different exact values. Semiboldface Latin letters designate vectors and matrices composed from random variables: vectors by lower case and matrices by capital letters. An italic type Greek and Latin with indexes are the elements of corresponding vectors and matrices. The italic type Latin designates functions and names of standard functions and operations from continuous values of argument on intervals. Thus Greek letters on former designate nonrandom arguments and Latin letters designate the random arguments. The special capitals designate the diverse sets. Tilde above a Latin letter designates realization of random variable (sample). Estimates also are considered as random variables and consequently are designated by Latin letters, near on a tracing to corresponding Greek letters and are supplemented by cap or by crescent to underline their uniformity with realizations. Abbreviations SLAE is a system of linear algebraic equations; SNAE is a system of nonlinear algebraic equations; MLM is a maximum-likelihood method; RMLM is a regularized maximum-likelihood method; LSM is a least-squares method; RLSM is a regularized least-squares method; LDM is a leastdistances method; RLDM is a regularized leastdistances method; LSDM is a least-squaresdistances method; RLSDM is a regularized least-squaresdistances method; RSS is a residual sum of squares.
Chapter 1 SYSTEMS OF LINEAR ALGEBRAIC EQUATIONS
Abstract
In chapter 1 the basic problem of the confluent, confluent-variance and confluent-regression analysis of passive experiment: a problem of estimation of unknown parameters is solved algebraically. The problem of the robust estimation of normal parameters of incomplete-rank confluent and confluent-regression model is solved also.
1. ANALYSIS OF PASSIVE EXPERIMENTS The statistical theory of the linear regression analysis [Borovkov 1984, 1984a, Cox & Hinkley 1974, Draper & Smith 1981, Demidenko 19811 offers the most spread method of parameter estimation. Consequently naturally rushing to compare results of own researches to the results obtained with the help of the classical theory. It causes to devote the first paragraphs of the given chapter to a summary of the basic part of this theory to accent its merits and demerits, moreover with the purpose to have a possibility to apply some specially obtained outcomes in the further account. The material of the chapter, for brevity, we explain in language of the matrix theory. All over, we enter concept of a linear functional relationship [Kendall & Stuart 1968, Lindley 19471 as identities such as a vector is identical to a matrix multiplied by a vector. The basic interest in study of linear functional relationship consists in detection of functional connection between a variable of the response p and other variable or a group of the variables Cl ,..., ,g, 4 1,... ,&, ql, ...,qk known as explaining variables. We enter uniform for first two chapters of work a
2
Alexander S. Mechenov
linear functional relationship or the linear functional (algebraic) equation of the first kind [Mathematical encyclopedia 19851. Assumption 1.Let there is a linearfinctional equation of thejrst kind
where A=
rp = (qq,...,P,,)~
is
a
[c ,. ..,em,41;...,QP ,ql ,...,qk]
response
vector
(right-hand side),
is a matrix explaining the response,
Bp ,Sl,...,6k)T is a vector of parameters. All values including in Eq. (1.0.0), are not rudom. We assume that such relation exists irrespectively our possibility of it's observe or create. The relationship of cause and effect goes from a matrix multiplied by a parameter vector, to the response vector. Separate parts of Eq. (1.0.0), various images deformed by errors, we study in first two chapters of work and at the end of the second chapter the relationship (1.0.0) we shall consider as a whole. Remark 1.0. In algebra the concept of system of the linear algebraic equations (SLAE) is usually applied, but such concept at once provides, that the right-hand side and a matrix are known theoretically (precisely) and it is necessary to find only a solution. For the linear algebraic equation of the first kind we carry out also the statements of problems essentially distinguished from such classical statement. = (pl,. ..,p m ,0
1.1 Multiple Linear Regression Analysis
We enter the basic model. Assumption 1.0. A linearfinctional equation of thejrst kind
is given, where
1
..
(p = (qq,.,an)T
is the unknown response vector,
is a known matrix, 6 = (6 1,. .., s )T ~ are unknown parameters H = [tll,. .., (in Eq. (1.0.0) matrices 8=0 and @=0). In the given paragraph the linear functional equation can be considered as overdetermined and necessarily joint SLAE H6 = (p . Non-joining in SLAE it appears exclusively due to inaccuracies in measurement of a right-hand side as in this paragraph the case when only it is exposed to measurement errors will be considered. It is naturally improbable, that always matrix H would be measured or prescribed in practice without errors, but, nevertheless, all the regression analysis
3
Pseudosolution of Linear Functional Equations
(which title goes back to Galton [Galton 18851) is under construction on this assumption. The linear functional equation (1 .O. 1) is schematically represented in Figure 1.1-1. The simple square framework represents theoretically known values, double square represents unknowns; framework with the rounded angles that will be measured.
pararneaers
Figure 1.1-1. Scheme and description of a linear functional equation.
Let we had an opportunity to measure the values pi,i = 1,n ; changed by the additive random error (see its more strict representation in item 3.2) ei ,i = ; so yi = pi + ei ,i = 1, n ; knowing, that each value p i ,i = 1,n ; coriesponds by definition to the values qil,...,qik , (not prescribed to us and not measured by us, and known theoretically, regardless to measurements cp), in n points i= 1,...,n. In practical applications this situation is, certainly, far-fetched, but realizable in the following cases: 1. In the assumption q6=cp, q = (1..., I ) ~that , is, for the linear functional equation from which the mathematical expectation is estimated; 2. In assumption I6=cp, (I is an identity matrix), that is, for the linear functional equation from which mathematical element expectations are estimated; 3. It in many cases is consistent as well for relations, characteristic for an analysis of variance. Therefore the set of all such matrices H looks as follows
4
Alexander S. Mechenov
Other area of applications: problems of the approximation theory of one function by others where also it is admissible to count the values of H exact [Berezin & Zhidkov 1966, Bakhvalov 19731. We study this situation a little bit more in detail [Metchenov 1981, Mechenov 19881. Assumption 1.1. The vector y = ( y , ,...,y,)* is a random vector of measurements over a Euclidean sample space n", is a linear manfold of dimension k in the space R", n2k. We assume, that there exists a set of representations of an aspect
~IH]
called as linear stochastic model, where ql ,.-.,qk is a vector in the set L[H] , e = ( e , e n is an additive random vector of observational errors having mathematical expectation zero ( Ee = 0 ) and a known positive-deJnite covariance matrix that is independent of 6: cov(e,e) = ~ (- Ee)(e e -~
e
)
~
= Eee T =Z
The vector set cp = Slql +...+Gkqk forms linear manifold of dimension k in n-dimensional space R~ provided that vectors 111,.-.,q k are non collinear. Noting expression (1.1.0) component-wise, we have
or in the matrix form
The matrix H is frequently named a variance matrix or a matrix of the plan, and a vector y is named a response. In the regression analysis, the relationships are constructed with the arbitrary nonrandom matrix, however any ideas of type two-dimensional distribution of pair of random variables [Lloyd & Lederman 19891 in this case it is not meaningful, as models with a random matrix demand other approaches.
Pseudosolution of Linear Functional Equations
5
The model of the variance and regression analysis is submitted in Figure 1.1-2. In the left part of figure (see [Vuchkov, etc. 19851) pointers show relationships of cause and effect. In the second part the simple square framework represents theoretically known values, double square framework represents unknowns and framework with the rounded angles and with shadow represents random variables.
-
Figure I.1-2. Scheme and description of measurements in regression and variance model.
1.1.1 Point Estimation of Required Parameters by the Least Squares Method Let errors submit to the Normal law. We write out the likelihood function:
Problem 1.1.1. Knowing one observation y = H6 + Z of the random vector y, its covariance matrix Z, and the matrix H offull rank k, estimate the true values ofparameters 6 of the model (1.1.0)so that value of likelihoodfunction is maximized (maximum-likelihoodmethod (MIM)). Remark 1.1.1. As det(EeeT)e d e t ( ~ does ) not depend on required parameters the Problem l . l. l can be reformulated thus: to estimate true values of parameter 6 of model ( 1 . 1 .O) so that the weighted quadratic form
6
Alexander S. Mechenov
is minimized (method of least squares (LSM)). So the MLM, which generally builds efficient, consistent and asymptotically unbiased estimates, in this case coincides with LSM. The method of least squares was proposed by Legendre [Legendre 18061 and Gauss [Gauss 1809, 18551 and was generalized in [Aitken 1935, Mahalanobis 19361to the case of an arbitrary covariance matrix. Markov [Markov 19241has in turn suggested build the best linear unbiased estimate, which has led besides to outcome as LSM (in the modem aspect explained in [Draper & Smith 19811 and by other authors). Malenvaud [Malenvaud 19701 and Pyt'ev [Pyt'ev 19831 in turn proposed entirely different approaches leading to the same results as the LSM. Theorem 1.1.1. The estimate ;i of the parameters 6 can be computed from the SLAE
which is called a normal equation or system of normal equations. Proof. We calculate derivatives of the minimized quadratic form with respect to the parameters and we equate these derivatives to zero. That is necessary condition of a minimum. As [Ermakov & Zhiglavsky 19871
where M is any functional, then
whence follows to Eq. (1.1.2). Taking into account, that det the formula
#
0 , we have for an estimation
7
Pseudosolution of Linear Functional Equations
1.1.2 Best Linear Unbiased Estimates. Gauss-Markov Theorem
The estimate ;I of the parameter vector 6 will be unbiased linear in that and only in that case, when d=Ty, where is a matrix such, that Ed = m y = r'H6 = 6 , that is, rH=I. For example r = H T~
(
r1 H
We consider the approach suggested by Markov [Markov 19241. Namely: Problem 1.1.2. Knowing one observation = H6 + 5 of the random vector y, its covariance matrix I: and the matrix H of the fill rank k, estimate true values of parameters 6 of model (1.1.0) so that to construct the unbiased linear estimation of unknown parameters with the least variance. Lemma 1.1.2. Let r be some matrix with n columns. Then mathematical expectation of the vector Ty is equal to
and its covariance matrix is equal to
Theorem 1.1.2. (Gauss -- Markov) The estimate 2 of the parameter vector 6 is linear unbiased and has a covariance smaller or equal to the covariance of any other unbiased linear estimate ofparameter vector. Proof. We consider an arbitrary linear estimate of the vector 6
i = ~ , where r is a required matrix of dimension kxn. As
for an unbiasedness of an estimation it is necessary, that TH=I. However thus it is set only the kxk equations concerning elements T, therefore we consider the following problem: to determine r also so that a variation of the linear form z T i would be minimum at any predetermined vector z. By virtue of the Lemma 1.1.2
Alexander S. Mechenov
= t r ( ~Tn r T,
=tr(CWT~)
For the solution of a minimization problem of the linear form variation in linear constraints we apply a method of undetermined Lagrange multipliers, having entered kxk such multipliers as the matrix A. Having equated to zero the derivatives with respect to the matrix of the expression
we obtain the equation
Lefbmultiplying its on H T ~ - l&d taking into account, that H T T = I ,we discover
Then
As the matrix T can be a rank 1 there is a set of matrices r, satisfjling this relation, but only one of this set does not depend from the matrix T, it is what will convert in zero expression in square brackets, that is
So, in this case under the same formula (1.1.2) the LSM-estimate 2 from the Problem 1.1.1 and the best (with the least variance) linear unbiased estimate of the parameter vector is calculated. Because of these properties, LSM-estimate appeared extremely attractive.
Pseudosolution of Linear Functional Equations
Corollary 1.1.2. The estimate covariance matrix
9
i of the vector of parameters 6 has the
Proof. A covariance matrix
Definition 1.1.2. We name the residual vector 6 the value A
G=Y-Hd
-
A
y-y.
1.1.3 Mathematical Expectation of the Residual Sum of Squares Theorem 1.1.3. Mathematical expectation of the weighed residual sum of squares is equal to n-k
Proof. Really
10
Alexander S. Mechenov
E,!?~= ~ i z i -Iee = ~
( -y~ 6T r1 ) (y - ~ 6 )
We use a relation eT X -1 e = tr 1.1.4 LSM-Estimation of Homoscedastic Model Assumption 1.1.4. (homoscedasticity) The random vector of measurement errors e of linear stochastic model (1.1.0) has the same constant variance a2 . From here follows, that a covariance matrix
X = cov(e,e) = ~ ( -eEe)(e - Ee)
T
= Eee T
= a2 I
will be a scalar matrix. The model (1.1.O) has the form
The quadratic form which needs to be minimized in the Problem 1.1.1, notes as follows
Theorem 1.1.4. The normal equation will be
Pseudosolution of Linear Functional Equations
H'HG = ~ Taking into account, that der(~'H)
'y.
+ 0, we have for an estimate the formula
where
1.1.4.1 Graphic Interpretation Corollary 1.1.4a.
Proof. As 6 = 7 - j , that
Thus, the residue vector is orthogonal to linear manifold L[H]. In that case when among vectors of model there is a unit vector the residual sum is equal to zero
We remark, following [Kolmogorov 19461, that the solution of the Problem 1.1.1 in the Assumption 1.1.4 characterizes the vector j = H i from L[H] , which is a projection of a vector of measurements to this subspace as it is submitted in Figure 1.1-3.
12
Alexander S. Mechenov
Figure 1.1-3. Projection of measurements on the plane of regressors.
1.1.4.2 Estimation of the Variance of Errors Theorem 1.1.4.b. The unbiased estimator of the error variance experimental variance
c?
is an
Proof. That is obvious corollary of Theorem 1.1.3.
1.1.4.3 Multivariate Multiple Model Investigated model (1.1.0) can be expanded on multivariate multiple, that is, on model of an aspect
where we arrange the matrix E by columns into the column E. Malenvaud considered such models and passage to such models is obvious enough [Malenvaud 19701. Remark 1.1.4. Basically, these outcomes should be put into practice with the big caution as only in the theory can be so, that the matrix H is really known precisely. All above-stated without special meditations is put into practice for those problems of the regression analysis where the columns of matrix H are either measured or assigned values or functions from measured or assigned values, that is, we underline, not free from errors of measurement or the representation of the information. But, at present, they are used most intensively in all computations and in software packages: LOTUS 123, Borland QUATTRO, Microsoft EXCEL, BMDP [BMDP 19791, SAS [Barr 19761, AMANCE [Bachacou, etc. 19811, SPAD [Lebarte & Morineau 19821 and many others (as only in very weak degree methods of a solution of problems with measured and (or) prescribed matrices have been developed).
Pseudosolution of Linear Functional Equations
1.2 Linear Model Subject to the Linear Constraints
Outcomes of the given paragraph will be essentially used further as the confluent analysis of passive experiment enters the scheme of linear model with linear constraints. Assumption 1.2. We consider the linear finctional equation (1.1.1) subject to the linear constraints of the form
where
r is a some known precisely fill-rank
matrix of dimension lxk and u is a known exact vector of dimension 1. That is, the unknown parameters P satisfy both to the model (1.1.O) and to the system of linear algebraic equations r6=u.
1.2.1 LSM-Estimator Subject to the Linear Constraints
It agrees [Aitchison & Silvey 19581 we consider the following problem [Metchenov 1981, Mechenov 19881. Problem 1.2.1. Knowing one observation y = H6 + 'i?' of the random vector y, its covariance matrix Z, the matrix H o f f i l l rank k, the matrix r of rank 1 and the vector u, estimate the true values of the parameters 6 of model (1.2.0) so that the quadraticform
is minimized subject to the linear constraints r6=u. Theorem 1.2.1. The estimator 4 of the parameter vector 6 is calculated from system of k+ 1 linear algebraic equations
Alexander S. Mechenov
where 1en' is a vector of the Lagrange undetermined multipliers. The SLAE (1.2.2) is called an expanded normal equation or expanded system of normal equations. Proof. We use the Lagrange method for the proof. For a calculation of T
result it is enough to set a vector 2 1 = 2(a1, ...,A/) of Lagrange undetermined multipliers, to left-multiply the system of constraints on this vector
and to add the product to the minimized quadratic form
The derivation with respect to the vectors 6, 1 leads to the expanded SLAE (1.2.2). It is necessary to show only that a vector thus calculated is unique. We consider any vector 6 such, that l3=u. Then
a
and from not negativity of the quadratic form (Z - Z) T Z -1 (6 - Z) 2 0 in view of previous follows -e-T Z- 1 ( E -5) 2 0 . From the previous equality and from this inequality we have
Thus, only the vector d minimizes the quadratic form (1.2.1) in linear constraints I~=u. This estimator, obviously, also is the best linear unbiased estimator.
Pseudosolution of Linear Functional Equations
1.2.2 Graphic Illustration
We construct also a graphic illustration for th~scase. We remark, that the solution of the Problem 1.2.1 characterizes a vector y = H a from L[H] which is a projection ji to intersection L[H] n - v ]
as it is shown in figure 1.2-1.
Figure 1.2-1. Projection of measurements on a regressor plane at linear constraints.
1.2.3 Expectation of the Residual Sum of Squares Theorem 1.2.2. For model (1.2.0) expectation of the weighed residual sum of squares
Proof. The weighed residual sum of squares
16
Alexander S. Mechenov
The second member is represented in the form
Really
whence
that expectation of the weighed sum of the squares added by constraints, is equal
It is known from the Theorem 1.1.3 that E i 2 = n - k . In result, we have the following unbiased estimator
17 1.2.4 Relation Between the LSM-Estimate in Linear Constraints and the Standard LSM-Estimate
Pseudosolution of Linear Functional Equations
Further there will be useful an expression of a LSM-estimate of model (1-2.0)through a LSM-estimate of model (1.1.O)
For the proof it is enough to substitute value I, in the expanded normal equation. Theorem 1.2.3, The estimate d is unbiased and has the covariance matrix
Proof. Taking into account, as Ed = 6 and E ( T ~- u) = 0 , at once we
have from Eq. (1.2.3), that Ed = 6 . From that, as d = 6 + KH T Z -1- e, using Eq. (1.2.3), we receive
then
Remark 1.2.3. Because this rather seldom stated in textbooks, model with linear constraints is a base for estimation construction of confluent model of the passive experiment investigated in the item 1.4 and further. As in chapter 3 the linear integral equations reducing in ill-posed problems are investigated, we consider separately a case of regression models with the incomplete rank matrices. The reader, who interests only with the well-posed problems of parameter estirnation, can continue at once to item 1.4.
18
Alexander S. Mechenov
1.3 Estimation of the Normal Parameter Vector Assumption 1.3. We consider a variant of linear regression model (1.1.0)
when the matrix H is singular of incomplete rank r
problem refers to ill-posed [Tikhonov 19651. What is the ill-posed problem in practice? If it has met at a solution concrete classical regression problems frequently it is offered to change in that case structure of the matrix H or dimension of the vector 6. That is, to improve initial model. Such approach is a problem of modeling in a concrete subject domain and it is nontrivial, especially, when the appearance circumscribed by linear stochastic model is not characterized by precisely set of parameters 6. That is the contributor is in the situation of "a black box". As it is marked in item 1.1.1, parameters 6 are not identifiable, when there is a linear relation between regressors qj . j = Lk . That is, when these variables are collinear. Following [Malenvaud 19701, it is possible to allocate conditionally two types of collinearity. 1. Regressors really are not independent. Really it happens, that some relations connect the various regressors among themselves by as it happens, for example, in an analysis of variance. Then the model, maybe, would win in clearness if the equivalent statement in which it would be included less than variables would be given to it, but they would beindependent among themselves. 2. Regressors qj , j = 1,k; are independent really but so happened, that they are collinear at the given measurements, without that they have been connected by the defined relation, valid under all circumstances, that is, equality k
A
=0 =
takes place only for the given set of measurements
j=l
qji ,j = L k ; i =
. In that case because it has measurements there is no more
regression analysis but a confluence analysis of passive experiment. First, it is necessary to apply a corresponding method of a parameter estimation that is explained further in item 1.4 of this chapter. Second, besides the probability of such event is
19 insignificant also nothing forces to assume, that at repeated measurements this relation will take place also. 3. Frequently it happens, that regressors are almost collinear. To avoid at numerical computations a zero determinant det H H = 0, it is enough to in-
Pseudosolution of Linear Functional Equations
( T )
crease the number of digits in the binary representation of the mantissa (if parameters cannot be calculated precisely). If such possibility is not present, it is necessary to take into account beforehand rounding errors as so-called errors of the equivalent perturbations [Wilkinson 1963, 1965, 19661, [Voevodin 1969aI. That is, considering model of these errors how it is made in Chapter 2, item 2.1.4 [Mechenov 1988, 19941. We cite the expression [Gauss 18091 concerning the SLAE with matrices of an incomplete rank: "more detailed study of this problem, which differs the theoretical delicacy, than favor in practice more likely, we should postpone up to other time". Further Gauss did not come back to this problem. Here it is possible or to pass to estimated functions [Albert 19721 or to calculate an approximation to a normal solution [Tikhonov 1965, 1965al. We consider the last because it has application at construction regularized pseudosolutions of linear integral equations of the first kind. Definition 1.3.0. Among all solutions of the incomplete rank finctional relation (1.0.1), the vector of parameters 60 with the minimum sum of the weighed relatively squares
a
60 = arg
min
6=ae r n i n ( H ~ + pC-' ) ~ (HG-(p) 6
is called normal, (where N is a positive dejnite matrix). Definition 1.3.1. Among all estimates of the incomplete rank finctional relation (1.3.0), the estimate do with the minimum sum of the weighed relatively
-d squares
do = arg
min
(G-J)~N-'(~-~),
d=atgrnin(H~-y)T C-' (HG-y) S
is called normal; Then
20
Alexander S. Mechenov
that - is, the expected normal estimate in general is not equal to the true parameter 6 . However do allows building some other estimates [Albert 19721. Further we put = 0 and N=I unless it will be especially necessary.
a
1.3.1 Robust Estimation of the Vector of Normal Parameters
We set the problem in a general view [Tikhonov 19651. The problem will be to receive the robust against errors of input data regularized estimate of a vector of normal parameters. Assumption 1.3.1. Let the set of admissible pseudosolutions (estimates) is
Naturally the regularization algorithm notes as the following problem on a conditional extremum: Problem 1.3.1. Knowing one observation y = H6 + i5 of a random vector y, its covariance matrix Z and the matrix H an incomplete rank r, estimate normal vector tio of ill-posed model (1.3.0) so that the square of estimate norm would be minimum on the set of admissible estimates d, = arg
min
If.
Theorem 1.3.1. The solution of the Problem 1.3.1 exists, is unique, satis$es to the SNAE
and it is robust on a mod~jicationof input data. Proof. As is known [Tikhonov & Arsenin 19791, the given statement is equivalent d, = arg
14'
min s&*:(Hs- j)T Z-'(HG-~)=~] ?
Pseudosolution of Linear Functional Equations
that is, minimizing the smoothing functional ~ ' [ 6 ] =1612 + IZ((H6 - y)T
z-'(H~ - 7)- n) .
The necessary condition of minimum
leads to the normal Euler equation (1.3.1). Hence, the regularized estimate d, has the form
General results of the theory of regularizing algorithms imply the convergence in probability
if X+O. If monst, then for A+ we have the convergence d~+do. In practice, 2 is usually unknown in regression problems. We therefore need to consider alternative approaches to the construction of the regularized estimates. One of them constructs a quasisolution [Ivanov 1962, Ivanov, etc. 19781. Definition 1.3.1. A quasiestimate in Eq. (1.3.0) is the estimate d, that minimizes
2
bl).
on a compact set 161 5 y 2 , where y is a given constant f i r example y = Similarly to the previous procedure, the quasiestimate is obtained by minimizing the functional
This leads to the equation
Alexander S. Mechenov
with a choice of Lagrange multiplier a , which is again possible when y is given. However, information about y is as rare as information about 2. Moreover 2
[Galkin & Mechenov 200 11, if lSO1 >
,then d, is a biased estimate and
Intuitive considerations [Hoerl 1962, Hoerl & Kennard 19701 have led to the so-called ridge estimate
where K> 0 is sought from min EL^ (K) = min ~ ( 6 ,- $)T (SK- S)
The following theorem was proved in [Hoerl & Kennard 19701. Theorem (existence). For complete rank models there always exists 100, such that
The theorem shows that all regularized estimates are preferable normal estimates. We return to model of a regression Eq. (1.1.0). The a priori information is frequently introduced in the estimation method of Eq. (1.1.1). Using it, tried to receive outcomes in something the best, than that the unbiased linear estimates allow. To such estimates concern, for example, ridge [Hoerl & Kennard 19701. In some cases the biased estimates have some advantages before unbiased. So the ridge estimation [Hoerl & Kennard 19701 (it has subjected to fair criticism [Coniffe & Stone 19731) has that advantage that at some value 0
d; = Arg min ~ ( d ,- ~ ) ~ ( d-s), , K+oP)
23
Pseudosolution of Linear Functional Equations
though E& 2 6 . But such estimate possesses that shortage, that value k beforehand is not calculated (there is only its estimate) and the possibility is not excluded to miss in its representation and to receive in a result at the some K >; an estimate with quality the worse, than a LSM-estimate. That is, with
But it is usual in such cases, Magquart [Magquart 19701 refers that for such estimates the matrix HTH + d ,>~ 0 , is better stipulated than H T~ and consequently this estimate is preferable. As shown in [Mechenov 1988, 19951 and in item 2.1.3: the effect promoting improving of a condition, such also should be at the correct account of equivalent perturbations of rounding errors at the SLAE numerical solution. Earlier [Stein 19601 studied a special case of ridge estimates: the multiplicative reduced estimates
which can be obtained from ridge by special selection of a stabilizing matrix In instead of I. Such approach was in details studied in [Zaikin & Mechenov 1973a1, [Mechenov 1977, 1978al. Its quality also substantially depends on skill to choose value K. In general, any estimate d is required to minimize the mean square deviation from the true value 6 [Kendall & Stuart 19671 min EL^ = min ~ ( -d$)T (d -
$1
+
= rnin{var dTd (Ed - 61T(Ed - 6)) ,
Here the second term is the squared bias of d in estimation of 6. It is zero if d is an unbiased estimate of 8 . In particular, this is so when 6 = 6 , and d=do, i.e., do is a normal estimate. Thus, some estimate d closer to $ than do may be preferable to do, bat vardTd is always greater than vardoTdo.However the construction of this d requires the additional a priori information about 6 . The statistical properties of this estimate d essentially depend on the a priori information and should be studied in each particular case individually. Then under assumption 6 = 6, , the statistical properties of the normal estimate do are
24
Alexander S. Mechenov
largely similar to the properties of the LSM-estimate h ,which explains why the apparatus of pseudoinverse matrices is widely used for solving Problem 1.3.1. This regularized estimate of a normal vector was studied also in [Elden 19771, [Zaikin & Mechenov 19711, [Metchenov 1981, 19881 and refers to as an estimate by the regularized method of Ieast squares (RLSM-estimate). It is robust against a modification of input data errors estimate of a normal vector [Tikhonov 19651, but naturally it is biased, that is, EG, ;t 6 0 . Corollary 1.3.1. The unbiased estimate Go of a normal vector of parameters & will turn out ifwe replace n on n-r in the last equation of Eq. (1.3. l) (1.3.la)
where in this case the solution is calculated as a limit at a 4 + . Proof. Really, the RSS of any LSM-estimate has (similarly 1.1.3) expectation equal to n-r. The regularized solution is computed at residue value equal to n-r. That is the residue is not biased. It follows, that the minimal norm estimate is not biased relatively 60, because the unbiased RSS is reached on thus estimate. But its expectation on former has not anythmg common with a true vector of parameters + 6. It also does not grow out a point parameter estimation of model in statistical sense as it is constructed from a solution of a variational problem with quadratic constraints. Therefore we construct cleanly a regression estimate, using the a priori information. Remark 1.3.1. Note, that when o2 is unknown in homoscedastic model, its unbiased estimates given by
n-r which calculation requires knowledge of the rank r of the matrix H. 1.3.2 Expansion of the Regularized Estimate in the Functional Series
We consider a relation between a LSM-estimate and regularized estimates [Mechenov 19731. For definition of regularized solution of the equation (1.3.0) we construct the smoothing functional [Tikhonov 19651
Pseudosolution of Linear Functional Equations
IH~
M ~ [ ~ ,= ~ , o-] yf
+ a16l2
The Euler equation for this functional has the form
From here the regularized solution is equal to
where Pa = (HTH + a1K1. We designate [Mechenov 1973, 1973al ry = Hd, - y a residue of the given equation. We consider some vector 6(') , which we search as a solution of the equation
and for which definition, we construct a secondary smoothing functional
where a the same, as in the previous case. The Euler equation for this functional has the form
which also can be received by derivation on a the following Euler equation:
26
Alexander S. Mechenov
From here we receive the formula for the computation d(,l)= a% only da from a right-hand side y an input equation
Let's transform the Euler equation to the form
( H T H +a1)(6, -6(,1)) = H T y +a16,, Let's designate 6$) the solution of this equation, it minimizes the fknctional
Similarly previous for the vector 6(")satisfylog the equation r: = H&)
+e n ,
we construct the smoothing functional
and, accordingly, the Euler equation (HTH+ a1)Sk)= HTrf
which solution also can be written in the form dk) =(-I) n a n P,n+lH T y and which also will easily be transformed in the Euler equation of the form
27
Pseudosolution of Linear Functional Equations
n-1
i (i)
= HTy+alZ(-1)6,
i=l
for the smoothing functional
1=I
H6,-r:
1: I
+a6n-6n-1
where
We deduce the recurrent formula for d (4 a
n+l a
n+l)
+ -n+ (n) . a
Thus
In the supposition that d k ) =
$-%, we compute d?") da
. We receive
Alexander S. Mechenov
I I TI
The norm idk)l= lan~;+lH Ty i PaH y is limited irrespective of n and
as
IaP,I
51, that is, {ldk)l} is nonincreasing sequence. We present formally the
regularized solution of the Euler equation in the point a' as Taylor's series concerning a point a
+x a)
d,. = d a
i=O
(a'-a)'
aid,
i!
aai
and we calculate the sum of these series
I
Last series converges at l(a'-a)pa < 1 or at aye(0,2a)and its sum is equal to
Thus it is proved: Theorem 1.3.2. For everyone a'40,2a),the regularized solution of the equation y = Hi3 + e is represented as the Taylor series. Corollary 1.3.2. Let a'&. Passing to a limit, it is possible to present the LW-estimate (or normal estimate, in case of a degeneracy of H) as series concerning an arbitrary point a
Pseudosolution of Linear Functional Equations
Remark 1.3.2. As shown in [Mechenov 19731,
That is every regularized estimate d, is the first term in the expansion of the normal estimate do in a functional series in the regularization parameter a (with value a obtained from residue, quasisolution or ridge principle). Thus, adding in the regularized solution the additional terms of expansion, we approach to the exact solution on the perturbed data, ie., to normal estimate. 1.3.3 Point Parameter Estimate of the Mixed Model by the LSM
We consider the problem of the a priori information account at the parameter estimation of the linear regression models, that is, relationship with the Bayesian procedure. The a priori information aspect used at exposition of additional information about unknown parameters of linear stochastic model, depends on the decision function used at the previous stages of researches. The same is possible to tell, when the a priori information describes any expert estimations. These estimations also grow out operations of any decision function. Let decision function had the form of a point estimation, and its result is the set
{-
I
the a priori information is described by a gang d,J ~ Kwhere
a , are~ realiza-
tion of the random variables d, s, and matrix K is known precisely (or v 2 ~where } v is known precisely). The idea in t h ~ case s is natural: to count the
{a,
a priori information besides the input data the estimate of parameter expectation and its covariance matrix. We give common definition of the a priori information. Definition 1.3.3. The a priori information on required parameters of regression models we name a random vector d with normal distribution, which expectation is equal to the required parameter vector 6, and the covariance matrix is known. Such form of the a priori information representation is studied in this paragraph. We count further any a priori information value measured and, natu-
30
Alexander S. Mechenov
rally, containing the measurement error, that is, a passive error. Thus, for a vector d the a priori information can construct model
where dimension of a vector can be not equal to k from Eq. (1.3.0), but that case when it is less k, we count, that additional elements necessarily are on places of collinear columns of the matrix H (supplement condition). We shall not speak in this case about the Bayesian approach as it is considered in [Zhukovskij 19721, [Zhukovskij & Morozov 19721, [Murav'eva 19731, [Strand & Westwater 19681 and postulates, that 6 E N author opinion, the complicated approach for the following reasons. The parameters become random bat this contradicts their nature. They cannot be simultaneously nonrandom in model and random in the a priori information. In active experiment it reduces to unusual distribution of the response. In the given paragraph the question is giving the simple and effective scheme of the a priori information account for well-posed and ill-posed models. Titles for such approach to the a priori information account are sought various: mixed model [Ermakov & Zhiglavskij 19871, regularized model [Tikhonov & Arsenin 19791, [Mechenov 19881, model with supplement condition [Morozov 1974, 19841. There are unresolved some problems connected to an unknown variance estimation. It is solved in [Mechenov 1986, 19881. Assumption 1.3.3. There is a linear stochastic model of an aspect
called as the mixed linear stochastic model. We consider, that the supplement condition [Morozov 1984, 19871 is carried out, i.e., a matrix of model (1.3.3) is a fill rank. We assume that the unknown parameter vector 6 is the moment characteristic of the random vector d. We consider the following problem: Problem 1.3.3. Knowing on one observation y,a of random vectors y, d, their covariance matrices Z, K , and the fi11-rank matrix of model (1.3.3), estimate the parameter vector 6 by the MLM. Such method of a point estimation for the mixed model we name a method of least squares in view of the a priori information. Theorem 1.3.3. The estimate hl, of a vector of parameters 6 is calculatedfvom the equation
Pseudosolution of Linear Functional Equations
We name this equation the mixed normal equation or the mixed system of normal equations. Proof. Noting expression (1.3.3) in the matrix form, we have
where matrix I has units only in those places where values d are known. From the Theorem 1.1.1 of item 1.1.1 the normal equation has the form
Remark 1.3.3. Applying outcomes of paragraph 1.1.3, that is, the weighed regression, we pass from model with the correlated error to model with a uncorrelated error. So, let Z-' = UTU. Then the mixed linear stochastic model will be noted as
It is easy to receive further homoscedastic model for which outcomes of paragraph 1.1.4 are applicable. 1.3.3.1 Expectation of the Residual Sum of Squares Theorem 1.3.3a. The mathematical expectation of the weighed RSS is equal to
32
Alexander S. Mechenov
That is, it will be unbiased in that case when lines are added equally so much, how many it is necessary and in necessary places, namely, in collinear columns. The proof is similar to item 1.1. Corollary 1.3.3. The unbiased estimate of a covariance matrix is equal
1.3.4 Passage to Regularized Model
The problem of calculation of the regularized normal parameter estimate of item 1.3.1 can be rephrased also within the framework of the mixed model. We assume that the supplement condition [Morozov 1984, 19871 is carried out, that is, the matrix of the mixed model has a full rank. Problem 1.3.4. Knowing on one observation y,d of random vectors y, d, their covariance matrices Z , K and a fill-rank matrix of model (1.3.3), calculate a minimum of the sum of squares of parameter estimates so that the weighed RSS would be less or equal to the RSS expectation
The unknown variance 2 is calculated from this relation that we name chi-square principle of a variance estimation and which coincides in fact with functional principle of a regularization parameter computation [Morozov 1974, 1984, 19871. Theorem 1.3.4. The estimate from the equation
and v is calculated from a relation
;Itt of the parameter vector bo is calculated
33
Pseudosolution of Linear Functional Equations
We consider the function i2 (v) depending on unknown value v. Theorem 1.3.4a. Function S2(v) is continuous convex downwards
function at v E(0, m) with a range
The proof is similar to [Gordonova & Moromv 19721. For the value v calculation from the chi-square principle that, according to the Theorem 1.3.4a, exists and is unique, we use the known secant method [Kiuru & Mechenov 19711 or the Newton method [Gordonova & Moromv 19721. 1.3.5 Pseudoinverse Matrix Definition 1.3.5. The generalized inverse of an arbitrary nxk matrix H a rank r
HH-H = H .
This matrix always exists, but it is not unique. Definition 1.3.5a. The matrix H- that in addition satisjes the three conditions
( r ( r
H-HH- = H-, HH-
= HH-, H-H
= H-H,
is called the Moore-Penrouse pseudoinverse [Albert 19721 and is denoted by H + (it is unique). The definition more convenient for these purposes is entered [Mechenov 19881 on the basis of a variational problem solution. We start with incompatible SLAE
where I is an identity matrix, A is a required inverse. We mark that HH' # I in case of a degeneracy of H. To establish a relationship between regularization and pseudoinversion, take the following variational problem: Problem 1.3.5. Knowing a matrix H and its rank r calculate
Alexander S. Mechenov
A. = arg
114r
min
A E R ~ ~ ~ : ~ ~ H A - I2 ~ ~ = ( ~ - ~ )
where the matrix and vector norms are consistent. Theorem 1.3.5. The solution of the Problem 1.3.5. exists, is unique, satisjes to the SNAE
and it is the pseudoinverse. Proof. Using the Lagrange method with a multiplier A, we receive a variational problem
s2=
1 12 - (n - r ) 2 ) .
min I I A I+~A(~~HA (AER~""
The Euler equation for this functional is
where last relation is fulfilled at A*m. For this reason, the pseudoinverse is defined [Albert 19721, [Zhukovskij & Liptser 19751 as the limit H+ = lim An
a+*
Remark 1.3.5. For nondegenerate homoscedastic model (1.1.0), the parameter estimate has the form d = H+Y. The same entry for incomplete rank
Pseudosolution of Linear Functional Equations
35
model (1.3.O) gives the unbiased estimate of a normal parameter vector. Moreover, every regularized estimate has the form
where a(a)it is calculated from the second equation of Eq. (1.3.1). Similar to the Eq. (1.3.4), we have the following functional series expansion in the regularization parameter:
Use the pseudoinverse notation represents the convenient form of estimators such as LSM and consequently was studied in [Golub & Kahan 19651, [Peters & Wilkinson 19701. In particular, a ridge estimate has the form d, = H : ~ . From the point of practice view these results can have only the limited application as very seldom matrices H, Z and K are really known precisely. But in physical research when at the big measurement number the limited parameter number is searched, estimates of active experiment model can differ a little from LSM-estimates. In the same scheme the parameter estimation problem of active experiment model is stacked also. Therefore we pass to study of the mixed models of a confluence analysis, all over passive, and then active experiments. Remark 1.3. So, the statements of problems investigated at present have one prominent feature: they badly reflect a real state of affairs, that is, solve problems of the parameter calculation of precisely set initial matrix that is possible in theoretical researches, but in practice meets extremely seldom. Considering, certainly, that an error of a right-hand side much more exceeds (actually, as shown in 1.4.3f, this association more complicated), it is possible to use an error of a matrix LSM, but such fact should and be proved and checked. For example, Legendre [Legendre 18061 calculated a declination angle of a comet in position function of frames on the Earth surface. Thus of distance between frames it supposed measured precisely, and the angle of declination with an error and applied LSM for deriving estimates of association factors. For a point prediction where there will be a comet after a while, with the purpose of its observation in a telescope the given exactitude of a parameter estimation can be and is sufficient. But for premise of a rocket on a meeting with a comet, this exactitude can not suffice any more as "the inaccuracy of position influences radius of a perigee, defined at an aiming" [Winkler 19691. On the other hand, the methods that are taking into account all types of errors, are naturally more complicated, for the majority of cases have not been developed and should be, on idea, are more exact. The given work is devoted to development of such methods.
36
Alexander S. Mechenov
1.4 Confluent Analysis of Passive Experiment Assumption 1.0.4. We assume, that in experiment the linear firnctional equation not subject to our influence (a relationship [Kendall & Stuart 19691) an aspect
is studied, where
= ( l , , n )
T
is. an unknown response vector,
[cl
8 = ,... , t m ] is an unknown matrix, p = (pl, . . , P , ) ~ are unknown parameters (the second part of the linearfinctional equation (1.0) with H=O and @=O is taken). Thus, all initial values remain nonrandom, and all perturbations in them appear due to outside (not concerning to values) errors which we be, not limiting a generality, consider by additive measurement errors and which are superimposed only during measurement both on a right-hand side cp and on a matrix E. In practice [Kendall & Stuart 19691, and especially in econometrics, biometrics and other descriptive research [Frich 1934, Koopmans 1937, Berkson 1950, Linnik 1962, Malenvaud 1970, Seber 1984, Borovkov 1984, Ajvazjan, etc. 1985, Vuchkov, etc. 19851 one often encounters the problem of estimating the unknown parameters of a linear functional equation of the first kind with inaccurate measurements of both the regressor matrix and the right-hand side, so-called linear confluent model of passive experiment [Frisch 1934, Ajvazjan, etc. 1985, Mathematical encyclopedia 19851. In this paragraph the author (see also [Mechenov 1988, 1991, 1994, 1997, 19981) offers a constructive solution of this problem by a method of the least distances (LDM). We consider the basic model. Assumption 1.4. We start with linear confluent stochastic model of passive experiment
constructed for a firnctional relation (1.0.4), where C denotes a matrix C arranged by rows in a column vector and matrix M(nmmm) has the following form
Pseudosolution of Linear Functional Equations
-1
whereMil* = ( p j l , j , l = l , r n
.
Probably, for the first time to note model thus it was offered in [Petrov 19741. His notes: 'While to the author are not known any solutions of this problem though practically it is very important. This also concerns and to rounding-off errors that very hardly give in to research. Therefore any result in this area is essential." - have induced the author to research of these problems. It follows from model (1.4.0),that the experiment is conducted passively, that is, that events occur at some unknown values E and p7while the researchers have at their disposal the random response vector y = cp + e = EP + e and the random regressor matrix X = E+ C with error e and error C respectively. The confluent model of passive experiment is shown schematically in Figure 1.4-1 by analogy with [Vuchkov, etc. 19871.
-
*
Fig. 1.4-1. Scheme and description of model of a passive experiment.
We consider the problem of estimating the required parameters. The author [Mechenov 19881 has obtained a constructive algebraic solution of this problem by the LDM.Here is explained a statement of the problem. 1.4.1 Least Distances Method of the Parameter Estimation of Linear Confluent Model Problem 1.4.1. Knowing one realization
and
(rank % = m)
Alexander S. Mechenov
of the random variables y and X, the corresponding covariance matrices Z, T , Myestimate unknowns cp, E and parameter f3 of linear confluent stochastic model (1.4.0) by the MLM. Remark 1.4.1. 1) In particular, Galton [Galton 18851 studied the known regression problem between the body height of children p on the average body height of parents 6, which given a name to the regression analysis. Bat this problem concerns to confluent problem. Let, for example, functional relationship between body heights of type p =Pg(it is possible to enter still and a constant term) is searched. One must take account of errors (e; c), that have different random causes bat are of the same type for different pairs (y; x)=(cp; 5)+(e; c) leading to the model
which parameter estimation we consider further in item 1.6.4. 2) The model of confluent analysis was first considered in [Pearson 19011 for the case for homoscedastic errors e and c. Pearson was considered the measurements (%;l)as the point set of a plane that should be designed on a line pP6. An estimation method was proposed in [Pearson 19011, using the centroid of points in space R" (as in factor analysis: the so-called least distance method). That is, for a solution of this problem has been applied only mechanistic approach. 3) An exhaustive result based on this approach has been obtained in [Sprent 19661. He studied model with covariance matrix of the special form that passed the proof of orthogonal projection (by the way, it is not reduced in [Sprent 19661) noting model as
and, solving problem T T Y D DY y.$=(I,p~); YQY
min
where the eigenvector corresponds to a minimum eigenvalue of the matrix ~ ~ ~ i and 2 -thel matrix of weights Q was assumed equal to
Pseudosolution of Linear Functional Equations
The result of the proof of this orthogonal projection method with the help of the geometrical approach is present also in [Linnik 1962, Demidenko 1981, Krianev 19861. 4) Many researchers were engaged in study of error check of a matrix [Berkson 1950, Madansky 1959, Petrov 1974, Petrov & Khovanskij 1974, Wahba 1977, Zhukovskij 1977a, 1977b, Krianev 1986, Zhdanov 19891. However, or its solution has been obtained only for special cases [Pearson 1901, Sprent 1966, Malenvaud 1970, Krianev 19861 or deadlock studies have been offered [Hodges & Moore 1972, Davies & Hutton 1975, Wahba 1977, Zhukovskij 1977a, 1977bl. The full algebraic solution of a problem is constructed in [Mechenov 1988, 1991, 19971. Theorem 1.4.1. The estimator of the parameters P of the linear confluent model (1.4.O)from the Problem 1.4.1 minimizes the quadraticform
where the matrix Y has theform
are the mxm block with number ih of the covariance matrix M (mnxmn), Ti are the m elements of i-th row of the matrix T (nxnm) in position h) Or, it is computed from following SNAE
(Mih
where (A)? designates the j-th row of the matrix A and the matrix
40
Alexander S. Mechenov
This SNAE is called the confluent normal equation of passive experiment [Mechenov 198 81. Proof. Instead of the matrix X and the vector y, we consider the (mn+n)dimensional vector z = ( ~ , - y ~ To ) ~this . end, we stretch the matrix X by rows and augment it with the row -yT ;the same transformation it applied to the vectors t;=(g,-(pT)T, ~ = ( z , --Ty )T and w = ( ~ , - e ~ Then ) ~ . the model (1.4.0) C, = 0 : is rewritten as a linear regression model with linear constraints r
where the constraint matrix r(nx(nm+n)) has the following form
p, p2 ... pmo 0 ... o... ......... 0 0 0 ... op, p2 ... p,.. .........0
r=
0
. . . . . . 0 ... 0 0
0 ... 0 ...
0 1 0 ... 00 1
::
.................. . 0 ... o... ...... ...PI p, ... pmo 0 ...
1.
In this case, the use of regression model is completely justified, as the identity matrix I is constructed theoretically and hence is exact. It is the second of theoretically justified variants alongside with a background constant; we build not uniform for all elements of the vector z expectation, bat for each component of the vector z separately. Problem 1.4.1 thus reduces to the following two-stage minimization problem: 1) Estimate the true value of the parameter t; by the MLM. Such problem was considered in item 1.2 and was shown, that it is equivalent to a problem of minimization of the quadratic form
subject the constraints rc = 0 . 2) Find a minimum of this quadratic form over all possible values of the parameters p of the matrix r
Pseudosolution of Linear Functional Equations
S^2 = min Consider the first stage. As it is possible to see from item 1.2, it is a problem of an estimate cp and B as C, = (g,-cpTIT by the weighed LSM in linear constraints [Aitchison & Silvey 1958, Vuchkov, etc. 19871. In this case, such approach is completely justified, as matrix I is known precisely. We use the method of undetermined Lagrange multipliers. For this purpose, we multiply the constraints
r<= 0 by the vector 2 1 = 2(aI,a2.....
and add the product the quadratic form being minimized, we pass to minimization of the Lagrangian
,Y2 =
min
- q T a - l ( -~a ) + 2hTrc,].
I. an ,
A necessary condition of minimum is equality to zero of the derivatives with respect to I; and 1
To construct the estimator for the vector of undetermined Lagrange multipliers, we left-multiply the first equation of the expanded normal equation by the matrix Ti2 and, as rI;=O, the estimate X of the vector of undetermined multipliers is equal to
and then the estimator Z of the vector (similarly Eq. (1.2.3))
< is calculated from the relationship
and its covariance matrix is calculated from the relationship
42
Alexander S. Mechenov
Then the variational problem (1.4.1) is rewritten in a form independent of unknowns E and cp:
where ~ = y - g p . In the second stage, we have to find 6 that minimizes (**). Differentiating (**) with respect to the parameter p, we obtain the SNAE (1.4.1). Since the original matrices 3 and t = [R,-y]are nonsingular (this concurrence is possible only
when the vector 7 is collinear with vector-columns of 2 ), the estimator 6 is the unique solution to the Problem 1.4.1 (the proof of uniqueness is similar to item 1.2.1). Remark 1.4.1. At n=m we have ii = y - Zp = 0 , also this SNAE turns in the SLAE %p = y . Therefore, at use of the matrix and the response measurements, in that case when the number of unknowns is equal to the number of the right-hand side measurements, it is used for estimation the standard Cramer rule. 1.4.1.1 Estimation of Expectation of the Residual Sum of Squares Now, using (*), we easily obtain the estimators x of the regressor matrix 9 and the estimator f of the response vector cp respectively
Theorem 1.4.la. Let RSS (**) is given by
P is true value ofparameter. The mean of the weighed
Pseudosolution of Linear Functional Equations
43
Proof. This result follows from the Theorem 1.2.2. The result is obvious, as there are the nm+n equations in addition of n constraints minus nm+n required parameters. Remark 1.4.la. The mean of the RSS i2has the asymptotically (on n+m) unbiased estimator equal to n. The mean of the RSS for the nonlinear regression analysis is predicted as equal to n-m in [Kendall & Stuart 19681. We assume, that an expectation. i2has also value
This value in view of nonlinearity on parameters of a covariance matrix Y! only approximately is the fastest is equal to n-m. Anyway, at m=n, the RSS is equal to zero and the mean also is equal to zero. And at m=O, the mean of the RSS is equal to n. So at these two values of m, the mean values are calculated precisely. But in gap among zero and n, the mean of the RSS is not obliged to function linearly on m, the deviation from linearity is possible (though it most likely does not exceed about o(1)). It is possible to refer that the minimum of the RSS for the correlated observations is the same, as in the MLM. And for the estimate of m parameters, it is shown in [Kendall & Stuart 19681, that it reduces by the same value of the RSS. However in this case the minimization is applied twice and the procedure of estimator computation will be carried out twice. As corollaries we consider two most interesting special cases (one is not solved within the framework of [Sprent 19601). The first, when covariance is identical in all vector-column of matrix aiid in the right-hand side; and the second, when variances are identical in matrix measurements and in response measurements (homoscedastisity), solved geometrically in [Krianev 19861. 1.4.2 Model with the Unit-Type Correlated Observations
Assumption 1.4.2. We use the following linear confluent stochastic model of passive experiment with the fill rank matrix E of the finctional relation (1.0.4)
That is, observations have identical correlations in y and in columns of matrix X that is fulfilled, for example, for the autoregression model [Malenvaud 19701
This model has the following appearance in the confluent form
where the covariance matrix can be identical to everyone shifted on one time position of the vector ytbi ,i = 1, m . Then the statement of the Problem 1.4.1 has the following form. Problem 1.4.2. Knowing one realization y and % (rank 2 = m)
of the random variables y and X , the corresponding covariance matrices Z, M = diag(Z,...,Z ) , estimate the unknowns cp, E and parameter P of linear confluent stochastic model (1.4.2) by the MLM, that is, so that the weighed RSS
is minimized in the constraints cp=EP. Corollary 1.4.2. The estimator 6 of parameters P of linear confluent model (1.4.2)from the Problem 1.4.2 minimize the quadraticform
or, that the same, are calculatedfrom following SLAE
Proof. Following the proof of the theorem 1.4.1, we construct a little on other of the vectors z, w and the matrix r
c,
... 0 P2 0 ... 0 . . . . . . . . . . . .pm 0 ... 0 1 0 ... r = 0 p, ... 0 0 p2 ... o... ......... 0 pm ... 0 0 1 . . . . . . . .................. . 0 0 ... p 1 0 0 ... p2... ......... 0 0 ... pmO 0 ...
&
1
45
Pseudosolution of Linear Functional Equations
0
::
also we obtain from (**)
Differentiating and equating to zero, we have
Such we find the required equation. 1.4.3 Homoscedastisity of Errors of the Matrix and the Right-Hand side
We consider the most popular case when all errors of the regressor matrix have identical variances the same as also the errors of the response vector. Assumption 1.4.3. We use the following linear conj'luent stochastic model of passive experiment with the fill-rank matrix E of the linear finctional equation (1.0.4)
Then the Problem 1.4.1 has the following form. Problem 1.4.3. Assume that we have one realization jf (rank jf = m) :
7 and one realization
46
Alexander S. Mechenov
of the random variables y and X. We also have their variances 0 2 , p 2 . It is required to estimate the unknowns 9,E and the parameters p of the linear confluent stochastic model (1.4.3) by the MLM, that is, so that the RSS
is minimized in the constraints cp=BP. Corollary 1.4.3. The estimator 6 ofparameter P of linear confluent model (1.4.3)from the Problem 1.4.3 minimizes the quadraticform
or, that the same, are computedfiom following SLAE
Proof. Following the proof of the Theorem 1.4.1, we obtain from (*), that
where the matrix
I"+[
$2 = diag
p2,...,p ,o
,..., a
and, accordingly, the matrix 2
y=(02+p
la12 ) I ,
Pseudosolution of Linear Functional Equations
that is,
Differentiating this weighed quadratic form and equating zero, we obtain
Whence the system of normal equations has the form
= ~f,(ZP) the minimal eigenvalue of the
matrix ZzZ,, . Then 2 (@ - Y )(@ ~ - Y) min ,u2'2S = min ,u Parn Parn o2+P21812 -
min
1 % ~ ,I2
yparn10
lr ,,f
as the vector y = (!pT,l)T $ 0 . From here
9
48
"1
Alexander S. Mechenov
From the Sturm separation theorem [Bellrnan 1960, Gantmacher 19661 we have
i
n
(
)
2 % ). If det X X- 1 I t 0 (that is fulfilled at
- -
0 < alin(Zr)
It depends, besides values of the matrix and its variance, also from the response and the response variance. Therefore such confluent pseudoinverse cannot be calculated separately only from % and its variance and cannot be used for parameter estimation of the model with new realization of a right-hand side. It is follows from Eq. (1.4.3. l), that the pseudoinverse usage at estimator computing from various right-hand sides is meaningful only in the regression analysis. Thus in the confluent analysis the estimate 6 is not written as 6 = X +-y where
-
%+ is either simply pseudoinverse (or constructed on the matrix % and its error norm any other pseudoinverse, for example, minimal pseudoinverse %; [Leonov 1985, 1987, 1990, 19911). Thus, the such pseudoinverse usage for construction of parameter estimates of the confluent model leads to the results distinguished from the LDM. 1.4.3.1 Function of the Residual Sum of Squares
We study behavior of function s2(a)=
Iz~a-$
,where a€R,and pa is
2 +p21par
a solution of the Euler equation:
Its graph for model (1.4.0a) is represented in Figure 1.4-2. Lemma 1.4.3.1. Thefinction g(a)is limited and continuous at ask. Proof. For the proof, we consider the representation of the function p ( a ) using the eigenvalues of the matrix % = KAM :
-- -
Pseudosolution of Linear Functional Equations
s2(a)=
where
= gy .
When the denominator in the sum sign will convert in zero (a -+ -i:),
then
a4 s2(a)+f . At all remaining values a, the denominator and numerator of fmcP
tion are limited, that is, the statement of the lemma whence follows. The necessary minimum condition is:
-2
-2
It allows to calculate the value a = -,u2s2(a)= -& . Because E& z ,u2(n - m), then this value can serve an initial approximation for a calculation of minimal -value by the iterative methods.
Figure 1.4-2. Function of the RSS of one-dimensional model.
50
Alexander S. Mechenov
1.4.3.2 Estimates of Elements of the Matrix and of the Response Let the vectors ti , i = l ..., n, are constructed from rows of the matrix 5. The estimator of matrix 5 and the estimator of vector cp serve accordingly (compare [Tikhonov 19851)
They were calculated from (*)
1.4.3.3 Geometrical Illustration We illustrate the method of the least distances. The algebraic solution of the geometrical approach [Bertier & Bourouche 198 1, Krianev 19861 is constructed in [Mechenov 19881. As
that the vector
epi represents a projection of the vector $' on the vector
and from zATp i g , = 0,i = 1,- ..,n; the orthogonality i orthogonality
i p i l t ,,i = 1,-
to linear manifold z
i
i = 1 n ; and the
- , n follow. Thus, the vectors i p i , i = I,.--,n; belong
.
.i = I,.. ,n;d i m z = m, that is orthogonal to the vector
51
Pseudosolution of Linear Functional Equations
t~form the linear manifold s and the vectors y ,form the linear manifold dy m. i, . In space R ~ " ,the vectors
p ] , d i m= ~
Figure 1.4-3. The example of construction of one projection of the vector 2 pi
Therefore, the relation pansion of
$,.= i,,.+ Epi
,i = 1,.-.,n; represents orthogonal ex-
$,.and ipiis the projection of Zpi on the linear manifold z
as it is represented in Figure 1.4.-3. 1.4.3.4 Numerical Example Let us consider an example. In order better to compare the results of parameter estimation, we artificially place four points of initial data at the vertices of a square {(1;1), (1;2), (2;1), (2;2)). This square has two axes of symmetry (the diagonal and the middle line), and we suppose that these values were measured with an identical error. From these four observations we shall construct estimator of the parameter of the different functional relations. In Figure 1.4-4a the model (1.0.1) y=6~7+ewas formed for the functional relation (1.0.0) of the form ph, and the four measurements were approximated by the LSM, giving as a result the whose slope is less than 1 (regression). Thus, "regression" turns out line j = even for such completely symmetric data.
gOrl,
52
Alexander S. Mechenov
Figure 1.4-4. Comparison of the LSM (a) and of the LDM (b).
The parameter of a functional relation (1.4.0) of the form p 5 P is estimated in Figure 1.4-4b for the same realization. According to the model (1.4.1) of the form y=Pc+e, x=c+c, when the four measurements are approximated by the LDM, this leads to the line j = P , which has slope 1 (an axis of symmetry), that is, there is no regression with such an estimate. 1.4.3.5 Expectation of LDM-Estimate Expectation of a LDM-estimate most soon somehow is biased. It is naturally interebting to compare this bias to bias of a LSM-estimate of the same data. So, we calculate bias of the LDM-estimate calculated from the equation
EX-T-X - Z T ,
= E(% - Z l T ( 2 - 8)= mP2E,
where the matrix E all will consist of units. Then = Z T Z + m P 2 ~ - ( n - m ) , u2 I and
Pseudosolution of Linear Functional Equations
That is, in brackets the matrix of an aspect
increases to the matrix ET E , in which diagonal negative bias in something compensates the positive biases of all remaining elements of a matrix. At the same time the LSM-estimate
6 = (-X T X- ) - l
-T -
X y of these data is even more biased than the
LDM-estimate as its bias has the form llb - P = ((ETB + mA2E)- l z T - n- 1) , that is, in brackets the matrix m , d ~increases to the matrix ETE. Remark 1.4.3e. "Than worse a LSM-estimate of parameters of confluent model of the LDM-estimate of these parameters?" The answers to a problem are followings: First. The LSM-estimate of confluent model does not follow from any statement of a problem, and it is simple "applied in a forehead" the Gauss transformation of which is fair only at n=m. Second. The LSM-estimate of confluent model is even more biased in comparison with the LDM-estimate. Third. The LSM-estimate is simply a special case of the LDM-estimate; so, if in experiment ,u21612,a2 then the application to these data of the LSM can give enough good approximation (the example of an inverse is written in item 1.6.4). Fourth. If the variance matrix is equal to zero then the formula of the LSM arises from the formula of the LDM. 1.4.3.6 Estimation of the Variance In that case, when is known the variance ratio K = ,r2/,u2, the approximated
estimate of a variance a2 can be calculated in view of a Remark in 1.4.l a from the formula $
-" K . n-m n-m
- 2
1% K
-
f
+1612
'
54
Alexander S. Mechenov
1.4.3.7 Estimation of the Covariance Matrix The covariance matrix of parameter estimates can be calculated approximately from the following formula
1.4.3.8 Distribution of the Residual Sum of Squares The RSS S2 has a chi-square distribution with n degrees of freedom. For the distribution it is necessary to build.
S2
1.4.4 Multiple Multivariate Confluent analysis Assumption 1.4.4. We use the following confluent stochastic model of passive experiment constructedfor a multiple multivariatefirnctional relation
We consider the following problem. Problem 1.4.4. Knowing one realization %(rank% = m)
and one realization
of the random variables Y and X , the covariance matrices Z,T, M , estimate the unknowns 9,E and the parameter B of linear confluent stochastic model (1.4.4) by the MLM. Theorem 1.4.4. Estimates B of the parameters B of linear confluent model (1.4.4) from the Problem 1.4.4 minimizes the quadraticform
where Y = {oih+
- (T; + T; )B) .
Pseudosolution of Linear Functional Equations
1.4.5 A Nonlinear Confluent analysis Assumption 1.4.5. We assume, that in experiment are studied the not subject to our influence nonlinearfunctional equation of thejrst kind of the form
where AE$) is a knownjmction. We construct model. Assumption 1.4.5a. We use the linear and additive on errors confluent stochastic model of passive experiment constructed for the nonlinear finctional equation (1.4.5) y = f(EI,fJ)+e, ~ E R " P, E R ~ , E ~ =Eee O ,T = Z , n t m , (1.4.6) X = E + C , XEIR", E C = O , E ~=T, C ~ E g c T =M.
If the functionf(E,p) supposes the possibility of the linearization for increment of the successive approximation method it is possible to set the problem of type 1.4.1 and to obtain its solution. Passing again to the nonlinear problem, we obtain, that parameters minimize the quadratic form
In remaining cases it is postulated the following problem. Problem 1.4.5b. Knowihg one realization of the random variables y and X, the covariance matrices Z, T, M , estimate the unknowns cp, EI and the parameters p of the linear confluent stochastic model (1.4.6) so that the following quadratic form is minimized:
Such problem should be postulated by analogy to a linear case as direct from the MLM (as in the variance and regression analysis) such statement does not follow. Now we pass to the ill-posed problems.
56
Alexander S. Mechenov
1.5 Stable Parameter Estimation of the Degenerated Confluent Model We consider also the problem of the stable against input data error estimates of the normal parameter vector [Tikhonov 1965al for cases when the matrix E has the incomplete rank. Attempt to solve this problem was undertaken earlier [Tikhonov 1980, 1980a, 1980b, 19851, up to when the solution of the well-posed problem has been constructed algebraically [Mechenov 19881. Transposition of the obtained results on the operational equations of the first kind (following [Tikhonov 19851) usually does not represent dificulties, and there they find the basic application. Assumption 1.5. Let the linearfinctional equation (1.0.4) has theform
T .
where cp= ( v ~ , . . . , B ~ )rs a response vector, 8 = [5 -.,E,,] ,a
is an unknown
matrix, J3 = (&, .,/3m)Tare unknown parameters (the second part of the linearfinctional equation (1.0) with H z 0 and @=O is taken). Definition 1.5.0. The vector Po is a normal solution of the SLAE (1.0.5) with the degenerated matrix E, if
Po = Arg
min
lJ3 -L12
p€Rm:(p=q3
with a known matrix E and known vectors cp and E. Further, not limiting a generality, we suppose 6 = 0 . 1.5.1 Regularized Distances Method
Assumption 1.5.1. We shall base our discussion on the linear confluent stochastic model ofpassive experiment of an incomplete rank y = X J 3 + e , y ~ ~J~EIR"',E~=O, ", EeeT =Z,Z>O,n>m, (1.5.0) X = E + C , XEIR"", E C = O , E ~ =T, C~ ECC~ =M,M>o, constructedfor the linearfinctional relation (1.0.5).
57
Pseudosolution of Linear Functional Equations
We change the form of the initial model. Instead of the matrix X and the vector y we consider the vector z = ( ~ , - y of ~ )dimension ~ mn+n. To do this we arrange the matrix X by r6WS into a row and adjoin to it the row -yT , and we do the same for the vectors 6 = ( ~ , - q ~ Z) =~(Z,-y , -T )T and w = (C,-e T )T . Then the model (1.5.0) reduces to the non-degenerate linear regression model with the linear constraints ry=O:
:[ *I:-
z = ~ ~ + w E, w R nm+n ,Ew=O,EWWT =a=
(1.5. la)
where the constraint matrix ~ ( x n(nm + n)) has the following form
In the given model, the matrix I is not degenerate and the constraint matrix is a full rank, without dependence from the matrix rank of the initial model. Therefore the first stage: the estimation of the vector is well posed always. Problem 1.5.la. Knowing the approached vector Z = (Z,-y-T )T , its covariance matrix and the constraint matrix r, estimate the parameters 5 of model (1.5. la) by the MLiM Theorem 1.5.la. The solution of the Problem 1.5.la exists, is unique and minimizes the quadratic form
The proof practically coincides with item 1.4.1. From the Theorem 1.4.1.1 the expectation of this quadratic form is equal to n. As the stability of estimates will be when the RSS reaches value of the expectation, we set the problem in appropriate form.
58
Alexander S. Mechenov
Problem 1.5.lb. Knowing the measured vector 7 and matrix g , exact matrices E, T , M , and that fact, that the matrix E of the SLAE (1.0.5) has incomplete rank, compute the regularized estimate of the parameters P
6 ,,
= arg
{
6:
PER"
; XP-7 ( "I:;-(
Y
,
XP-y
}1611.
(1.5. lb)
Theorem 1.5.lb. The solution of the Problem 1.5.lb exists, is unique and satisjes to the relation
Proof. It is enough to find a minimum of the norm square of parameters on all possible values of parameters P in the constraints S+ I n . It is known [Tikhonov & Arsenin 1979, Shikin 19851, that the minimum is attained on the set boundary, that is, when S+ = n . Having taken advantage of the relation (*) from the item 1.4.1, it will be easy to calculate the estimation of an expectation of the regressor matrix Z and the estimation of an expectation of the response vector cp. Such estimation of a normal vector of parameters was studied also in [Mechenov 1988, 19911 and refers to as an estimate of a regularized method of the least distances or the RLDM-estimate. It is the stable against a modification of input data errors estimate of the normal vector [Tikhonov 19851 but it naturally is biased, that is E 6 , t Po. Corollary 1.5.1. The unbiased estimator 60 of a normal vector will turn out ij'to replace n on n-r in the inequality (1.5. lb) where in this case the estimate is calculated as a limit at ha+. Proof. Really, in this case the RSS expectation of any LDM-estimate is equal to n-r (similarly to item 1.1.3). The estimate is calculated at value of the RSS equal to n-r, that is the RSS is not biased. It follows that the minimum-norm estimate (on which this unbiased RSS is attained) is not biased relatively Po. But its expectation on former has not anything common with the true parameter vector. Most frequently in practice there is a homoscedastic model, therefore consider this case separately.
59
Pseudosolution of Linear Functional Equations
1.5.2 Homoscedastisity of Errors of the Matrix and of the Right-Hand side Assumption 1.5.2. We use the following linear confluent stochastic model ofpassive experiment with the matrix E of the linear functional equation (1.OS) of incomplete rank
Problem 1.5.2a. Knowing the measured vector E = ( z , - ~ ~ its ) ~covari, ance matrix R and the constraint matrix I'of the model (1.5.2a)
<
estimate the parameters by the kEM Theorem 1.5.2a. The solution of a Problem 1.5.2a exists, is unique and minimizes the quadraticform 2 l%-<1 c:rg=o
S 2 = min
The proof practically coincides with item 1.4.3. Then the Problem 1.5.1b has following form. Problem 1.5.2b. Knowing the measured vector
7
and matrix
2 , vari-
ances a2,,u2, and the fact, that the matrix E of the SLAE (1.0.5) is incomplete rank, calculate the stable against errors of input data estimate of the parameter vector f3 of linear confluent stochastic model (1.5.2)
Alexander S. Mechenov
6,
min
=arg I;:
Theorem 1.5.2b. The solution of the Problem 1.5.2b exists, is unique and satisfies to the relation
and the SLAE
where the value of regularization parameter a is positive. Proof. Noting that the minimum is attained on the set boundary, the relation (1.5.2b) is obvious. We apply for estimate construction the method of indeterminate Lagrange multipliers. For this purpose we multiply the constraint
1% -f - n = 0 by the multiplier iZ, we add the product to the minimized
2 + ,u2
quadratic form and in a result we come to the Lagrangian minimization
@.,,A)
(Pf
= arg b a min tm,aa
f " +,u PI ~b -
+A[
-n ] ] .
Differentiating this expression with respect to the unknowns and substituting A=lla, we receive the SLAE (1.5.2~)and the estimate calculated from it will solve the Problem 1.5.2.
61
Pseudosolution of Linear Functional Equations
Theorem 1.5.2.1. The vector A
A
$=(&,-yAT)T , realizing identity which is a stable estimate of the norA
j = Xb,,p, is determined through mal vector,from the formula (*) of item 1.4.3. Remark 1.5.2. As method of the stable estimation of the normal vector is transformed from the LDM, we call this method the regularized method of the least distances (RLDM). Tikhonov [Tikhonov 19851 has called its analog a regularized method of least squares, that, however, illogically. The given result coincides with the assumption predicted in the foreword in [Tikhonov & Ufimtsev 19881. 1.5.3 Stability of Pseudosolution of Systems of Linear Algebraic Equations
We explain a material only algebraically [Mechenov 19911, i.e., we shall consider the SLAE (1.0.5) in presence of deterministic errors of the matrix and of the right-hand side. It is interesting to compare the results with the diverse authors' results and to study the pseudoinverse problem. Assumption 1.5.3. We use the following linear model of an incomplete rankfor the SLAE (1.0.5) with the nonrandom error
Problem 1.5.3. Let B =
[P
ER
~ :min 1 1 - 51 {5'<=0
2}
5x
is the set of
admissible pseudosolutions. Knowing the measured vector Z = (g,-y-T )T containing the fill-rank matrix %, the error norm and thatfact, that the matrix Z of the S U E (1.0.5) is degenerated, calculate the regularized pseudosolution so that the square of norm of this pseudosolution would be minimum on the set of admissible pseudosolutions
Corollary 1.5.3. The solution of the Problem 1.5.3 exists, is unique, satisjes to a relation
Alexander S. Mechenov
ix =arg
min
PI'
or a SNAE
where the regularization parameter is positive. The proof is similar to item 1.5.1. Theorem 1.5.3. The obtained regularized pseudosolution is stable approximation regarding the normal solution. That is, for any value E>O there is such value of an input data error f i g ) , that the regularizedpseudosolution of the SLAE (1.0.5) evades from a normal solution not more than on E
The proof is similar to written in [Tikhonov 19851. We compare this result to known result [Tikhonov 19851 for measured matrix of a SLAE, where the regularized solution is offered to be calculated from
Ix
ly
- Ell= ,u is the where - = o is the norm of an error y relatively q and norm of error X relatively E. Asymptotically they will give identical estimations, but the result in [Tikhonov 19851 does not follow from variational statement. We set the Problem 1S . 3 a little differently
,.
PC = arg
min
Pseudosolution of Linear Functional Equations
which solution will lead to
where for the computation of the regularized pseudosolution it is necessary to know the inconsistency measure and the exact solution norm. Having taken advantage of an inequality [Morozov 19671we can proceed to the following problem
fio.,, 1-81 = arg
min
pa.:{
j
I l+pr Y -J~+PIFI) ~l+ler I$
7
where for computation of the regularized pseudosolution it is necessary to know the exact solution norm and the error norms. Remark 1.5.3. 1) In introduction of [Tikhonov & Ufimtsev 19881 it is supposed, that the statistical form of regularized algorithm will have the form
however in [Tikhonov 19851 is stated. 2) The problem explained in this paragraph represents doubtless interest, especially in connection with applications and application in integral equations (see Chapter 3). 3) Some words about a pseudoinverse in this case, that is, when Z is degenerated. Leonov [Leonov 1985, 1987, 1990, 19911 has entered a minimal pseudoinverse from the condition X* ,+ = arg
In the nonlinear system
min
lp+11.
k:{~ar~'~:~lTi-~k~)
Alexander S. Mechenov
the parameter depends both on a right-hand side and on norm of its error. Therefore any pseudoinverse estimation [Leonov 1985, 1987, 1990, 19911 in its pure aspect for the ill-posed problems of a confluence analysis at its application does not give a vector, which would coincide with a regularized estimation. Since, it is impossible avoid the relation on a right-hand side and on norm of its error completely. Therefore all methods (based on the pseudosolution computing of the given models only through the pseudoinverse estimating) have a little common with a RLDM, except for, maybe, common asymptotic properties. Thus, though application of an pseudoinverse estimation is possible, may be, but only as one more asymptotically (at p a , but not at 0-4)not an inconsistent method of the pseudosolution computing for the ill-posed problems. The results can be easy transferred on the nonlinear models and the multivariate analysis. 1.5.4 Mixed Linear Confluent Model of Passive Experiment
Frequently there is a problem of the unknown parameter estimation of linear confluent model with inaccurately measured regressor matrix and the a priori information in that aspect as it has been assumed into item 1.3. In this paragraph the author (see also [Mechenov 1988, 19911) offers a constructive solution of this problem by the LDM. Assumption 1.5.4. We use the linear confluent stochastic model (1.4.1) ofpassive experiment of an incomplete rank with the a priori information T
y = E P + e , enn,p enm,Ee=O,Eee = X,Z>O, X = E + C , x ennm,EC = 0, EesT = T,E C C ~ = M,M > 0, T 2 b = 1 p + w 7 be^^, EW=O, Eww = v K,K>O,n+h>m,
(1.5.4)
such that the supplement condition is accomplished, that is, the matrix composed from matrices E and I is not degenerated. We consider, that the experiment is conducted passively, that is, the events occur at some unknown values of E and of p. The researchers have at their dis-
65
Pseudosolution of Linear Functional Equations
posal the random response vector, the random regressor matrix, the random a priori vector b for the vector p and its covariance matrices. The structure of model is shown in Figure 1.5-1.
P
object EP=(p
I
Figure 1.5-1. The scheme of regularized experiment.
Problem 1.5.4. Knowing one realization y.%(rank%= m) and
6
of
random variables y, X , b and the corresponding covariance matrices Z, M, T, K, estimate the unknowns cp, E and the parameters p of linear conjluent stochastic model (1.5.4) by M M . Theorem 1.5.4. Estimates ofparameters f3 of the linear confluent model (1.5.4)from the Problem 1.5.4minimize the following quadraticform
The proof differs a little from item 1.4.1. Theorem 1.5.4.1. Expectation of the weighed RSS at true values of parameters is equal to
Proof. We take advantage of the Theorem 1.2.2 of the Chapter 1. The outcome is obvious, as is present the nm+n+m equations and nm constraints. 1.5.5 A method of the Least Distances for Homoscedastic Model
66
Alexander S. Mechenov
We consider the most popular case when all errors of the regressor matrix are known with the same variance, the same as errors of the response vector and the a priori information errors. Assumption 1.5.5. We use the linear confluent stochastic model (1.4.1) ofpassive experiment of an incomplete rank with the a priori information y=EP+e,
T
2
E I R ~P ,exkrn,Ee=O,Eee = a I,
X = B + C , X E ~ ~ ~ , E C = O , E ~ ~ ~ =(1.5.5) O , E ~ ~ b = I P + w , b ~ n ' ,EW=O, Eww T = v 2I , n + A 2 m , such that the supplement condition is accomplished, that is, the matrix composed from matrices E and I is not degenerated. In the model (1.5.5) the observations have identical variances in y, separately in columns of matrix X and in the a priori information. Then the Problem 1.5.4 has the form. Problem 1.5.5. Knowing one realization y.g(rankg = rir) and b of the random variables y, X , b and corresponding variances 02, p2 and v2, estimate the unknowns cp, E and the parameters fl of linear confluent stochastic model (1.5.5) by the MLM. Theorem 1.5.5. Estimates of parameters fl of linear confluent model (1.5.5)for the Problem 1.5.5 minimize thefollowing quadraticform
The proof differs a little from item 1.4.3. Remark 1.5.5. Apparently from previous, the problem in such aspect is easy enough for putting, but uniform computing process as in regression model, already is not present. So development of such approach any more so is interesting as does not leading to use of already available program package as it takes place in the regression analysis where can be used the same standard program and for computing the LSM- and the RLSM-estimates, changing only input data. Here, for deriving the RLDM-estimates it is necessary to write the specific program. Nevertheless this approach is interesting to that allows to count the degrees of freedom and, accordingly, to apply them to calculation of the unknown variance of the a priori information. It allows to estimate the interval of solution (parameter) errors.
67
Pseudosolution of Linear Functional Equations
1.6 Confluent-Regression Analysis of Passive Experiment We have solved in item 1.4 the parameter estimation problem for linear functional model of the first kind in presence of random measurement errors in the matrix and in the response vector (having the correlated observations) or so-called confluent model of passive experiment. In the given paragraph the constructive solution of the parameter estimate problem of the jointly operating in experiment the linear regression model and the linear confluent model of passive experiment or otherwise linear confluent-regression model of passive experiment is offered by a method of least squares-distances (LSDM). Assumption 1.0.6. Given the linear finctional relationship (not subject to our influence overdetermined simultaneous SLAE offill rank)
where cp = (qq ,-.-, Cn)T is an unknown response vector, B = [gl -.,E], ,a
is an
..,q k ] is a known theoretical matrix, unknown measured matrix, H = [q,. T = ( f l l ,...,fl,,,) ;6 = (4,...,6k)T is a vector of unknown parameters (it is
taken the first and the third terms of the linear functional equation (1.0) at =O). Assumption 1.6. We consider the following well-posed linear confluentregression stochastic model of a passive experiment
constructedfor the linearfinctional equation (1.0.6).
I
I
Figure 1.6-1. Scheme and description of a passive experiment.
I
68
Alexander S. Mechenov
We assume that the experiment is passive, that is, events occur when E, H, p and 6 take some nonrandom values, and the researcher observes the random response vector y=cp+e=Ep+HG+e with the covariance matrix T, of the errors e and the random regressor matrix X=E+C with the covariance matrix M of the errors C. We suppose that the matrix H is known precisely (as it is supposed in the analysis of variance, for example, it can be a background vector). The structure of confluent-regression model or confluent-variance model of the passive experiment is shown in Figure 1.6-1. Remark 1.6.1. A linear confluent model (1.6.1) was first considered in [Lindley 19471 for the case of a linear functional relationship with only two parameters p and 6:
and was estimated the parameters p and S by the MLM. Parameters have been estimated with the help of derivation on these two parameters and then of solution of the SNAE. The numerical example of the estimation is shown in [Lindley 19471. However, the generalizations have not followed in view of complexity of a solution of the obtaining high order SNAE. The general algebraic formulation of the model was solved in [Mechenov 19911 and the general stochastic formulation in [Mechenov 1997, 19981. 1.6.1 Least Squares-Distances Method of the Parameter Estimate of Model of Passive Experiment Problem 1.6.1. Given are the values of the matrix H , and one realization and one realization % (rank % = m) of the random variables y and X:
We also have the covariance matrices 2, T , and M. It is required to estimate the unknowns 9, E and the parameters p, 6 of the linear confluent model (1.6.0) of passive experiment by the MLM. Theorem 1.6.1. The estimators 6,a of the parameters P,6 of the linear confluent model (1.6.0)fvom Problem 1.6.1 minimize the weighted quadraticform
Pseudosolution of Linear Functional Equations
or, they are calculatedfrom the SNAE
This system of equations is called the confluent-regression normal equation. Proof. Instead of the matrix X and the vector y, we consider the (mn+n)dimensional vector z = ( ~ , - y ~ To ) ~this . end, we stretch the matrix X by rows and augment it with the row -yT; the same transformation is applied to the vec-T T T T tors 6 = ( B , - ~ ~ P )= ~ (g,-y , ) and w = (c,-e ;) . The original linear confluent-regression model (1.6.0) is rewritten as a linear regression model with linear constraints rc = -H6 :
where the constraint matrix I' is the same as in item 1.4.1. Using the MLM, we receive similarly 1.4.1 that Problem 1.6.1 reduces to the following two-stage minimization problem: estimate the true value of the parameter 6 so that the quadratic form
I'c
is minimized subject to the constraints + H6 = 0, and then find the minimum of this quadratic form over all possible values of the parameters f3 of matrix r and of the parameters 6
Consider the first stage. We use the method of undetermined Lagrange multipliers, and we receive similarly item 1.4.1, that the estimator of the vector 4 is calculated from the relationship
Alexander S. Mechenov
At any parameter p, the matrix r is the full-rank matrix because of presence of the submatrix I. Then the minimization problem of the weighed quadratic form subject to these linear constraints always has the unique solution. Then the variational problem can be rewritten in a form independent of E and 0:
j2 =(Z-Z) T a-1 (Z-Z)
+
=( I 3~
6(rC2r'T)d(E ) ~
+ HS)
= ( R ~ + H ~ - ~ ) ~ Y - ~ ( R- 7~) + H s = -T e Y -1-e,
where Z = Y - ~ ~ - H S . In the second stage, we have to find 6,; that minimizes (**). Differentiating (**) with respect to parameters p, 6, we obtain the complicated SNAE, which is certainly insoluble in an explicit aspect but which is solvable in simple cases. The estimators of parameters p, 6, provided that the matrix A = [E,H] has a full rank, will supplies a unique solution to the Problem 1.6.1. 1.6.1.1 Estimate of Mean of the Residual Sum of Squares Now, using (*), we easily obtain the estimators i = (&-y AT)T of the regressor matrix E and the response vector cp respectively
Theorem 1.6.1.1. Let P, 6 the true values of parameters. The mean of the residual sum of squares (*) is given by
The result follows from equation (**) allowing for equation (*) as in item 1.2.3.
Pseudosolution of Linear Functional Equations
Remark 1.6.1. The mean of the RSS n-m-k [Kendall & Stuart 19681
71
i2 is assumed (similarly 1.4.la) equal to
The obtained result satisfies completely to the correspondence principle. Really, when the matrix E is absent in the Eq. (1.6. l), this estimation method is transformed to the LSM [Gauss 1809, 1855, Aitken 19351 and, when the matrix H is absent in the Eq. (1.6.l), it is transformed to the LDM [Pearson 1901, Mechenov 19881. Therefore we call this method by the least squares-distances method (LSDM) [Mechenov 19911 or, when there is only one free parameter S, by the least distances method. 1.6.2 Identical Correlations in Columns of Matrix and in the Vector of the Right-Hand Side Assumption 1.6.2. We assume the following linear confluent-regression stochastic model ofpassive experiment
in which [E, HI is the full-rank matrix of the linearfinctional relation (1.0.6), where Ci is the column i of the error matrix C, 6, is the Kroneker delta. Thus, the normally distributed passive observations in model (1.6.2) have an identical covariance matrix Z in y and in each column i of the matrix X, that often enough appearance at measurements by the same devices, both the response and the regressor matrix. Then the parameter estimation problem takes the following form. Problem 1.6.2. Given a single realization y and 2 (rank = m) of the random variables y and X ,
the exact theoretical matrix H, and the covariance matrices Z, M = diag(Z,... ,Z ) , estimate the unknown elements 9,E and the unknown parameters p, 6 of the linear confluent-regression stochastic models (1.6.2) by the MLM, that is, so that the weighed sum of squares
Alexander S. Mechenov
would be minimum in the constraint cp=EP+H6. Corollary 1.6.2. Estimates k,& ofparameters P, 6 of the linear confluentregression model (1.6.2)from the Problem 1.6.2 minimize the quadraticform
or they are calculated from the SLAE
where i2is the minimum value of the RSS 9. The proof is similar to item 1.4.2. 1.6.3 Equally Weighed Measurements of the Matrix and of the Vector
We consider the most popular case when the elements in the matrix X and in the response vector y are measured with the same variance. Assumption 1.6.3. We assume the following fill-rank linear confluentregression stochastic model of passive experiment, which use the finctional relation (1.0.6)
Thus, the normally distributed passive observations in model (1.6.3) have an equal variance 2 in y and equal variance ,din X. Then the parameter estimation problem takes the following form. Problem 1.6.3. Given a single realization y and % (rank % = m) of the random variables y and X ,
73
Pseudosolution of Linear Functional Equations
the exact theoretical matrix H , and the variances o2 ,,u2 , estimate the unknown elements cp, E and the unknown parameters py 6 of the linear confluentregression stochastic model (1.6.3) by the U , that is, by minimizing the RSS
subject to the linear constraints (p=EP+HG. Corollary 1.6.3. Estimates 6,i of parameters P, 6 to linear confluent-regression model (1.6.3)from the Problem 1.6.3 minimize the quadraticform
or they are calculated from the SLAE
where i2is the minimum value of the RSS 9. The proof repeats the proof of the Corollary 1.4.3. 1.6.3.1 Estimate of the Variance Knowing the ratio of variances
K = 02/,u2 , the
approached estimate of the
variance a2can be computed (similarly 1.4.1.1 and 1.4.3.6) fromthe formula
n-m-k
n-m-k
1.6.3.2 Example of Construction of the Estimate Let us process the results of the example using the confluent model with free background. As an example in Figure 1.6-2, we construct the line of simple regression p=&i-6217 and the line of simple confluence p=@+S. At the left, these four measurements are approximated by the LSM, leading to the line j = 1.5 in a result. On the right, these four measurements are approximated by the LDM,
74
Alexander S. Mechenov
leading to the line j = i with declination 1 (the RSS are equal to 1 in both cases). In both cases, we have the symmetric estimation of the input data. I
Figure 1.6-2. Comparison the LSM-estimate(a) and the LDM-estimate (b).
1.6.4 Progress Toward the Ideal in Inherited Stature
Consider the statistically based approach to solving the problem of estimating the linear dependence between the height of children and the average height of their parents. As we know from messages of mass media and we see from own observations, our children accelerate. They are higher and cleverer than us on the average. However by results of observations, Galton [Galton 1885, 18891 has concluded that there is not the acceleration, but there is the regression of height of children on height of parents. In particular, it has denominated the entire area of statistics: regression analysis. The initial data are shown in Table 1.6.4 (Galton takes these data from researches: Record of family Faculties [Forrest 19741, and the author has taken them from [Forrest 19741). Table 1.6.4.The relation between the height of children and the mean height of theirs parents. The lefi-hand column gives the mean height (designated according to functional relation as x or q, in inches) of the parents; the rows give the number of children with the height shown at the top ( y, in inches).
75 From the LSM we have the following estimate of the parameters of linear regression model:
Pseudosolution of Linear Functional Equations
Sir Galton did not use the LSM, but on average maximum values of columns has concluded the regression of children height on the mean height of parents. Since measurements of human height contain random errors, pure regression analysis is not applicable to these data. Accordingly this problem is one of processing the results of a passive experiment. Here it is preferable to seek a functional relation between the heights of the form cp = EJ + 8; and when describing the model one must take account of errors (ei;ci),i = 1,s.-,n; that have different random causes but are of the same homoscedastic type for all pairs (yi ;x i ) = ( c i +~6;ci)+ (ei ;ci ),i = 1.-..,n; leading to the following model:
in which we assume the variance of the height of people is the same and equal to bL.
However in the data it is supposed, that height of parents is taken as a half-sum of height of the father and mother, moreover rounded off up to inch, and, hence, contain an error of measurements. Assuming the variance of height of father, mother and children identical and equal to 02,it is easy to calculate the variance of the half-sum height of the father and mother ,u2 [Borovkov 19861. It is equal to half o2:,u2 = 02/2 (the estimate of the ration of variances from input data gives value 2.2446). It is possible to explain excess over the two by magnification of height of children and by magnification of their scatter heights. Estimating by the LSDM, that is, applying the confluent-regression analysis of passive experiment, we have the following result
The input data (on intersection of the row 69 and the column 70) contains probably or a corrigendum or 'rough malfunction', i.e., instead of value 2 it would be necessary to read 12. It practically influences the estimate results neither the LSM, nor the LDM. The results of calculations are submitted in Figures 1.6-3 and 1.6-4.
Figure 1.6-3. The graph of the residual sum of squares for confluent-regression model.
From observational results Galton concluded that there was no progression, but rather a regression of the height of children to the height of the parents, and this, in data particular gave a name to this entire area of statistics: regression analysis. Galton argued (see [Galton 1869, 1883, 1889, 19071 and [Galton 18851 is named "Regression towards mediocrity...") that tall people degenerate in the third generation with coedcient -1 I3 (see least-squares estimate).
I I Fig. 1.6-4. The data of Table 1.6.4 (data), the LSM (reg) and the LDM (r*c) estimates of measurement of heights of children as a function of the mean height of parents. The line y=x is shown for comparison.
77 In fact, they (these great both height and mind) undertake whence. But there is also bias, that is, tall people must come from somewhere, and by the minimumdistance estimate taller people occur 1.6 times as often among tall people; conversely, short people tend to get even shorter (the bias depends on that fact). Therefore by analogy and as opposed to regression it is possible to name the given analysis of passive experiment as the "progression" analysis.
Pseudosolution of Linear Functional Equations
1.6.5 Homoscedastic Measurements of the Matrix and of the Response Vec-
tor Let us consider the main assumption 1.6.1 in application to the typical case where all the errors in the columns of the confluent matrix X and in the response vector y are homoscedastic. Assumption 1.6.5. We assume the following linear stochastic model of passive experiment
in which [ E , HI is the fill-rank matrix of the linear finctional equation (1.O.6), d;, is the Kroneker delta. Thus, the normally distributed passive observations have an equal variance 02 in y and equal variances j$ in each column i of the matrix X. Then the parameter estimation Problem 1.6.1 takes the following form. Problem 1.6.5. Given a single realization 7 and % (rank% = m) of the random variables y and X ,
the exact theoretical matrix H, and the variances a2 ,pi2 ,l. = 1,m; estimate the unknown elements q, E and the unknown parameters P,6 of the linear stochastic model (1.6.5) by MLM.
Given that the observation errors are independent of the parameters, the parameters should be estimated so as to minimize the squared distance of the observations from the sought values
Alexander S. Mechenov
subject to the linear constraints cp=EP+HG. Corollary 1.6.5. The estimates 6 , i of the parameters P,8 of the linear model (1.6.5)from Problem 1.6.5 minimize the quadratic form
or are calculated from the following SLAE
where i2is the minimum value of the RSS 9. The corollary is the particular case of the proposition 1.6.1 proved for a more general covariance matrix. 1.6.5.1 Example of Construction of the Estimate
As a corollary, we consider estimation for two-parameter linear stochastic confluent model with variances that are equal for each variable and homoscedastic within each variable:
The model includes one free (regression) parameter. It follows from Corollary 1.6.5 that the estimates i1,&,2of the parameters PI,P2,S of the linear model (1.6.5a) from Problem 1.6.5 minimize the quadratic form
Pseudosolution of Linear Functional Equations
YJZ
S 2 = 1 ~ 1 %+ P2g2 + 62- 2 ' o2+ P:P: + P2&
and are obtained as the solution of the followkg SLAE:
This SLAE obviously has to be where S2 is the minimum value of the RSS 9. solved by an iterative process. Consider a regression example from weiss 19951, where the price of secondhand Nissan car is estimated as a function of the model year ("age") {I and the distance traveled 5. The input data are given in Table 1.6.5. Table
First we run a simple regression on age [Mechenov 20001. For a simple linear regression model we obtain the LSM-estimate
To compute the LDM-estimate, we assume that the car age is uniformly distributed throughout the year. We thus obtain for its variance [Borovkov 19841
80
Alexander S. Mechenov
We further assume that for 11 observations the uniform distribution is adequately approximated by the normal distribution. We use the variance of the LSM price estimate as the starting value for iterative calculation of the variance of the price y. Iterating to the expected number of degrees of freedom =11-2=9, we easily choose the variance ratio for the input data. We thus obtain the LDMestimate
The LDM produces a more accurate estimate of the value of a two-year old car than the rough LSM estimation (whose deficiencies are described in weiss 19951). The prices calculated using LDM estimates perfectly match the observations. It follows from the model that the price of a new car is $20,000 and its useful life is -10 years. Figure 1.6-5 plots the input data and both estimates. regression&confluence
U)
g 'E
18 16 14 12 10864203 0
\K x\
1
2
3
4
5
6
7
years I
I
Figure 1.6-5. One-parameter model. Squares show the input data, disks the LSM-estimates, diamonds the LDM-estimates.
Let us now consider the regression on the distance traveled. For a simple linear model we have the LSM-estimate
and the LDM-estimate (the observation error is also assumed uniformly distributed)
Pseudosolution of Linear Functional Equations
81
Here the LDM estimation procedure is virtually identical with LSM calculations, because the theoretical variance of the distance traveled is very small, which is not entirely consistent with observations. Indeed, the odometer readings are directly related to the wheel diameter, and a change of 1 cm in the wheel radius due to tire wear leads on average to an error of 4%-5% in distance measurement. Hence on average ,u: 5 . Let us re-calculate the estimate with this variance. Here also the variance ratio of observations is easily calculated from the degrees of freedom:
This change of a variance does not have a strong impact on the estimate because in fact the variance should be much smaller for large odometer reading, and this approach requires calculations with heteroscedastic variance, which fall outside our scope. ~igure1.6-6 plots the observations and both estimates.
I
Figure 1.6-6. One-parameter model. Dark squares are the input data, diamonds the LDMestimates, disks the LSM-estimates.
Finally, let us consider regression on both variables. For two-parameter linear model we obtain the LSM-estimate
82 and the LDM-estimate
Alexander S. Mechenov
Both LSM and' LDM estimates fit the observations. Figure 1.6-7 plots the observations and the two estimates. We see that the six middle points (which provide the most reliable description of the observations) are closer to LDM estimates. Since age and mileage are strongly correlated (in principle, they cannot be efficiently used together in the same model), the rank of matrix is sensitive to the specification of the variance. The LDM-estimate rapidly loses its stability when the variances of independent variables become equal. The variance ratio of the observations is easily calculated from the degrees of freedom. The model is min fact nonlinear in these parameters (because the price does not drop to zero after 9-10 years) and a more detailed analysis is required.
I
I
Figure 1.6-7. Two-parameter linear model projected onto the pricedistance plane. Squares show the input data, diamonds the LDM-estimates, disks the LSM-estimates.
1.6.5.2 Example of Extrapolation of the Input Data Let consider an example. In Figure 1.6-8, the black squares designate input data which errors are identical on both variables. The cloud of input data, on the outlines, reminds a trunk of a "gun" shooting under an angle of 45 degrees. The estimates of input data (symmetric concerning the line y=x) are computed by the LSM (the disks connected by a line) and by the LDM (the diamonds connected by a line). Above in Figure 1.6-8, the obtained estimates are written. Thus, the LDM-estimate keeps the symmetry available in input data, and the prediction goes on an axis of a symmetry. In this case, the LSM estimate does not lead to similar symmetric result. It follows, that for value x=8, the LDM-extrapolation
Pseudosolution of Linear Functional Equations
83
("shots from a gun") is equal to 8, and the LSM-extrapolation is equal to 6,96 and falls ltside the limits of the band ye [x-1, x+ 11. regression y=0.4+0.82x and confluence y=x
F
re 1.6-8. The estimates of input data. Squares show the input data, diamonds the LI estimates, disks the LSM-estimates.
Remark 1.6.5. Further it is possible to consider the confluent-regression analysis of passive experiment subject to the linear constraints, the multivariate confluent-regression analysis of passive experiment and nonlinear models. As in practice, the estimates of nonlinear parameters are made with use of a linearization, there are only technical difficulties. Now we consider the confluent-regression incomplete-rank models of passive experiment.
84
Alexander S. Mechenov
1.7 STABLE ESTIMATION OF NORMAL PARAMETERS OF THE DEGENERATED CONFLUENT-REGRESSION MODEL
We consider the problem of the stable estimation against input data errors of the normal vector for cases when the matrix [E,H]has an incomplete rank. Assumption 1.0.7. We assume the following linearJirnctional equation of the 3rst kind of an incomplete rank
Definition 1.7.0. The vector (Po,Go) refers to as a normal solution of the SLAE (1.0.7) with the degenerated matrix [=,HI, when
where H , E are the known exact matrices and p is the known exact vector. 1.7.1 Regularized Least Squares-Distances Method
Assumption 1.7.1. We use the linear confluent-regression stochastic models of passive experiment
for the linearJirnctional equation (1.0.7) of an incomplete rank. Instead of the matrix X and the vector y we consider the (mn+n)-dimensional vector z = ( ~ , - y ~For ) ~this . purpose, we stretch the matrix X by rows and
augment it with the row -yT. The similar operation is applied to the vectors < = ( ~ , - p ~Z=(g,-y ) ~ , -T )T and w = ( ~ , - e ~Then ) ~ .the equation (1.7.0) is rewritten as a nondegenerate linear regression model with linear constraints:
where the matrix of constraints
(nx(nm+n))has the same form, as in Eq.
85
Pseudosolution of Linear Functional Equations
(1.5. la). In the given model the matrix I is not degenerated also the constraint matrix has the full rank without dependence from completeness of a rank of the initial model matrix. Therefore the first stage: an estimation of the vector 6 will be always well posed. Problem 1.7.la. Assume given the approximate vector %, its covariance matrix R and the constraint matrix r. It is required to estimate the vector of parameters 6 of the model (1.7.1) by the MLM. Theorem 1.7.la. The solution of a Problem 1.7.la exists, is unique and minimizes the quadraticform S'2 =
min
1% -61 2
<:r<+HS=O
=(%p+HS-9) T Y! -1 (@+Hs-y) 2 = sy.
The proof practically coincides with 1.6.1. From the Theorem 1.6.1.1, the mean of this quadratic form is equal to n. Now it is possible to construct the s'et of admissible pseudosolutions. Definition 1.7.1. Let
is the set of admissible pseudosolutions. Problem 1.7.lb. Assume given the measured vector
9, the measured fill-
rank matrix %, the exact matrices 2, T , M and that fact, that the matrix [=,HI of the SLAE (1.0.7) has incomplete rank. It are required to $nd the stable against errors of input data estimation of normal vector (Lp, by) = arg
2 PI + #2 . b,dea
min
Theorem 1.7.lb. The solution of a Problem (1.7.lb) exists, is unique and satisfies to the relation (Lp,
by ) = arg
min P E R ~ , ~ E R ' : ( % ~ + H ST-Y ~ )-1 ( $ P + H S - ~ ) = ~ )
86
Alexander S. Mechenov
Proof. It is enough to find the minimum of square of parameter norm on all possible values of parameters P, 6 in the constraint S$ i n . It is known [Tikhonov & Arsenin 1979, Shikin 19851, that the minimum is attained on bound2 ary of the set, i.e., when SY =n . Having taken advantage of the relation (*) of item 1.6.1, it will be easy to compute the estimate of the repressor matrix E expectation and of the response vector expectation accordingly. Remark 1.7.1. When the matrix E is absent in the SLAE (1.0.7) this method of stable normal vector estimation is transformed to the regularized method of least squares, and when there is no matrix H it is transformed to the regularized method of least distances [Mechenov 19881. We call it the regularized least squaresdistances method (RLSDM). 1.7.2 Degenerated Model with Homoscedastic Errors of the Matrix and the Response Assumption 1.7.2. We use the following linear conJluent stochastic model of passive experiment with an incomplete rank matrix offinctional relation (1.0.7)
Problem 1.7.2a. Knowing the approximate vector Z , its covariance matrix R and the constraint matrix r of the model (1.7.2a)
estimate the vector ofparameters C; by the MLM. Theorem 1.7.2a. The solution of a Problem 1.7.2a exists, is unique and minimizes the quadraticform
s'2
=
min
1 1
<:r<+HS=O
-<12
Pseudosolution of Linear Functional Equations
The proof practically coincides with 1.6.3. Then the Problem 1.7.1b has the following form. Problem 1.7.2b. Assume given the measured vector y , the measured fillrank matrix %, the variances o2 ,,u2 ,and the fict, that the matrix [=,HI of the SLAE (1.0.7) has incomplete rank. It is required to calculate the stable against errors of input data estimate of a normal vector of the linear confluent stochastic model (1.7.2)
Theorem 1.7.2b. The solution of the Problem 1.7.2b exists, is unique and satisjes to the relation
and the SLAE
where the regularizationparameter is positive. Proof. Since the minimum is attained on boundary of the set B, the relation (1.7.2b) is obvious. We apply for construction of the estimation method the method of indeterminate Lagrange multipliers. For this purpose we multiply the
88
constraint
Alexander S. Mechenov
1 % ~+ HS - $ - n = 0 on a multiplier A, we add the product to the
2 + P2 minimized quadratic form and in result we come to the Lagrangian minimization
Having taken a variation of this functional concerning of unknowns and Lagrange multiplier, we receive the SNAE (1.7.2~). Having taken advantage of the relation (*) from item 1.6.3, it will be easy to calculate the estimates of the regressor matrix E expectation and of the response vector cp expectation accordingly. We explain separately a case of the determined errors as it represents independent interest. 1.7.3 Normal Solution of the SLAE Measured with Nonrandom errors
We show calculations for those cases when the matrix and the right-hand side have the determined errors. In this case, the results differ a little from case with random errors, as the norms of the RSS and of the errors of the right-hand side and of the matrix are another. Assumption 1.7.3. We start with the following linear model of an incomplete rank for the 1inearJirnctional equation (1.0.7) or otherwise a SLAE (1.0.7) with response errors
Let
is the set of admissible pseudosolutions. -T )T containProblem 1.7.3. Assume given the approximate vector 3i = (g,-y
I'i
x
ing the full-rank matrix %, the norm of the measurement errors - 61 5 , the exact matrix H and that fact, that a matrix [ E , HI of the SLAE (1.7.3) is degenerated. It is required to calculate the unknown values of the approximation vec-
89
Pseudosolution of Linear Functional Equations
tor to the normal vector so that square of norm of this approximation was minimum on the set of the admissible pseudosolutions B
Corollary 1.7.3. The solution of the Problem 1.7.3 exists, is unique, satisfies to the relation
and it is calculatedfrom the SNAE
Ijib,
+ Hd, - y r
b,
+ Z T ~ d a= X- T-y,
l+lbaf
where positive value of the regularization parameter is calculated from a relation
or the solution is calculated as a limit at value of parameter a+ 0 when this relation is not satisfied. Theorem 1.7.3. The obtained approximation is stable against a normal solution, that is, for any value E> 0 there is such value of an input data error ~ ( s ) , that the approximation vector to the normal vector of the SLAE (1.0.7) evades from a normal solution no more than on s
The proof is similar to [Tikhonov & Ufimtsev 19881.
Alexander S. Mechenov 1.7.4 Quasisolutions
In the same way we generalize the quasisolution concept [Ivanov 19621. Definition 1.7.4.
he vector
T
( f 3 ~ , 6 5 ) we call quasisolution of the equa-
tion (1.0.7) on the set M, if T
( pTM , tTi M ) =arg
2
~ E ~ + H G. - ~ (
min (pT,STr€+!
We consider as the set M the full-sphere
Problem 1.7.4. Given the approximate vector 5 , the exact matrix H , and the T
value of g>0, calculate the quasisolution vector ( p L , t i L ) as the argument minimizing the RSS on the set M of admissible quasisolutions min p,S:l(pT
[
min
c:I'<+HS=O
-c12.
SY
The solution of this problem is similar to the above-described procedure and only involves a change of the equation for the Lagrange multiplier. Theorem 1.7.4. The solution of a Problem 1.7.4 exists, is unique, is argument of the minimum of quadraticform
and satisJies to the SNAE
Pseudosolution of Linear Functional Equations
(quasisolution is calculated as a limit at A+ 0 when last relation is not satis-
$4.
Remark 1.7.4, 1 ) Completely the algebraic approach to all these problems is explained in [Mechenov 19911. 2) Some words about a technique of the computation of estimates. It is obvious enough, that direct minimization of the sum of squares will be preferable in all cases of a heteroscedasticity. However at the homoscedastic data a multiple SLAE solution is preferable, especially in view of application of the regularization method [Voevodin 19691 (see item 3.5.3). 3) For nonlinear problems, in the most common case, methods of deriving of estimations are transferred on them with difficulty. It is necessary to postulate the formulas for parameter estimation. 1.7.5 Inference of Chapter 1
The answer to the problem: 'How to calculate a solution (parameters) in a case of the partly or completely inaccurately measured matrix and the measured right-hand side? ' is the following: For square matrices which part of columns is measured with an error, the classical Cramer rule of solution estimation remains in force and coincides with a rule of the pseudosolution estimation. For the overdetermined matrices which part of columns (or all columns) is measured with an error, it is necessary to apply the estimation methods of the pseudosolutions explained in this chapter, since use of traditional Gauss transformation is wrong! The basic result of the given chapter is the,solution of the basic problem of the confluent, confluent-dispersion and confluent-regression analysis of passive experiment, i.e., the problems of unknown parameter estimation. The problem of a stable estimation of normal parameters of the degenerated confluent and confluentregression models is solved also.
92
Alexander S. Mechenov
The following chapter is devoted to that case, when the matrix of a functional relation is prescribed (or is more precisely inspected) and to every possible more complicated case.
Chapter 2 SYSTEMS OF LINEAR ALGEBRAIC EQUATIONS
Abstract In the second chapter, the models of the passive-active-regressionexperiment are constructed. A picture of exposition of experimental researches in frameworks of the confluent-influent-regression models is finished. That allows the contributor to understand better to itself a picture of researches and to carry out correctly parameters' estimation. The method of the effective correction of rounding errors is constructed also for the procedure of the numerical solution of the SLAE and of the numerical computation of parameters' estimates. The regularized estimation methods for a case of the incomplete-rank matrices are developed.
2. ANALYSIS EXPERIMENTS
OF
ACTIVE
AND
COMPLICATED
When experiments are carried out in practice, especially in physical and engineering investigations, the problem most often encountered is that of estimating the unknown parameters of a linear model with imprecisely controlled matrix (the socalled predictor matrix). The researchers prescribe the exact predictor matrix based on the equipment, but random errors accumulate inside the equipment. We describe this representation of information in such a way. Assumption 2.0. Assume the following linear finctional relationship (the functional equation of thejrst kind)
94
Alexander S. Mechenov
where @ =
[
;..,
is a known matrix, 0 = ( 8 1 , s .6. ,p ) T are unknown pa-
rameters, q = (qq,-..,Cn)T is an unknown response vector. We assume the structural relationship corresponding to the Eq. (2.0.0)
[
where F = f l , - . - , f pis] a random matrix of realization @, J = [ j l ; - . , j p ] is its T
error and i = (il ,..- , i n ) is an unknown response vector. That is, the structural relationship is a functional relationship with the matrix containing an additive random error. In this chapter the author offers a constructive solution of the more complicated models, that is, containing both actively controlled part of a matrix, and passively measured and theoretically known parts. For the beginning we consider the simplest case of the structural relationship (2.0.1).
2.1 Analysis of Active Experiment Assumption 2.1. We shall use the linear stochastic model of active experiment off11 rank, which is based on the structural relation (2.0.1)
Such experiment is conducted actively, that is, the researcher prescribe the exact predictor matrix @ based on the equipment, but random errors J accumulate inside the equipment, and the events being studies with certain random (unknown) values F=Q,+J and unknown parameters 0. Thus, errors are introduced in the appearance first by an inaccuracy in the realization of controlled predictor values. The researchers have at their disposal a little another, than in the regression analysis, a random response vector
that is further corrupted by measurement errors e, in which the complete vector of error u=Je+e also depends on the unknown parameters (and the exact predictor matrix @). Such a model also still refers to as the model generated by structural relations of the form i=F0. In [Kendall & Stuart 1967-691, they refer to structural relationship, instead of functional relationship as F is the random ma-
95 trix, on which the random response measurement error e is imposed. We shall assume that there are no correlations between the controlled and measured quantities (hence EeJ=O). Such an assumption can be justified by the fact that their errors arise at different times (error of the matrix a representation proceeds before error of the right-hand side i registration). They also arise at different spaces (they are aMixed to the different type objects), but, probably, it is impossible to exclude completely these correlations. The scheme of such an experiment is shown in Figure 2.1-1.
Pseudosolution of Linear Functional Equations
-
prescribed
["-
u predictors
1 measurements
I
Figure 2.1-1. Scheme and description of model of an active experiment. Here all initial predictors I/+, are set nonrandom, and all casualties in model appear due to errorsj,, of instrument
realization of the controlled predictor matrix and due to additional errors e of the response measurements which are superimposed on a right-hand side i=W+J9.
Thus, experiment is realized in the conditions that are a little bit distinct from desirable theoretically and, accordingly, controlled. Thus not so traditional response y=@8+e is removed (as it assumes in the regression analysis). Since the full error u=J8+e depends on the unknown parameter 8 and on the errors of matrix realization. The response, i.e., is correlated with the matrix errors. Neither the LSM nor the LDM can be applied to estimate the unknown parameter 8. We shall call such model more correctly influent (influent, that is, acting, influencing), that is, model that studies the results of direct action, instead of results of passive contemplation. In such model, as against confluent model of passive experiment, it is impossible to interchange places any column of a matrix and a right-hand side. For the first time the problem on various linear models of a multiple analysis of passive and active experiment in a case of homoscedastic error e and J was considered in [Berkson 19501, where the incorrect conclusion that parameters of such model can be estimated by the LSM is made. In [Durbin 19541 the behavior of estimation errors was studied; Fuller [Fuller 19801 has studied properties of these estimators. Fedorov has studied the model of a multiple analysis of active experiment in linear [Fedorov 1968, 1971, 19781 and nonlinear cases [Ajvazjan, etc. 19851 with homoscedastic measurement errors, and the case of heteroscedastic
96 Alexander S. Mechenov errors there is in [Mechenov 19881. Statements of a question of similar type belong to much. Vuchkov has named such experiment planned (in [Vuchkov, etc. 19871 there is the literature). Frequently the problem of passive experiment is substituted by the problem of active experiment [Islamov 1988, 19891. Algebraically (as against a confluence analysis of passive experiment) such problem is not solved because of presence of correlations between the complete right-hand side errors and the realization errors of the predictor matrix. Therefore it is necessary to apply the statistical methods of an unknown parameter estimation. 2.1.1 Maximum Likelihood Method of the Parameter Estimation of Linear Model of Active Experiment In what follows, we shall assume that the errors are normally distributed and
we shall study the estimation of the unknown parameters by the MLM. We write out the likelihood function: 1
)-*
exp - + ( q - ~ ~ ) ~ (T~ u(q-W) u
(
1
.(2.1.1)
In calculations we use its negative double logarithm which is more convenient for the further reasoning
at which sometimes we throw also a constant n In 2n. Problem 2.1.1. Given a single realization i= cf,0 + k +Z of the random vector q, the matrix cf, (rank cf, = p) and the corresponding covariance matrices Z, P , estimate the unknown parameter 0 of the linear influent stochastic model (2.1.0) of the active experiment by the MLM. Theorem 2.1.1. The estimates of the parameters 0 of the linear influent model (2.1.0)from Problem 2.1.1 can be computed by minimizing the functional
Pseudosolution of Linear Functional Equations
97
where Y = {oik+ 8 T ~ i k 8,and } Pik are the elements of the cell having label ik and dimension pxp in the covariance matrix ~ ( ~ n * ~orn can ) ; be computed from the SNAE
where ii = 0 8 - Q and
We shall call this system of the nonlinear algebraic equations by the normal equation of active experiment. Proof. As the minimized functional is written out, it is enough to show, that EuuT = yl , to reject an insignificant constant and to differentiate the turned out expression. The calculation EuuT is made similarly by item 1.4.1, and derivation of matrix expressions explicitly is considered in [Ermakov & Zhiglavskij 19871. 2.1.1.1 Nonlinear Model. Assumption 2.0a. We assume the following nonlinear functional equation of the first kind
where 0 is a known controlled matrix, 8 are the unknown parameters, p is an unknown response vector and the following structural relationship
corresponding to the relationship (2.0.Oa). The structural relationship, i.e., is a nonlinear functional relationship with a matrix containing an additive random error. Assumption 2.la. We shall use the (linear on measurement errors of the response vector) influent stochastic model o f f i l l rank, which is based on the structural relation (2.0.la) of active experiment:
98
Alexander S. Mechenov
If the linearization procedure is possible, the results of the previous paragraph are applicable, and in this case if it is not present, the main problem becomes the construction of mathematical expectation of the full errors of model. 2.1.2 Homoscedastic Errors in the Matrix and in the Response
We consider the most popular case when all elements of an controlled matrix are realized with the same variance p2 as well as the response is measured with the same variance 02. Assumption 2.1.2. We shall use the linear influent stochastic model of active experiment offill rank, which is based on the structural relation (2.0.1):
Thus, perturbations have a scalar covariance matrix both in the right-hand side and in the matrix of the predictor realizations. Then the Problem 2.1.1 has the form. Problem 2.1.2. Given a single realization = @8+ 58 + Z of the random variable q, the matrix cD (rank 0 = p) and the corresponding values 2 and p2, estimate the unknown parameter 8 of the linear influent stochastic model (2.1.2) of the active experiment by the MLM. Corollary 2.1.2. The estimator of the parameter 8 of the linear influent model (2.1.2)fromProblem 2.1.2 can be computed by minimizing thefinctional
or can be computedfrom the SLAE
[
n
-
pe+ p- q2 2 1012 ) l ) = m ~ q
CT
Proof. The value R$ is calculated by obvious manner. Differentiating it with respect to parameter vector and equating it to zero, we find
Pseudosolution of Linear Functional Equations
99
As the first factor will not convert in zero, that, having equated to zero the second factor, we find the normal equation
2.1.2.1 Variance Estimation
Calculating a derivative on 02 of the negative double logarithms of the ratio of a likelihood and equating its to zero, we receive the equation
In that case when the ratio of variances is equal to K = 0 2 / p 2 , the asyrnptotically unbiased estimator of a variance d can be computed from the formula
Remark 2.1.2. The problem of deriving an unbiased estimator remains open. That is reached whether the estimate unbiasedness by replacement n on n-p? In fact, even at p=n, the RSS is not equal to zero. Thus the mean of expressions in brackets for the equation (*)
is positive. In the equation (*), the diagonal elements of the matrix mT@ have positive shift (we mark similarity with a regularization method where also there is a positive shift of diagonal elements of the transformed matrix). As it is known, simplifications are frequently deceptive. Therefore in [Ajvazjan, etc. 19851 in the chapter devoted to the nonlinear analysis of active experiment, at construction of
100
Alexander S. Mechenov
recurrent sequence the term with the logarithm was throw (probably, the author considered it poorly significant). But it ensures a positiveness of the component on a diagonal of a matrix (without it the estimation of parameters actually coincides with an estimation of passive experiment). If in experiment p2llilV << 0 2 ,the LSM application to these input data can give enough good estimate.
2.1.2.2 Log-Likelihood Function We study the behavior of function
where 0, is a solution of the Euler equation
Since the function S$ (a)=
1@0a-G?
is non-decreasing for a20 (being
o2+P21ea12
the ratio of non-decreasing residue function r 2 ( a )= I@ea function
(a)=
d + p2leal2 that is convex above (see [Moromv 1987])), and
the function T$ (a)= n ln(02 +
2 lea?)
is non-increasing, their sum ZI2(0,)
has at least one point of a minimum for a 2 0 . From the necessary minimum condition
we have
-12 and a decreasing
101 The iterative process of the minimum computation can be begun with the following initial approximation a. = -p2 (n - p / 2) + p2n = p2p / 2 . The behavior of log-likelihood function is represented in Figure 2.2-2.
Pseudosolution of Linear Functional Equations
N a) !.a)
-
5
$a) I
60.00039595 1, 0 0
50
100
-0.5
a
100
Figure 2.1-2. Log-likelihood function. 2.1.2.3 Example of the Estimate Computation
Let us consider the same example in item 1.4.3. From the same four observations (see Figure 1.4-4) we form the functional relation cp = +e. I
Figure 2.1-3. Comparison of the LSM for a regression model (a) and the MLM for an active model (b) (in models without background).
These measurements are approximated before by the LSM, leading in a result to the line j z 0,9q (at the leR in Figure 2.1-3a). In Figure 2.1-3b, the errors of realization are designated by the arrows going in points to reflect the activity of executing of the experiment, whereas the arrows going out points represent the measurements. After these measurements are approximated by the MLM assum-
102 ing that the experiment is active, leading to the line in Figure 2.1-3b).
Alexander S. Mechenov
2 = %31 z0,7(
(on the right
2.1.3 Effect of Rounding Errors for Systems of Linear Algebraic Equations
The model of influent analysis of an active experiment is used to describe the rounding errors in the computer solution of SLAE. The estimation of the parameters (solving the SALE) of this model by the MLM has the effect of positive displacing the diagonal elements of the Gauss transformation matrix. Rounding errors in the computer solution of SLAE can play rather significant role. Now from practical observations at the computer solution of SLAE it was known, that at shift of a spectrum in a positive leg the process of a solution becomes more stable. There are not emergency stoppings of the computer and even it is possible to select such displacing, that the solution improves a little. However the origin of this phenomenon was not known. The author constructs the solution estimate in view of rounding error influence and the obtained solution computing method just leads to shift of a spectrum in a positive displacing that justifies earlier suggested and used practical recommendations. This item is a continuation of the ideas in [Mechenov 19881, where a procedure for compensating of rounding errors when solving the normal equations of regression analysis by computer was analyzed [Mechenov 19951. Suppose that the full-rank system of simultaneous, accurately given linear algebraic equations (2.0.0) is to be solved by computer. We will compute a solution by any direct method [Faddeev & Faddeeva 19631. To describe the rounding errors, we use the concept of equivalent perturbations [Wilkinson 1965, Voevodin 19691, that is, we assume that the initial values are perturbed by the value of the rounding errors, and that the subsequent computations are accurate. As we know [Voevodin 19691, equivalent perturbations J of the matrix cD and equivalent perturbations e of the right-hand side cp are practically random variables, additive, undisplaced and uncorrelated with one another. Then to account for the influence of rounding errors in the approximate computer solution of the SLAE, the model (2.0.0) is converted to the model [Voevodin 19691
where the matrix J is expanded in the column vector along the rows, Z={G,) and P={pv) are the covariance matrices of the errors. In turn, ,the direct methods of [Voevodin 1969, Faddeev & Faddeeva 19631 consist in multiplying the expanded matrix [cD,cp] on the left by the matrices which reduce cD to a simple form (triangular or bidiagonal) for computing the solution of the SLAE and then doing
103 so. This procedure finally results in all rounding errors being accumulated on the right-hand side. Thus, model (2.1.3) can be transformed to the stochastic model
Pseudosolution of Linear Functional Equations
where u=J0+e is the total error of the response q=
the matrix Q (rank Q =n) and covariance matrices Z and P, it is required to estimate the unknown parameters (to estimate the solution of the SLAE) 0 of the linear stochastic model (2.1.3') so that the 1ikelihoodJicnctionL(0) in Eq. (2.1.1) is a maximum. Since Theorem 2.1; 1, the estimators of the parameters (of the solution of the SLAE) of model (2.1.3') minimize the negative double logarithm of the likelihood function (2.1.1'). According to [Voevodin 19691, "independence of rounding errors in aggregate cannot be assumed without proof' and the theorem 2.1.3 permits a full analysis of their influence on the solution of SLAE (2.0.0). However [Voevodin 19691 "all errors arising when a matrix is decomposed into factors by elimination methods are asymptotically independent of one another". We will thus assume that the covariance matrices Z and P are diagonal. The covariance matrices of rounding errors for iterative methods have been written out in [Kim 19721, where the rounding errors have been shown to be homoscedastic (that is, to have equal variance). The same will be assumed of the direct methods. There are the majorant estimates poevodin 19691 of the variance of errors of the matrix p2 and right-hand side 2 for the direct methods which, in floating point computations, satisfy the relations
Alexander S. Mechenov
where t is the number of digits in the binary representation of the ma@issa, f@(n),f,(n)' are functions which depend on the method used to compute the solution of the SLAE and, in the worst case, do not increase with order greater than n2 (see [Voevodin 1969, Kim 1972, Voevodin 1969a1). However, when these functions are used to compute the solution of the inverse matrix, their order of increase will not exceed n3 and the norm of the matrix can be replaced by its conditionality number [Voevodin 1969aI. Thus Eq. (2.1.1') can be written in the form
Let
Differentiating Eq. (21.2') with respect to 0, to compute the estimator SLAE we obtain the Euler equation
6
of the
[eTe + &1]6= eTq, where
(*)
has a non-linear dependence on the required estimator 6 . We will consider the Euler equation as an equation in 0, for arbitrary a 2 0 . Since the function
4 (a)= (me, - 6(Z/(02+ p210, 12)
is non-decreasing for a t 0 (being the ratio
of the non-decreasing function r2 (a)= lee,
-
and a decreasing convex above
105
Pseudosolution of Linear Functional Equations
function
(a)= o2+ p2 19a
ly (a)= ln[02 + p2 19,
12)
l2
[Morozov
1987]),
and
the
fmction
is non-increasing, their sum R2 (9,) has at least one
point of a minimum for a 2 0 . We thus have the following corollary. Corollary 2.1.3. For model (2.1.3') with asymptotically independent homoscedastic errors, the parameter estimator (the estimate of the solution of the S U E (2.0.0)) gives a minimum of the functional (2.1.2') or else can be computed from the Euler equation with (*). The value a that gives a minimum of the function R2(ea) is computed with an iterative process constructed in one of way described in item 3.5.5 [Mechenov 1988, Mechenov 19771. It is natural to use the algorithm of [Voevodin 1969al for the computer solution of the Euler equation. The estimate given above for the errors of a solution is applicable here too. Thus, the computation of the MLM estimate for such a model entails an increase in the diagonal elements of the Gauss transformation matrix aT6,in the Euler equation by a positive amount, which improves its conditionality beforehand. This effect is especially important for SLAE with an ill-conditioned matrix. Thus the SLAE (2.0.0) can be solved (even when it has an ill-conditioned matrix) without the need for additional information on the initial exact problem. So, it is constructed the method of allowing of equivalent perturbations of the input data caused by rounding errors in the computer solution of the SLAE (2.0.0). Remark 2.1.3. 1) Since the estimates p2 and d are not error-free, the following method can be used in practice to monitor the solution. Since the residue functional is not always monotone with respect to a [Mechenov 19771 when there are rounding errors, its local minimum, corresponding in many cases to the most accurate solution, can be computed. It is natural then first to compute the solution, then its residue and its norm, rather than the residue as a solution of the transformed ra = -aq ;the norm of that is less subject to rounding
errors and more often has a local minimum for a different value of a (or else does not have one at all) [Mechenov 19771. Non-monotonicity of the rounding errors of
a1G
the function q(a) = 1 ~ 0 ,- q12 + could be used in the same way [Mechenov 19771. 2) The improved stability that is obtained by using a regularization algorithm [Tikhonov 19651 in the computer solution of ill-conditioned SLAE, which is basically intended to compute the stable approximation to a normal solution of the incomplete-rank SLAE, also results from the computation of the solution from an
106
Alexander S. Mechenov
equation of the Euler equation form, the value of a in which is sought from the residue principle [Morozov 19871. 3) It is quite obvious from the above what will be the effect of the influence of rounding errors when overdetermined SLAE are solved by the LSM, the SLAE with an inaccurately measured matrix and right-hand side by the LDM or by the LSDM [Mechenov 19911, or the SLAE with inaccurately realized matrix by the MLM [Fedorov 19681. Since the general scheme of the solution constructing for these models is still similar to the Gauss transformation or the solution of the Euler equation, the given approach is suitable for those problems also. 2.1.4 Incomplete-Rank Model of Active Experiment.
The incomplete-rank model is obviously "nonsense" at the representation of the predictor matrix Q, that is, when experiment is planned beforehand and it is possible to provide all troubles, but from the theoretical point of view it is interesting. Except for it deserves study, especially in view of possible applications in the linear integral equations of the first kind and in the operational equations of the first kind. Assumption 2.1.4. We use the linear influent stochastic model (2.1.0) of active experiment of incomplete rank. Definition 2.1.4. The vector80 is the normal solution of the SLAE (2.0.0) with the incomplete-rank matrix Q, if 2
go = Arg min l1811E @:@@=(p
at known exactly the matrix Q and the vector 9. Definition 2.1.5. Let
is the set of admissible pseudosolutions, where
Problem 2.1.4. Given a single realization vector = Q8 + h +Z of the random vector q, the exact matrices Q, Z, P and that fact, that the matrix Q of the SLAE (2.0.0) is incomplete rank, estimate the unknown values of the ap-
107
Pseudosolution of Linear Functional Equations A
proximation vector t, to the normal vector 80 so that the sum of squares of this approximation was minimum on the set of admissible pseudosolutions
Theorem 2.1.4. The solution of the Problem 2.1.4 exists, is unique, satisjes to the relation A
,
t = Arg
min o : ( ( D o - ~ Y-'((DO-q)+lndet )~ ~ + lnn2 n = d
18l2.
The proof is similar to item 2.1.1 and to item 1.3. 2.1.5 Regularization in the Case of an Error Homoscedastisity
We consider the most popular case when the matrix errors and the response errors are known with an identical variance. Assumption 2.1.5. We use the linear influent stochastic model (2.1.2) of active experiment of incomplete rank. Thus, the perturbations have a scalar covariance matrix both in the right-hand side error and in the error matrix J. Then the Problem 2.1.4 has the following form. Problem 2.1.5. Given a single realization vector = cD8 + i 8 + Z of the random vector q, the exact matrix 0 of the SLAE (2.0.0) of incomplete rank and the values o and p, estimate the unknown values of an approximation vector i, to the normal vector 80 so that the sum of squares of this approximation was minimum on the set of admissible pseudosolutions
Corollary 2.1.5. The stable approximation to a normal vector of parameters exists, is unique, satisjes to the relation A
t
,= Arg
min e : e + n * ( 2 + p 2 1 e 1 2 +nln2n=m2 ) 0 +p21q2
and the S U E
lei'
Alexander S. Mechenov
where ~ 1 I R 2 iisl the numerical Lagrange multiplier. Proof. For the proof it is enough to use the item 2.1.2 and 2.1.4. Remark 2.1.5. The result will not vary asymptotically, if in the last equation we proceed to the relation
2.1.6 Mixed Models of Active Experiments
When experiments are carried out in practice, especially in physical investigations, the problem most often encountered is that of estimating of unknown parameters of a linear model with imprecisely controlled predictor matrix with an a priori information on the unknown parameters. For the beginning we consider the simple case. Assumption 2.1.6. Given the fitnctional relation (2.1.0) with the a priori information
We shall consider, that the supplement condition is satisjed for the mixed model (2.1.6), that is, the complete matrix of this fitnctional relationship has a fill rank. We assume further, that all errors submit to the normal law. We consider an estimation of the required parameters by the MLM. We write out the likelihood function
Pseudosolution of Linear Functional Equations
(
xexp - (q - W ) T ( E ~ ~ T(q ) --' 0 0 ) - i ( t - e ) ~ @ c c ~ ) - l (-t 0))
and we calculate its negative double logarithm which is more convenient for the further reasoning
+ lndet (Euu
TI
(
det Ecc
T I+ ( + )-
n l ln2n.
Problem 2.1.6. Given the realizations ij = 0 0 + ?0 + Z and ? of the random vectors q and t, the matrix 0 (rank@=r)and the corresponding covariance matrices 2, P, K , estimate the unknown parameter 0 of the mixed linear influent stochastic model (2.1.6) by the MLM. Theorem 2.1.6. The estimates of the parameters 0 of the mixed linear influent model (2.1.6)from the Problem 2.1.6 minimize the Jitnctional
Proof. As the minimized functional is written out it is enough to show, that that is already done earlier.
EUU~=Y,
2.1.6.1 Homoscedastic Errors in the Matrix and in the Response. We consider the most popular case when all errors of the predictor matrix are realized with the same variance p2 as well as the response is measured with the same variance 02. Assumption 2.1.6a. We use the following mixed linear influent stochastic model of active experiment with the a priori information of a functional relation (2.1.O) of incomplete rank
110
Alexander S. Mechenov
Thus, perturbations have a scalar covariance matrix in the right-hand side errors, in the error matrix J and in the a priori information. Then the Problem 2.1.6 has the following form. Problem 2.1.6a. Given a single realization = 0 8 + j8 + E and 7 of the random variables q and t, the matrix d[, (rank
Remark 2.1.6. 1) One of variances can be estimated, having taken advantage of the RSS mean. 2) First, that was applied to the account of the a priori information is the Bayesian approach [Zhukovskij & Morozov 19721, [Zhukovskij 19771, [Murav'eva 19791. But the application of the Bayesian approach to the active experiment model leads to complicated distributions for the response, that, most likely, not vitally. Really, if we replace the parameter vector 8 on the random vector t in the active experiment model 2.1.1, then at once there is the distribution problem of the vector Jt. If J and t, say, are normal, the error Jt+e has a complicated distribution, which correspondence to distribution of the real errors is enough uneasy to check up. Besides, the MLM application becomes enough complicated. Therefore the approach, where the parameter vector is random, leads to complicated theoretical research. We consider now the models of active experiments containing also the regression parameters.
Pseudosolution of Linear Functional Equations
2.2 Analysis of Active-Regression Experiment In engineering and especially in physical researches, one frequently meets with the problem of estimating the unknown parameters of a linear model of active experiment with an imprecisely controlled predictor matrix and a regressor matrix known purely theoretically. For example, this part of a matrix has the form of a background vector. That reflects the assumption about a nonzero constant of expectation of the response vector errors. Assumption 2.2. We use the following linear Jitnctional equation (relationship)
where 8 =
,...,+P] is a known precisely prescribed
is precisely known theoretical matrix, 8 = (el,.
.a,
matrix, H = [q -.,qk] ,a
o ~ and) 6~= (61, . - ~ , T6 are ~)
.
unknown parameters, cp = ( qq ...,pn)T is an unknown vector of the response. Also we enter the corresponding Eq. (2.0.2) concept of structural relationships
where i = (il,..
is a unknown vector of the response. F = fl, - fp] is a
[
.a.
random matrix of realization @ and J is its errors, submitting to a normal law. That is the structural relationship is a functional relationship in which the part of an initial matrix contains an additive random error. 2.2.1 Maximum-Likelihood Method of the Parameter Estimation of Linear Model of Active-Regression Experiment Assumption 2.2.1. We shall use the following linear injluent-regression stochastic model of an active experiment o f f i l l rank, which uses Eq. (2.0.2) and Eq. (2.2.0)
in which the errors are subject to a normal law. We consider, that experiment is conducted actively, that is, the researchers prescribe the exact predictor matrix @ based on the equipment, but random errors
112 Alexander S. Mechenov J accumulate inside the equipment, and the events being studied occur with certain random (unknown) values F = (D + J and unknown parameters 8. Except for it, there are still not taken into account the theoretically known regressors H which influence 6 also is necessary to estimate. Thus, errors beforehand are introduced in an appearance by an inaccurate realization of prescribed values of predictors. The researchers have at their disposal a random response vector
that is further corrupted by measurement errors e, in which the complete vector of errors u = J8 +e also depends on the unknown parameters. Such model also can be counted the model generated by structural relationship (2.2.0) i = F8 + H6 on which the measurement error e of the response i is imposed. The scheme of such an experiment is shown in Figure 2.2-1.
I
.
I
Figure 2.2-1. Scheme and description of an active-regression experiment.
Remark 2.2.1. Vuchkov and Boyadjieva have undertaken the attempt to understand the simplified model of the form q = F8 + 6 + e,F = @ + J , in [Vuchkov & Boyadjieva 19811. Since the error u = J8 + e depends on the unknown parameters 8, neither the LSM nor the LDM can be applied to estimate the unknown parameters. In what follows we shall assume that the errors are normally distributed and study the estimation of the unknown parameters by the MLM. We write out the likelihood function
In our computation we shall use the logarithm of this function multiplied by -2.
Pseudosolution of Linear Functional Equations
113
Problem 2.2.1. Knowing one realization ?j = a0 + H6 + %I + Z of the random variable q, the matrices @(rank@= p) and H (rank H=k) and the corresponding covariance matrices Z, P, estimate the unknown parameters 0 , 6 of the linear influent-regression stochastic model (2.2.1)by the MLM. Theorem 2.2.1. The estimates i,d of the parameters 0, 6 of the linear influent-regression model (2.2.1)from Problem 2.2.1 can be computed by minimizing theJirnctional
where Y = {oik+ CITpik0),and Pik are the elements of the cell having label ik and dimension pxp in the covariance matrix ~ ( x pn). ~ n The proof of this result is made similarly item 2.1.1. 2.2.2 Homoscedastic Model of Active-Regression Experiment
We consider more in detail the most popular case when the prescribed matrix is realized with of the same type errors with the same variance, and the response is measured with errors with the same variance. Assumption 2.2.2. We use the following linear influent-regression stochastic models of active experiment of full rank, which uses a structural relationship (2.2.0)
Then the Problem 2.2.1 will have the following form. Problem 2.2.2, Knowing one realization ii= @0+ H6 + i 0 + Z of the random variable q, the matrices @(rank@= p) and H (rank H = k) and the corresponding values 2 and 8, estimate the unknown parameters 0, 6 of the linear influent-regression stochastic model (2.2.2) by the MLM. Corollary 2.2.2. The estimates i,;1 of the parameters i,hof the linear influent-regression model (2.2.2)from Problem 2.2.2 can be computed by minimizing the finctional
Alexander S. Mechenov
or can be computedfrom the SLAE
We call this system of the nonlinear algebraic equations the normal equation of active-regression experiment. 2.2.2.1 Numerical Example Let us consider the same example. From the same four observations (see Figure 1.1-3a) we form the functional relationship
Figure 2.2-2. Comparison of the LSM for simple regression model with a background (a) and the MLM for model of simple active-regression experiment (b).
2.2.3 Degenerated ~ o d eof l Active-Regression Experiment
The degenerated model is obviously "nonsense" at the representation of the predictor matrix and the theoretical H, that is when experiment beforehand is planned and it is possible to provide all troubles. But for the applications concern-
115
Pseudosolution of Linear Functional Equations
ing linear integral equations of the first kind, such supposition is normal, therefore it deserves to study. Assumption 2.2.3. We use the linear influent-regression stochastic model (2.2.1) of active experiment of an incomplete rank, which uses Eq. (2.2.0). Definition 2.2.3. The vector
(B$,s$) T is called a normal solution (normal
vector) of the SLAE (2.0.2) with the incomplete-rank matrix [@,HI if T
(Bg,6$) = Arg
min
1012 +16r
0,6:@0+HS=y,
where the matrix [@,HI and the vector cp are known exactiy. Definition 2.2.4. Let 0,6:(@0+Hs - i)T(EuuT)-l(@O + H S- i ) + lndet EUUT 2 m is the set of admissible estimates (pseudosolutions),where
m2 = E(@G+ HS -
-1 (EUUT)
(@0+ ~g -
a) + lndet
EUU'
Problem 2.2.3. Knowing the approached vector = @0 + H6 + i 0 + Z as realization q, the exact matrices @, Z, P, and that fact, that the matrix [@,HI of the SLAE (2.0.2) has incomplete rank, estimate unknown values of an approximation vector to the normal vector so that the sum of squares of this approximation was minimum on the set of admissible estimates ^T (t^T ,d
)m = Arg rnin l0r +16(2. 0,Sn
Theorem 2.2.3. The solution of the Problem 2.2.3 exists, is unique, satisjies to the relation
(i';dT)
T m
=~ r g
min
p12 + 1612 .
0,6:(@0+~S-$~Y-'(@0+HS-ij)+ln(det Y)+n In2n=m2
The proof of this theorem is similar to item 2.1.4.
116
Alexander S. Mechenov
2.2.4 Regularization in Case of the Homoscedastisity of Errors
We consider the most popular case when errors of a matrix and of a response have the same variance. Assumption 2.2.4. We use the linear injuent-regression stochastic model (2.2.2) of incomplete rank. Thus, perturbations have a scalar covariance matrix as well in errors of a righthand side e as in the matrix J. Then the Problem 2.2.3 has the form. Problem 2.2.4. Knowing the vector { = @0 + H6 + k3 + E as realization q, the exact incomplete-rank matrix A=[@,H] of the SLAE (2.2.0) and values a, p, estimate the approximate vector to the normal vector so that the sum of squares of this approximation was minimum on the set of admissible pseudosolutions
(i' ;QT)
T zu
= Arg
min
~ , s : ( @ ~ + H s (- ~ u ) ~u ~ ~ ~ ( @ 0 + ~ 6 - ~ )Euu')+n1n2&mz +ln(det
ler +IS? .
Corollary 2.2.4. The stable approximation to the normal vector exists, is unique, satisfies to the relation
and the SLAE
where a=lli220 is a numerical Lagrange multiplier. Proof. For the proof it is enough to use 2.2.2 and 2.2.3. Remark 2.2.4. The given problem can be simplified not only as in item 2.1.5, simplifying residue principle, but also knowing that fact, that only one of matrices
Pseudosolution of Linear Functional Equations
2.3. Analysis of Passive-Active Experiment Assumption 2.3. We are given a linearfinctional relation
I
where 8 = [E,l,...,E,rn]is an unknown measured matrix, 9= +l,...,+p] is a known precisely prescribed fill-rank
p = ( p I ,---, /3rn)T and
matrix,
8 = (O1,...,1 3 ~ ) ~are
rankA = rank [E@]= m+ p, unknown
parameters,
r = ( P I ~ - . P ~ ) is an unknown response vector. Also we construct for this linearfinctional relation a structural relation
where i = (il ..,in)T is an unknown measured response vector, F = f 1 , ,a
a random matrix and J is an error matrix. We consider briefly this yet universal experiment description.
I
..fp ] is
2.3.1 Maximum-Likelihood Method of the Parameter Estimation of Linear Model of Passive-Active Experiment
Assumption 2.3.1. We shall assume the linear stochastic model of passiveactive experiment offill rank, which uses the finctional relation (2.0.3) and the structural relation (2.3.0):
and we shall assume that the errors are normally distributed. We assume that the experiment is both passive and active, that is, the researchers prescribe the exact predictor matrix Q, measure the matrix E, and estimate the parameters 8 and p. The researchers have at their disposal a random response vector
118
Alexander S. Mechenov
that is further corrupted by measurement errors e, the random regressor matrix X and the exact predictor matrix @. That is, the model contains structural relations of the Eq. (2.3.0) on which the measurement response errors e are imposed. Models of such type in view of complexity of their parameter estimation were considered by nobody. Though a variant of experiment where the researcher prescribes something, and measures something passively, it seems in many respects natural. The scheme of such an experiment is shown in Figure 2.3-1.
Figure 2.3-1. Scheme of passive-active experiment.
We shall study the estimation of the unknown parameters by the MLM. Problem 2.3.1. Knowing one realization q = EP + a 0 +& + Z of the ran-
=
dom variable q, one realization % = + (rank X=m) of the random matrix X , the nonrandom matrix @ (rank @=p), and covariance matrices M, T , Z , P, estimate the unknown parameters p, 0 of the linear confluent-influent stochastic model (2.3.1) by the MLM. Theorem 2.3.1. The estimates 6,i of the parameters P, 0 of the linear confluent-influent model (2.3.1) of passive-active experiment from Problem 2.3.1 can be computed by minimizing the functional
where Y = {oik +pThilikp-(TJ + T ~ ) +eTpike] P , Z~ = {oik + e T pike). Proof. We change model and statement of the problem. Instead of the matrix X and the vector q we consider the vector z = ( x , - ~ ~of) dimension ~ mn+n. To do this we arrange the matrix X by rows into a row and adjoin to it the row -qT , and we do the same for the vectors l, = (%-cpT)T and w = ( ~ , - u ~ Then ) ~ . the
Pseudosolution of Linear Functional Equations
119
original linear confluent-influent model reduces to a linear regression model with linear constraints T<=-@8
where the constraint matrix r is the same, as in item 1.4.1 and the covariance matrix has the form: zp = {oik +B'P~~o) (its computation is similar to item 2.1.1). The Problem 2.3.1 thus reduces to the following two stage minimization problem: estimate the true values of parameters so that the negative double log-likelihood function
<
is minimized subject to the constraints I?< 4 8 , and then find the minimum of this form over all possible values of the parameters P, 8:
Consider the first stage. We use the method of undetermined Lagrange multipliers, i.e., multiply the constraints r<+W=O by the vector 2h=2(A1,&..., add the product to the negative double log-likelihood and come to the Lagrangian minimization:
$=
min
(z-<) Tn-1 (z-<)+2f ( r ( + ~ ) + h & t ( ~ ) + ( n m + n ) l n 2 n .
~BR~,&R~":T<=-@O
A necessary condition of minimum is equality to zero of derivatives with respect to and h
<
120
Alexander S. Mechenov
To construct the estimator for the vector of undetermined Lagrange multipliers, we left-multiply the first equation of the augmented normal equation by the matrix rC2. Since
<
the estimator of the vector is calculated from the relationship
and its covariance matrix from the relationship
At any gangs of parameters P, the matrix r is a full rank because of presence of the submatrix I. Therefore the minimization problem of the negative double loglikelihood in linear constraints always has the unique solution. Then the original variational problem is rewritten in the form independent of 3 and 9:
R2 = ( 2 - i)T C ~ - ' C ~ Q( -5~- i)+ lndet(C2)+ (nm+ n)ln2n = iTl?C21'Ti+
lndef(C2)+ ( n m + n ) l n 2 ~
= (q - %p - @€I) T Y -1
(i - %p- ale) + lndef(C2)+ (nm+ n)ln 2x
where Y = {oik + pT~,,$ - (T; + T;)P + B'P~~B), Mik are elements of the cell with number ik and dimension mxm of covariance matrix M(mnxmn), Tik is the segment k of dimension m of the row i of the matrix T(nxnm),Pikare the elements of the cell with number ik and dimensionpxp of the covariance matrix P(pnxpn). Differentiating this expression with respect to the parameters P, 8, we obtain the SNAE
Pseudosolution of Linear Functional Equations
121
-T Y -1-X),p-u T -T Y -1-Y-u+2(X aJ 1- -T Y -1Q,),0=2(X T -T Y-@j,j=l,m; 1 2(X
@i
aY -=--
dSZ - {(Pik + ~ l ) T e. )Since the original matrix is nonsingular, the esti-
a, a,
mator 6,i, is the unique solution of Problem 2.3.1. Now, using (*), we easily obtain the estimators of the regressor matrix E and the response vector cp respectively
Theorem 2.3.1.1. The mean of RSS is given by
Eii TY -1-u = n . The obtained result satisfies completely to the correspondence principle. Really, when the matrix E is absent in Eq. (2.3. l), this estimation method is transformed to the MLM [Fedorov 1968, Mechenov 19881, and when the matrix Q, is absent, it is transformed to the LDM [Mechenov 19881.
2.3.2 Homoscedastisity in Experiment Assumption 2.3.2. We assume the following well-posed linear stochastic confluent-influent model of passive-active experiments (the Jirnctional relation (2.0.3), the structural relation (2.3.0))
122
Alexander S. Mechenov
We assume that experiment is both passive and active. Also each random matrix has the own error variance. We consider an estimation of required parameters by the MLM. Problem 2.3.2. Assume that we have one realization ii of the random variable q and one realization 2 of random matrix X and the predictor matrix a. We also have their variances 2, j.?, $. It is required to estimate the parameters p, 8 of the linear stochastic model (2.3.2). Theorem 2.3.2. The estimates of the parameters P, 8 of the linear model (2.3.2)from Problem 2.3.2 minimize
and are calculatedfrom the SNAE
We call this SNAE the normal equation of the passive-active experiment. Proof. We apply the same proof scheme, as in item 2.3.1. Following it, we consider that place where the vector estimate is calculated from the equation (*). Having taken advantage of an estimate for (*), we rewrite the original variational problem in the form independent of 9 and of cp:
<
<
Differentiating this relation with respect to the parameters P, 8, we obtain the SNAE (**), whence we can calculate the required estimate supplying the solution to the Problem 2.3.2.
Pseudosolution of Linear Functional Equations
2.3.3 Singular Model
The degenerated model is obviously nonsense at the representation of the predictor matrix (9 but for the matrix E such supposition can take place in practice therefore it deserves study. Assumption 2.3.3. We use the following ill-posed linear stochastic model of passive-active experiments (theJitnctional relation (2.0.3), the structural relation (2.3.0))
Definition 2.3.3. The vector
T
(pT;eT) is called the normal solution of the 0
SLAE (2.0.3) with the degenerated matrix [E, (91, if
(P T $3T )T = Arg 0
min
lpr +
$,0:E$+@Q=(p
at the known exact matrices E, @ and the vector 9. Before to put a problem, we alter model. As in item 2.3.1, we reduce the original linear confluent-influent model to linear regression model with the linear constraints ry =-@€I
where the matrix r(nxmn+n) has the same appearance, as in item 1.4.1 and the matrix xp = {aik+ eTp*e) (its computation is similar 2.1.1.). Definition 2.3.4. Let r i s the set of admissible pseudosolutions (estimates) b,t: where
min
(;:T<+@t=O
2
1%-&l,+lndetR+(nm+n)lnlais
Alexander S. Mechenov
m2 =
min-
<:T;t;+oe=o
IE
-&la2 + lndet +(nm +n)ln2n
Problem 2.3.3. Knowing the realization vector E containing the fill-rank matrix %, the exact matrices CD, C, T , M,P, the valui d and thatfact, that the matrix [E, CD] of the SLAE (2.0.3) has incomplete rank, calculate the approximation vector to the normal vector so that the square of this approximation was minimum on the set of admissible pseudosolutions
Theorem 2.3.3. The solution of the Problem 2.3.3 exists, is unique, satisjes to the relation
gT;iT
T
(
=Arg A
A
-
2
min
1612 +lif
b, t:,xb+@t-i&, +nlndet C2+(mn+n)1nk=m2
Proof. We take advantage of the Theorem 2.3.1 and rewrite the original variational problem in the form
Since the minimum is attained on the set boundary [Tikhonov & Arsenin 19791, the inequality can be replaced with an equality: T
iT;iT
(
= Arg
b ,t : ~ b + @ t - $ A,.
min
1612 +lif .
+nlndet~+(mn+n)ln2z=m~
We use the method of undetermined Lagrange multipliers. For this purpose we multiply the constraint by the multiplier A, add the product to the minimized quadratic form and come to Lagrangian minimization
The estimates, calculated from this relation, supply the solution of Problem 2.3.3.
Pseudosolution of Linear Functional Equations 2.3.4 Degeneracy and Homoscedastisity
Assumption 2.3.4. We use the following incomplete-rank linear stochastic model of passive-active experiments (the functional relation (2.0.3), the structural relation (2.3.0))
We reduce the original linear confluent-influent model (2.3.4) to the linear relatively 5 regression model with the linear constraints I'c =-me of the form:
Problem 2.3.4. Given the realization vector E, the exact matrix
(6'; iT)T= Arg tD
min
b,tcr
+ /tr.
Theorem 2.3.4. The solution of the Problem 2.3.4 exists, is unique and satis$es to the relation
(6',ir); = ~r~
min
6,i:
P*-f
+nm*$ a2+p21b12+p21t12
+nl n ( d
+dlf )t(nrn+n)1n2n=3
Proof. We take advantage of the Theorem 2.3.3 and we rewrite an original variational problem in the form:
126
Alexander S. Mechenov
(I,^ T ,t^T)
=A&
min
m
6,i:
1612 + 1i12.
+nmlnr2+n fn[02+p21tr) +(nrn+n)fn2n=m2 a2+p21b(2+p21t)2
We apply for a minimum computing the method' of undetermined Lagrange multipliers. For this purpose we multiply the constraint
on the multiplier A,we add the product to the quadratic form and come to minimization of the quadratic form
Differentiating this expression with respect to b, t and A,we obtain the SNAE
where a=lli2>0 is the numerical Lagrange multiplier. The obtained estimate will supply the solution of Problem 2.3.4. In the given paragraph the problem of a parameter estimation of models of well-posed and ill-posed passive-active experiments is solved. For completeness of research it is necessary to supplement only the model investigated above in free (regression) parameters, as it will be done in the following paragraph.
127
Pseudosolution of Linear Functional Equations
2.4. Analysis of Passive-Active-Regression Experiment Having connected all models it is easier to see a real difference between them and to understand their interaction. Assumption 2.4. We consider the following linearjimctional relation
where 8 = [E,l,...,E,m]is an unknown measured matrix, 6 = 41,...,4p] is a
[
known prescribed matrix, H = [ q ,-..,q k ] is a known theoretical& matrix with k, ~=(Q~,-..,B~)~ common r a n k ~ = r a n k [ E , 6 , ~ ] = m + ~ +p=(pl,..-,pm)T, T
and 6 = (Ljl ,...,Ljk) are unknown parameters, cp = ,...,fpn)T is an ~nknown response vector. Wewe enter also the structural relation
corresponding Eq. (2.0.4), where i =(il,...,in)T is an unknown measured re-
[ ...,f p ] is a random matrix and J is its error.
sponse vector, F = f,,
That is the structural relation is a functional relation in which one of three matrices contains an additive random error. 2.4.1 Maximum Likelihood Method of the Parameter estimation of Linear Model of Passive-Active-Regression Experiment
Assumption 2.4.1. We shall assume the fill-rank linear confluent-influentregression stochastic model of passive-active-regression experiment
based on thefinctional relation (2.0.4) and the structural relation (2.4.0). We assume that the experiment is both passive and active, that is, the events occur when E, 0,H, p, 0, 6 have certain values. The researchers have at their disposal the predictor matrix 0,the matrix of theoretical values H, the random response vector
Alexander S. Mechenov
with error vector u=J8+e, and the random confluent matrix X with error matrix C. This model is generated by structural relations of the form i=EP+FB+HG, on which the measurement errors e of the response i are imposed. In such model all nuances of carrying out of experiment are taken into account completely, it contains a confluent, influent and regression part. The scheme of such an experiment is shown in Figure 2.4-1.
In what follows we shall assume that the errors are normally distributed and study the estimation of the unknown parameters by the MLM. Problem 2.4.1. Suppose we know one realization = EP + (DO + H6 + i 8 + Z and = E + (rank X=m) respectively of the random variables q and X , the values of the matrices
A
Pseudosolution of Linear Functional Equations
tional
where Y = {oik + P ~ M -~(T; ~ + ~ T;)P J + B ~ P ~ OZP } , = toik+ e T Pike}.
Proof. We alter model and statement of a problem. Instead of the matrix X and the vector q we consider the vector z = of dimension mn+n. To do this we arrange the matrix X by rows into a row and adjoin to it the row -qT, and we do the same for the vectors = (3,-cpTlT and w = ( ~ , - u ~ Then ) ~ . Eq. (2.4.1) reduces to the linear relatively regression model with linear constraints T<=-@8H6
<
<
where the matrix r is the same, as in item 1.4.1 and the matrix Z = {oik+ 0 T~ikO)(its evaluation is similar 2.1.1). The Problem 2.4.1 thus reduces to the following two-stage minimization problem: estimate the unknown values of parameters so that the negative double logarithm of likelihood function is minimized subject to the constraints T<=-08H6, and then minimize this negative double logarithm of likelihood function over all possible values of the parameters P, 8, 6
<
Let consider the first stage. We use for its solution the method of undetermined Lagrange multipliers. For this purpose we multiply the constraints TC;+
i2= min
min ( i - < ) T R - 1 ( - i - < ) + 2 h T ( r g + O + ~ 6 ) + l n d e t ( ~ . n e nt;rnwn
130
Alexander S. Mechenov
A necessary condition of minimum is equality to zero of the derivatives with respect to the vector C; and the vector h
To construct the estimator for the vector of undetermined Lagrange multipliers, we left-multiply the first equation of the expanded normal equation by the matrix rC2. Since
the estimator Z of the vector C; is calculated from the relationship
and its covariance matrix is equal to
Since the matrix r is nonsingular by virtue of presence of unit submatrix I, the estimator Z is the unique solution of the first stage of Problem 2.4.1. Having taken advantage of the estimator Z (*), we rewrite the initial variational problem in form independent of the unknowns E and
where Y = {qk+ pTlulikj3 - (T: + T;)P + e T p i k 8 ) Mik are the elements of a cell with number ik and dimension m*m of covariance matrix M(mn *mn), Tikis a ?
Pseudosolution of Linear Functional Equations
131
k segment of dimension m of i row of a matrix T (n*nm), Pikare the elements of a cell with number ik and dimension p*p of covariance matrix P,, ., Differentiating this expression with respect to the parameters P, 0 and 6, we obtain the SNAE
where
m
---= -
aY
5= q - @ - @0- HG.
m
aj aj
-= {(Mik + M $ ) T ~- ( T ~ -) (~T ~ ~ ) ~ ) ,
43j
{(Pik + P;)T~) . Since the original matrices
-
A
R , ~ are H non singu-
A
lar, the estimators b,t,d, are the unique solution of the Problem 2.4.1. Having taken advantage of a relation (*), it will be easy to calculate an estimation of the regressor matrix E and of the response vector cp accordingly
Theorem 2.4.1.1. The mean of the RSS is equal to
Eii T Y -1- u = n . 2.4.2 Homoscedastic Experiment
Assumption 2.4.2. Given the functional relationship (2.0.4) and the finctional-structural relationship (2.4.0), we consider the following homoscedastic well-posed linear conjluent-injluent-regression stochastic model of the passiveactive experiment
132
Alexander S. Mechenov
The experiment is homoscedastic, i.e., events occur when e, C, J have respectively constant variances. We consider an estimation of required parameters by the MLM. Problem 2.4.2. Assume given one realization of the random variable q and one realization of random matrix X :
the matrixes @, H. We also know the corresponding variances 2, j?, 8.It is required to estimate the unknowns cp, Z and the parameters P, 0, 6 of linear stochastic model (2.4.2) by the MLM. Theorem 2.4.2. The estimators of the parameters P, 0, 6 of linear model (2.4.2) for Problem 2.4.2 minimize
They are calculatedfrom the SNAE
Proof. We apply the same proof scheme, as in the item 2.4.1. Following it, we consider that place where the estimator of the vector C, is calculated from the relationship
where
133
Pseudosolution of Linear Functional Equations
Then the initial variational problem is rewritten in form independent of E and 0:
Differentiating this expression with respect to the parameters P, 8, 6, we obtain the above-stated SNAE. Remark 2.4.2. It is easy to construct an iterative process that generates the sought estimators. The mean weghted RSS is used as the zeroth approximation. Substituting this value in the SNAE, we solve the SLAE, recalculate the weighted RSS, and so on. 2.4.3 Singular Model
The singular model obviously nonsense at the representation of a predictor matrix Q, or theoretical H, but for the matrix E such supposition is normal, therefore it deserves study. Assumption 2.4.3. We use the following linear stochastic model of an incomplete rank for passive-active-regression experiment (the linear jimctional equation (2.0.4), the structural relation (2.4.0))
Definition 2.4.3. The vector (PT;8T;6T)T,-,refers to as a normal solution of the SLAE (2.0.4) with the degenerated matrix [E,@,H], if T
( ~ ~ ; 8 ~o = ; Arg 6 ~ )
min
P,0,6:BP+@B+H6=(p
+ l8f +16r
134
Alexander S. Mechenov
at known exactly matrixes E, a,H and the vector cp. Before to put a problem, we alter model to the form
(its evaluation is similar 2.4.1). Definition 2.4.4. Let the set of admissible estimators T b,fd:
min
1?-~~+lndet~+(nm+n)ln2x~d
<:rcmt+Hd=O
where
Problem 2.4.3. Given the approached vector z, and also exact matrixes
(6';i';d~)~ = Arg w
min lb12 +lt12 +1d2,
b,t,da-
Theorem 2.43. The solution of the Problem 2.4.3 exists, is unique, satisfies to the relation:
(b^ T , t^T , d^ T )T = Arg
2
min
lbp +lt12 +Id12 .
b,t,d:IXb+@t+~d-~~~+nlndet~+(mn+n)ln2r=w~
Proof. We take advantage of the Theorem 2.4.1 and we copy an initial variational problem as
GT;zT;dT
(
T
)"
= Arg
min lbf +1tl2 +1d12 . b,t,d:I~b+@t+Hd-q(, +nlndet~+(mn+n)ln2r~a?
135 It is known [Tikhonov & Arsenin 19791, that the minimum is attained on the set's boundary, that is
Pseudosolution of Linear Functional Equations
We apply for its evaluation a method of undetermined Lagrange multipliers. For this purpose we multiply the constraint on the multiplier A, we add the product to the quadratic form and in a result we minimize the Lagrangian
Differentiating this expression with respect to parameters and A, we obtain the SNAE and the calculated from this SNAE estimate is a unique solution of the Problem 2.4.3. Remark 2.4.3. But in this case the value a? hardly gives in to an estimation as S2 depends on unknown parameters. Therefore the variant containing only a residue, seems more preferably, especially in view of that fact, that asymptotically they lead to the same outcome. As the RSS estimate is known from the Theorem 2.4.1.1. and is equal to n then Ew T S2 -1 w In Then last equation of this SNAE is possible to replace on
.
2.4.4 Singularity and Homoscedasticity Assumption 2.4.4. We use the following incomplete-rank linear stochastic model of passive-active-regression experiment (the linear Jitnctional equation (2.0.4), the structural relation (2.4.0))
We reduces Eq. (2.4.4) to the linear relatively 6 regression model with linear constraints T&=-@e-H6
136
Alexander S. Mechenov
where the matrix I? is the same, as in item 1.4.1. Problem 2.4.4. Given the approached vector z, and also exact matrixes a, H, corresponding variances d,p2,p2, the value a? and that fact, that the matrix [=,@,HI of the SLAE (2.0.4) has incomplete rank, calculate the unknown values of estimator vector to the normal vector (pT;8T;6T)T so that square of this estimator was minimum on the set of admissible estimators:
(b" T ;t"T ;d"T) L = ~ r gmin \b(Z+I$ +1d2. b,t,d€T
Theorem 2.4.4. The solution of the Problem 2.4.4 exists, is unique, satisjes to the relation:
min
bI2+ltlZ +ld2
1pm+EkI-q2 b,W Srmthp~(2+~~t~~+(lon+n)h2IC;3 2+IUzlbI2+p21tl2
and the SNA E
where e l l b 0 is a numerical Lagrange multiplier. Proof. We take advantage of the Theorem 2.4.3 and we rewrite an initial variational problem in the following form:
Pseudosolution of Linear Functional Equations
137
We apply for an evaluation of a minimum the method of undetermined Lagrange multipliers. For this purpose we multiply the constraint
on the multiplier A, we add the product to the quadratic form and in a result we minimize the Lagrangian
Differentiating this expression with respect to the parameters and the numerical Lagrange multiplier (having replaced a=1/DO),we obtain the required SNAE and the estimator is the unique solution of the Problem 2.4.4. Remark 2.4.4. As the mean of the weighted RSS is equal to n:
that, having replaced compulsorily last line the SNAE and corresponding values, we receive the SNAE
138 Alexander S. Mechenov which solutions asymptotically are identical previous, but it is a little bit easier for computations. We remark that for precise matrices (influent and regession parts) the negative displacement of diagonal elements is not present. For a matrix of a passive part there is a negative displacement. So formulas of regularized solution computation do not coincide. 2.4.5 Summary Table of Experiments
All results are shown in small tables that give a total characteristic of an experimental material and methods of their parameter estimation. By completely simplifjring the description of all the errors occurring in the model (or in the system of linear algebraic equations), that is, assuming they are homoscedastic and have variance 1, one can construct a summary table of the models, quadratic forms, and equations for estimating the parameters (for computing the pseudosolutions) of all possible combinations of measurements and prescriptions of the initial data. Table 2.4.1 contains the different models in presence of l~omoscedastic errors. In Application the tables of models in the presence of the homoscedastic errors (Table 1) and in the presence of the heteroscedastic errors (Table 2) are shown. Table 2.4.1. Models of errors for systems of the linear algebraic equations and methods of a computation of their pseudosolutions. There is the input data in column D and the unknown values in column V.
dodels cp=& A
e equations and quadratic forms, the authors
1
=[$@,HI
[Gauss 1809, Legendre 18061
YP Rep
-u
I I
(RTR- g[%-711)~ =R ~ Y
s2 = m i n 1 % ~ -112 = -A'["-Y] [Pearson 1901, Mechenov 19881
139
Pseudosolution of Linear Functional Equations
[Mechenov 19911
k'@+n1-s21
[Fedorov 19681
)
mTa+nl-s21 e + m T ~ 6 = m T q
H ~ C D € I + H ~=H S~ ~ r j
[Mechenov 19961
p+ZToe=%l'"4
+kTm +nI-
oTWp
I
S ~ I
[Mechenov 19961
( g T t- s 2 1 ) p + gT@9+ g T H 6 = g T i j d t p + ( ~ D T Q+ n1- S ~)eI+ Q I ~ H =S @'a H ' X ~ + ~~m+ H ~ = H ~
[Mechenov 19961
Alexander S. Mechenov
2.4.6 Inference of Chapter 2
The basic result of the second Chapter is construction of model of passiveactive-regression experiment. Thus, the picture of exposition of experimental researches in frameworks of confluent-influent-regression models is completed. Such gradation of experiments allows the contributor to understand better to itself a picture of researches and correctly to carry out a parameter estimation. Really appreciated of usual Gauss transformation, the parameters turn out "underestimated" in case of passive experiment and "overstated in case of active. The method of effective correction of rounding errors is constructed also at the SLAE solution and at the parameter estimation on the computer. They are developed regularized methods of an estimation for a case of the singular matrices. The given approach has ample opportunities of development in a multivariate confluence analysis, in nonlinear models, in model with linear constraints, with the not Gauss error, etc. Tikhonov has applied the regularization term for computation of a solution of integral equations of first kind and the singular SLAE, that is, for problems with strict and beforehand known singularity. In them, for deriving a solution (in view of their infinite number), the additional (a priori) condition of selection of unique solution is necessary. It does not concern to the badly stipulated (nevertheless, stipulated) SLAE, to the account of the a priori information on a solution though in a result these methods can lead to the equations for an evaluation of the solution, similar among themselves. Problems of a solution of integral equations are considered in Chapter 3.
Chapter 3 LINEAR INTEGRAL EQUATIONS
Abstract In this chapter we allow for deterministic and random errors in variational methods that construct the pseudosolution of linear integral equations of the second kind and the regularized pseudosolution and quasisolution of the linear integral equations of the first kind. We consider both passive errors (i.e., errors during observation or measurement) in the right-hand side and passive or active errors (i.e., errors during specification) in the core. We consider the representation methods of the a priori information on a sought pseudosolution using the mixed models and the statistical regularization methods. We construct the numerical realization of these methods.
3. ANALYSIS OF PASSIVE AND OF ACTIVE EXPERIMENTS In Chapter 1 and 2 [Mechenov 1991, 1997, 19981 we have considered the estimation of the pseudosolution (the parameters) of linear functional equations of the first kind such as systems of linear algebraic equations in the presence of various deterministic and random errors in the input data. In this chapter we consider a more complicated form of linear functional equations [Mathematical encyclopedia 19851, namely linear integral equations. Solving a linear integral equation of the first kind involves an ill-posed problem [Tikhonov 19631, whereas a linear integral equation of the second kind has solution from the well-posed problem [Zabreyko, etc. 19681. 3.1 Variational Problems for the Construction of Pseudosolutions of Linear Integral equations
142 Alexander S. Mechenov In this paragraph we compute the pseudosolutions of linear integral equations of the second kind in presence of nonrandom passive or active errors in the core and nonrandom passive errors in the right-hand side. Similarly to [Mechenov 19971, we start with the well-posed problem. 3.1.1 Pseudosolutions of Linear Integral equations of tlie Second Kind Assumption 3.1.1. Take the Fredholm linear integral equation of the second krnd [Zabreyko, etc. 19681
with a suflciently smooth nondegenerate core without singularities. We assume that this relationship is maintained exactly. WeJitrther assume that the constant A is exactly known and is not contained in the spectrum of the integral operator with the core K ( X , S ) . 3.1.1.1 Pseudosolution in the Presence of Deterministic Passive Errors Consider the equation (3.1.1) in the presence of nonrandom passive measurement errors in the right-hand side and in the core. Assumption 3.1.2. We use the following linear model ofpassive observations for the Fredholm linear integral equation the second krnd (3.1.1) in the presence of an LZnonrandompassive measurement error in the core and in the right-hand side
If we have a single observation with an error y" only for the right-hand side of Eq. (3.1.2), the pseudosolution is computed from the standard variational problem (an analog of the least-squares method)
We first consider the construction of the pseudosolution for the case when both the core and the right-hand side are measured with an error.
Pseudosolution of Linear Functional Equations
143
Problem 3.1.1. Given one realization of the approximate right-hand side y" and the approximate core of model (3.1.2), compute the pseudosolution that minimizes the sum of squares of the deviation norms of the approximate input data from the sought data for the equation (3.1.1)
Theorem 3.1.1. Assume that the conditions of the Problem 3.1.1 are satisfied. Then the variational problem (3.1.3) to construct the pseudosolution takes the form
and the pseudosolution tion
(3.1.3) of equation (3.1.2) is computed from the equa-
Proof. Passing from the integral to Darboux sums, that is, making an approximation (without loss of generality, on uniform grids), we replace the equation (3.1.1) with the SLAE
where Kth ,It are matrices such that [Mechenov 19771
and the x- and s-partition of the interval [a,b] are identical, rp, is a vector such that
144
Alexander S. Mechenov
L,, = Ch are the solution vectors on the grids t and h, respectively. Assuming that the approximation errors are much smaller than measurement errors [Mechenov 19771 and therefore can be incorporated in the of input data errors, we consider the variational problem of computing the pseudosolution with a known approximate right-hand side and the matrix yt,kthfor the SLAE (approximated as in Eq. (3.1.1'))
This system has been solved in [Mechenov 19911 (the proof is similar to the Corollary 1.4.3), and its solution minimizes the functional
Differentiating, we find that to estimate the solution we need to solve the equation
Passing to the limit as t+O, h+O, that is, passing from Darboux sums back to integrals, we write the expression in functional form, that is, as the solution of the variational problem (3.1.4). This method is applicable to construct numerically the pseudosolution of linear integral equations of the second kind in the presence of measurement errors in the core and in the right-hand side. 3.1.1.2 Pseudosolution in the Presence of Deterministic Active Errors Let us consider the methods for computing the pseudosolution in the presence of nonrandom active errors associating with the realization of an exactly specified core and the passive errors associate with measurement of the right-hand side. Assumption 3.1.3. We use the following linear model of active observations for the linear integral equation of the second kind (3.1.1):
Pseudosolution of Linear Functional Equations
145
where j(x, s) E LQ ([a,b]x [a.b]] are the realization errors of the exactly specifed core K ( X , SE) & ([a,b] x [a,b]). Exactly specified cores that are realized with an error (an active experiment [Mechenov 19971) are considered here for the first time in the literature. We provide a formal statement of the construction of the pseudosolution. Problem 3.1.2. Given the approximate right-hand side q", the error norm = CT, the exactly specified core K ( X , S ) and the norm of its realization
11~11~
error
llJll
= p for model (3.1.5), compute the pseudosolution.
The total error of the right-hand side u" = q" - C + AKC = -fir + Z depends on the sought solution, and therefore the minimization problem cannot be solved by standard methods, i.e., as
Its solution requires an equivalent of the statistical approach [Mechenov 1997, 20011. Using the result of [Mechenov 19971, we construct a minimizing functional by a method equivalent to the maximum likelihood method. We construct a functional similar to the functional of maximum likelihood method [Kendall & Stuart 19681. For simplicity, we consider the analog of the negative double logarithm of likelihood function, without insignificant constants [Mechenov 19971.
Theorem 3.1.2. Assume that the conditions of the Problem 3.1.2 are satisJied The solution of the variational problem
146
Alexander S. Mechenov
to construct the pseudosolution exists, the pseudosolution is unique and it is computed from the equation
The pseudosolution computed in this way takes into account the active errors in the core. Let us now consider linear integral equations of the first kind, which must be solved using the regularization method. 3.1.2 Regularized Pseudosolution of the Linear Integral equation of the First Kind in the Presence of Measurement Errors in the Right-Hand side Assumption 3.1.4. We use the following Fredholm linear integral equation of the first ktnd
with a suflciently smooth nondegenerate core without singularities, so that the solution is unique. We assume that this relationship is maintained exactly. The Volterra integral equation of the first kind
can be regarded as a Fredholm equation if the core is augmented with zero in the region s> x. The problem of computing the solution of this equation is Hadamard ill posed. The main feature of the various approaches to the solution of this problem with a known error in the right-hand side is that the approximate solution in principle cannot be computed without information about the norm of the right-hand side error [Tikhonov 19631 or the norm of the solution [Ivanov 19631. We will describe these approaches for the case when the right-hand side is measured with an error and then compare them with the previous results for a core with observation or specification errors. We thus assume that only the right-hand side of the equation (3.1.6) contains nonrandom observation errors.
147
Pseudosolution of Linear Functional Equations
Assumption 3.1.5. We use the following linear model of passive observations for the equation (3.1.6) in the presence of a nonrandom passive measurement error in the right-hand side:
where y(x) is an r2[c,d]known function measured with an error. We start by reviewing the essentials of a regularization method [Tikhonov 19631 and statement of the corresponding variational problems [Morozov 1967, 1987, Ivanov 19631. 3.1.2.1 Regularized Pseudosolution. Let us consider a method that utilizes the known norm of the right-hand side errors to compute the pseudosolution (the residue principle [Morozov 19671). Assumption 3.1.6. The set of admissible pseudosolutions u of the equation (3.1.7) is deJned by the inequality
where the error norm o is known in advance. The problem of constructing the normal pseudosolution as the argument minimizing the functional
on the set of admissible pseudosolutions kv has the form
However, this pseudosolution has only weak convergence to exact solution as 04 The .distance of the approximation z(s) from the unknown function ~ ( s is)
1 1 ~ 1 (,1, ~ =
b
therefore measured in the space w2(1) [a.b]:
W2
z2 (s)+ zt2(s)ds [Tikhonov
a
19631. The selection of regularized pseudosolutions (that is, pseudosolutions that strongly converge to the exact solution as 0 4 ) relies on the choice of set of suf-
148
Alexander S. Mechenov
ficiently smooth pseudosolutions [Tikhonov 19631 (a set in the space w2(1) [a,b] [Sobolev 19501). Problem 3.1.3. Given the approximate right-hand side 7,lly"- = o , an exactly specwed nondegenerate core K(x,s)of the equation (3.1.7), and the error norm a,Jind the regularized pseudosolution as the argument that minimizes the stabilizing functional [Tikhonov 19631
qll
on the set of admissible pseudosolutions
~x (3.1.8) has
the form
This approach has been systematically developed in [Tikhonov & Arsenin 19791, [Morozov 19871. It is proved in [Tikhonov & Arsenin 19791, that the regularized pseudosolution obtained in this way stably converges to the exact solution as 04. We write out the solution method used in what follows and the required Euler equation. Since the infinum is attained on the set boundary [Tikhonov & Arsenin 19791, the inequality in Eq. (3.1.9) can be replaced with an equality. Applying the method of Lagrange indeterminate multipliers, we pass to problem
A necessary condition for a minimum of functional Ma[4 is that its first variation with respect to 6 and A, vanish: A M ' [ ~ = OThis . leads to the Euler equation
where L is the Sturm-Liuville operator with boundary conditions that specify equality to zero of a sought solution and (or) its derivatives:
Pseudosolution of Linear Functional Equations
149
If inf r(il)=r(co)
where yis a priori known value of the norm of the exact solution. Problem 3.1.4. Given one realization of the approximate right-hand side 9,lP - = c , an exactly specijied nondegenerate core K of the equation (3.1.7), and the value y, compute the regularized quasisolution as the argument
rll
minimizing the residuefunctional
I$ - K
~ P on the set v =2
The solution of this problem is similar to above-described procedure and only involves a modification of the equation for the Lagrange multiplier. In principle, however, this quasisolution is not associated with the norm of the right-hand side error a. In [Mechenov 1988, 19911 we constructed the normal pseudosolution of the SLAE that allows also for errors in the matrix. We can accordingly extend this result to the problem of constructing the regularized pseudosolution and the regularized quasisolution of linear integral equations of the first kind with the observation errors in the core and in the right-hand side. 3.1.3 Regularized Pseudosolution and Quasisolution in the Presence of Nonrandom Passive Errors in the Core and in the Right-Hand Side
In practice, in addition to measurement errors in the right-hand side we also have to deal with passive measurement errors in the core or errors associated with the realization of the core. We have to distinguish clearly between these two problems. We start with a model of purely passive nonrandom observations (that is, we
150
Alexander S. Mechenov
assume that the core and the right-hand side are observed with errors). A lot of attention was given to this problem [Tikhonov, etc. 19711, [Tikhonov & Arsenin 19791, [Morozov 19871. Assumption 3.1.8. We use the following linear model ofpassive observations [Petrov 19741 for the linear integral equation of the jrst kind (3.1.7) with L2 nonrandom measurement errors in the core and in the right-hand side
3.1.3.1 Regularized Pseudosolution We develop a method that utilizes the known norm of nonrandom passive errors in the right-hand side and in the core to compute the pseudosolution (the distance method [Mechenov 19911). To this end, we define again the set of admissible pseudosolutions. of the equation Assumption 3.1.9. The set of admissible pseudosolutions (3.1.11) is dejned by the inequality
t~ =
{(,
inf
[lly- ~ J (+LK -K11:2]] 6 o2 + p 2 } ,(31.12)
q~,K:Kr=y,
where the error norms o, p are known in advance. Problem 3.1.5. Given one realization of the approximate right-hand side - q~ll= o and of the approximate nondegenerate core - =p ,
K-, I K - K I
y,-l y -
where a, p are known values, jnd the regularized pseudosolution of model (3.1.11) inf
Theorem 3.1.3. Assume that the conditions of the Problem 3.1.5 are satisjed. Then the variational problem to construct the regularized pseudosolution takes the form
Pseudosolution of Linear Functional Equations
zP = arg
inf
Ilw:2
6:
l+llse2
=a2
II~[,~ 2
and the regularized pseudosolution of the equation (3.1.1 1) satisjes the Euler equation
Proof. Passing from the integral to Darboux sums, as in item 3.1.1 (that is, making an approximation), we rewrite the equation (3.1.6) in the form
Ignoring the approximation errors, which by assumption are much smaller than the measurement errors [Mechenov 19781, we consider problem
Its solution has obtained in [Mechenov 19911, [Mechenov 19881. Noting that the sought value of the pseudosolution is attained on set boundary, we can write in form
Passing to the limit as t+O, h+O, that is, passing from the Darboux sums to integrals, we can write the expression in integral form. Solving this variational problem by the method of Lagrange undetermined multiplier, we obtain
Alexander S. Mechenov
Differentiating, we find that for the computation of the regularized pseudosolution requires solving the Euler equation (3.1.14) simultaneously with the equation for the Lagrange undetermined multiplier A. If the second equation in Eq. (3.1.14) is not satisfied at any A> 0, then the pseudosolution is computed for A+@ [Tanana 19811. Theorem 3.1.3a. Assume that the conditions of the Problem 3.1.5 are satisJied. Then the regularized pseudosolution is a stable approximation to the exact solution of the equation (3.1.7). Proof. Define the sequence 6k = { o k , p k ) such that "
llyk - -,Il
l$ -KI
= ok,
= pk
[Morozov & Grebennikov 19921. Let
=2
exact solution. Using the definition of the element z6, ,we obtain
Using the relationship
we have
5 be the
153
Pseudosolution of Linear Functional Equations
Hence we obtain that the sequence zak ,k = 1,2,.... is bounded from above
It follows that we can isolate a weakly convergent subsequence from the sequence zak . Without loss of generality, we assume that zgk A?. Passing to the limit k+
in the inequality
we obtain that lim
ak+o
- (011
%
=
1
-
1
%
= 0 and B is the solution of the
equation (3.1.7). Then it follows from Eq. (3.1.15) that lim llzgk k-m
wil
=~
~.
In the Hilbert space wf) , weak convergence and convergence of the norms imply strong convergence (Arcela theorem- [Kolmogorov & Fomin 1972]), so that lim zg., = i and by uniqueness i = 6 . k+oo
Remark 3.1.2. 1. The Euler equation for. this problem differs from the Euler equation (3.1.10) primarily in the following sense. With respect to the SturrnLiuville operator LZ= z - Z" in the stabilizing operator of the equation (3.1.14)
~
Alexander S. Mechenov
the smoothing part becomes dominant, and the bias is suppressed. This is consistent with common sense: since the core contains an error, that is, is less smooth than the exact core, a better (smooth) solution is obtained by increasing the influence of the smoothing part. 2. The common approach to construction problems of the regularized pseudosolution bases on sets of the admissible pseudosolutions of the form
that is, the functional Q from a pseudosolution less or equal to the functional Q from the exact solution. In principle, we can compute other regularized pseudosolution on other sets of admissible pseudosolutions. For example, the set
which, after carrying out of the same computations, as in Theorem 3.1.3, has the form
allows to calculate the best pseudosolution, but demands knowledge
11k?z- jf
12
llqL. If there are these values we can take advantage of the formula (3.1.12"). If there is only (lqf we can take the formula L2
and
Pseudosolution of Linear Functional Equations
155
At last, taking into account that value of the exact solution norm beforehand very seldom is known, we take advantage of a Cauchy-Bunyakovskii inequality
that in result we receive the formula (3.1.12) of the set of admissible pseudosolutions. The application of the following iterative process for computing the pseudosolution: use the pseudosolution from Eq. (3.1.13) for computing the pseudosolution from Eq. (3.1.12') and then the pseudosolution from Eq. (3.1.12') for computing the pseudosolution from Eq. (3.1.12'7, can improve quality of the last. 3. The Euler equations (3.1.14), (3.1.12') and (3.1.12") for regularization parameter (the Lagrange multiplier) also differ from previously proposed equations [Tikhonov, etc. 19711, [Tikhonov & Arsenin 19791, [Morozov 19871, [Motozov & Grebennikov 19921. The previous procedures [Tikhonov, etc. 19711, [Tikhonov & Arsenin 19791, [Morozov 19871 and many other methods were applied to the problem with passive errors in the core and were strictly heuristic. They were not the result of the variational problem solution on minimization of quadratic functional~on compact sets. In particular, when discussing such a problem with A.G. Jagola back in 1969, the author proposed a clearly similar to the inequality [Morozov 19671
formula for computing the regularization parameter of the Euler equation (3.1.10) from the condition
Of course, this condition does not follow from any variational problem, but it has been tested in practice and as has been shown subsequently [Tikhonov, etc.
156
Alexander S. Mechenov
19711, some modifications of this condition lead to the necessary asymptotic expression, which has been useful for solving a number of physical problems. At that point however, the variational problem (producing the regularized pseudosolution) in the presence of observation errors in the core and in the right-hand side had not been developed. Its algebraic solution for SLAE was only obtained in [Mechenov 19911. Let us compare the result with that proposed in the formula of the generalized residue principle [Tikhonov & Arsenin 19791, [Jagola 1979, 1979a, 1980, l98Oa], which requires solving the following Euler equation
where c is the measure of inconsistency of the equation (3.1.11). Here the same goal (increasing the smoothness of the solution) is achieved by increasing the value of the regularization parameter a in addition to plliall we also introduce the
r,
inconsistency measure in the equation (3.1.10) for the regularization parameter. The inconsistency measure is not needed for computing the regularized pseudosolution, because the following Cauchy-Bunyakovskii inequality holds:
and so
where equality is attained only for p = oll&
I(
L2
.
4. The error norms are the weakest link in the regularization theory, and summation of two squared error norms reduces the probability of an error in estimating the norm. We know [Tikhonov, etc. 19731that the residue principle produces a parameter value that is somewhat greater than optimal.
157 5. In principle, we can easily consider the case when the equation (3.1.7) additionally contains a pure regression part (for instance, in the form of an unknown constant), as has been done in [Mechenov 19911 and in the Table 3 and 4 of Application. 6. This is applicable to nonlinear integral equations of the first kind and to other forms of the operator equations.
Pseudosolution of Linear Functional Equations
3.1.3.2 Regularized Quasisolution Let us construct the solution method that utilizes the known solution norm in the presence of nonrandom passive measurement errors in the right-hand side and in the core (a generalization of the regularized quasisolution method). Problem 3.1.6. Given one realization of the approximate right-hand side y"
and of the approximate core of the model (3.1.1 1) and given the value y bounding the set of admissible quasisolutions, compute the regularized quasisolutions
Theorem 3.1.4. Assume that the conditions of the Problem 3.1.6 are satisJied. Then the variational problem to construct the regularized quasisolution take the form
and the regularized quasisolutions is stable approximation of the exact solution ofEq. (3.1.7). Proof. Again passing from the integral to Darboux sums, we repeat the calculations of Theorem 3.1.2. Passing from the discrete problem to continuous problem, we obtain the equation (3.1.18). Applying the Lagrange method with the multiplier a , we obtain the functional
158 Alexander S. Mechenov Differentiating this functional, we obtain the corresponding Euler equation. The proof of stability of the regularized quasisolution is similar to Theorem 3.1.4. 3.1.4 Allowing of Nonrandom Active Errors in the Core and Passive Errors of the Right-Hand Side
Let us construct methods that compute the solution in the presence of nonrandom active errors arising during the realization of the given core, and passive errors associated with the measurement of the right-hand side. Assumption 3.1.10. We use the following linear model of active observations for a linear integral equation of thefrst kind (3.1.6)
where j(x,s) is the realization error of an exactly specijied core in the experiment. To the best of our knowledge, exactly specified cores that are realized with an error (an active experiment [Mechenov 1997, 1988]), have not been considered in literature. Let us formulate the construction of the residue-based regularized pseudosolution and the regularized quasisolution. 3.1.4.1 Regularized Pseudosolution We construct a method to find the regularized pseudosolution given the norm of nonrandom active errors that arise during the realization of the specified core and the norm of passive measurement errors in the right-hand side (further to us it can be demanded the exact solution norm). Problem 3.1.7. Given the approximate right-hand side 8,118- KG - k,ll= G, the exactly specfled nondegenerate core, the core realization error norm llSll=p for model (3.1.19), and the value q,p, fnd the regularized pseudosolution. Since the total right-hand side error
depends on the sought solution, the set of admissible pseudosolutions
159
Pseudosolution of Linear Functional Equations
will lead to the minimization problem with the nonquadratic functional, that is, to zl = arg
1
inf
IMI'
J,
E-"Y: w2 (~llsll, +.)
Even if we take advantage of the set of admissible pseudosolutions of the form 22
= arg
then, in both cases, we construct a method similar to the method for passive errors, that contradicts sense of the problem. Besides the given methods do not coincide with the statistical approach in which that fact is essentially used, that the full error turns out depending from a sought solution. To solve this problem we apply an equivalent of the statistical approach [Mechenov 19971. Using the result of [Mechenov 19971, we construct a functional equivalent to the MLM [Kendall & Stuart 19691 and we realize it for finding the set of admissible pseudosolutions:
where
This functional distinguishes from a functional implying from the MLM on some constants. In particular, in a finite-dimensional case, after the computing the logarithm of the determinant of expectation of a correlation error matrix, before the logarithm there is a value n, which is more or equal to expectation of the RSS. For the set of admissible pseudosolutions T this value is equal to 1. With the pur-
160
Alexander S. Mechenov
pose to compare the results with the previous results, we write the variational problem for the regularized pseudosolution in the following form. Problem 3.1.7'. Find the regularized pseudosolution as argument of functional injnum
where the value rn is equal to
Theorem 3.1.6. Assume that the conditions of the Problem 3.1.7 are satisjed. Then the variational problem for construction of a regularized pseudosolution has the form: zp = arg
inf
(3.1.22)
Il~~-q"ll:~+(u2+p2)ln (l+\~dl:~) =m2
6:
l+II$
and the regularizedpseudosolution of the equation (3.1.19) satisjes to the Euler equation
and it is unique in area
Pseudosolution of Linear Functional Equations
161
Proof. The proof is similar to Theorem 3.1.2. Theorem 3.1.7. Assume that the conditions of the Problem 3.1.6 are satisJied. Then the regularized pseudosolution is a stable approximation to the exact solution of equation (3.1.6). Proof. Define the sequence 6 k = {ok ,pk such that [Morozov & Grebennikov 19921
Let
be the exact solution. Using the definition of the element isk,we obtain
Hence we have
that is, the sequence z6, ,k = 1,2,.... is bounded from above
It follows that we can isolate a weakly convergent subsequence from the sequence z6, . Without loss of generality, we assume that
Passing to the limit k +co in the inequality
162
Alexander S. Mechenov
we obtain that Nm
6, +o
1 1 - vll~'k ~1 1 -~ 1 =
"L
=0
and i is the solution of the
equation (3.1.7). Then it follows from Eq. (3.1.24) that /'Jlz6,
iw4,
=I lil,!~)
.
In the Hilbert space wz(l), weak convergence and convergence of the norms imply strong convergence (Arcela theorem- [Kolmogorov & Fomin 1972]), so that lim z6, = i and by uniqueness 5 = 4. k-m
Remark 3.1.3. 1. The Euler equation (3.1.23) for this problem differs from the Euler equation (3.1.10) in the following sense. With respect to the Sturm-Liuville operator LZ= z - Z" in the stabilizing operator of the equation (3.1.23)
the smoothing and biasing parts become multiplying by a constant, and biasing part becomes increasing.
Pseudosolution of Linear Functional Equations
163
2. Since the value m is complicated for computing (it depends from the norm of the exact solution), we propose to compute the regularized pseudosolution from the Euler equation (see Eq. (3.1.19))
The changed mode of the parameter choice can be preferable. In this case, the parameter choice does not depend from the exact solution norm. 3.1.4.2 Regularized Quasisolution Let us construct a method that utilizes the known solution norm in the presence of nonrandom passive measurement errors in the right-hand side and of nonrandom active errors in the prescribed core (a generalization of the regularized quasisolution method). Problem 3.1.8. Given the approximate right-hand side q,llg"- KC, - k,l= o,
the exactly specijed nondegenerate core, the core realization error norm p in model (3.1.19) and values a,y;find the regularized quasisolution. Since the total right-hand side error G = - KC = 74+ Z depends on a sought solution, the L2 residue norm inadequately represents the situation and the minimization problem cannot be solved by standard method, that is, in the form
Therefore, correct description of the error norm required applying the equivalent of the statistical approach [Mechenov 19971. Using Eq. (3.1.20), we construct the problem of computing the regularized quasisolution
164 Alexander S. Mechenov Theorem 3.1.8. Assume that the conditions of the Problem 3.1.8 are satisJied. Then the variational problem to construct the regularized quasisolution takes the form
<,=argf
inf
and the regularized quasisolution of the equation (3.1.6) satisJies to the Euler equation
and it is unique in interval
Proof. The proof is similar to Theorem 3.1.2. 3.1.5 Sturm-Liuville Differential Operator
In practice, the use of the Sobolev spaces of higher order than the first is desirable. We study connection between functions from the Sobolev space and the solutions of the Sturm-Liuville differential equation. 3.1.5.1 Sturm-Liuville Differential Operator of the First Order We consider an ordinary linear differential operator
165
Pseudosolution of Linear Functional Equations
Let function K(S)2O, but is not equal to zero everywhere, &)>O and we consider variety twice continuously differentiable functions a s ) such, that r(a)=r(b)=O. It is obvious, that unique function which satisfies to equation Lc=O, at ~(s)=const,S(s)=const, and to these boundary conditions, it will be identically equal to zero. It is known, that the Sturm-Liuville operator is unlimited. We consider the following functional [Tikhonov 1963, Tikhonov & Arsenin 19791: the norm in Sobolev space
5(1) [a,b][Sobolev 19501
Integrating by parts at ga)=c(b)=O, we have
3.1.5.2 Sturm-Liuville Differential Operator of the Order n Similarly above-stated, we consider the function norm in the Sobolev space
P$)[a,b]
c(s)
with the conditions q(s)>O, i 0 , l and function such, that &(a)=&b)=0, i=O, 1,...,n- 1. Fulfilling an integration by parts, we have
where L(") is the Sturm-Liuville differential operator of the order n.
166
Alexander S. Mechenov
3.2 About the Appurtenance of Gauss-Markov Processes to Probability Sobolev Spaces 3.2.1 Random Processes
The given item is necessary for an account of some facts from the theory of random processes that will be used further. Let { S 2 , 7 , ~ )is a probability space, where n={o} is a space of simple events, 7 is the a-algebra of its subsets, which elements refer to as random events, P is a probability (the measure on {S2,7) such that P{n}=l). Also is present the measurable space (W,q in which all singletons are measurable also which elements refer to as conditions [Ventsel 1975, Malenvaud 19701. The random variable w is measurable map of space {Q7,P) in (W, 9 . Under the a-algebra generated by system of random variables w, with values in (Wa,?a), we understand the a-algebra generated by events of an aspect {w,EG}, GE?,, CF {w.}=cF {{w,EG}, GE?,}. Distribution of a random variable w is the measure F, on (W, a), assigned to the relation
A joint distribution of random variables x, y..., z, accepting values in spaces (X, 4, (Y, a),..., (2, $Z) is a distribution of the random vector (x, y..., z)
The random defined on the set T function is function wt(u) which at everyone f ET is measurable on w. Random process with discrete time is a random function, for which T is or set of all whole: f c Z or set of the whole positive: f d . Taking into account a finite number of values f: M = {tl,f2,...,t, ) c T , the distribution F of random variables wt, (a),wt2(o),...,wtm(o) (that is, the vector (wl,w2,...,w,)~) is by definition the probability P {w €A} for set A E F
These distributions at every possible f 1, f2,...,fm€ T carry a title of finitedimensional distributions of a random function. Random process wt(o) refers to stationary, if
Pseudosolution of Linear Functional Equations
identically relatively u, t and at anyone T. Only random process by definition is formed of sequence of independent random variables with the same distribution. Distributions of this process are the following
It is obvious, that this process is stationary.
The moments of the first and second sort of random process will be
The expectation is calculated with the help of a cumulative distribution function O f ,a covariance is calculated with the help of OtIt2. If process is station-
"
ary, the condition Ot,+, = Otl leads to that Ew(t)= is equal to a constant, not dependent from t, and K(t1,t2)=K(tl-t2)=K(u)depends only on a difference of arguments and refers to as an autocovariance. We put rco = a2 ,/ci = cr2f i ,i = 1,. ..,GO. Thus, performances of the first and second order of stationary random process are: expectation ~ i r ,a variance a2, and a correlogram p i ,i = 0,.. oo . All autocorrelation coefficients of correlogram of only random process are equal to zero except for the first. Let random process a,
be stationary random process without correlations. Process of the form wt = yget + ylet-1 +...+Yket-k refers to as sliding average process, where yo,^,...,yk are the fixed numbers. It is easy to calculate its correlogram
168
Alexander S. Mechenov
Hence, the correlogram is equal to zero since valuej=k+ 1. 3.2.2 Norms of Autoregression Processes
We consider the equation
where ,$,b,...,Pk are the fixed numbers, and wt is a stationary random process. We study solutions of this classical linear finite difference equation with constant factors. We replace w, on zt, where z is a complex number
where z,, j=1,. ..,k, are the radicals of this equation. Three types of radicals differ: 1. Iz,l>l, j=l, ...,k, the solutions miss, when t will increase. 2. Iz,l
Stationary random process w, being a solution of the equation
where et is a stationary random process with zero expectation and without correlations, refers to autoregression process. In the case 2, when all radical modules of the equation (3.2.1) are less than unit, it is possible to calculate the following expansion in a series
Pseudosolution of Linear Functional Equations
and then
That is process w,notes as process of sliding average process e, , teZ. From this formula immediately follows, that
Above-stated result allows consider the problem evaluations of association between norms of processes. 3.2.2.1 Norm of Autoregression Random Process of the First Order We consider more in detail autoregression random process of the first order
where the factor of an autoregression p should be less unit: O
that process w, = pw,-l
+ e, can be written in the form
whence any term of a covariance matrix is written in the form
Thus for autoregression processes of the first order we have model
Alexander S. Mechenov
where
Lemma 3.2.2. The sum of squares of elements ("Euclidean norm") of the stationary random process e, with zero expectation and without correlations of the model (3.2.2) is equal to
that is to "Sobolev norm" [Sobolev 19501 of autoregression stationary random process w , where'the norm in 6 f ) is written in a dfirence form (norms are taken so-called as operation of expectation is not written out). Proof. We note, for convenience of the calculations, above-stated model (3.2.2) in the finite number of points t= 1 ,..., k+ 1
or in the following matrix form
Pseudosolution of Linear Functional Equations T We calculate eTe= w T ApApw. AS
that, applying the notation of finite differences, we have [Tikhonov 19631
For stationary random processes factors I-$ and p are positive. 3.2.3 Construction of Gauss-Markov Process 3.2.3.1 Gauss Process We consider construction of random process on a known set of the finitedimensional distributions coordinated naturally. Let {R,,d,&T} is a metric space. As it is known, the direct product of a finite number of metric spaces {R,,d,tcME T,M
Alexander S. Mechenov
PdA *Rm)=PN(A), NcM, A c%.
The measure PN refers to as a projection of the measure PMon (RN,?~). We define still measurable space RT = Rt . For final set M c T and
n t€T
Ac?, the s e t A * R T d T refers to as the cylinder with foundation A, and ?T is the least a-algebra containing all cylinders. Theorem 3.2.3a. ([Kolmogorov 19561, [Ventsel 19751, [Klimov & Kuz'min 19831) Let R, for everyone t ET be fill separable metric space and the set of probability measures PM on (RM7gM) on the Jjnal sets M f l satisfi to coherence condition. Then there is the unique probability measure P on (RT,?T), which projection on (RM7gM)coincides with PM for any set M d , that is P(A*RTw)=PIM(A)for everyoneAc?M.. Corollary 3.2.3. We assume, that conditions of the Theorem 3.2.3a are carried out. Then there is a random process w={w, t e n such that: 1. The random variable w, for everyone t ET accepts values in R, 2. For any final gang M= (tl,...,t,,) c T
for all 13-gl x ? . ..X
C
?
~
where gk = gtk , that is finite-dimensional distribu-
tions of process w coincide with corresponding probability measures of the set coordinated set of measures. I Proof. Really, it is enough to put w,(w)=o(t) for teT, weRt. As w={o(t): t€T) E RT ,then
3.2.3.2 Gauss-Markov Process We consider random process with discrete time. We assume, that w k accepts values from set R,. Let distribution w ~ depends + ~ on the value wk and does not depend on the preceding values we,, Wk.2,. ... That is wk+l=hk(wk,ek), where the Bore1 functions h k are given, and values ek,k1,2,..., m ,... are those, that Ee,ek=O, Ewkek=O. Then this sequence refers to as Markov process.
173
Pseudosolution of Linear Functional Equations
(el
Let some vector ( = .., and a symmetric positive definite matrix k , j ~ M are ) given. We show, that there is a probability space {Ll,7,P) and random process zf, such that ,a
1) Finite-dimensional distributions of process are normal. 2) Ezk= rk, COV(Z~,Z,)=W~, k= 1,2,...,m; j=1,2,...,m. Such process is called normal (Gaussian). We put R ~ = R for ' k= 1,2..., m; Rk = Rm. For any set M={t,,t2,...,tm) we put Q,,={wb; k,jjeM), RM = tk &f
&(G ,&,. ..,em) also we define probability measure PMon (RM, ?M)
(Rm, F), supposing PpN(&&). The coherence condition of the set of measures PM, ME[0,4 at IM 1
It is enough to consider the case ILI=m-1, that is L={tl ,..., tm.l) and if the random vector z = (zl ;.., random
vector
T
2,)
has the normal distributionPpN(&,aM), then the
z =( z , , z
T
has
the
normal
distribution
N , - ~ ( ~ ~ ~ , ~ , ~ - " ) = N , , ~ ( ~ ~where , R ~ )the= P matrix ~ ,
T={s,,
i= 1,. ..,m- 1;j= 1,. ..,m) . Therefore PIM('BMR')=P~ZE 'BMR')=PM(~zE Q=PL(~. NOW according to the Corollary 3.2.3, exist random process z(t), t~ [O,a with values in IR' such that for any final set M={tl,...,tm)je [O,TJ, the distribution of the random vector z=(z(tl),...,z(tm))is the normal distribution PpN(&,QM). In particular, as COVZtl ( ( ).Zt2 ( )) Theorem 3.2.3b. Gaussian process for which the covariance matrix is equal to
is as well Markov process. Proof. [Klimov & Kuz'min 19851. Because of previous it is enough to check up, that for anyone m numbers tl
Alexander S. Mechenov
is positive definite. This matrix can be written as
The matrix A is positively defined by Sylvester criterion [Shikin 19871, because all matrices
have d e t ( ~ k=)al (a2 - al),...,(ak - ak-1) > 0 . For an evaluation of a determinant, we subtract from the last row the penultimate row, etc. Precisely also all matrices
have det(Bk)= (bl - b2)(b2- b3),...,bk > 0 (for an evaluation of determinant it is subtracted the second row from the first row and so on). We show, that matrix W=C?A@B, where A> 0, B> 0 are also positives, is defined at given A and B. Really, we present the symmetric matrix A in the form
Pseudosolution of Linear Functional Equations
175
A=QAQ~,where Q is an orthogonal matrix and A={h&ki} is a scalar matrix. Then
as all b & - 0 . 3.2.3.3 Canonical Expansion Definition 3.2.3. Any representation of process w(t) in the sum of elementary randomfunctions, that is in the form:
where ,u(t), (t),i = 0,m; are nonrandom functions (called coordinate functions), and ei are the random variables (called random factors and satisjed the conditions Eei=O, Eeiei=Di>O,Eeie,=O at iZj), is called canonical expansion of random process w, This implies canonical expansion of correlation function
and variance ~ ( tt ),=
CDiy? ( t ).
The convergence of this positive functional series in each point, for example, an interval [0, is necessary and sufficient for existence on this interval of canonical expansion. That is, for convergence, in sense of average quadratic, of some elementary random functions to the centered cut wp ,t E [0, T] the given process.
176
Alexander S. Mechenov
For construction of the approached canonical expansion of process wt on an interval [0, T ] with known correlation function K(t,s), the interval [0, T ] is divided by points tl,...,t,,. .. in the parts. For random factors of expansion ei are taken the linear combinations of the centered cuts of process w, in the moments tl,...,tm..with constant factors z, . For which evaluation, the following SLAE is solved
applying operation of term-by-term multiplication of expectation and of variance. Theorem 3.2.3~.Gauss-Markov process is the autoregression random process of thejrst order. Proof. We consider Gauss-Markov process with zero expectation and correlation function cov(w,,ws)=~e~4~s~. We discover the approached canonical expansion of this process, having taken for random factors linear combinations of cuts of this process in the moments t=(k-l)h, k=L. .,m; and solving system with unknowns pf Obviously, Ewl=Eel=O, Del=Dl. Further, multiplying termwise the first and the second equations and applying expectation operation, we discover
As el and e2 should be uncorrelated, we receive Eele2=0, whence we discover y 2 1 = e-44 .
(
De2 = D 1-e - 2 al h l )
Further
we
have
2 ~ 2= Y ~ ~2 D + whence E W E ~ ~ ,
. Multiplying now the third equation in turn on the first and
the second, and applying expectation operation, we obtain
etc. Substituting all these values yif in a system of equations and solving they relatively ei, we obtain
Pseudosolution of Linear Functional Equations
177
That is it is shown that Gauss-Markov process will be at the same time autoregression process of the first order. 3.2.3.4 Ergodicity Definition 3.2.3. The random process w(t) refers to ergodic concerning the expectation, if 1) Its expectation is constant: Ew(t) =paonst; 2) The limiting relation takes place: lim f T w(t)dt = p . T+m 0
The following condition lim
T
T+m 0
(I- $ ) ~ ( t ) d =t 0 is necessary and sufi-
cient condition for an ergodicity concerning expectation of the stationary process. The condition k(t)4 at t - m . is sufficient. Theorem 3.2.3d. Markov random process is ergodic. The proof is obvious, as correlation function of this process has the form: k(t) = e-alhl also aspires to 0 at t +m.. From ergodic Birhoff-Hinchin theorem follows, that for a determination of expectation, of variance and of autocorrelation of an ergodic process with probability unit it is possible to be limited only to one realization. 3.2.4 Random Process of Errors of the Right-Hand side
We designate PN a probability density of normal distribution with zero expectation and a variance d.Let on probability space (n,~, p N } the random function e, ,t E T is given. We enter the o-algebra %generated by this randonl function. With o-algebra 3 it is possible to connect the various spaces of random variables generated by a random function e, ,t E T . Let L ~ , Tbe a Hilbert space of the square integrable random variables, generated by e, ,t E T
1
such that e: pNdt < m converges. T
178
Alexander S. Mechenov
In this Hilbert space of the square integrable random variables ~2 {Q, 7,PN ) ,the scalar product is defined through (e,c)L2= EeF .
If the random function e, ,t E T is square integrable, then it is defined the scalar product
that expectation and correlation function define a random function, that has one to one correspondence with respect to isometric linear transformation of the space ~2 (Q.7, P N ) that are leaving invariant the vector 1. If T is a final set M then L2,T = R~ also the Hilbert norm becomes Euclidean. 3.2.5 A Priori Distribution of the Required Pseudosolution
The regularization was constructed initially in w2(1) [a,b] [Tikhonov 19631. Various authors on miscellaneous tried to solve a problem of the probability representation of the a priori information about appurtenance to this space of the integral equation solution. In [Turchin 1967, 1968, 19691, [Turchin & Nozik 19691, [Turchin, etc. 19701 " the statistical ensemble of smooth functions " (we underline, that statistic is a function of sample) as "statistical" exposition of the Sobolev space w2(1) [a,b] without engaging probability analog of this space was introduced. In work [Korobochkin & Pergament 19831 the parameter exposition of a solution is offered. We designate PN M a probability density of Gauss-Markov distribution with zero expectation, with variance c? and covariance p = e-"It-'I . Let on probability space ( $ 2 ,p~N.P M ]the random function w,,teT is given. We enter aalgebra YT generated by this random function. With a-algebra YT it is possible to connect the various spaces of random variables generated by a random function
Pseudosolution of Linear Functional Equations
179
(9 be a Hilbert space of the square integrable random variables and of w,. Let w2,T the square integrable their derivatives, generated by w,, t ET
such that
j
W:
pNeMdt< m converges.
T
In this Hilbert space of the square integrable random variables and of the square integrable their derivatives w2(1) {S2,7,pNSM),the scalar product is determined through
If the random function w,, at anyone t€T, is square integrable together with the derivative, then the scalar product which is defined, according to the Lemma 3.2.2, is equal to
As the functions also belong to Lebesgue space, the expectation and correlation function of the random function w, and of its derivative give a random function, that has one to one correspondence with respect to isometric linear transformation of Lebesgue space leaving invariant the vector 1. If T is a final set M, then L2,T = R ,also the Hilbert norm becomes Euclidean power norm. For construction of distribution of these processes we consider the theorem. Theorem 3.2.5. The functions w(rJ with the limited norm in space w$$ with constant factors K 1 and K O
such that wf(a)=w'(b)=Oor w(a)=w(b)=O at a difference approximation of their norm by conservative difference schema of thejrst and second types on uniform
180
Alexander S. Mechenov
grids generate system ofjnite-dimensional distributions of Gauss-Markov random processes. Proof. Under boundary conditions w'(a)=wt(b)=O or w(a)=w(b)=O this norm can be written in the form [Tikhonov 19631
We consider approximation of this norm on a uniform grid, not limiting a generality, for example, the difference scheme of the first type. Then for preservation of symmetry accordirig to the Theorem 3.5.1 it is necessary to approximate an integral on a trapezoid rule and we receive the following difference analogue
where the matrix SZ-'has the form
with boundary conditions w'(a)=w'(b)=Oor it is represented in the form
for the boundary condition w(a)=w(b)=O. From here
where
Pseudosolution of Linear Functional Equations
1
We present p in the form p = e-ah , where a = - -En p. Then, applying h the Theorem 3.2.3b, and supposing the expectation vector is equal to w=(0,0...,o ) and its covariance matrix is equal to
we receive a set of required Gauss-Markov distributions. For the proof termination it is enough taking advantage of the Lemma 3.2.2. 3.2.5.1 Gauss-Markov Distributions of the Order n Let the random variable zs is given on probability space @,7, P M } . We enter a-algebra 7,generated by this random function. With a-algebra 7, we connect space of random variables with integrable in square derivatives up to the order n, generated z,, s ES,
For the representation of a scalar product we consider the theorem. Let w(s)=z(s)-a s ) .
(4
Theorem 3.2.5a. Functions w ( s ) ~% [a,b] with the limited norm in Sobolev space with constantfactors K,, i=O..., n 2
~:i~~(%] ds<w
i=O
ds'
such that w(')(a)=w(')(b)=0,i=1,2,...,n-I, at a dflerence approximation of their norm by conservative drfference scheme of thejrst and second types on uniform grids generate system ofjnite-dimensional distributions of Gauss-Markov random processes of the order n (or otherwise normal autoregression processes of the order n) The proof is similar to the Theorem 3.2.5 with use of the mathematical induction method. So, the scalar product is defined.
~
182
Alexander S. Mechenov
3.3 Fredholm Linear Integral Equations with the Random RightHand Side Errors We consider the pseudosolution estimation of the linear integral equations of the second kind also the regularized pseudosolution and quasisolution estimation of the linear integral equations of the first kind in the presence of random passive measured errors in the right-hand side. Let the random processes of the measured right-hand side belong to Hilbert space of square integrable ergodic random functions. This supposition allows to take advantage of ergodic Birhoff-Hinchin theorem. The theorem states: for estimating the mathematical expectation, the variance and the autocorrelation of an ergodic process with probability unit is possible to be limited only to one realization. 3.3.1 Solution Estimator of Fredholm Linear Integral Equations of the Second Kind Assumption 3.3.1. According to item 3.2, we use the following linear model of passive observation for the Fredholm linear integral equation of the second kind (3.1.1) in the presence of the Lz random errors in the right-hand side: p = 6 - AKc
-
11
q(t) = ~ ( t-) ~ ( s)c(s)ds, t, t E [a,61, a
(3.3.1)
It agrees the item 3.5.1, the approximation of the linear model (3.3.1) leads to the linear regression model
The procedure of a statistical estimation of the pseudosolution, at corresponding extrapolation in the case s , # ti ,is reduced to the LSM. 3.3.2 Regularized Estimators of Fredholm Linear Integral Equations of the First Kind
Pseudosolution of Linear Functional Equations
183
Assumption 3.3.2. According to item 3.2, we use the following linear model of passive observation for the Fredholm linear integral equation of the Jirst kind (3.1.6) in the presence of the L2 random errors in the right-hand side:
It agrees the item 3.5.1, the approximation of this linear integral equation leads to linear regression model
We construct the statistical analog of the variational problems considered in item 3.1 for computing the estimator of the regularized pseudosolution and of the regularized quasisolution of equation (3.3.2). For computing the error norm of the right-hand side we take advantage of the item 3.2.4 where the scalar product is defined. According to this definition, the weighted norm of errors is equal to
3.3.2.1 Estimator of the Regularized Pseudosolution from the Residue Principle Assumption 3.3.2a. The set of admissible pseudosolutions u o f equation (3.3.1) is deJined similarly Eq. (3.1.8) by the inequality
where o2 is the a priori known variance of errors. Problem 3.3.2a. Given one realization of the approximate random righthand side y , its variance an exactly specijied nondegenerate core K(X,S) and the a priori function 2 , compute the regularized pseudosolution of model
d,
184
Alexander S. Mechenov
(3.3.2)as the argument that minimize the stabilizing junctional in the set of admissible pseudosolutions ku 2
zc = Arg minlc - 21
r=
z$)[, b],on
.
w2
Theorem 3.3.2a. Assume that the conditions of the Problem 3.3.2a are satisfied. Then the variational problem to construct the regularized pseudosolution takes the form
Proof. It is similar to the item 3.1.2 [Metchenov 1977, 19821. 3.3.2.2 Estimator of Regularized Quasisolution on the Solution Error Variance Assumption 3.3.2b. The set of admissible pseudosolutions v is defined similarly 3.1.7 by the inequality
V =
where the value 2 is the realization of the random variable z with mathematical expectation 6 and the variance ? is known in advance. Problem 3.3.213. Given one realization of the approximate random righthand side y" and the a priori function 2 , a variance ?, an exactly specified nondegenerate core K(x,s), compute the regularized quasisolution of model (3.3.2) as the argument minimizing the residuefunctional on the set v
Theorem 3.3.2b. Assume the conditions of the Problem 3.3.2b are satisfied. Then the variational problem to construct the regularized quasisolution takes the form
Pseudosolution of Linear Functional Equations
and the regularized relatively Z quasisolution of the equation (3.3.2) satisfies the Euler equation
Proof. It is similar to the item 3.1.2. Thus it would be possible to count constructed a statistical analog of the regularization method. But in the first, it is variational-statistical, instead of only statistical approach, in the second, in practice it is necessary more oRen to use the ready package of programs of linear and curvilinear regression, and they do not provide a solution of such variational problems. Therefore it is desirable to construct the statistical regularization method on the basis of information about errors of random processes and to set the information as superposition of regression models, application to which the standard methods much more simplify the problem of a computation of estimators. 3.3.3 Statistical Regularization Method for the Fredholm Linear Integral Equations of the First Kind
Definition 3.3.2. Mathematical (probability) expression of concept of the a priori information about unobservable nonrandom@nction T(s)we assume a priori distribution P of a random function z(s), having T(s) as mathematical expectation. As a priori distribution of a required pseudosolution of a Fredholm linear integral equation of the first kind we take the Gauss-Markov distribution of the first order, which can be written in the form
which error norm belongs to the Sobolev space. Having taken advantage of the item 3.2.5, we find that the error norm of a solution is described by the following relation
Alexander S. Mechenov *
Combining with the linear model (3.3.2) of linear integral equation of the first kind in the presence of the measured random errors of the right-hand side, we rewrite the model (3.3.2) in the form:
Passing to finite differences in the two equations of the model (3.3.3), we obtain the following mixed regression-autoregression model of the form:
and applying to the presented model the LSM (the regression analysis), we take possession of the stable estimator of the regularized pseudosolution. At a computation of an estimator of the regularized pseudosolution the values o, v or their ratio are computed by iterative methods either from a residue principle or from a quasisolution principle, or from a functional principle (i.e., so that the RSS was equaled to n+m+ 1). Also on this basis, the method of a statistical regularization is constructed. The given approach realizes a more simple form of mixed regression model, than the Bayesian approach [Zhukovskij & Morozov 19721. Remark 3.3.3. As for obtaining the normal smoothness, it is necessary to use the autoregression processes of the order 4-5 [Gor'kov, Mechenov, etc. 19881, the convenient form of their representation is the computation of their factors through radical product computation of the corresponding algebraic equation (3.2.1).
Pseudosolution of Linear Functional Equations
187
3.4 Linear Integral Equation with the Measured or Realized Core Let for linear integral equations the random fields of errors of the measured or realized core and random processes of the measured right-hand side belong to Hilbert space of square integrable ergodic random functions. This supposition allows to take advantage of ergodic Birhoff-Hinchin theorem which states, that for a determination of mathematical expectation, of variance and of autocorrelation of an ergodic process with probability unit it is possible to be limited only to one realization. In this item we develop the variational statements of problems at the statistical approach in the regularization method of solution of linear integral equations of the first kind with random measurements of the right-hand side and with passive errors in random measurement of the core or with active errors in random representation of the core. 3.4.1 Estimate of the Pseudosolution of Linear Integral Equations of the Second Kind at Passive Measurements in the Core and in the Right-Hand Side
We consider the most popular case when we are homoscedastic errors in the core and in the right-hand side of the Fredholm linear integral equation of the second kind (3.1.2). As this problem is well posed, we construct the standard estimation similar to Chapters 1 and 2. 3.4.1.1 Passive Random Errors in the core Assumption 3.4.la. We use the following linear confluent model of passive experiment
where all errors have normal distributions. Similarly item 3.1 we put the following problem. Problem 3.4.la. Given one realization of the approximate right-hand side y" and the approximate core k"(x,s) of model (3.4.la), compute the pseudosolution minimizing the sum of deviation squares of the approximate input data from the sought data for Eq. (3.1.1) 6 - AKc = q~ :
z =A
{
i n
6
Alexander S. Mechenov
[I? + G 2
inf
K,~:C-x6=9
Theorem 3.4.la. Assume that the conditions of Problem 3.4, l a are satisjed. Then the variational Problem 3.4.la to construct the pseudosolution takes the form
(3.4. la)
and the pseudosolution z," is computedfrom the equation
Proof. Passing from the integral to Darboux sums, that is, making an approximation (without loss of generality, on uniform grids), we replace equation (3.1.1) with the SLAE
where Kth , I t are the matrices such that [Mechenov 19781
and the x- and s-partitions of the interval [a,b] are identical, 9, is a vector such that
rt rh are the solution vectors on the grids t and h respectively. Assuming that 3
the approximate errors are much smaller than measurement errors [Mechenov 19771, and therefore can be incorporated in the input data errors, we consider the variational problem of pseudosolution computing with a known approximate righthand side and the matrix yt, kt/,for the SLAE (approximated as in Eq. (3.1.1'))
Pseudosolution of Linear Functional Equations
{
zP,h = arg min t;h
min { ~ t h p ' p&t;h t -&hSh='pt)
This system has been solved in item 1.4, and its solution exists, is unique (the proof is similar to a corollary 1.4.1) and minimizes the functional
Differentiating, we find that to estimate the solution we need to solve the equation
Passing to the limit as t 4 , h 4 , i.e., passing from Darboux sums back to integrals, we write the expression in functional form, i.e., as the solution of the variational problem. This method is applicable to construct numerically the pseudosolution of linear integral equations of the second kind with random measurement errors in the core and in the right-hand side. 3.4.1.2 Random Active Errors in the core Let us consider the methods for computing the pseudosolution estimate in the presence of random active errors associated with the realization of an exactly specified core and the random passive errors associated with observation of the right-hand side. Assumption 3.4.lb. We use the following linear model of active observations for the Fredholm linear integral equation of the second kind (3.1.1)
where j(x,s) are the L2 random realization errors of exactly specfled core.
190
Alexander S. Mechenov
Exactly specified cores that are realized with an error (an active experiment [Mechenov 19961) a considered here for the first time in the literature. We provide a formal statement of the construction of the pseudosolution. Problem 3.4.lb. Given the approximate right-hand side q", the error vari, the variance p2 of its realization ance a 2 , the exactly spec@ed core ~ ( x , s )and error for model ( 3 . 1 3 , compute the pseudosolution. Whereas the total error of the right-hand side ii = q" - 6 + AK< = - f i r + 2'
14,
depends on the sought solution, its variance has the form a2+ p2 2 (Corollary 2.1.2). We construct a minimizing functional by the MLM (for simplicity we minimize negative double logarithm of likelihood function, without inessential constant)
Theorem 3.4.lb. Let conditions of the Problem 3.4. l b are satisjed. The solution of the variational problem
zp = arg
for estimation of the pseudosolution exists, it is unique and is computedfrom the equation
Proof. We use the approximation as in 3.4.1.1 and the results of item 2.1.2. Let us now consider linear integral equations of the first kind, which must be solved using the regularization method. 3.4.2 Variational Problems of Estimation of Regularized Pseudosolution and Quasisolution in the Presence of Random Passive Errors in the Core and in the Right-Hand side In practice, in addition to measurement errors in the right-hand side, we have to deal with passive measurement errors in the core or errors associated with the re-
191
Pseudosolution of Linear Functional Equations
alization of the core. We have to distinguish clearly between these two problems. We start with a model of purely passive random observations (i.e., we assume that the core and the right-hand side are observed with random errors). Assumption 3.4.2. We use the following linear model of passive observations for the linear integral equation of the jrst krnd (3.1.7) with L2 random normal distributed measurement errors in the core and in the right-hand side:
3.4.2.1 Regularized Pseudosolution
We develop a method that uses the known variances of random passive errors in the right-hand side and in the core to estimate the pseudosolution (the regularized method of distances [Mechenov 19911). To this end, we define again the set of admissible pseudosolutions. Assumption 3.4.2a. The set of admissible pseudosolutions u of the Equation (3.4.2) is defmed by the inequality
K-K pB:KC=p
where the error variances 2, ,dare known in advance. Problem 3.4.2a. Given one realization of the approximate right-hand side y" and of the approximate nondegenerate core g , where a and p are known values; estimate the regularizedpseudosolution of model (3.4.2): z P = arg inf
IR?~,),
Ccu
w2
where the difference analogue of Sobolev norm is taken. Theorem 3.4.2a. Assume that the conditions of Problem 3.4.2a are satisjled. Then the variational problem to estimate the regularized pseudosolution takes the form
Alexander S. Mechenov
and the regularized pseudosolution of Equation (3.4.2) satisJies the Euler equation
Proof. Passing from the integral to Darboux sums as in item 3.4.1 (i.e. making approximation), we rewrite Equation (3.1.6) in the form
Ignoring the approximation errors, which by assumption are much smaller than the measurement errors [Mechenov 19781, we consider the problem zhp = arg
min E
Its solution has been obtained in [Mechenov 19911, [Mechenov 19881. Noting that the mathematical expectation for the degrees of freedom number of the RSS is equal to n (Theorem 1.4. la), and that the sought value of the pseudosolution is attained on the set boundary, we can write the problem in the form
193 Passing to norms in difference spaces, we can write the expression in integral form. Solving this variational problem by the Lagrange method, we obtain
Pseudosolution of Linear Functional Equations
Differentiating, we find, that the computation of the regularized pseudosolution requires solving the Euler equation simultaneously with the equation for the multiplier A. If the second equation in Eq. (3.4.2a) is not satisfied for any b 0, then the pseudosolution is computed for A+oo [Tanana 198 11. 3.4.2.2 Regularized Quasisolutions Let us construct the solution method that uses the known solution norm (or the known norm of a priori solution errors) in the presence of random passive observation errors in the right-hand side and in the core (a regularized quasisolution). Problem 3.4.2b. Given one realization of the approximate right-hand side y" and the approximate core K of model (3.4.2), a given value j! bounding the set of admissible quasisolutions, error variances d and p2,compute the regularized quasisolution
Theorem 3.4.2b. Assume that the conditions of Problem 3.4.2b are satisjed. Then the variational problem to construct the regularized quasisolution takes the form
Proof. Again passing from the integral to Darboux sums, we repeat the computations of Theorem 3.4.2a and we obtain the variational problem 3.4.2b. Applying the Lagrange method with the multiplier a, we obtain the functional
Alexander S. Mechenov
Computing its variation, we obtain the corresponding Euler equation. The proof of stability of the regularized quasisolution is similar to Theorem 3.1.4. 3.4.3 Statistical Regularization Method of the Solution Estimation of Linear Integral equations of the First Kind in the Presence the Passive Measurements of the Core and of the Right-Hand Side
We consider the most popular case, when the Fredholm linear integral equation of the first kind (3.1) has the homoscedastic errors in the core and in the right-hand side. Regularized solutions were searched initially in Sobolev spaces [Tikhonov 19631. Various authors tried to solve a problem of the probability representation of the a priori information that the solutions appertain to the Sobolev spaces. In [Turchin 1967, 1968, 19691, [Turchin & Nozik 19691, [Turchin, etc. 19701 "the statistical ensemble of smooth functions" as exposition of Sobolev space without construction of probability analog of this space was introduced. According to the Theorem 3.3.2 the Gauss-Markov random functions (or, in a discrete aspect, autoregression processes) belong to probability Sobolev spaces [Sobolev 19501. After the probability analog of Sobolev spaces, construction of a statistical image of regularized algorithm and, further, its discrete analog obviously enough is constructed, using, for example Bayes strategy. Assumption 3.4.3a. We use the following linear confluent integral model of passive experiment with the following a priori model for the sought solution
where all measurement errors have normal distributions and the a priori distribution errors have Gauss-Markov distribution. We enter a priori distribution of the sought estimate of a solution as follows: 1. This random function has the sought exact solution (not observable determined function) of linear integral equation of the first kind as the mathematical expectation;
Pseudosolution of Linear Functional Equations
195
2. Its covariance function sets desirable smoothness of a regularized pseudosolution estimate. On the basis of the Theorem 3.2.5 as a priori distribution, we have a possibility to take the Gauss-Markov distribution. To understand as to construct a statistical estimation, we take advantage of problem of the regularized pseudosolution estimation (item 3.4.2a) and of its solution method. Problem 3.4.3a. Given the approximate functions 7 ,k,as realization y, K, and the realization Z offitnction z having Gauss-Markov distribution, of model (3.4.3a), value o, p, estimates unknown values of the pseudosolution by the MLM. Theorem 3.4.3a. Assume that the conditions of the Problem 3.4.3a are satisBed. Then the conjluent problem for construction of an estimate of a regularized solution takes theform
and the normal equation of a conjluence analysis has the form
where one of unknown variances can be computedfrom the equation
Proof. For the proof it is enough to write out the likelihood function for a difference analogue of an integral equation and for a priori model of the sought solution
Alexander S. Mechenov
and to carry out standard procedure for computing the estimate. Remark 3.4.3. 1. We mark that construction of finite-dimensional analog is simply passage to autoregression processes. Realization of it is possible with the help of standard program FUMILI [Sokolov & Silin 19611. 2. For passive measurements in the core it is possible to consider the most popular case of construction of a regularized estimation when the core and right-hand side are measured and background is present
3.4.4 Estimation of the Autocorrelation Function from the Autoregression Model
Maximum-likelihood estimates of parameters of the autoregression models representing temporal relationship are offered. We consider typical prognostic model for representation of random process y, ,t = 1,...,n; that is, autoregression model with a quantity of explanatory variables which in many applications has the form [Malenvaud 19701
Parameters of this model are usually estimated by the LSM and it was demonstrated, that this estimate is relative bad: is displaced and asymmetric [Malenvaud 19701. However the values y, are random, and according to the established concepts of a classical regression the values y in a matrix Y = [y-l, y~ ,...,y-m] should be nonrandom, that in this case is not so. To correct a situation, we pass to a functional relation
Pseudosolution of Linear Functional Equations
Ey=cp=EP+H6, -
where 8= [cp-l,cp-2 ,...,cp-m] is a matrix constructed from vectors cpVi ,i = 1,m; having shifted on i of time units of value q j = Ey and H = [l,l,...,1lT is a column-vector constructed from units, and, accordingly, to other entry of initial model. Usually used for the prognosis the Eq. (3.4.3) from regression turns to model of a confluent analysis of passive experiment
Parameters of such model should to be estimated, accordingly, by the LDM from a minimum of the following quadratic functional [Mechenov 19911
where y is a realization of random process y. As the error variances are supposed identical (actually they can differ a little) the formula for computing the estimate becomes simpler. The value c? also can be estimated. Frequently the values y are correlated and in this case the problem of a parameter estimation becomes complicated (its solution is explained in item 1.4.2). We mark, that the autoregression model is more often digitization of continuous appearances which are in that case described by an integral equation such as convolution 00
yt
=
1q(t
+ + et ,t e [-a,m],
- s)~(s)ds 6
-00
Yt = Pt
2
+ et ,Eet = 0,Eetketi = Bike ,
where P(s) is an autocorrelation function (more often it, at least, continuous). Therefore for deriving the stable estimates As), it is necessary to take into account its appurtenance to space $1 (that is, to compute a regularized pseudosolution). Applying a statistical regularization, which in this case seems to the most approaching, we come to a parameter estimation of model
Alexander S. Mechenov
where b, are the a priori values for the sought vector of parameters P (can be, in particular, a zero vector). 3.4.5 Random Active Errors in the core of the Linear Integral Equation
Let us construct methods that compute the pseudosolution in the presence of random active errors arising during the realization of the given core and passive errors associated with the observation of the right-hand side. We consider the most popular case when errors in the realization of the core and measurement in the right-hand side are homoscedastic. Assumption 3.4.5. We use following linear influent model of active experimentfor the linear integral equation of thejrst kind b
b
a
a
i (t) = j ~ ( st ),~ ( s ) d+sj j(t ,s)C(s)ds, 2
qt =it +et,t ~ [ a , b ] , ~=O,Eetketi e, =Jiko,
(3.4.5)
k(t,s)=r(t.s)+j ( t , s ) . ~ f ( t . s ) = 0 , ~ ~ ( t i , s ~ ) j ( t ~ . s k ) = [ 6 ~ ] p ~ , where the realization errors j(x,s) in the core and the measurement errors e(t) in the right-hand side have the normal distribution. Let us formulate the construction of the residue-based regularized pseudosolution and the regularized quasisolution. 3.4.5.1 Regularized Pseudosolution Assumption 3.4.5.1. Let be the set of admissible pseudosolutions
Problem 3.4.5a. Given the approximate right-hand side q", the exactly speczjes nondegenerate core ~(x,s)for model (3.4.5), the values o, p, and the a
199
Pseudosolution of Linear Functional Equations
priori function ? , estimate the regularized pseudosolution so that minimizes the square of deviation of this pseudosolution from ? in ble pseudosolutions
$1
on the set of admissi-
Theorem 3.4.5a. Assume that the conditions of Problem 3.4.5a are satis8ed. Then the variational problem for a regularized pseudosolution construction takes the form
and the regularized relatively Y pseudosolution of the equation (3.4.5) is computed from the Euler equation
and it is unique in interval
3.4.5.2 Estimation of Regularized Quasisolutions Problem 3.4.5b. Given the approximate right-hand side q , the exactly s p e c ~ e dnondegenerate core, the variances 2, 2 of model (3.43, the value 3: and the a priori realization F , compute the sought quasisolution minimizing the minus double logarithm of a likelihood of deviation of the approximate input data from the sought data in the restriction: quadrate of the weighed deviation of the pseudosolution from a priori realization is less or equal to m
Alexander S. Mechenov
Theorem 3.4.5b. Assume that the conditions of Problem 3.4.5b are satisjed. Then the variational problem for the regularized quasisolution construction takes the form
zp = Arg
min
5.1?-4:(,) wz
IKC - q12 2
2
2
=my 0 + p
14% 2
+n ln(02 + p2 14);
and the regularized relatively Z quasisolution of the equation (3.4.5) is computed from the Euler equation
and it is unique in interval
Complicated experiments can be considered similarly to Chapter 2. Remark 3.4.5. In Application, summary tables of the elementary cases of the representation of the information are reduced: deterministic errors (Table 3) and random homoscedastic errors (Table 4) for the models considered above. It is shown in Table 3 the models of linear integral equations of the first kind with the deterministic errors and the regularization methods for computing the pseudosolution. It is shown in Table 4 the models of linear integral equations of the first kind with random errors and the regularization methods for computing the estimate.
Pseudosolution of Linear Functional Equations
3.5 Technique of Computations In the given paragraph, the approximation of Fredholm linear integral equation and of Volterra linear integral equation is constructed in view of a computation in further their solutions by the regularization method that keeps all mathematical properties of this method. That is, the corresponding difference model for a regularization method of the integral equation solution is created, and numerical algorithms for this model are developed. 3.5.1 Difference analogue of the Regularization Method 3.5.1.1 Approximation of the Smoothing Functional We consider non-uniform grids
Let's designate [Samarskij 19771
We consider approximation
of the smoothing functional M' [K, 6,y ] from the Eq. (3.1.9') so that
where values of functions K(x,s), <(s), y(x), are taken in the set points of grids Kth=
{K(xi,sj)1
ch=
{ 6(sj)1 Y t= {Y ( ~ i1) 7
202
Alexander S. Mechenov
andLhhis some difference scheme of the Sturrn-Liuville operator, , Ch, Cr are the coefficients of quadrature integration formulas (m and n dimensional diagonal matrices). We consider [Zaikin & Mechenov 1971, 1973b1, [Mechenov 19781 a problem of a choice of quadrature coefficients and approximating coefficients of the difference scheme so that the numerical analog of a regularization method would be identical to a regularization method in finite-dimensional spaces. Definition 3.5.1. We name the conservative difference scheme on a nonuniform grid symmetrical with factor Ch (Ch-symmetrical)if at multiplication Lhhto the matrix C h we receive a symmetric matrix of coeflcients Whh=ChLhh of the difference scheme. Theorem 3.5.1. That approximation of a smoothing functional for a linear integral equation of thejrst kind led to afinctional for a SLAE such, that
is necessary and enough the equality of quadrature factors Ch =
e, et
eh= eh,the
ch
equality and the positiveness of quadrature factors C, = = , and the symmetrical conservative difference scheme Lhhthat approximate the operator L. Proof. Necessity. Let approximation of a flinctional y ] is such, that
MYK,~,
.rh,y, ] = Ma [K.<,y] . For the proof of necessity of quadrature coeffi-
M; [Kth
cients' equality we construct the Euler equation for the functional in Eq. (3.5.1)
and we compare it to the algebraic Euler equation for M"K,<,y] :
For realization of the symmetry condition, the matrix chc,K;,ehKh should be
gY,, Ch = eh= eh,
represented as K ~ K that is fulfilled at K = c ~ ~ K y*= ,
e,
C, = = C, and positivenesses of quadrature coefficients C, . Further, for symmetry of the matrix W = chLhh, the conservative difference scheme Lhh should be Ch-symmetrical.
203
Pseudosolution of Linear Functional Equations
Sufficiency.We consider the functional M " K , ~ , ~for ] the SLAE Kc=y
Let's show, that if to take
where Lhhthere is the Ch-symmetrical conservative difference scheme of order 1, then the functional with such expression for K, y, W approximates smoothing functional M " K , L ~ ] for a linear integral equation of the first kind. For this purpose we check up realization of a condition of the order (k,l) approximation, for example, for the first term of the functional ~ " K , l ; y :]
Remaining terms of the functional M ~ K6,y, ] are checked similarly. For realization of conditions of the k order exactitude on non-uniform grids it is necessary, b
2
that for function d s ) , &) in llzllw,l,= ~ ~ ( ~ z " ( ~ + ~ ( s ) zbelonged ~ ( s ) dtos a
class C [a,b], and the core K(x,s)EC[a,b]at everyone fixed x and on x would be summable with quadrate [Samarskij 19771, [Nikolsky 19791. We consider two most practically applicable special cases for the basic types of integral equations: Fredholm and Volterra. 3.5.1.2 Trapezoid rules and Difference Scheme of the First Type Let Ch = diag(hl,h2,...,h,) , C, = diag($ ,q,---,t,)are quadrature coefficients of integration trapezoid rules. Then the matrix K has the form
K =C*&K* y=
= (fj&.(xi,sj)),
and
the
vector
y
has
the
form
gYI = (fiy(xi)] . To these quadrature coefficients there correspond the stan-
dard Ch-symmetrical conservative difference scheme of the first type, where
204
Alexander S. Mechenov
and, if boundary conditions <(a)=<(b)=O to approximate as follows
that reached symmetry of a matrix of the difference scheme. Such approximation satisfies to conditions of the theorem 3.5.1. The trapezoid rule is the best on a class of continuous functions, as well as the conservative difference scheme. Here approximation of the order O(h) on a non-uniform grid is reached and the exactitude of a solution is equal to 0(h2).The given formula approaches for a Fredholm linear integral equation of the first kind as builds a solution in the same points of a grid better. 3.5.1.3 Formulas of Median Rectangles and Difference Schema of the Second Type Let Ch = diag(h2,- . , Ct = diag(t2,.- .,t,) are quadrature coefficients of the integration formula of median rectangles. Hence, the matrix K is equal to e
K = c~&K& to y =
=
.
4
(hl&K(xi
)
t.
-+,rj
-$)} and the right-hand side vector y is equal
gYt ={h4(xi$1) . To these quadrature coefficients, there correspond -
approximation of the Sturm-Liuville differential operator by the standard symmetrical conservative difference scheme of the second type [Tikhonov & Samarskij 1962, 19631
Pseudosolution of Linear Functional Equations
205
with (keeping the symmetry of matrix) approximation of boundary conditions
Thus symmetry of the difference scheme matrix is reached. This approximation mode also satisfies to the theorem conditions of O(h) order approximation on a non-uniform grid is here again reached and the exactitude of a solution is equal o(h2).The given approximation approaches for the Volterra integral equation of the first kind better as builds a solution on the average between points of the grid points. Really, we consider all over a problem of the derivative computation. 3.5.2 Numerical Derivation by the Regularization Method
We consider a problem of the derivative computation from some function y(x), b]; known in n values X I , X ~..., , X, a non-uniform grid u=xo<xI<...<x~=~ and belonging to space Lz[a,b] (random or deterministic functions). The derivative qs), s~[a,b];from precisely set function d x ) is a solution of the Volterra linear integral equation of the first kind
XE [a,
or a Fredholrn linear integral equation of the first kind b
a = KC = [K(X - s)~(s)ds= a(x), x E[a,b ] , a
where &-S) is the Heviside function
It is known, that it is an ill-posed problem, that is, it is necessary to apply the regularized algorithm to its solution. On the other hand, it is a typical problem of passive experiment of regression type. Here there are two problems: how correctly
206
Alexander S. Mechenov
to construct approximation and to apply a regularization method [Mechenov 19881. Theorem 3.5.2. The equation (3.5.4) approximation by the rectangular formula on a non-unlform grid with approximating Heviside core down-triangle matrix Kh of the form
leads to the result that is equal to numerical derivation. Proof. Really, assuming values 91, p,...,qn set ( P I nonzero) and supposing p = 0 , we receive
el
l2 h1 3 = q 2l , where 912 = 91 2
= -
-2P o ,that is,
2
= P1-Po . hl
7
and so on. In the same way hlC1+.+;ben7 I = P,n~I ,where qn-+ = 2
+"-l
2
,that is, in-; = q n -qn-1 h,
Thus, matrix ChLhh,approximating a differential operator in a regularization method for the given problem, will be the same, as in item 3.5.1.3, that is, will correspond to the rectangle formula. Remark 3.5.2. Similarly the problem of derivation can consider a problem of the numerical solution of Volterra linear integral equation of the first kind with a nonsingular core. The obtained result is easily transferred to this case. 3.5.3 Approximation of the Sturm-Liuville Operator of the n Order
We consider construction of a difference analogue of a smoothing functional with a stabilizing operator of the n order [Tikhonov & Samarskij 19621. As practice shows, .to compute the enough smooth solution it is necessary to use the regularization method of 4-th - 5-th orders [Mechenov 19771. Corollary 3.5.3. That the approximation of a smoothing fimctional the n order for a linear integral equation of thejrst krnd was corresponded to condi-
207 tions of the Theorem 3.5.1, it is necessary and sufJicient the Ch-symmetrical conservative difference scheme Lhhthat approximate operator L("): The proof is similar to item 3.5.1. This corollary also is fulfilled on conservative difference schema of the first and second types. Application of difference schema of the high order exactitude [Tikhonov & Samarskij 1962, 19631 in the given cases is not justified [Zaikin & Mechenov 1973, 1973b, Mechenov 19771.
Pseudosolution of Linear Functional Equations
3.5.4 About the Method of the Solution of the Algebraic Equation of Euler
The methods of approximation explained in the previous items, as well as methods of the estimate construction, explained earlier, frequently base on solutions of the algebraic Euler equation
where L is a symmetric positive definite matrix. We need also the condition of the Lagrange multiplier a computation from principle with quadratic dependence from 6 , and from which the Lagrange multiplier a is calculated by iterative methods. For realization of the iterative process, the numerical solution of Euler equation is necessary. The effective method of its solution is offered in [Voevodin 19691 and consists in the following: 1) We decompose matrix L in product NTN,for example, using a square root method. We designate d=N6, A=HN-' ,then the Euler equation will be noted as
and relations are fulfilled
2) Further, making expansion A= QDR by the rotation method (where Q {nxn} is left orthogonal, R{nxk) is right orthogonal, and D{nxk} is upper two-diagonal), lul= 1 ~ ~the 1 )Euler we receive (designating f=Rd and u=Qy, where Ifa = equation
1 I R ~ , 1,
Alexander S. Mechenov
and for r, = Df, - u ,the Euler equation
Remark 3.5.4. The given method of the numerical solution of the algebraic Euler equation is effective practically for all problems, except for in what matrix L varies completely for each value of Lagrange multiplier (as, for example in the LDM and in the other generalizations of LSM). However, in these cases it is desirable to apply this method as it well enough ensures realization of theoretical properties of these methods. It is shown [Mechenov 19771, that the singular expansion [Wilkinson 19651 is preferable from the point of view of observance of the regularization method properties. But it, naturally, does not possess the same practical advantages as works much longer on time. 3.5.5 About Iterative Methods of the Lagrange Multiplier Computation
The idea of these methods belongs, on seen, [Reinsch 1967, 19711 and [Kiuru & Mechenov 19711, [Zaikin & Mechenov 19711; the most strict approach is explained in [Gordonova & Morozov 19721. We consider the functions depending on unknown value a
Where unknown value a is calculated from one of conditions
As an example it is enough to prove one of them. As the third most common the proof for this case also we consider. Theorem 3.5.4. Function
209
Pseudosolution of Linear Functional Equations
is continuous convex downward function at a e(0,m) ( P a-
with a range
~ ,Pa f ' f).
Proof. We consider the regularized normal equation
Using singular expansion H=KAM where K is left and M is right orthogonal matrices, and A is a diagonal matrix, we receive the equation
As thus all transformations do not change magnitude of quadratic forms,
From here, differentiating on multiplier a, we obtain the statement of the theorem. For the value a computing from principle q(a) = qj2 and which, according to the Theorem 3.5.4, exists and is unique, we use the known Newton method 1 1 [Gordonova & Morozov 19721, the most effective for the equation -= .(a)
3
For the equation q(a) = 3 , the Newton method of computation of the unknown radical has the form
210
Alexander S. Mechenov
Starting value of a can be calculated under the formula [Zaikin & Mechenov 19711
If to put
I
2
Y?(ak-l) z ak-ll~ak-,,it will lead to the simplified formula
demanding computation only bn(ak-l). The same linearized formula was used for the residue principle in the standard program [Kiuru & Mechenov 19711. On the effectiveness, it coincides with the secant method (regula falsi) [Mechenov 19771 which as is known [Ostrowski 19731 in this case is more effective than the Newton method. 3.5.6 An inference of Chapter 3 The given methods were applied to the big class of Messbauer spectroscopy problems [Andrianov, etc. 19881, [Gor'kov & Mechenov 19881, [Gor'kov & Mechenov 19901, [Rejman, etc. 19871, for an estimation of the electron distribution function at energies from various performances of gas discharge [Volkova, etc. 19751, [Volkova, etc. 19761, [Volkova, etc. 1976a1, [Volkova, etc. 1976b1, [Volkova, etc. 19781, [Volkova, etc. 1978a1, for calculation of cross-section of photonuclear reactions from the experimental information [Tikhonov, etc. 19731, for an estimation of principal universal performance of the water-wheel and many others. The basic outcomes of the given chapter are: development of the variational approach for pseudosolution construction of linear integral equations, development of the statistical variational approach in a regularization method of construction of a pseudosolution of integral equations of the first kind with approximately known or inaccurately realized operator; determination of stochastic analog of functions from spaces of the Sobolev: them are widely known Gauss-Markov (called also autoregression) processes, construction of a statistical image of a regularization method of a solution of stochastic integral equations of the first kind; application of the obtained outcomes to an estimation of derivatives and autocorrelation functions.
Pseudosolution of Linear Functional Equations
211
Thus variational methods of construction of a regularized pseudosolution and quasisolution of linear integral equations of the first kind and a pseudosolution of linear integral equations of the second kind with the determined both casual passive and active errors of a core and passive errors of a right member are developed; methods of the representation and the account of the a priori information on a required pseudo-solution with the help of construction of the mixed model; statistical regularization methods on the basis of the account of the a priori information; numerical realization of these methods is constructed.
J. Aitchison, S.D. Silvey, "Maximum-likelihood estimation of parameters subject to restraints," Ann. Math. Stat., 29,813-828 (1958) A.C. Aitken, "On least and linear combination of observations, "Proceedings of the Royal society of Edinburgh, 55,42 (1935) S.A. Ajvazjan, I.S. Enjukov, L.D. Meshalkin, Applied statistics: Research of relationships. A reference media. Under ed. S.A. Ajvazjan [in Russian]. Moscow: Finance and statistics, (1985) A. Albert, Regression and the Moore-Penrouse Pseudoinverse. New York-London: Academic Press, (1972) V.A. Andrianov, M.G. Kozin, A.J. Pentin, V.S. Shpinel, A.S. Mechenov, V.P. Gor'kov, "Distribution of molecular fields and ferromagnetic clusters in diluted alloys PdFe: Messbauer researches of the paramagnetic phase," Physics of the solid states, 30, No. 11, 32433252 (1988) J. Bachacou, C. Millier, J.P. Masson, Manuel de la programmatheque statistique AMANCE 81. Versailles: INRIA, (1981) N.S. Bakhvalov, Numerical methods. [in Russian]. Moscow: Science (1973) A.J. Barr and an., A User's Guide to SAS-76. SAS Institute inc. (1976) R. Bellman, Introduction to matrix analysis. New York: McGraw-Hill Book Company (1960) IS. Berezin, N.P. Zhidkov, ComputationalMethods. [in Russian] Moscow: Science (1966) J. Berkson "Are there two regression?," J. Amer. Statis. Assoc., 45, 164 (1950) P. Bertier, J.-M. Bouroche, Analyse des donnees multidimensionnelles. 3-e edition. Paris: Presses Universitaires de France (1981) BMDP Biomedical ComputerPrograms. Ed. W.J.Dixon. Univ. of California Press (1979) A.A. Borovkov, Mathematical statistics. [in Russian] Moscow: Science (1984) A.A. Borovkov, Mathematical statistics. Additional chapters. [in Russian] Moscow: Science (1 984a) A.A. Borovkov, Probability theory. [in Russian] Moscow: Science (1986) Probability and mathematical statistics. Encyclopedia. [in Russian] Moscow: NIBRE (1999) D. Conniffe, J. Stone, "A critical view of ridge regression," Statistician. 22. 181-187 (1973) D.R. Cox, D.V. Hinkley, Theoretical statistics. London: Chapman and Hall (1974) R. Davies, B. Hutton, "The effect of errors in the independent variables in linear regression," Biometrica, 62. 383 (1975) E.Z. Demidenko, Linear and a curvilinear regression. Moscow: Finance and statistics (1981) N.R. Draiper, H. Smith, Applied regression analysis. 2 ed. New York: J.Wiley (1981) J. Durbin, "Errors in variables," Rev. Int. Stat. Znst., 22.23 (1954) L. Elden, "Algorithms for the regularization of ill-conditioned least squares problems," BIT 17,134-145 (1977) S.M. Ermakov, A.A. Zhiglavskij, Mathematical theory of optimum experiment. [in Russian] Moscow: Science (1987)
214
Alexander S. Mechenov D.K. Faddeev, V.N. Faddeeva, Computing methods in linear algebra. [inRussian] MoscowLeningrad: Fimatgiz (1963) V.V. Fedorov, Analyze of experiments at presence of errors in definition of inspected variables. Preprint 2. [in Russian] Moscow: Izd.MGU (1968) V.V. Fedorov, Theory of optimum experiment {scheduling of regression experiments). [in Russian] Moscow: Science (1971) V.V. Fedorov, "Regression analysis at presence of errors in definition of a predictor," Problems of cybernetics, Ed. 47,69 (1978) D.W. Forrest, Francis Galton. The life and work of Victorian Genius. London: Paul Elek (1974) R. Frisch, Statistical confluence analysis by means of complete regression systems. Oslo (1934) W. Fuller, "Properties of some estimator for the errors-in-variables model," Ann. Stat., 8. 407 (1980) V.Ya. Galkin, A.S. Mechenov, "Regularization of Linear regression problems," Computational mathematics and Modeling, 13, No. 2. 186-200 (2002) F. Galton, Hereditaly genius. An inquiry into its laws and consequences. London: Macmillan (1869) F. Galton, Inquiries into human Faculty and its development. London: Macmillan. XXII (1883) F. Galton, "Regression towards mediocrity in hereditary stature," Journal of the Royal Anthropological Institute. 15. 246-249 (1885) F. Galton, Natural inheritance.London: Macmillan (1889) F. Galton, "Probability, the foundation of eugenics, " The Herbert Spencer lecture delivered on June 5. Oxford. Clarendon press (1907) F.R. Gantmacher, Theorie des matrices. V. 1: Theorie generale. Paris: Dunod, V . 2: Questions speciales et applications. Paris: Dunod (1966) K.F. Gauss, Theoria motus corporum coelestium in sectioninus copecis solem ambientiunt, Hamburgi (1809) K.F. Gauss, Theoria combinationem observationem erroribus mininis, Traduction francaise par J. Bertran, Paris, (1855) G. Golub, W. Kahan, "Calculating the singular values and pseudoinverse of a matrix," J. SZAM Numeric. Anal., Ser. B, 2,205-224 (1965) V.I. Gordonova, V.A. Morozov, Analysis of numerical algorithms of the choice of paranteter in the method of regularization. Scientific report 161-TZ(507). [in Russian] Moscow: Izd.MGU, (1972). V.P. Gorkov, A S . Mechenov, "Numerical methods for solution of inverse problems of Messbauer spectroscopy," Numerical methods for solution of inverse problems of mathematicalphysics. Moscow: Izd. MGU, 69-74 (1988) V.P. Gorkov, A.S. Mechenov, "Estimation of performances of hyperfine interactions of passage 3Q-1 Q from experimental Messbauer spectra," Materials I l l All-Union meetings on nuclear spectroscopic researches of hyperfine interactions. Moscow: M. MGU, 173179 (1990) A.E. Hoerl, "Application of Ridge analysis to regression problems," Chem. Eng. Progr. No. 58. 54-59 (1962) A.E. Hoerl, R.W. Kennard, "Ridge regression: biased estimation for non-orthogonal problems," Technonzetrics. 12. No. 1. 55-67 (1970) S. Hodges, P. Moore, "Data uncertainties and least squares regressions," Appl. Stat., 21.185 (1972) LM. Islamov, "Asymptotic regularization of an ill-posed problem of identification," ZIz. Vychisl.Matem. i Mat. Fiziki, 28, No.6,815-824 (1988)
,
Pseudosolution of Linear Functional Equations LM. Islamov, "Asymptotically optimum method of a solution of an ill-posed problem of classification," Zh. Vychisl. Matem i Mat. Fiziki, 29, No.6,26-38 (1989) V.K. Ivanov, "About Linear ill-posed problems," Dokl. Akad. Nauk SSSR, 145, No.2, 270272 (1962) V.K. Ivanov, "About ill-posed problems," Matem. sbornik, 61, No.2, 187-199 (1963) A.G. Jagola, "About a choice of parameter of a regularization on the generalized principle of the residual," Dokl. Akad. Nauk USSR, 245, l,37-39 (1979) A.G. Jagola, "Generalize a principle of the residual in reflexive spaces," Dokl. Almd. Nauk SSSR, 249, No. l,71-73 (1979a) A.G. Jagola, "About choice of regularization parameter by solution of ill-posed problems in reflex spaces," Zh. Vychisl. Matem i Mat. Fiziki, 20, No.3, 586-596 (1980) A.G. Jagola, "About solution non-linear ill-posed problems with help of generalized residue method," Dokl. Akad. NaukSSSR, 252, No. 4,810-813 (1980a) M.G. Kendall, A. Stuart, Advanced theory of statistics. Vols. 1-3. London: Griffin (19671969) G. Kim, "About statistical models of research of rounding errors in problems of Linear algebra," Computing methods and programming, Ed. XVIII. Moscow: Izd. MGU, 173-187 (1972) E.M. Kiuru, A.S. Mechenov, Standard program of the solution of integral equations of Fredholm of thefirst kind by the regularization method. Ed. 45. [in Russian] Moscow: Izd. MGU (1971) G.P. Klimov, A.D. Kuz'min, Probability, processes, statistics. Problems with solutions. [in Russian] Moscow: Izd. MGU (1985) A.N. Kolmogorov, Success ofMathematica1Sciences, 1. Ed. 1, 57-70 (1946) A.N. Kolmogorov, Foundations of the Theory of Probabilities. Bronx, New York: Chelsea (1956) A.N. Kolmogorov, S.V. Fomin, Felements of Function Theory and Functional Analysis. Moscow: Nauka (1972) T.C. Koopmans, Linear regression analysis of economic time series, Haarlem (1937) A.E. Korobochkin, AKh. Pergament, "A priori Parchment the information and an exactitude of a solution of Fredholm integral equations of the first kind," Theory and methods of a solution it is ill-posed problems in view and their application. Novosibirsk: VC AN USSR, 122 (1983) A.V. Krianev, "Statistical form of a regularized method of least squares of A.N.Tikhonov," Dokl. Akad. Nauk USSR, 291, No.4,780-785 (1986) L. Lebart, A. Morineau, SPAD - Systeme portable pour l'analyse des donnees. Paris: Cesia (1982) A.L. Legendre, Nouvelle methode pour la determination des orbites de cometes. Paris: Courcier (1806) AS. Leonov, "Method of a minimum pseudoinverse matrix for a solution of ill-posed problems of linear algebra," Dokl. Akad. Nauk USSR, 285, No. l,36-40 (1985). AS. Leonov, "Method of a minimum pseudoinverse matrix," Zh. Vychisl. Matem i Mat. Fiziki, 27, No.8, 1127-1138 (1987) A S . Leonov, "To the theory of a method of a minimum pseudoinverse matrix," Dokl. Akad. Nauk USSR, 314, No. l,89-93 (1990) AS. Leonov, "Method of a minimum pseudoinverse matrix: the theory and numerical realization," Zh. Vychisl. Matem i Mat. Fiziki, 31, No.10, 1427-1443 (1991) Ju.V. Linnik, Method and bases mathematical-statistical theory of handling of observations. [in Russian] Moscow: Fizmatgiz, (1962) D.V. Lindley, "Regression lines and the linear fimctional relationship, J. Roy. Stat. Soc. 9, Supplement, No. 1 and No.2,2 18 (1947)
215
Alexander S. Mechenov E. Lloyd, W. Ledermann, Handbook of applicable mathematics. Volume VI: Statistics, Part A, B. Chichester, John Wiley and Sons (1987) A. Madansky, "The fitting of straight lines when both variables are subject to error," J. Anter. Stat. Assoc., 54, 173 (1959) D.W. Marquardt, "An algorithm for least squares estimation of nonlinear parameters," J. soc. and Appl. Math. 11,No 2.431 (1963) P.C. Mahalanobis, "On the generalized distance in statistics, " Proceedings of the National Institute of Science, India, 12,49-55 (1936) E. Malenvaud, Methodes statistiques de l'econometrie. Paris: Dunod (1970) A.A. Markov, Calculus ofprobabilities. 4 ed. Moscow: State publishing house (1924) Mathematical encyclopedia. Head ed. LM. Vinogradov. [in Russian] Moscow: Publishing house Soviet encyclopedia (1985) A.S. Mechenov, "About expansion of a solution of equation of the first kind in a series of functions." Computing methods and programming. Ed. XXI. MOSCOW: Izd.MGU, 165-169 (1973) A.S. Mechenov, Some problems of a numerical solution of linear integral equations of Fredholm of thefirst kind by the regularization method. Scientific report No. 200-TZ(591). [in Russian] Moscow: Izd.MGU, 1-134 (1973a) A.S. Mechenov, Numerical solution of linear integral equations of the first kind by a regularization method. [in Russian] Diss. ... Cand. Physical and Mathematical sciences. Moscow: VINITI (1977) A.S. Mechenov, "About a difference approximation of a regularization method," Handling and interpretation ofphysical experiments, Ed. 6. Moscow: Izd.MGU, 105-1 12 (1978) A.S. Mechenov, "About a stabilizing operator in a regularization method," Republican symposium on methods of a solution of the nonlinear equations and optimization problems. Parnu, June, 5-10 1978. Reports. Tallinn, IK AN ESSR, 67-68 (1978a) AS. Mechenov, "Method of a regularization and problem of a linear regression,"Methods of mathematical modelling, automation of handling of observations and their applications. MOSCOW: M. MGU, 88-92 (1986) A.S. Mechenov, Regularized least square method. [in Russian] Moscow: Izd.MGU (1988) A.S. Mechenov, "About in part approached systems of the linear algebraic equations," Zh. Vychisl. Matem i Mat. Fiziki, 31, No.6, 790-799 (199 1) A.S. Mechenov, "A maximum likelihood approach to linear functional relationship," International Congress of Mathematicians ICM-94 Zurich 3-1 1 August 1994. Abstracts. Short Communications. 153 (1994) AS. Mechenov, "The effect of rounding errors for systems of linear algebraic equations," Comp. Maths Math. Phys, 34, No. 10, 1313-1316 (1995) A.S. Mechenov, "Maximum likelihood approach to parameter estimation for linear functional relationships," Computational Mathematics and Modeling, 8 , No. 2, 187-193 (1997) AS. Mechenov, "On the confluent approach in regression analysis," Computational Mathematics and Modeling, 9, No. 3,203-213 (1998) A.S. Mechenov, "Least Distance Method for a confluent model with homoskedasticity in the matrix columns and in the right-hand side," Computational Mathematics and Modeling, 11, NO. 3,299-304 (2000) AS. Mechenov, "Variational problems for the construction of pseudosolutions of linear integral equations," Computational Mathematics and Modeling, 12, No. 3,279-292 (2001) AS. Metchenov, Econometrie et informatique appliquees. [in French] Antananarivo, Universite de Madagascar (198 1) V.A. Morozov, "About a choice of regularization parameter at the solution of the functional equations by regularization method," Dokl. Akad. Nauk SSSR, 175, No.6,1225-1228 (1967)
Pseudosolution of Linear Functional Equations
217
98. V.A. Morozov, "About residue principle at the solution of the operational equations by the regularization method," Zh. Vychisl.Matem i Mat. Fiziki, 8, No. 2,295-309 (1968) 99. V.A. Morozov, Regular methods of a solution of ill-posed problems. [in Russian] Moscow: M. MGU (1974) 100. V.A. Morozov, Methods of Solving incorrectlyposed problems. New York: Springer-Verlag (1984) 101. V.A. Morozov, Regularization method of unstable problems. [in Russian] Moscow: Izd. MGU (1987) 102. V.A. Morozov, A.I. Grebe~mikov,Method of a solution of ill-posed problems in view: algorithmic aspect. [in Russian] Moscow: Izd.MGU (1992) 103. M.V. Murav'eva, "About an optimality and limiting properties Bayes solutions of systems of the linear algebraic equations," Zh. Vychisl.Matem i Mat. Fiziki, 13, No.4,819-828 (1973) 104. S.M. Nikolsky, Quadratureformula. [in Russian] Moscow: Science (1979) 105. A.M. Ostrowskij, Solution of equations and system of equations (in Russian] Moscow: Science (1963) 106. K. Pearson, "On lines and planes of closest fit to systems of points in space," Philosophical Magazine, series 6. No. 2. 559-572 (1901) 107. G. Peters, J.H. Wilkinson, "The least squares problem and pseudoinversion," Comput. J. NO. 13. 309-316 (1970) 108. A.P. Petrov, "About the statistical approach to ill-posed problems of mathematical physics," Transactions of all-Union school of young scientists "Methods of a solution of ill-posed problems and their applications I' (October, 9-18, 1973). Moscow: Izd. MGU, 177-18 1 (1974) 109. A.P. Petrov, A.V. Khovansky, "Estimation of an error of solution of linear problems at presence of errors in operators and right-hand sides of the equations," Zh. Vychisl. Matem i Mat. Fiziki, 14, No. 2. 157-163 (1974) 110. Ju.P. Pyt'ev, "Problem of reduction in experimental researches," Matem. sbornik, 20, No.2, 240-272 (1983) 111. C.H. Reinsch, "Smoothing by spline functions,"J. Numer. Mach. 10. No. 3. 177-183 (1967) 112. C.H. Reinsch, "Smoothing by spline functions," J. Numer. Mach. 16. No. 5.451-454 (1971) 113. S.I. Rejman, V.P. Gorkov, A.S. Mechenov, G.K. Rjasnyj, A.A. Sorokin, "Influence of heat treatment of an amorphous powder of an alloy (Fex Nil, ) P B A1 on the nearest environment of atoms of iron," Physic-chemistry amorphous (glass) metal materials. Moscow: Science, 24-37 (1987) 114. A.A. Samarskij, Theory of dzfleree schema. [in Russian] Moscow: Science (1977) 115. G.A.F. Seber, Multivariate observations, Willey (1984) 116. E.V. Shikin, Vector space and maps. [in Russian] Moscow: Izd.MGU (1987) 117. S.L. Sobolev, Some applications of a functional analysis in mathematical physics. [in Russian] Leningrad, (1950) 118. S.N. Sokolov, I.N. Silin, Determination of a minimum of functionals by the linearization method. Preprint OIIaI, D-810, [in Russian] Dubna (1961) 119. P. Sprent, "A Generalized Least Squares Approach to Linear Functional Relationship," J. Roy. Stat. Soc., 28,278 (1966) 120. C. Stein, "Multiple regression," Contrib. probab. statist. Stanford Univ. Press, 424-443 (1960) 121. O.N. Strand, E.R. Westwater, "Statistical estimation of the numerical solution of a Fredholm integral equations of the first kind, "J. of the Assoc. for Computing Machinery 15, No. 1 (1968) 122. V.P. Tanana, Method of a solution of the operational equations. Moscow: Science (1981) 123. A.N. Tikhonov, "About a solution it is ill-posed problems in view and a regularization method," Dokl. Akad. Nauk SSSR, 151, No.3,501-504 (1963)
218
Alexander S. Mechenov
124. A.N. Tikhonov, "About ill-posed problems of linear algebra and steady methods of their solution," Dokl. Akad. Nauk USSR, 163, No.3, 591-594 (1965)
125. A.N. Tikhonov, "About a stability of algorithms for a solution of the degenerated systems of the linear algebraic equations," Zh. Vychisl.Matem i Mat. Fiziki, 5, No.4,763-765 (1965a)
126. A.N. Tikhonov, "About one principle of reciprocity," Dokl. Akad. Nauk USSR, 253, No. 2, 302-308 (1980)
127. A.N. Tikhonov, "About normal solutions of the approached systems of the linear algebraic equations," Dokl. Akad. Nauk USSR, 254, No.3,549-552 (1980a)
128. A.N. Tikhonov, "About the approached systems of the linear algebraic equations," Zh. Vychid Matem i Mat. Fiziki, 20, No.6, 1373-1383 (1980b)
129. A.N. Tikhonov, "About problems with inaccurately set initial information," Dokl. Akad. Nauk USSR, 280, No. 3,559-562 (1985)
130. A.N. Tikhonov, V.Ja. Arsenin, Method of a solution of ill-posed problems. [in Russian] Moscow: Science (1979)
13 1. A.N. Tikhonov, A.V. Goncharskij, V.S. Stepanov, A.G. Jagola, Regularized algorithms and the a priori information. [in Russian] Moscow: Izd. MGU (1971) 132. A.N. Tikhonov, A.A. Samarskij, "Homogeneous difference schema of the high order of an exactitude on non-uniform grids," Zh. Vychisl. Matem i Mat. Fiziki, 1, No.3, 425-440 (1961) 133. A.N. Tikhonov, A.A. Samarskij, "Homogeneous difference schema on non-unifonn grids," Zh. Vychisl. Matem i Mat. Fiziki, 2, No. 5,812-832 (1962) 134. A.N. Tikhonov, A.A. Samarskij, "About homogeneous difference schema of the high order of an exactitude on non-uniform grids," Zh. Vychisl. Matem i Mat. Fiziki, 3, No. 1, 99-108 (1963) 135. A.N. Tikhonov, M.V. Ufimtsev, Statistical handling of outcomes of experiments. [in Russian] Moscow: M. MGU (1988) 136. A.N. Tikhonov, V.G. Shevchenko, P.N. Zaikin, B.S. Ishhanov, A.S. Mechenov, "About calculation of cut of photonuclear responses under the experimental information," Bulletin of the Moscow university, seriesjizika i astronomiia, 14, No.3, 317-325 (1973) 137. V.F. Turchin, "Solution of the Fredholm equation of the first kind in statistical ensemble of smooth functions," Zh. Vychisl. Matem i Mat. Fiziki, 7 ,No.6, 1270-1284 (1967) 138. V.F. Turchin, "Choice of ensemble of smooth functions at a solution of an inverse problenl," Zh. Vychisl. Matem i Mat. Fiziki, 8, No. 1,230-238 (1968) 139. V.F. Turchin, "Solution of the Fredholm equation of the first kind in statistical ensemble of smooth functions," Zh. Vychisl. Matem i Mat. Fiziki, 9, No. 6, 1276-1284 (1969) 140. V.F. Turchin, V.P. Kozlov, MS. Malkevich, "Use of statistics methods for a solution of illposed problems," Successes Phys,. Sciences, 102, No. 3,345-386 (1970) 141. V.F. Turchin, V.Z. Nozik, "Statistical regularization of a solution of ill-posed problems," Physics of an atmosphere and ocean, 5, No.l,27-38 (1969) 142. A.D. Ventsel, Rate of the theoly of random processes. [in Russian] Moscow: Science (1975) 143. V.V. Voevodin, Rounding errors and a stability in direct methods of linear algebra. [in Russian] Moscow: M. MGU (1969) 144. V.V. Voevodin, "On a regularization method," Zh. Vychisl. Matem i Mat. Fiziki, 9, N3,673679 (1969a) 145. L.M. Volkova, A.M. Devjatov, A.S. Mechenov, N.N. Sedov, M.A. Sheriff, "Function of electron distribution on energies by regularization method from probe measurements," Bulletin of the Moscow university, seriesjizika i astronomiia, 16, N3,371-374 (1975) 146. L.M. Volkova, A.M. Devjatov, E.A. Kralkina, A.S. Mechenov, "Definition of a cumulative distribution electron function on energies on intensive spectral lines by a regularization method," Ill-posed inverse problems of atomic physics. Novosibirsk: Institute of exact and applied mechanics, 73-9 1 (1976)
Pseudosolution of Linear Functional Equations
219
147. L.M. Volkova, A.M. Devjatov, E.A. Kralkina, A.S. Mechenov, "About a possibility of detection of structure of a cumulative distribution electron function on energies from intensity a continuous spectrum, "Handling and intelpretation of physical experiments, Ed. 5, Moscow: Izd.MGU, 104-108 (1976a) 148. L.M. Volkova, A.M. Devjatov, E.A. Kralkina, A.S. Mechenov, V.F. Fazlaev, M. Sheriff, "Computation of a cumulative function of electrons' distribution on energies in gas discharge by the regularization method, "Handling and interpretation of physical experiments, Ed. 4, Izd.MGU, 88-96 (1976b) MOSCOW: 149. L.M. Volkova, A.M. Devjatov, E.A. Kralkina, A.S. Mechenov, "About a possibility of detection of structure of a cumulative distribution of electron function on energies," Informations of the Siberian division of the Academy of sciences of the USSR, a series of engineering science, No. 3, Ed. 1, 56-61 (1978) 150. L.M. Volkova, A.M. Devjatov, E.A. Kralkina, A.S. Mechenov, "Application of Abel inversion for definition of some performances of discharge plasma, " Abel's Inversion and itseer) generalizations. Novosibirsk: Institute of exact and applied mechanics, 200-21 1 (1978a) 151. I. Vuchkov, L. Boyadjieva, E. Solakov, Prilozhen linear regression analysis. [in Bolgarien] Sofia (1985) 152. L. Vuchkov, L. Boyadjieva, "Error in the factor levels and estimation of regression model parameters," J. Stat. Comp. Simul., 19, 1-12 (1981) 153. G. Wahba, "Practical approximate solutions to linear operator equations when the data are noisy," SIAM J. Numer. Anal., 14, No.4,651-667 (1977) 154. N.A. Weiss, Introductory statistics, New York, (1995) 155. J.H. Wilkinson, Rounding errors in algebraicprocesses, Prentice Ha11 (1963) 156. J.H. Wilkinson, The algebraic eigenvalueproblem, Oxford, Clarendon Press (1965) 157. J.H. Wilkinson, "A priori error analysis of algebraic processes," ICM-66. Abstracts of reports. Moscow. 119-129 (1966) 158. J.H. Wilkinson, "The classical error analysis for solution of linear systems, " J. Inst. Math. Appl. 10. 175-180 (1974) 159. M.K. Winkler, "ALAA Paper No. 69-881," A M Guidance, Control and Flight Mechanics Con$, August 18-20,1969 (1969) 160. P.P. Zabreyko, A.I. Koshelev, M.A. Krasnoselskij, S.G. Mikhlin, L.S. Rakovtshik, V.Ja. Stecenko, Integral equation. [in Russian] Moscow: Science (1968) 161. P.N. Zaikin, A.S. Mechenov, Some problems of a numerical solution of integral equations of the first kind by the regularization method. Scientific report 144-TZ(468). [in Russian] Moscow: Izd.MGU (1971) 162. P.N. Zaikin, A.S. Mechenov, "Some problems of numerical realization of regularized algorithm for linear integral equations of the first kind, " Computing methods and programming, Ed. XXI. MOSCOW: Izd.MGU, 155-164 (1973) 163. P.N. Zaikin, AS. Mechenov, "About a choice of a regularized operator at a solution of the operational equations of the first kind," Some problems of the automated handling and interpretation ofphysical experiments, Ed. 1. Moscow: Izd.MGU, 202-208 (1973a) 164. P.N. Zaikin, A.S. Mechenov, "About numerical realization of a regularization method for solution of linear integral equations of the first kind," Some problems of the automated handling and interpretation of physical experiments, Ed. 2. Moscow: Izd.MGU, 124-140 (1973b) 165. A.I. Zhdanov, "Solution of the ill-posed stochastic linear algebraic equations a regularized method of a maximum likelihood," Zh. Vychisl. Matem i Mat. Fiziki, 28, No. 9, 1420-1424 (1989) 166. .L. Zhukovskij, "Stochastic regularization of systems of algebraic equations," Zh. Vychisl. Matem i Mat. Fiziki, 12, No. 1, 185-191 (1972)
Alexander S. Mechenov 167. E.L. Zhukovskij, "About the generalized solution of systems of linear algebraic equations," Dokl. Akad. Nauk USSR, 232, No. 2,269-270 (1977a) 168. E.L. Zhukovskij, "Method for the degenerated and badly stipulated systems of linear algebraic equations," Zh. Vychisl.Matem i Mat. Fiziki, 17, No.4,814-817 (1977b) 169. E.L. Zhukovskij, V.A. Liptser, "On pseudoinverse Bayes regularization of system of algebraic equations," Zh. Vychisl.Matem i Mat. Fiziki, 12, No. 12,464-465 (1975) 170. E.L. Zhukovskij, V.A. Morozov, "About sequential Bayes regularization of system of algebraic equations," Zh. Vychisl.Matem i Mat. Fiziki, 12, No. 12,464-465 (1972)
APPLICATION Tables of the Models, Quadratic Forms, and Equations All results are shown in small tables that give a total characteristic of an experimental material and methods of their parameter estimation. For the system of linear algebraic equations, summary tables of the representation of the information are reduced for the models considered above. By completely simplifying the description of all the errors occurring in the model, that is, assuming they are homoscedastic and have equal variances, one can construct a summary table of the models, quadratic forms, and equations for estimating the parameters (for computing the pseudosolutions) of all possible combinations of measurements and prescriptions of the input data. Table 1 contains the different models in presence of homoscedastic errors. In Table 2 the models and the quadratic forms in the presence of the heteroscedastic errors are shown. For the Fredholm linear integral equations of the first kind summary tables of the elementary cases of the representation of the information are reduced for the models considered above. It is shown in Table 3 the models of linear integral equations of the ,first kind with the deterministic errors and the regularization methods for computing the pseudosolution. It is shown in Table 4 the models of linear integral equations of the first kind with random errors and the regularization methods for computing the estimate of unknown solution.
Alexander S. Mechenov
222
<2 o
ca. 03.
I CQ.
K
II
^
ffi >.
II I
CQ.
+
X b
IX
ca E-H
!>2
X
I
II
e£
9- [i]
9- <
a
<$ I—I
"b ii
II^
o
ö1
o" o"
o
o
i? 53-5
Jl I
+
X CO
9" II
02.
o
o
II
I!
ffl w 0
02.
gf I«
ffi + o W II
©
+
CQ.
II
H
X
e
223
Pseudosolution ofLinear Functional Equations
© 4-
4t-,
ler
8
4-
©
^ .
e
b
4-
II
©
©
i
©
1
©
1
4-
e
e
CQ.
e
oa
oa IX
4-
© icr
O
4-
9- [i]
4-
eo. ©
II
©
II
to
o
o II
(Jl ^5l
« 5 faCJ
ü ü ©* © II II
41-9
!"9
4- o II II [I]
4-
eii
o
II
ffl
W
n
cr
4- o II
eii
[I]
m ,U fH + 4- [i]
^ 40
O II
||
||
B
CQ.
M + g 7 »
Q 4[i] ii X!
[I]
<s: b
© ffi
I
©* ©"
©
4N
ca
ca
b n 4»
4-
oa
b
B X vr
!-9
"55T ts
©
R
4ca
+
cä ©"
©
4-
4-
i
.5»
t-1 4-
9-
©
e
e CTT
4-
ffl
i ©
e
s©
4-
"s
©
©
©
s'
s
©
©
©
s
R
n
E-.
s
II
55
e
^b
II
b
4-
^ + e n fa
Alexander S. Mechenov
224
W
#
Al R
[i]
[I]
cd.
u
ax
'J-J
o
H
[A
1
w
s
w
n
H II
o1
1
ül
o Q
13
i
1
©
+ o
CO
ffi
+ o
+ [I]
OO II
II
11 e w
to ü + W II ca e a II © [i]
X
225
Pseudosolution oiLinear Functional Equations
jt
a
ca
+
5
5oi II
CQ.
H CD
+
+
b
b
+ eQD
[I]
[i]
&
co
II
a
cd icr
ffi
x II
t-,
o
t-,
%> ^ l II
«
ü^ fcü ^~* O" Ö" r4» II II ^ r*^ »^l
o
»"9l
h,
1-91
u u
II
o r
II i-,
""91
II
1
3S
*
eo 0)
1-9
1-9
+ oII + oII II [I]
e
W
+ oII
CD
„
en
W
ta + + o
+
II
£
+
[i] ©
HH i-M
1
X
4-
+ + [i]
0
II
II
226
Alexander S. Mechenov
+
'reg
3
+
+
ods io:
o
7 ft
S ts
« s "8
srari ati
o
4
+
<4-H
"c3
II
^ w
*
.6
IU
JS 4-1
a
s
a>
b
L-i
?
sg o
ran coh
o g 5
.3
the fir sthe
©
b" =3.
Q
.S
equ ions
0
u <4>
CS
0
73
Ö>
+
3
ion.
=2 6
§
I
II
V II
tü
^
CN
+
8
1 1 ,s
+ b
+
227
Pseudosolution oiLinear Functional Equations
s
\
i
A
1
"N
1 /— - \
X
-<
i
l
+
\
IUL
K
c
T:
w
i
s i-
R
\
6
s : H
s
+
S
s H
b r.4 4
cs
' ^
1 Hh K. .j
1
/
L
i
+
A
b
i
II c
Hh
tN
+
i-
>
i
ii
< N,
y> N,
III
"
^ ^
III
+
» 1 '-l' n
,+
+
C
II
II t. «—1
II
t
K
5
7
+
C H
II
\
b
!
1 /~~
/—
i. |_
£ (S
3
1
=^
f
s
II
o ^b
o
+
228
Alexander S. W 'echenov *
00
| *
1
3
n
ii
c ^=
1
1
/-
N
J>
+
3
\
N =
tN
=
i. h
i
1 j
=3=
3 1
^
=
il H
—^
H
r< fK 1
+
II
-h H
S 1
+
k
1
"b
o E Js X)
^ =-<
C
V
1
^ ^
H (N
«^ k.
M «^
^
^1
tt
J ——N
<S
i
ä g
1
oi
HV M
d.
Hh Vi,
O
e
i
*
JS2
-/
^
II t>
r-H
s
'
1< *
i
^
i b- <S
Q
.2 "3
S,
1
+ 6n
b
v+ 5 t
II
<5J
odeli
1.5
n
^
in iM
II
III
i
^
2 '"TT'
P< > "iT Jl ^!< i
II
b
Ji^ o
^
ii
ii
üjr
.
n
.
i
li
j i
o
-
INDEX
A priori distribution 178 A priori information 29, 30, 110 Abbreviations ix Active experiment 93 Active-regression experiment 111, 114 Analysis of variance 3 Approximation 144, 151, 180,210 Arcela theorem 153, 162 Autocorrelation function 196 Autocovariance 167 Autoregression model 196 Autoregression model of the first order 169 Autoregression process 168, 169, 176, 186 Bayesian approach 30, 110, 186 Best linear unbiased estimate 7, 9 Birhoff-Hinchin theorem 187 Bivariate model 78 Bivariate regression 8 1 Bore1 field 171 Canonical expansion 175 Cauchy-Bunyakovskii inequality 154, 156 Chi-square distribution 54 Confluent analysis 36 Confluent equation 40 Confluent model 37,39,64 Confluent-regression analysis 67 Consistent estimate 6 Correlogram 167 Correlation function 175
Covariance matrix 54, 116, 173 Crarner rule 42 Darboux sum 143, 151, 153, 157, 188, 192 Degenerated model 56, 84 Degree of freedoms 54 Diagonal matrix 48,209 Dimension 4 Differential operator 164 Efficient estimate 6 Eigenvalue 48 Ergodic theorem 187 Ergodicity 177 Errors of observation 36 Estimate 6 Estimation 53 Estimator 182 Euclidean norm 170 Euclidean power norm 179 Euclidean sample space 4 Euler equation 34, 104, 148, 151 Fredholm linear integral equation of the first kind 146, 182, 185 Fredholm linear integral equation of the second kind 142, 182, 187 Functional equation of the first kind 2 Functional relationship 2 Galton 3,38, 74, 76 Gauss 6 Gauss process 171
Alexander S. Mechenov
Gauss-Markov distribution 178, 181, 195 Gauss-Markov process 171, 172 Gauss-Markov theorem 7 Gaussian process 173 Heteroscedastic 138 Heviside core 205,206 Hilbert norm 178 Hilbert space 177, 178, 179 Homoscedastic errors 86, 98, 109 Homoscedastic experiment 131 Homoscedastic model 10,65, 113 Homoscedastisity 45, 107, 115, 121, 125, 135 Ill-posed problem 18, 56, 84, 182 Influent analysis 93 Influent equation 95 Influent model 94 Influent-regression analysis 103 Kolmogorov theorem 172 Labels ix, 233 Lagrange method 34, 157 Lagrange multipliers 41, 116, 120, 208 Lagrangian 4 1,60, 124 Least distances method 37 Least squares method 5 Least squares-distances method 68 Lebesgue space 179 Legendre 6,35 Likelihood function 112, 129 Log-likelihood function 100 Mahalanobis distance 6 Markov 7 Markov process 172, 177 Matrix formulation for linear model 4 Maximum likelihood estimates 5, 96
Maximum likelihood method 5, 96, 117, 127 Mixed model 29,64, 108 Mechenov two-stage minimization method 40, 119, 129 Multiple linear regression 2 Multivariate normal distribution 176 Newton method 2 10 Nonlinear analysis 55 Nonlinear model 97 Norm 168 Normal distribution 3, 37, 96 Normal equation 6 Normal process 173 Normal solution 88 Normal vector 115 Orthogonal matrix 209 Orthogonal projection 39 Parameter estimation 3 Parameters 3 Passive-active experiment 117 Passive-active-regression experiment 127 Pearson, Karl 38 Point estimation 5 Positive definite matrix 174 Forecasting using autoregression model 196 Prognosis 197 Program 196 Pseudoinverse 48, 63 Pseudoinverse matrix 33 Pseudosolution 61,62, 142, 150 Quadratic form 60 Quadratic functional 155 Quasiestimate 2 1 Quasisolution 2 1,90, 149, 184 Random function 166
Pseudosolution of Linear Functional Equations
23 1
Random process 166 Random variable 122, 166 Random vector 173 Regression 3 Regression analysis 2 Regression equation 3 Regression line 76 Regression models 2 Regression models passing through origin 52
Two-stage minimization problem 40, 119, 129
Regression parameters 3 Regressor 2 Regularization method 115, 147, 20 1 Regularized distances method 56 Regularized pseudosolution 146, 158, 178, 194, 198 Regularized quasisolution 157, 163, 193, 199 Regularized solution 25 Regularized squares method 26 Residual sum of squares 9, 15, 3 1, 42,48,54, 70 Residue principle 183 Rounding errors 102
Weighed quadratic form 47,70 Well-posed problem 3, 36, 67, 94, 111, 117, 127, 142
Schmidt spectral expansion 156 Simple linear regression 74 Singular models 123, 133, 135 Sobolev norm 170 Sobolev space 164, 165, 166, 194 Stationary process 166 Stochastic model 37,68, 95, 112, 117, 123 Sturm separation theorem 48 Sturm-Liuville equation 164 Sturm-Liuville operator 153, 162, 165,206 Sylvester criterion 174 Taylor series 28 Tikhonov 18,56,84, 142, 183
Unbiased estimate 6 Unbiasedness Uniform distribution 79 Unit vector 178, 179 Variance 12, 53 Variance ratio 53 Volterra linear integral equation 205
GLOSSARY OF SYMBOLS
- Tis distributed under the law; T
X ,y is a transposition of matrix X and of vector y; X - is a column vector composed from rows of the matrix X; det F is a determinant of the matrix F; diag (Il,4 , . ., l n ) is a diagonal (scalar) matrix with elements 1,,1, ,...,1, on a principal diagonal; Ik is an identity matrix of the size kxk; xo=argmin f (x) is a value x, supplying a minimum off (x) (similarly arg inff (x)); lim is a common label of a limit;
EC i=lx2 is a square of Euclidean norm of random variables;
11x11; = ~ 1 x =1 ~
1~11: =
=
2
xy=ll#
is a square of Euclidean norm of nonrandom variables;
11x11;~= ~ 1 x 1 := ExTN-'x is a square of power Euclidean norm of random vari-
ables;
1 1 ~ 1 1 : ~ = ~151: = EgTNdlE, is a square of power Euclidean norm of nonrandom variables; n m
=
k i is the sum of squares of matrix elements for random matrices; t=lh=l
tr F is a trace of matrix F; Ex is the expectation a random variable x; Dx is a variance of random variable x; cov (x, h) is a covariance of random variables x and h; N (a,R) is a multivariate normal distribution with expectation vector a and a covariance matrix R; is a chi-square distribution with k degree of freedoms; dim V is a dimension of the space V; Proj~ b is a projection of a vector b on the set B;
xz
Common Labels of Chapter 1 A=[Z,??,H] is an explaining matrix of functional relations (alpha); ~=[P,8,6] is a full vector of required parameters;
234
Alexander S. Mechenov
m+p+k is a dimension of a full parameter vector; cp is a vector of the response (right-hand side); n is a dimension of a vector of a right-hand side. Labels of the Regression Analysis H is a matrix of theoretical values of a linear functional relation; 6 is a vector of required parameters; k is a dimension of a parameter vector; cp = Hd is a vector of a right-hand side; y is a random vector of observations of a right-hand side; e is errors of observations; X is a covariance matrix of observation errors; L[H] is a linear manifold; L(6) is a likelihood function; y is a realization of a random vector y; Z is a vector of a difference - H6 ,the realization of e;
-
;I is the LSM-estimate of a vector of required parameters 6; f
~d
is the LSM estimate of a vector of a right-hand side; 6 = - Hd is a residue vector; t is a vector; G is a matrix; T is a matrix; A is a matrix; c? is a variance of homoscedastic errors of right-hand side observations; i2 is an estimate of a variance c?; M is a functional; A is a matrix of required parameters; 9=HA is a matrix of right-hand sides; k*r is a dimension of a matrix of required parameters A; Y is a random matrix of observations of a right-hand side; n *r is a dimension of a random matrix of observations; E is errors of observations of a matrix of right-hand sides; I' is a matrix of linear constraints; u is a vector of a right-hand side of linear constraints; 1 is a number of linear constraints; h is a vector of undetermined Lagrange multipliers; K is an inverse matrix; is a LSM-estimate of a vector of required parameters in linear constraints; L is a matrix of passage from a LSM-estimate in linear constraints to a LSMestimate; =
A
a
235
Pseudosolution of Linear Functional Equations
9 is a residual sum of squares; is a vector of normal parameters; d is a vector of testing parameters concerning which the normal vector of parameters is searched; N is a matrix of power norm; r is a rank of the degenerated matrix H.
60
Labels For the Analysis of the Mixed Experiments d is a vector of random required parameters (the a priori information); k' is a dimension of a vector of random required parameters; w is a random vector of errors of observations of the a priori information; 2 is a variance of a random observation error vector of the a priori information; K is a known covariance matrix; H- is the generalized pseudoinverse matrix; H + is a pseudoinverse matrix; A, a=l/Ais a numerical Lagrange multiplier; Labels of the Confluence analysis of Passive Experiment -. a is a matrix of a linear functional relation cp=SP; p is a vector of required parameters of a functional relation; m is a dimension of a vector of required parameters; X is a random matrix of observations E; 2 is a realization of a random matrix of observations X; e is an error of a random matrix of observations X; M is a covariance matrix of a random matrix of observation errors C; T is a covariance matrix between C and e; (A); designates row j of the matrix A;
6 is a vector of unknowns (g,-q T )T ,. Z is a matrix of unknowns [X,-q]; z is a random vector
(5,-y
Tr ;
Z is a random matrix [X,-y]; w is a vector of random errors (_C,-e T )T ,. 2=
-T T (g,-y ) is a vector of realization
= [it,-y] is a matrix of realization Z = [X,-y];
,
[
"I
Z is a random matrix of unknowns X,- ,y R is a covariance matrix of the vector w;
;
236
Alexander S. Mechenov
Y is a covariance matrix of the least distances method;
I? is the matrix of linear constraints constructed from elements of the parameter vector b, zeros and units; 6 is a LDM-estimate of a parameter vector; 6 = - Xb is a residue vector; ,u2 is a variance of homoscedastic errors of an observable matrix of confluent model; - A
y,
(E1
is a vector -P,1 ;
Ji! (a) is a RSS function; %+ is a pseudoinverse in the confluence analysis; A is a minimum eigenvalue of a matrix; E is a matrix totally consisting of units; KAM is a singular expansion; K is an orthogonal matrix; A is a cectangular diagonal matrix; M is an orthogonal matrix; G[g] is a linear manifold; Z[z] is a linear manifold; B is a linear manifold; IC is the ratio of variances; is an expectation of the sum of squares of errors of the vector z; B is a matrix of required parameters of the functional relation 9=EB; rn *r is a dimension of a matrix of required parameters; b is a vector of a priori random parameters; rn' is a dimension of a vector of the a priori information on parameters.
2
Labels of Active And Complicated Experiments of Chapter 2 <E, is an assigned matrix of inspected predictors;
F is a random matrix of realization a; J is an error of a random matrix F;
P is a covariance matrix of random errorJ; f;; is a realization of random matrix F; J is a realization of the random matrix of errors J; 8 is a vector of predictor parameters; p is a dimension of a vector of predictor parameters; i=F8 is a random vector of the response of structural relationship; q=F8+ e is random observation vector of a response vector; u=J8+e is a full random error of a right-hand side q; ;ti is one realization of a random vector of observations of the response q;
-
Pseudosolution of Linear Functional Equations
-u = 7 - @€I is one realization of a full random error of a right-hand side;
237
R2 is the double negative logarithm of likelihood function; P2 is a variance of homoscedastic errors of row J; 2 is a MLM-estimate of a vector of predictor parameters; R2(a) is a log-likelihood function; a is a parameter; t is a number of digits in a binary representation of a mantissa; fm (n),fv(n) is the functions depending on a computation method of SLAE solution; m2 is an expectation of the minus double logarithm of a likelihood; A. is a undetermined Lagrange multiplier; 80, 60 is a normal vector; t is a vector a priori random predictor parameters; p' is a dimension of the a priori information vector about predictor parameters. Labels For Chapter 3 K = K ( X , S ) is an exact core of linear integral equation; p(x) is a right-hand side of linear integral equation; L2 [c, 4 is a space of Lebesgue of square summable functions; &s), s E [a, b] is an unknown required function; W$ 1) [a,b] is a Sobolev space of summable with square both functions and their
derivatives; K* = K * ( x ,S ) is a conjugate core to core K = K ( X , S ) ; L is a linear differential operator of Sturm-Liuville of the first order; ilis a numerical parameter; ~ [ zis]a stabilizing functional or otherwise norm of functions in the Sobolev spaces w2(1) [a,b];
an[ z ] is a stabilizing functional or otherwise norm of functions in the Sobolev spaces in)[a,b]; Ln is a differential Sturm-Liuville operator of the order n; 02 is a square of norm of right-hand side errors; z(s) is an a priori known function; M [ Z ] is a smoothing functional; y2 is a square of norm of an exact solution; ,dis a square of norm of an error of the representation of an operator; Khr is a matrix, approximating a core; t, h are the approximation grids;
Alexander S. Mechenov
{f2,7,P) is a probability space; S {o)is a space of simple events; 7 is a a-algebra of its subsets; P is a probability; (X, B) is a measurable space; x is a random variable; F is a measure on (X, B), distribution of an random variable x; F, h..., (A) is a joint distribution of random variables x, h,. .., z; x,(cu) is a random function which at everyone t ET is measurable on w; u(t) = E x (t) is the moment of the first order of random process; K(t,s)=E(x(t)- W(t))(x(s)- 4 s ) ) is the moment of the second order of random process; P is an expectation; p, is an autocorrelation coefficient; p, , q = 0,1,2,..., is a correlogram; T is a set; {R, ,d,t E T) is a metric space; RT is a measurable space; L2 {G,?,P) is a Hilbert space of square integrable random variables; Wz(1) {f2,7,P) is a Sobolev space of square integrable random variables with square integrable derivative of random variables; is a formal norm (without expectation operation) for random variables;
I.14 1.1%
is a formal difference analogue of norm (without expectation operation) for
random variables; ( n ) is a formal Sobolev norm (without operation of expectation) for random I.lWz variables; - ( n ) is a formal difference analogue of Sobolev norm (without operation of exl.lw2 pectation) for random variables; W is a set; Q is an orthogonal matrix; D is a rectangular two-diagonal matrix; R is an orthogonal matrix; K is an orthogonal matrix; L is a rectangular diagonal matrix; M is an orthogonal matrix; a is a parameter of a regularization.