Lecture Notes in Control and Information Sciences Edited by M.Thoma and A.Wyner
160 P.V. Kokotovic (Ed.)
Foundations of Adaptive Control
Springer-Verlag Berlin Heidelberg NewYork London ParisTokyo Hong Kong Barcelona Budapest
Series Editors M. Thoma • A. Wyner Advisory Board L. D. Davisson • A. G. J. MacFarlane • H. Kwakernaak ,1. L. Massey • Ya Z. Tsypkin • A. J, Viterbi Editor Prof. Petar V. Kokotovib ECE University of California Santa Barbara CA 93106 USA
ISBN 3-540-54020-2 Springer-Verlag Berlin Heidelberg NewYork ISBN 0-387-54020-2 Springer-Verlag NewYork Berlin Heidelberg This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. Duplication of this publication or parts thereof is only permitted under the provisions of the German Copyright Law of September 9, 1965, in its current version, and a copyright fee must always be paid. Violations fall under the prosecution act of the German Copyright Law. © Spdnger-Verlag Berlin, Heidelberg 1991 Printed in Germany The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera ready by author Printing: Mercedes-Druck, Berlin Binding: B. Helm, Berlin 61/3020-543210 Printed on acid-free paper.
Acknowledgements While many people contributed to the organization of the 1990 Grainger Lectures and the publication of this volume, these tasks would not have been accomplished without the tireless work of Peter Sauer and Ioannis Kanellakopoulos. P e t e r $auer, Grainger Associate, was in charge of administration, finances, publicity and social programs for the Lectures. Ioannis K a n e l l a k o p o u l o s , Grainger Fellow, performed all the editorial work for the volume, including proofreading, correcting and formatting the texts. Financial support for the Lectures was provided by the Grainger Foundation. Personal attention of D a v i d Grainger, President, was extremely helpful.
Petar V. Kokotovid Grainger Professor
Preface
In the fourth decade of its development, adaptive control has reached a level at which a critical reexamination of its foundations is useful for both theoretical and practical purposes. A series of sixteen lectures, The 1990 Grainger Lectures, was dedicated to this task. The lectures were delivered at the University of Illinois, Urbana-Champaign, September 28-October 1, 1990. In the creative atmosphere of intense discussions among the lecturers, the scope and content of the lectures were expanded by the addition of new topics and most recent developments. The result of this activity are the revised and enlarged texts in this volume. As it becomes clear even after a cursory review of its contents, the volume has gone far beyond its original scope of reexamining the foundations of the field. New solutions are presented for some long-standing open problems, earlier successful approaches are unified, new problems are formulated and some of them solved. Indeed, the title of the volume could have been changed into "Recent Advances in Adaptive Control". However, as all of the new results address fundamental issues, the title F o u n d a t i o n s o f A d a p t i v e C o n t r o l is still appropriate. The two parts of the volume deal with adaptive control of linear and nonlinear systems, respectively. Part I contains unifications, reappraisals and new results in adaptive linear control. Most nonlinear problem formulations in Part II are new, and some of their solutions are likely to be further extended and generalized. Part I opens with a view of the history of the field by one of the pioneers of Model Reference Adaptive Control, Bob Narendra. He starts with a wise caveat that even for a professional historian it is hard or impossible to be objective, let alone for a deeply involved participant of many exciting events. In T h e M a t u r i n g o f A d a p t i v e C o n t r o l , the reader will find the wealth of information and freshness of views that only an eyewitness report can convey. Adaptive control has always been characterized by a great diversity of concepts and algorithms and a continuous search for their unification. As such, it has been a fertile ground for both scientists and inventors who at times speak different languages. W h a t both groups have been lacking is a common set of fundamental concepts in which to communicate the properties of their inventions and theories. A C o n c e p t u a l F r a m e w o r k f o r P a r a m e t e r A d a p t i v e C o n t r o l by Steve Morse is a major step in this direction. In the near future, his concept of tunability may become as common as the concept of controllability. Just as the proofs of stability and convergence dominated the 1970s, the issues of robustness were central throughout the 1980s. As the next three papers in this volume show, the robustness issues will continue to be among the important
VI research topics in the 1990s. Petros Ioannou, one of the leading authors on this subject, and Aniruddha Datta have given us a self-contained detailed presentation of the state-of-the-art in R o b u s t A d a p t i v e C o n t r o l : D e s i g n , A n a l y s i s a n d R o b u s t n e s s B o u n d s . What in the 1980s appeared as a collection of unrelated modifications of adaptive algorithms is now a unified methodology for preventing destabilizing effects of disturbances and unmodeled dynamics. The robustness proofs are at the same time simpler and more powerful. The next two papers present new robustness results for continuous and discrete time systems. The main message of both R o b u s t C o n t l n u o u s - T i m e A d a p t i v e C o n t r o l b y P a r a m e t e r P r o j e c t i o n by Naik, Kumar, and Ydstie, and S t a b i l i t y o f t h e D i r e c t S e l f - T u n i n g R e g u l a t o r by Ydstie, is that to achieve boundedness of all the signals in the presense of disturbances and unmodeled dynamics it is sufficient to introduce a parameter projection modification into the usual gradient update law. Although the continuous-time result by Naik, Kumar and Ydstie was inspired by the discrete-time result of Ydstie, their order is reversed in order to preserve the connection with the preceding continuous-time papers. The contribution of the Soviet Academician Tsypkiu, a pioneer of both control theory and stochastic adaptive control is at the same time historicM and innovative. In A d a p t i v e - I n v a r i a n t D i s c r e t e C o n t r o l S y s t e m s , he first reminds us that what today we call disturbance-rejeciion conditions appeared in the 1940s as selec~ive-invariance conditions. This type of invariance is achievable for disturbances generated from known models and with complete knowledge of the plant. Tsypkin then introduces the notion of adaptive-invariant systems, which have an anMogous invariance property when the parameters of the plant and disturbance generator are unknown. He restricts his presentation to the socalled interval plants, whose parameters are known to belong to given intervals. One of the leading authors in the field of stochastic adaptive control, P. R. Kumar, was invited to summarize the field and reexamine its foundations. In S t o c h a s t i c A d a p t i v e S y s t e m T h e o r y : R e c e n t A d v a n c e s a n d a R e a p praisal, he and Wei Ren have not only accomplished this task, but have also completed the solution of several long-standing open problems. They first provide a unified treatment of identification, prediction and control based on the ARMAX model and prove self-tuning and convergence properties for several adaptive schemes. They then give new general results for parallel model adaptation problems, including output-error identification and adaptive noise and echo cancelling. It is expected that these recent results will give a new impetus to both simpler and more fundamental developments in stochastic adaptive control. Part II of this volume consists of five papers dealing with adaptive control of nonlinear systems. All the authors responded to the invitation to address truly new nonlinear problems without any linear constraints imposed on the nonlinearities. (Since the unknown parameters appear linearly, imposing linear growth constraints or global Lipschitz conditions would make the problems tractable by the methods of adaptive linear control.) The first three papers assume that the full state vector is available for feedback. More realistic situations, where only an output is available for measurement, are considered in the last two papers.
VII In A d a p t i v e F e e d b a c k L i n e a r i z a t l o n of N o n l i n e a r Systems, Kokotovid, Kanellakopoulos, and Morse first survey some recent results and then design a new adaptive scheme, applicable to a much larger class of systems than before. The difficulties of the global adaptive stabilization and tracking problems are shown to increase due to two factors: level of uncertainty and nonlinear complexity. The new systematic design procedure is applicable to the highest level of uncertainty, but it limits the nonlinear complexity by the so-called pure-feedback condition. Whenever this assumption is globally satisfied, the results on adaptive stabilization and tracking are global, that is, there is no loss of globality caused by adaptation. A conceptually broader approach to adaptive control of nonlinear systems is presented in A d a p t i v e S t a b i l i z a t i o n o f N o n l i n e a r S y s t e m s by Praly, Bastin, Pomet, and :Hang. They unify and generalize most earlier results and classify them according to additional assumptions such as matching conditions (i.e., uncertainty levels) and linear growth constraints. The unification is achieved by a novel Lyapunov approach to the design of direct schemes and by generalizations of equation error filtering and regressor filtering for indirect schemes. The key assumption in this approach is that a Lyapunov-like function exists and depends on the unknown parameters in a particular way. Depending on the properties of this function, various designs are possible, including feedback linearization designs when this function is quadratic in the transformed coordinates. Possibilities of either global or only regional adaptive stabilization are also examined. Processes involving transfer of electromagnetic into mechanical energy require essentially nonlinear models. One of the most practical representatives of these processes is considered in N o n l i n e a r A d a p t i v e C o n t r o l o f I n d u c t i o n M o t o r s via E x t e n d e d M a t c h i n g by Marino, Peresada, and VMigi. Their adaptive design encompasses, as special cases, some earlier designs and patents. However, the assumption of full state measurement is a practical disadvantage which motivates the development of adaptive schemes using only output measurements. Adaptive output-feedback control of nonlinear systems is a new problem of major theoretical and practical interest. In Global A d a p t i v e O b s e r v e r s a n d O u t p u t - F e e d b a c k S t a b i l i z a t i o n for a Class o f N o n l i n e a r S y s t e m s by Marino and Tomei, this problem is addressed in two stages. First, sufficient conditions are given for the construction of global adaptive observers for singleoutput systems. The construction makes use of novel filtered transformations, that is, nonlinear changes of coordinates driven by auxiliary filters. At the second stage, an observer-based adaptive output-feedback control is designed for a somewhat narrower class of systems. A different approach to adaptive output-feedback control of nonlinear systems is presented in A d a p t i v e O u t p u t - F e e d b a c k C o n t r o l of S y s t e m s w i t h O u t p u t N o n l i n e a r i t i e s by Kanellakopoulos, Kokotovic and Morse. The main result of this paper is a nonlinear extension of a 1978 paper by Feuer and Morse. Because of its complexity, this early paper is less well known than the papers mentioned in the historical survey at the beginning of this volume. However, in
VIII contrast to some other results of adaptive linear control, its nonlinear generalization does not impose linear constraints on system nonlinearities. This brief preview suffices to show that the papers in this volume herald a decade of new breakthroughs in adaptive control research in the decade of the 1990s. With the already achieved unification of robust adaptive control, more research will undoubtedly be focused on performance. A particularly challenging task will be to develop methods to improve transients and eliminate bursting phenomena. In stochastic adaptive control, the most recent proofs of stability and convergence make it possible to address the robustness issue in a new way. In nonlinear systems, adaptive control has just made its first encouraging steps. T h e y are likely to stimulate research leading to results applicable to larger classes of systems. Rigorous analytical methods of adaptive control may also become a theoretical basis for some developments in neural networks. The intellectual depth and scientific content of adaptive control are growing at an impressive pace. Its mature vitality is as exciting as was its youthful impetuousness.
Petar V. Kokotovi6 Urbana, Illinois February 1991
Contents Part
I: A d a p t i v e
Linear
Control
1
T h e M a t u r i n g of A d a p t i v e C o n t r o l by K. S. Narendra . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
A C o n c e p t u a l F r a m e w o r k for P a r a m e t e r A d a p t i v e C o n t r o l by A. S. Morse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
R o b u s t A d a p t i v e C o n t r o l : Deslgn~ Analysis a n d R o b u s t n e s s B o u n d s by P. Ioarmou and A. Datta . . . . . . . . . . . . . . . . . . . . . . . . 71 Robust Continuous-Time Adaptive Control by Parameter Projection by S. M. Naik, P. 1~. Kumar, and B. E. Ydstie
.............
153
S t a b i l i t y o f t h e Direct Self-Tuning R e g u l a t o r by B. E. Ydstie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
201
Adaptive-Invariant Discrete Control Systems by Ya. Z. Tsypkin . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
239
Stochastic Adaptive System Theory: Recent Advances and a Reappraisal by W. Ren and P. R. Kumar . . . . . . . . . . . . . . . . . . . . . . .
269
Part
II: Adaptive
Nonlinear
Control
309
A d a p t i v e F e e d b a c k L i n e a r i z a t i o n of N o n l i n e a r S y s t e m s by P. V. Kokotovi5, I. Kanellakopoulos, and A. S. Morse . . . . . . . .
311
Adaptive Stabilization of Nonlinear Systems by L. Praly, G. Bastin, J.-B. Pomet, and Z. P. Jiang . . . . . . . . . .
347
A d a p t i v e N o n l i n e a r C o n t r o l of I n d u c t l o n M o t o r s via E x t e n d e d M a t c h i n g by R. Marino, S. Peresada, and P. Valigi . . . . . . . . . . . . . . . . .
435
Global A d a p t i v e Observers a n d O u t p u t - F e e d b a c k S t a b i l i z a t i o n for a Class of N o n l i n e a r S y s t e m s by P~. Marino and P. Tomei . . . . . . . . . . . . . . . . . . . . . . . . A d a p t i v e O u t p u t - F e e d b a c k C o n t r o l of S y s t e m s with Output Nonlinearities by I. Kanellakopoulos, P. V. Kokotovi5, and A. S. Morse . . . . . . . .
455
495
Part I Adaptive Linear Control
The Maturing of Adaptive Control K. S. Narendra Department of Electrical Engineering Center for Systems Science Yale University New Haven, CT 06520, USA.
Prologue History is neither purely subjective nor purely objective. According to Edward Hallett Carr [1], the historian is engaged in a continuous process of molding his facts to his interpretation and his interpretation to his facts. The following paper is a brief history of adaptive control theory during the period 1955-1990 as interpreted by the author, who was also a participant.
1 Historical
Background
Control theory is concerned with modifying the behavior of dynamical systems so as to achieve desired goals. These goals include maintaining relevant outputs of a system around desired constant values, assuring that the outputs follow specified trajectories, or more generally ensuring that the overall system optimizes a specified performance criterion. The goal is achieved by computing a suitable control input based on the observed outputs of the system. The fundamental processes involved in controlling a dynamical system consequently include the mathematical modeling of the system', identification of the system based on experimental data, processing the outputs of the system into mathematically convenient forms, using them in turn to synthesize control inputs, and applying the latter to the system to achieve the desired behavior. In the early stages of its development, control theory was primarily concerned with linear time-invariant systems with known parameters. Around the 1940s, electrical, meehanicM and chemical engineers were designing automatic control devices in their own fields, using very similar methods but disguised under different terminologies - a situation which is prevalent even today in many of the newer areas of systems theory. GraduMly it became apparent that the concepts had a common basis and towards the end of World War II a frequency-domain theory of control that was mathematically elegant and universal in its scope came into existence. During the following decade, even as this theory was being used successfully for the systematic design of innumerable industrial feedback control
4
Narendra
systems, it became clear that a new methodology would be needed when substantial amounts of uncertainty are present in such systems. It was at this stage in the history of control theory, in the 1950s, when frequency-domain methods were well established and time-domain methods based on state representation of dynamical systems had not yet been introduced, that the field of adaptive control was born. In the early 1950s there was considerable interest in the design of autopilots for high-performance aircraft. Such aircraft operated over a wide range of speeds and altitudes (according to the standards prevailing at that time). The existing feedback theory was adequate to design an efficient controller at any one operating condition, but could not cope with problems that arose when there were rapid changes in operating conditions. One obvious solution was to store the controller parameter values for the different operating regions and switch between them as the conditions changed. But this process, referred to as gain scheduling, became infeasible when the number of possible operating regimes became very large, and it was generally acknowledged that a more sophisticated approach would be needed to cope with the problem. It also became apparent in the following years that the difficulties encountered in aircraft systems were generic in nature and that very similar problems were encountered over the entire spectrum of systems applications. As in aircraft systems, parameters also tend to vary in most practical systems - ships, cement and paper mills, distillation columns, engines, chemical processes and power systems, to name but a few. The frequency-domain methods available at this time [2,3], even when used efficiently, were inadequate to achieve satisfactory performance in the entire range over which the characteristics of the system varied. The demands of a rapidly growing technology for faster and more accurate controllers have always had a strong influence on the progress of automatic control theory and when it became apparent that the existing theory was inadequate to successfully tackle the problems arising in industry, interest gradually shifted to adaptive systems which would adjust their performance with changes in the environments. The term "adaptation" is defined in the dictionary as the modification of an organism or its parts that fits it better for the conditions of its environment. Inspired by this, in 1957, Drenick and Shahbender [4] introduced the term "adaptive system" into control theory to represent control systems that monitor their own performance and adjust their control mechanisms in the direction of improved performance. The following years witnessed considerable debate on the differences between conventional linear control systems and adaptive systems. Further, numerous feedback systems were designed which their proponents considered to possess properties peculiar to adaptive systems. Many of these were collected and presented in survey papers by Aseltine et al. [5] and Stromer [6] indicating that research on adaptive systems was flourishing even in the 1950s. An a t t e m p t was made in [5] to categorize adaptive systems into the following five classes depending upon the manner in which adaptive behavior was achieved: (i) Passive Adaptation, (ii) Input-Signal Adaptation, (iii) System Variable Adaptation, (iv) System Characteristic Adaptation, and (v) E x t r e m u m Adaptation. The first class contains those systems in which the clever design of
The Maturing of Adaptive Control
5
time-invariant controllers results in satisfactory performance over wide variations in the environment. High-gain controllers designed to cope with plant parameter variations belong to this class which, according to the present terminology, would be described as robust rather than adaptive. Class (ii) includes systems which adjust their parameters in accordance with input signal characteristics while classes (iii) and (iv) contain systems in which the control input is based on system outputs, errors or their derivatives or alternately the observed impulse response or frequency response. It is worth remembering that system representation at this time was mainly in terms of transfer functions, and distinctions were made only between inputs, outputs and parameters. Even when the parameters were adjusted on-line, the overall system was analyzed as a linear time-varying system. Adaptive systems belonging to class (v) received more attention than the others during the 1960s and attempted to seek the extremum of a performance criterion (cf. Section 2). Even though the classification of adaptive systems given above is no longer in vogue, the concepts that led to such a classification have endured for over three decades and have been rediscovered by successive generations of systems theorists in their attempts to deal with uncertainty. In the early 1960s, many sessions in the major conferences were devoted to defining adaptive control and numerous definitions were proposed [7,8,9,10]. Yet, thirty years later a universal definition of an adaptive system is still elusive. This is because adaptation is multifaceted in nature and cannot be compressed into a simple statement without vital loss of content. Itence, the many definitions proposed merely reflect the variety of personal visions encountered in the field. The seeming divergence between the different viewpoints may be partly attributed to the fact that, in the ultimate analysis, an adaptive system is merely a complex feedback system. A designer who sets out to control a system in the presence of a certain class of uncertainty and produces a controller which achieves its objective may conclude that he has designed an adaptive system. Yet, an observer unfamiliar with the process prior to the design of the controller merely observes a nonlinear control system. Hence, among the various definitions proposed, the one due to Truxal [10], that an adaptive system is one which is designed from an adaptive viewpoint, captures much of the difficulty encountered in defining adaptive systems. The fact that most adaptive systems are nonlinear systems came to be realized in the 1960s, when attempts were made to describe them using state equations. For example, if a plant is described by the vector differential equation = A(p,
+ B(p,
y = C(p,
(1)
with r(t) E IR, x(t) E IR n, y(t) e IR"~, p e IRm~ and0(t) E IR 'n~, where r,x, y are respectively the input, state and output, p is an unknown constant vector, and 9(0 a time-varying control parameter vector which is adjusted using the measured signals of the system as
= g(y, e, O ,
(2)
6
Narendra
it follows that the components 6i(t) of 6(t) (i = 1 , 2 , . . . , m 2 ) are no longer "parameters", but are state variables of the overall system described by the equations (1) and (2). Hence, any adaptive system, no matter how complex, is merely a nonlinear feedback system involving estimation and control, and the design of adaptive systems consequently corresponds to the design of special classes of nonlinear systems [11]. During the past three decades great advances have been made in adaptive control theory and adaptive controllers have been used in numerous industrial problems. While we are still very far from designing autonomous systems which will take care of themselves, a considerable amount of insight has been gained regarding the nature of the new problems and the concepts and theoretical questions associated with them. During the early years some of the research efforts led to dead ends and new starts had to be made. There were also periods when the field appeared to be in the doldrums, and these were invariably followed by breakthroughs which led to a resurgence of interest in the field. The main objective of this paper is to trace some of these major developments, particularly those related to continuous-time adaptive systems, and to provide the reader with a sense of the excitement and frustration that have alternately marked the growth of the field.
2 Developments
in t h e e a r l y 1960s
The early days of adaptive control coincided with one of the most active periods in the history of control theory. The state representation of dynamical systems was introduced at this time and the relationship between frequency- and timedomain methods was a subject of frequent discussion. The maximum principle of Pontryagin [12] which had been introduced in the 1950s and which had opened the way for a systematic study of optimal trajectories in the presence of state and control constraints was gradually being absorbed into the mainstream of control theory. The contributions of Lurie, Popov, Kalman, Yakubovich and others stimulated a great deal of interest in the absolute stability problem and more generally in Lyapunov's methods for establishing the stability or instability of nonlinear systems. This was also the period when advances in stochastic estimation were accelerated by the introduction of the Kalman-Bucy filter, when the linear optimal regulator problem with a quadratic performance index was resolved and when Bellman's dynamic programming method was being increasingly used for sequential planning and optimal resource allocation. All these had a profound impact on researchers in the area of adaptive control and on their understanding of adaptive control processes. Even at this early stage, following the work of Feldbaum [13], it was generally realized that two philosophically different approaches exist for the control of plants with unknown parameters. In the first approach [Fig. la] referred to as indirect control (or explicit identification) the unknown plant parameters are estimated on-line and the controller parameters are adjusted based on these estimates. In the second approach, referred to as direct control [Fig. lb] (or
The Maturing of Adaptive Control
f
7
1P
ec
I
1
/
(a)
•
I
e~
r
I
ik
Ident.~
e¢
i (b)
i
Fig. 1. (a) Direct adaptive control, (b) Indirect adaptive control.
implicit identification) the plant parameters are not estimated but the control parameters are directly adjusted to improve a performance index. A paper by Kalman in 1958 [14] combining recursive estimation with controller design was to a large extent responsible for indirect control being used almost exclusively in the late 1950s. This is reflected in the first book on adaptive control by Mishkin and Braun [15] published in the United States in 1961. Soon after this, however, direct methods also became popular for adaptive control. 2.1 M R A S a n d S T R As mentioned in Section 1, the aim of control is to keep the relevant outputs of a given dynamical plant within prescribed limits. If yp denotes the output of the plant and Ym the desired output, the objective is to keep the error el = yp - Ym between desired values. If Ym is a constant, the problem is one of regulation, and when Ym is a function of time the problem is referred to as tracking. When the
8
Narendra
characteristics of the plant are unknown, both regulation and tracking can be viewed as adaptive control problems. In practice, adaptive control problems can arise in a variety of ways. For instance, in a dynamical plant, the input and output may be the only accessible variables and a parameter ~ in the plant may have to be adjusted to minimize some functional of the error el -- yp - Ym. Alternately, a feedforward a n d / o r feedback controller may be used to control the unknown plant and the parameters of the controller may have to be adjusted based on the information obtained as the system is in operation. For mathematical tractability, the plant is generally assumed to be linear and time-invariant with unknown parameters. In such a case whether indirect or direct control is used depends on whether or not the parameters of the plant are estimated explicitly. Much of the research that has been carried out in adaptive control is concerned with two classes of systems called Model Reference Adaptive Systems (MRAS) and Self Tuning Regulators (STR). While the former evolved from deterministic servo problems, the latter arose in the context of stochastic regulation problems. In spite of their different origins they are closely related as is evident from the discussion that follows. If an indirect approach is to be used, the unknown plant parameters are estimated, using a model of the plant, before a control input is chosen. If the problem is one of tracking, the desired output Ym can be expressed as the output of a reference model (ref. sections 3 and 5). Adaptive systems that make explicit use of such models for identification or control purposes are called MRAS [16]. The Self Tuning Regulator, which is based on the indirect approach, consists of a parameter estimator, a linear controller and a mechanism for computing the controller parameters from those of the plant parameter estimates. Since any estimation scheme can be combined with a suitable control scheme, various possibilities exist for designing such a controller. It is clear that STRs can also be designed based on a direct method in which the regulator parameters are directly updated. In spite of the seeming similarity of MRAS and STR, historically the former has been used mainly in the analysis of continuous-time systems, while the latter has found application in discrete-time systems. Since our interest in this paper is primarily in continuous-time adaptive systems, the emphasis in the following sections will be on MRAS rather than STR. For the sake of historical continuity, we first deal in this section with parameter perturbation methods as well as methods based on sensitivity models which represented the two principal approaches to adaptive control in the 1960s. All subsequent sections deal with problems which arise in MRAS. 2.2 P a r a m e t e r
Perturbation
Method
E x t r e m u m adaptation, mentioned in Section 1 was perhaps the most popular m e t h o d for adaptive control in the 1960s. It had considerable appeal to researchers because of its simplicity and applicability to nonlinear plants and since it did not require explicit identification of plant parameters. In 1951, Draper and Li [17] had demonstrated that the performance of an internal combustion engine
The Maturing of Adaptive Control
9
could be optimized using a parameter perturbation method involving perturbation, correlation and adjustment. Assuming that a single parameter value has to be chosen, the parameter is perturbed around a nominal value, the parameter variations are correlated with changes in the performance index and used in turn to adjust the parameter in the direction of optimum performance. Numerous papers [18,19,20,21] appeared on this subject and detailed analyses were carried out regarding the amplitude and frequency of the perturbations, the time over which the correlation is to be carried out and the amount by which the parameter is to be adjusted. The analysis in [21], for example, revealed that sophisticated perturbation methods involving differential equations with multiple time-scales would be needed for a precise analysis of the behavior of even a second-order system with a single adjustable parameter. The problem became significantly more complex when several parameters had to be adjusted simultaneously. To determine the gradient of a performance index in parameter space, the frequencies of perturbation of the different parameters had to be separated and this in turn resulted in slow convergence. The inability to use the method successfully in practical situations where multiple parameters had to be adjusted, along with the difficulty encountered in determining conditions for their stability, resulted in the rapid demise of parameter perturbation methods in the mid 1960s. 2.3 S e n s i t i v i t y M o d e l s The parameter perturbation method is a direct method for estimating the gradient of a performance index. An alternate gradient approach, based on significantly more prior information concerning the plant, used sensitivity models for the generation of the desired partial derivatives. Given a system with an inputoutput pair {u, y} and a parameter 8, the partial derivative ~ of the output y with respect to/? can be computed if the structure of the system is known. For example, if a system is described by the second-order differential equation ~)+ 0y + y = u, the partial derivative "~e = z satisfies the differential equation + 0k + z = - 9 -
(3)
By constructing a model described by equation (3) with - ~ as its input the desired partial derivative ~ can be generated. Similarly, using a model corresponding to each adjustable parameter, the gradient of y with respect to ~ can be determined and in turn used to adjust the parameter vector to improve the performance [22,23,24]. The method using sensitivity models gained great popularity in the 1960s and was extensively used in industrial problems for the optimal adjustment of parameters. Since substantially more information is assumed in this method as compared to the parameter perturbation method, the convergence of the parameters was significantly faster than in the latter case. In many adaptive situations the structure of plant may not be known (i.e., a differential equation of the form (3) may not be available). In such cases an identification model of the plant has to be constructed and used in the sensitivity model. The output
10
Narendra
of the sensitivity model then corresponds to an estimated rather than a true gradient of the system. 2.4 M . I . T . R u l e The first MRAS was proposed in 1958 by Whitaker et al. [25] and contained a reference model whose output represented the desired output of the plant. The parameters ~ of the controller were adjusted based on the error e between the output of the reference model and that of the plant. The adaptive law for the adjustment of 0 was based on the fact that if the parameters ~ changed much slower than the system variables, then a reasonable approach was to change 0 along the negative gradient of e2: da 0e ~ - / = -Te~-~.
(4)
However, since the plant parameters were unknown, the components of the sensitivity vector 0¢ could not be obtained using the sensitivity models described earlier. Instead, they were replaced by some available signals according to a "rule" which came to be known as the M.I.T. rule.
3 Adaptive
Control
of Simple
Systems
(late 1960s)
3.1 S t a b l e A d a p t i v e S y s t e m s In the gradient methods described thus far for the study of adaptive systems, the emphasis was on their performance. The control parameters were adjusted along the gradient of a performance index. Once the design of the controller was completed, the stability of the overall system was analyzed and conditions were established for local stability. Since adaptive systems are nonlinear nonautonomous systems, the derivation of sufficient conditions for even local stability was not possible in many cases. Since stability is a fundamental requirement for the satisfactory operation of any system, it was suggested in 1963 by Grayson [26] that a reversal of the procedure adopted earlier would be more efficient. He argued that adaptive systerns should be designed to be globally stable for all values of a parameter 7 belonging to a set S 7. The optimization of the system could then be carried out by choosing the optimum value ")'opt E S- t. The above idea can be mathematically s t a t e d a s follows. Let the differential equation governing an adaptive system be
p, 0, t]
(5)
y = h[z,p,O,t], where x(t) and y(t) are the state and output of the system respectively, p is a constant but unknown parameter vector, and O(t) is a control parameter vector.
The Maturing of Adaptive Control
11
Let ym(t) be a desired trajectory. The choice of designer. Let an adaptive law of the form
O(t) is
at the discretion of the
= g[y, Ym, 7, 5]
(6)
exist, such that the nonlinear system described by (5)-(6) is globally stable for all values of the parameter ~/ in some set S~. Then, as stated earlier, the performance of the system can be improved according to some criterion by the choice of 7. The above procedure is very similar to that adopted in the design of optimal linear time-invariant control systems, where the control parameters are chosen subject to the constraint that the poles of the overall system lie in the open left-half of the complex plane. The suggestion that adaptive control systems should be designed from a stability viewpoint was enthusiastically received by the adaptive control community, particularly since by this time the difficulties associated with establishing the stability of gradient-based schemes was well recognized. Numerous contributions were made to the design of stable adaptive systems, the most notable of which are due to Shackcloth and Sutchart [27], Parks [28] Monopoli [29], Philipson [30], and Winsor and Roy [31]. The strict positive real condition, which subsequently became an important design tool, made its first adaptive appearance in [27,28]. It should be borne in mind that in the 1960s many of the subtle questions of stability were not yet well understood. Hence, while the contributions of the early authors were undoubtedly seminal in nature, some of them contained technical flaws which were eventually eliminated in more rigorous analyses [28,32,33]. The 1966 paper by Parks [28], which is of great historical significance in adaptive control, clearly demonstrated using a specific example that gradient methods of adjusting the parameters in an adaptive system can result in instability. At the same time it was also shown that the system could be made globally stable by using a design procedure based on Lyapunov's method. This clear demonstration that gradient-based adaptive systems could be unstable, eventually tolled the death knell of such systems and witnessed a gradual shift on the part of researchers to the design of adaptive systems based on stability methods. 3.2 L y a p u n o v Design o f Simple A d a p t i v e S y s t e m s The concepts used in the design of stable adaptive systems are best illustrated by considering plants which are described by first-order differential equations with unknown parameters. Following this, it can be readily shown that the same concepts carry over to higher-order systems when the state vectors of the plants are accessible. More general cases, where plants of order greater than one have to be controlled using only input and output information, are treated later in Section 5. For a detailed treatment of these topics the reader is referred to [11]. Let a plant with input-output pair {u(.), xp(-)} be described by the scalar differential equation ~p = ap~p(t) + kpU(~), (7)
12
Narendra
where ap and kp are constants and Zp, Up : I~ + --* ]1%. Let the control input u(t) be of the form u(t) = OzAt ) + kr(t) , (8) where 0 and k are adjustable parameters. It is assumed that the plant parameters ap and kp are unknown, but that the sign of kp is known. The aim of adaptive control is to adjust the control parameters O(t) and k(t) so that the output Xp(t) follows asymptotically the output Zm(1) of a reference model described by
~m(t)
=
amZrn(t) -F kmr(t) ,
am < O,
(9)
where r(t) is a known bounded piecewise-continuous function of time. Defining
e(t) ~= Z p ( t ) - ~rn(t), ¢(t) ~ O ( t ) - 0", and ¢(t) ~ k ( t ) - k*, where O* = (am - ap)/kp and k* = km/kp, the differential equation describing e(t) can be expressed as
~(t) = ame(t) + kpCxp(t) + kp¢(t)r(t).
(10)
The realization that the analysis is best carried out by considering the error equations, in which only the parametric errors ¢ and ¢ and the output error e(t) feature, was slow in coming. The importance of the error equation is that the adaptive control problem can be stated as the stability problem of the equilibrium state of a set of error differential equations. For instance, if 0(t) and k(t) are to be adjusted adaptively, the problem can be posed as follows: Determine the adaptive laws
= 0 = fl(e, zp,r)
(11)
so that the equilibrium state of the third-order system described by (10)-(11) is uniformly stable. Choosing as a Lyapunov candidate V ( e , ¢ , ¢ ) = 71 [e2 + ikpl(¢2 + ¢2)] and evaluating the time derivative of V along the solutions of equations (10) and (11) leads to
v(e, ¢, ¢)
=
2 + kpC
p +
-- Ikpl[¢¢ +
(12)
Since ¢ and ¢ are parameter errors which can never be known, it follows that and ¢ should be chosen to cancel the terms in ¢ and ¢ if V is to be non-positive for all e, ¢ and ¢. This is accomplished by choosing =
-sgn(kp)
p
(13)
= -sgn(kp)er, where sgn(kp) = 1 if kp > 0 and sgn(kp) = - 1 if kp < 0. Since V ( e , ¢ , ¢ ) is indeed a Lyapunov function of the system, it follows that the system is stable. Since V = ame 2 < 0, e E £2. Furthermore, since by equation (10) ~ is bounded, we have limt.-.co e(t) = O.
The Maturing of Adaptive Control
13
The proof given above for a first-order system forms the essence of the proofs given for more general cases. For example, let an nth-order plant be described by the vector differential equation
kp = Apzp + bmu,
(14)
whereAp E I~nxn, bm E ]Rn,xp :II~+--~]R n, andu:]P~+ ~ I R . Let bm b e a known vector and Ap an unknown matrix. Let a reference model be described by the differential equation
~m
=
Amxm + bmr,
(15)
where Am is an asymptotically stable matrix and r a specified scalar input which is bounded and piecewise-continuous. The objective, as in the scalar case, is to determine the input to the plant so that limt...~ e(t) = 0 where e(t) = • p(t)
-
To assure the existence of a solution, we assume that a vector k* exists so that if u = k*Tzp(t) + r, both xp(t) and zm(t) satisfy identical differential equations. Since Am is asymptotically stable it follows that a symmetric positive definite matrix P exists such that
ATp+PAm=-Q,
Q=QT>0.
(16)
Using V(e,¢) = eWpe + ¢T¢ as a candidate for a Lyapunov function, (where k(t) - k* = ¢), it can be shown, using the same approach as in the scalar case, that the adaptive law
k(t) "- - e T (i)Pbmxp
(17)
yields V(e,¢) = - e T Q e < O, thereby assuring global stability. The procedure outlined above was also used for the identification of dynamical systems [11] and later for both indirect control as well as in combined direct-indirect methods, for the adaptive control of systems whose state variables are accessible. 4 Adaptive
Observers
(early 1970s)
In linear systems theory it is well known that a controllable linear time-invariant system can be stabilized by state feedback. This formed the theoretical basis for the adaptive control methods discussed in the previous section. In more realistic cases the entire state vector cannot be measured and adaptation has to be carried out using only input-output measurements. For linear systems with known parameters, the use of a Luenberger observer to estimate the state variables of the system from input-output measurements had been established by this time. This in turn motivated researchers in the adaptive control field to attempt to estimate the state variables of linear systems from similar data even when the system parameters were unknown. Since the estimation of the state variables using an observer requires knowledge of the system parameters, and estimation of the latter requires knowledge of the state variables, it soon became evident that the problem was substantially more complex in the adaptive
14
Narendra
context. An adaptive observer was defined as one which simultaneously estimates the parameters and state variables of a linear time-invariant system from inputoutput measurements. Around the late 1960s, the field of adaptive control was not particularly popular in the United States. Most of the work on continuous time adaptive observers in the country was being carried out at two universities - the University of Connecticut under the direction of Lindorff and Yale University under the direction of Narendra. There was considerable interaction between these two groups and the University of Massachusetts, where Monopoli headed the control activity. During this period it became evident that the structure of the adaptive observer, as well as the adaptive laws, would have to be chosen judiciously if the adaptive observer was to be stable. At the same time, no one was confident that even the existence of such a stable adaptive observer could be established. In the author's opinion, the period 1972-74 constituted one of the important stages in the development of continuous-time adaptive control. The mid 1970s witnessed a renaissance in adaptive control. In Europe, Tsypkin [34] made fundamental contributions to the field and showed that many of the methods proposed for learning and adaptive control could be described in a common framework as special types of recursive equations. Identification methods were extensively investigated and different estimation methods were made the basis of design procedures. /~strSm and Wittenmark introduced the self-tuning idea in a seminal paper [35], and studied the asymptotic properties of the self-tuning regulator. On the practical side, minicomputers were used to implement adaptive concepts. In the United States major strides were made in the development of adaptive observers and a comprehensive survey on the subject by Narendra and Kudva [32,33] appeared in 1974. In view of the impact of adaptive observers on later developments in adaptive systems, a brief description of the nature of the problems as well as the solutions proposed are included in this section. The first solution to the problem was given by Carroll and Lindorff [36]. Using the Kalman-Yakubovich Lemma, a similar adaptive observer was later proposed by Kudva and Narendra [37] (Fig. 2). Various canonical representations of unknown plants were suggested by Luders and Narendra [38,39], which eventually led to a substantially simplified non-minimal representation of adaptive observers. 4.1 M i n i m a l R e a l i z a t i o n o f A d a p t i v e O b s e r v e r s Any single-input single-output linear time-invariant plant which is observable can be parametrized as y
= Xpl
where a T = [ a l , a 2 , . . . , an] and b T = [bl, b2 . . . . , bn] are unknown, and -4 is a known matrix [11]. The objective of the adaptive observer is to estimate the vectors a and b as well as the state vector xp(t) from the input-output data. A
The Maturing of Adaptive Control
15
more convenient parametrization for the design of the adaptive observer has the form zp = [-al-4]~p + gyp + bu (18) yp = hTXp -- Xl , where ~ is a known constant vector which makes [ - a l A ] = K a stable matrix, and g and b are unknown vectors in lRn which have to be estimated. From the earlier work on simple adaptive systems (Section 3) it was clear that 2n-independent signals would be needed to estimate the 2n unknown parameters. However, since only two signals u and y are accessible, the required signals had to be generated as part of the adaptive process. The observer was chosen to have the form Xp = K~:p + 9(t)yp + [~(t)u + vl(t) + v2(t), (19) where vl(t) and v2 (t) are vectors in IR'~ which are at the discretion of the designer and 5:r,(t), 9(t) and b(t) represent the estimates of the state vector and unknown parameter vectors respectively. If e(t) = 2p(t) - zp(t) and el(t) = 9p(t) - yp(t), the error equation (subtracting (18) from (19)) has the form
= K e + Cyp + Cu + vl + v2.
(20)
In [37] the vectors vl(t) and v2(t) are generated as the state variables of two identical controllable linear time-invariant systems with u(t) and y(t) as inputs respectively. The adaptive laws for adjusting ~(t) and b(t) are then derived using the signals vl(t) and v2(t) and it is demonstrated that a Lyapunov function exists which assures the stability of the system. The structure of the adaptive observer is shown in Fig. 2.
Iobservr Fig. 2. The minimal adaptive observer.
×1
16
Narendra
4.2 N o n m i n i m a l R e a l i z a t i o n o f t h e A d a p t i v e O b s e r v e r The adaptive observer in Fig. 2 is unwieldy because of the signals vl and vs. The question was raised, after the papers [36] and [37] appeared, as to whether it would be possible to eliminate these signals entirely by choosing a different parametrization of the adaptive observer. Several different parametrizations [38, 39] were partially successful and eventually led to a final structure based on a nonminimal representation of the model. This structure, suggested by Luders and Narendra [39] is shown in Fig. 3. Using the input u and output y and two identical controllable systems of dimension n, the signals 91 and w2 are first generated, and a linear combination of these signals (using 2n parameters) is chosen as the input to a stable first-order system. The output of the latter was used as the estimate of the output of the plant. The output error el together with ~1 and ~2 determined the adaptive laws for the estimation of the unknown parameters. In the late 1970s, Kreisselmeier [40] made numerous contributions to adaptive observers. He proposed different structures for such observers and derived conditions under which the parameters would converge arbitrarily fast. One of the significant improvements suggested by him led to an integral adaptive law which improved convergence rates substantially. From a theoretical standpoint, the integral laws enabled the procedures developed for continuous-time systems to be related to those generated using discrete-time systems.
A
Yo
Pig. 3. The non-minimal adaptive observer.
4.3 P e r s i s t e n t E x c i t a t i o n It was realized even in the early 1970s that the output error in an adaptive system could tend to zero without the parameters converging to their true values. The convergence of the parameter estimates was known to depend upon the properties of certain signals in the system which in turn depend on the nature of the reference input. The property of the signals which results in the parameter
The Maturing of Adaptive Control
17
errors tending to zero is referred to as persistent excitation and is consequently a central concept in adaptive control. The concept arose in the 1960s in the context of system identification of discrete systems, and was coined to express the qualitative notion that all the modes of the plant are excited. In 1966, •strSm and Bohlin [41] gave a formal statement relating persistent excitation to the convergence of the identification model. Several equivalent definitions, related in one form or another to the existence of a solution to a set of linear equations, appeared soon after. As will be shown in the following sections, persistent excitation also plays a central role in the convergence of adaptive control parameters, as well as in the robustness of such systems in the presence of bounded disturbances. In the late 1970s, numerous workers, including Sondhi and Mitra [42], Morgan and Narendra [43,44], Yuan and Wonham [45] and Anderson [46], addressed the question of persistent excitation in continuous-time systems. While all these papers contain essentially similar results, we describe here those contained in [43] and [44], which pertain to two classes of differential equations and which give rise to different definitions of persistent excitation. These are discussed briefly here since they are relevant for some of the discussions in later sections. The error equations in the adaptive observers discussed in Section 3 and many identification problems have the form
cT(t)~(]~) = C1(~) ,
(21)
where ¢(t), u(t) E IR", u(t) is a known input and ¢(t) is the parameter error vector. If the adaptive law ¢(t) = -el (t)u(t) (22) is used, we have
= --u(t)uT(t)¢(t).
(23)
In [43], necessary and sufficient conditions were derived for the uniform asymptotic stability of the equilibrium state of the differential equation (23) (i.e., the exponential convergence of the parameter error vector to zero). These conditions are also termed the conditions for the persistent excitation of the input vector u. One of several equivalent conditions for the persistent excitation of u is that it satisfy the inequality
jft+To u(~')uT(v) dv >_ aI,
Vt >_to
(24)
for some positive constants to, To and c~. The second differential equation, analyzed in [44], has the form =
_u(t)b T
x
(2s)
where A + A T < 0, (A,b) is controllable and u : IR+ --* IR2 is a bounded piecewise-continuous vector. T h e necessary and sufficient conditions for the
18
Narendra
asymptotic stability of the equilibrium state of (25) is that positive constants T0,/~0, and el exist such that for some t2 E [t,t+To]
1 [*~+~°uT(r)Wdr > el V t > t o
(26)
for every vector W E lR". In the context of the differential equation (25), persistent excitation of u would be defined in terms of the inequality (26). The constant el in (26) is referred to as the degree of persistent excitation. In continuous-time adaptive systems both equations (23) and (25) find application, and persistent excitation in both cases implies the exponential convergence of the parameter errors to zero. For a long time it had been realized that if the reference input contains n distinct frequencies, the vector w(t) E IR ~'~ used in the identification of an LTI plant would be persistently exciting, and hence the parameters would converge to their true values. A general statement of this condition was derived in 1983 by Boyd and Sastry [47] using the concept of spectral lines.
5 Breakthroughs
in the
late
1970s
The results described in Section 3 revealed that linear time-invariant syster~s whose state vectors are accessible can be controlled in a stable adaptive fashion even when the system parameters are unknown. When it was realized that stable adaptive observers could be designed, which estimate the parameters as well as the state variables of a linear time-invariant system, there was renewed interest in adaptive control in the mid 1970s. Based on well-known results in linear systems theory when the parameters of the system are known, it was felt that the estimates obtained by adaptive observers could be used to control systems with unknown parameters in a stable fashion. In fact, for a short period in 1974, it was even believed that the solution to the adaptive observer problem had also provided the solution to the control problem. It was only during the following months that it was realized that the control problem was more involved. In the adaptive observer, the assumption had been made that the plant is stable and that the input and o u t p u t to the observed plant are uniformly bounded. But this is no longer valid in the control problem where it is precisely the boundedness of these signals that has to be established. In fact, it was during one of the sessions at the IFAC World Congress held at Boston in 1975 that it became evident to the leading research workers in the field of adaptive control that the stability problem of adaptive control systems had not been resolved. The next five years witnessed tremendous research activity and culminated in the resolution of the problem in the ideal case in 1980. This result represents the most important development in adaptive systems theory in three decades.
The Maturing of Adaptive Control
19
The Adaptive Control Problem. invariant differential equation
A plant P is represented by a linear time-
~p = Apzp + up =
bpu
(27)
where u : IR + ~ IR is the input yp : ]R+ --* ]R is the output and Zp : 1R+ --* IR" is the state of the plant. The triple {h T, Ap, bp} is assumed to be controllable and observable. T h e elements of hp,Ap and bp are assumed to be unknown. Using only input and output information, the objective is to determine a control function u, using a differentiator-free controller, which stabilizes the overall system. This is referred to as the adaptive regulation problem. If ym(t) is a uniformly bounded desired output signal that is specified, the problem of tracking is defined as the determination of a control input u, using a differentiator-free controller, such that limt-.oo lyp(t) -- ym(t)l = 0. As described earlier, in model reference adaptive control (MRAC) the desired output ym(t) is specified as the output of a reference model whose input r(t) is a uniformly bounded piecewise-continuous function of time. It is worth pointing out that in the initial stages, around 1976, even precise statements of the regulation and tracking problems were not available. In fact, as stated above, it was not clear that a solution existed for either of the two problems. It was only four years later, with perfect hindsight, that a precise problem statement along with a solution was available. The adaptive control problem can also be stated in the frequency domain as follows: A linear time-invariant plant is described by a transfer function Wp(s) where KpZp(S) wp(s)= Rp(s) ' (28) where Zp(s) and Rp(s) are monic polynomials in s of degrees m ( < n - 1) and n respectively and Kp is a constant. The coefficients of the polynomials as well as Kp are assumed to be unknown. A reference model with an input-output pair {r, Ym) which is linear and time-invariant with an asymptotically stable transfer function Wm(s),where
Wm(s) -
KmZm(s) -~m (8) '
(29)
is specified. T h e objective is then to determine, as mentioned earlier, a control input u to the unknown plant such that
lim t~oo
[yp(t)
- ym(t)] = 0.
(30)
Once again, the prior information regarding the plant as well as the transfer function Wm(s) that would be sufficient to solve the problem was by no means clear in the early stages. As numerous workers attempted the solution of the problem, the requirements became more evident. It is also worth pointing out that, over a decade after the adaptive control problem for the ideal case was
20
Narendra
solved in 1980, numerous workers in the field are concerned at present with questions of existence of solutions for other adaptive control problems. During the period 1975-1978 it was generally realized by those working on the adaptive control of continuous-time systems, that the control problem for the case when the relative degree n* (number of poles - number of zeros) of the plant is unity, is substantially simpler than that for the case n" > 2. For such systems, the reference model could be chosen to be strictly positive real and the powerful results of stability theory developed in the 1960s could be used to prove global stability. T h e A d a p t i v e C o n t r o l P r o b l e m f o r n* = x. All the idealized adaptive control problems attempted in the 1970s can be conveniently divided into two parts. These are generally referred to as (i) the algebraic part and (ii) the analytic part. The algebraic part is concerned with the existence of a solution to the adaptive control problem. It consists in demonstrating that a control structure with constant parameters 0*(i = 1, 2 , . . . ) exists such that the plant together with the controller has a transfer function identical to that of the reference model. Once such a controller is shown to exist, the analytic part consists in determining stable adaptive laws for the adjustment of the control parameters 0i, i = 1 , 2 , . . . , so that they evolve to the desired (but unknown) constant values 0~. Even though work on adaptive systems had been in progress for several years, it was only in 1975 that it was realized that the algebraic and analytic parts of the problem could be treated separately. The structure of a controller for the adaptive control problem defined in this section is shown in Fig. 4. The input to the plant u as well as the output y are processed through identical (n - 1)-dimensional stable systems described by the controllable pair [.4, ~]. The output of the former is the ( n - 1)-vector wl (t), while the output of the latter is the (n - 1)-vector w2(t). Together with the reference input r(t) and the output yp(t), these constitute the 2n signals fed back to the plant. The control input to the plant is then expressed as a linear combination of the above signals as follows: =
= k(t)r(t) + oT(t)wl(t)+ oT(t)~2(t) + Oo(t)yp(t) = [k(t), oT(t), oT(t), O0(t)]w(t) T(t) = It(t),
(31)
up(t)].
The algebraic part consisted in showing that constant values of the parameters exist such that the transfer function of the plant together with a controller having the structure given by (31) and these parameters would have a transfer function identical to that of the reference model. Such a controller parameter vector is denoted by 0". Expressing the parameter error vector as ¢(t) = O(t) - 0", the o u t p u t error o(t) = yp(t) - ym(t) can be expressed in terms of the parameter errors as Wm (s)¢T(t)w(t) = el (t). (32)
The Maturing of Adaptive Control
21
From equation (32) it is clear that when the parameter error is zero, the output error will also tend to zero. The analytic part of the problem was to determine adaptive laws for adjusting 0(t) (or equivalently ¢(t)) so that lim,._.oo el (t) = 0. Several research workers working independently showed that the following prior information concerning the plant was sufficient to assure the existence of 0* as well as to develop stable adaptive control laws. The prior information is stated as a set of four assumptions: A s s u m p t i o n s (I) (i) The sign of the high-frequency gain kp of the plant, (ii) an upper bound on the order of the plant, and (iii) the relative degree of Wp(s) are known, and (iv) Zp(s) is a Hurwitz polynomial in s. Assumptions (I), under which this idealized problem could be solved, became important in adaptive control and formed the starting point for many investigations in the 1980s. Returning to equation (31), it was shown by Narendra and Valavani [48] that if the transfer function Wm(S) is strictly positive real, the adaptive laws
O(t) = -sgn(ko)el(t)w(t )
(33)
would result in all the signals in the adaptive system being bounded and the output error e,(t) tending to zero asymptotically. T h e A d a p t i v e C o n t r o l P r o b l e m for n* ~_ 2. Following the solution of the adaptive control problem for the case n* = 1, there was world-wide interest in solving the problem for the general case when n* :> 2. A relative degree of the plant greater than or equal to two implies that the reference model transfer function Wm(s) cannot be positive real, since the latter must have a relative degree greater than or equal to that of the plant. This in turn implies that in the error equation (32) Win(s) is not strictly positive real so that the same procedure used in the case when n* = 1 can no longer be adopted. During this period it almost seemed impossible to solve the problem. The major breakthrough came from a suggestion made by Monopoli [49] that an augmented error (rather than the true error between plant and model outputs) should be used to generate the adaptive laws. The process involved the addition of auxiliary signals to the error equation (32) to reduce it to a mathematically tractable form. As acknowledged by Monopoli to the author, this idea was strongly influenced by the earlier work on adaptive observers in the early 1970s. Defining cT(t)Wm(s)w(t) -- Wm(s)¢T(t)w(t) -" e2(t), (34) it follows that
el(t) + e2(t) = gl(t) = cT(t)Wm(s)w(t) : c T ( t ) ( ( t ) ,
(35)
22
Narendra
r{t)
y~{t}
, lI kmZm(sI Flm(s) I
~(0 +
] k~ZpCs) [ R~,Is)
%{t)
I
q ~2 (t}
F i g . 4. T h e a d a p t i v e controller for n* = 1.
where e2(t) is referred to as the auxiliary error and ¢1(t) is the augmented error. From equation (35) it follows that if e2(t) is added to the true error el(t), then the resulting augmented error ~t(t) is related to the parameter error vector ¢(t) by the equation eT(~)~(~) : E l ( t ) ,
(36)
which has been extensively studied in the context of adaptive systems. It was therefore concluded that adaptation should be carried out using the augmented error ¢1(t) and the vector signal ~(t) rather than the true error signal el(t) and the signal w(t) as in the case when n* = 1. This represented a significant step since the generation of the adaptive law had finally been separated from the control process itself and the signals associated with it. Following the methods used in adaptive observers the natural tendency was to choose the adaptive law as
d(t) = d(t) = --cl (t)¢(t) = --~(t)¢T (t)¢(t) ,
(37)
which assures the boundedness of the parameter vector. However, it was soon realized that this would not assure the boundedness of all the signals in the system. The period 1978-1979 was a period of intense activity in the adaptive control area, with different groups working in different parts of the world trying frantically to determine the modifications needed in the adaptive law (33) to assure the global stability of the adaptive system. As it has happened so often in the history of science, the breakthrough was made independently by several
The Maturing of Adaptive Control
23
groups simultaneously. The first wave of solutions, by Morse [50] and Narendra et al. [51] for continuous-time systems, and by Goodwin, Ramadge and Caines [52] and Narendra and Lin [53] for discrete-time systems, appeared in the June 1980 issue of the IEEE Transactions on Automatic Control, followed by solutions by other authors in later issues. An early unified treatment of these results was given by Egardt [54]. The problem of adaptively controlling a linear time-invariant plant which satisfied the four assumptions given in (I) had been solved. The key to the solutions was the use of a normalization factor in the adaptive control law which assumed the form = 1
(3s)
This ensured that ¢ E £2. The boundedness of the parameter vector together with this condition on the rate of change of the control parameter vector was sufficient to demonstrate that the signals of the adaptive system would be bounded and that the output error el(t) would tend to zero asymptotically. Since many of the results derived in the 1980s used arguments similar to those used in the ideal case, the various steps used in the proof of stability are outlined here. (i) Using the adaptive law (38) it follows that the control parameters are bounded. This in turn implies that the state variables of the adaptive system can grow at most exponentially. (ii) From the fact that ¢ e /22 it is shown that the augmented error el(t) is related to ¢(t) as ~l(t) =/~(t)x/1 + (w(t)¢(t) where/~ e £2. (iii). Assuming that the signals grow in an unbounded fashion, the fact that ¢ E £2 implies that IIw2]l,I1¢11,Ilcall and lYpl grow at the same rate, and that Hw2]I grows at a different rate than Hwll. This leads to a contradiction, so that it is concluded that all signals in the feedback loop are bounded and that limt_.~o el(t) = limt--.oo el(t) = 0. (iv) Finally, it follows that if w(t) is persistently exciting, the parameter vector 0(t) converges to the desired value 0*. For detailed stability arguments based on growth rate of signals the reader is referred to the paper by Narendra, Annaswamy and Singh [55]. 6 Diversity
in the
1980s
The solution of the adaptive control problem in the idealized case described in the previous section provided a tremendous impetus to the field and attracted many researchers from different areas of systems theory. During the first half of the last decade, four major directions of research emerged. The first dealt with situations where the ideal assumptions (I) are satisfied but adaptation has to be carried out in the presence of different types of perturbations. The second direction of research was towards relaxing the assumptions (I) and determining the least restrictive conditions under which regulation and tracking are possible. Extension of results derived for single-input single-output systems to multivariable
24
Narendra
problems was a third direction in which research evolved, and a fourth direction was aimed at determining the conditions for assuring stochastic stability of adaptive systems when the disturbances are random in nature. Since most of the results in stochastic adaptive control apply to discrete-time systems and this paper deals mainly with continuous-time adaptive systems, only the developments in the first three directions are discussed here. Further, during the second half of the decade, the field continued to expand rapidly and many new areas of research came into existence, so that it becomes difficult to track precisely the different advances in the field. In this section we consequently attempt to provide merely a comprehensive review of the principal developments during the 1980s. 6.1 R o b u s t A d a p t i v e C o n t r o l In the statement of the idealized adaptive control problem treated in Section 5, the plant parameters were assumed to be constant and the system was assumed to be disturbance-free. However, in practice, these assumptions are rarely met. No plant is truly time invariant, finite-dimensional, linear, and noise-free. The question therefore naturally arose as to whether the idealized adaptive laws would perform satisfactorily even when external disturbances and parameter variations are present and imperfect models of the plant are used. The realization that small external and internal perturbations can result in unbounded signals in the system eventually led many researchers to what is termed robust adaptive control or the search for techniques for achieving bounded response even in the presence of different disturbances. To understand some of the theoretical difficulties encountered in robust adaptive control problems, consider the Lyapunov function V(e, ¢) used to prove uniform stability in the ideal case. The time-derivative V(e,¢) of V along the trajectories of the system is invariably negative semi-definite. When disturbances and/or unmodeled dynamics are present, V(e, ¢) is indefinite and V is no longer a Lyapunov function guaranteeing stability. The objective in such cases is to modify the adaptive laws so that V is negative semidefinite outside a compact region containing the origin in the state space, so that the response of the system is bounded. B o u n d e d D i s t u r b a n c e s . For the case of arbitrary bounded disturbances, Egardt [54], Peterson and Narendra [56] and Samson [57] suggested the use of a dead-zone in the adaptive law to assure bounded solutions. This modification implies that adaptation is stopped when the error is smaller than a value determined by the prior information concerning the disturbance. Assuming that 0*, the desired control parameter vector, lies inside a sphere S, Kreisselmeier and Narendra [58] demonstrated that the ideal adaptive law, modified only at the boundary of S would result in all the signals in the adaptive system being bounded. This was the precursor, in the continuous case, of the projection algorithms that are commonly used at present in both discrete- and continuous-time systems.
The Maturing of Adaptive Control
25
The two modifications to the adaptive law suggested above require additional information concerning the disturbance or the plant. In 1983, Ioannou and Kokotovic [59] proposed a modified law which does not require such additional information to assure robustness. In place of the adaptive law 8 = - e l w used in the ideal case, the modified law has the form ~=-etw-
nO.
(39)
As mentioned earlier, one of the major problems in the analysis of adaptive systems using the stability approach is that the time derivative V of the Lyapunov function is semidefinite even in the ideal case. The introduction of the term - a 0 in (39) makes V negative outside a compact region containing the origin in the state space, thereby assuring the boundedness of all the signals. In 1987, Narendra and Annaswamy [60] proposed a somewhat different adaptive law based on the same principles, while avoiding some of the shortcomings of (39). This adaptive law has the form = --el~
-- l e l l 0 ,
(40)
The ~-modification law and the lel I-modification law given in [59] and [60] have been studied extensively and are often used in applications to assure robustness in the presence of bounded external perturbations. In contrast to the above methods, which attempt to assure robustness by modifying the adaptive laws, efforts were also made to use the concept of persistent excitation to ensure boundedness of all the signals in the system. A significant result in this direction given by Narendra and Annaswamy [61] states that, if the reference input has a sufficiently large degree of persistent excitation relative to the magnitude of the disturbance, then all the signals in the adaptive system will be bounded. H y b r i d A d a p t i v e Control. All the adaptive laws described so far are continuous in time. Using the same controller structure, it can be shown that the parameter vector 9(t) can also be adjusted using a hybrid algorithm while assuring global stability in the ideal case and robustness in the presence of disturbances. In such a case, 0(0 is held constant over intervals of length T and updated at the end of the interval using the information collected. Several such hybrid modifications were given by Narendra, Khalifa and Annaswamy in [62]. A typical adaptive law has the form t+T
O(t + T ) = O(t) -
Jt
7el(r)w(r)dr ,
(41)
so that 8 is updated along the average gradient over the interval. In extensive simulation studies, hybrid adaptation was found to be preferable to continuous adaptation, as far as robustness is concerned.
26
Narendra
T i m e - V a r y l n g S y s t e m s . One of the compelling reasons for considering adaptive control is to compensate for time variations in plant parameters. While much of the research reported thus far has dealt with plants with constant parameters, interest in adaptive control in the presence of plant parameter variations has existed from the very beginning. In the early stages, most of the researchers made rather restrictive assumptions on the nature of the parameter variations. Anderson and Johnstone [63] demonstrated that if the reference input is persistently exciting and the parameter variations are sufficiently small, the signals in the system will be bounded provided the initial errors are small. Later papers by Chen and Caines [64], Hill et al. [65], and Martin-Sanchez [66], considered exponential decay of parameters to constant values, finite number of jump variations, small perturbations around constant values or a combination of these. Since time variations of plant parameters may violate the assumptions (I) instantaneously, further assumptions were made by many authors to achieve robustness. In [67,68,69,70], globally bounded solutions are established assuming that the plants vary sufficiently slowly. Towards the end of the 1980s, several workers became interested in adaptive systems in which the parameters vary rapidly [72,71]. Tsakalis and Ioannou [72] suggested a modification in the controller structure, by which, at least in theory, the control objective can be met exactly. This important development has enabled the authors to study the behavior of time-varying systems with arbitrarily varying parameters. U n m o d e l e d P l a n t D y n a m i c s . From the results presented in Section 4 it is clear that 2n controller parameters are needed to control an nth-order plant. This poses a major practical limitation since, even for plants of moderate complexity, this requires a large number of adjustable parameters. Further, from a theoretical point of view, the assumption that the plant order is known is rarely met. Hence, in the opinion of many researchers, adaptively controlling an unknown plant using a reduced-order controller was the most important problem in adaptive control in the mid 1980s. The output ~/of the unmodeled part of the plant can be considered as a statedependent disturbance on the plant. Hence, it cannot be assumed to be bounded a priori and a modified adaptive law is needed, which does not require this a priori information. Ioannou and Kokotovic [59] showed that such an update law is the ~-modification (39). The restrictive case when the true plant together with a constant controller is SPR was analyzed by Narendra and Annaswamy [73]. They showed that the degree of persistent excitation of the modeled part should be larger than the magnitude of the output of the unmodeled part to assure the boundedness of all the signals in the system. Important contributions to the problem were made by Praly [74], who employed a normalization technique, and Kreisselmeier and Anderson [75], who used a variable dead-zone to implement the adaptive law. Along with the development of modified adaptive laws, instability mechanisms of ideal adaptive laws were analyzed./~strSm [76] used averaging, previ-
The Maturing of Adaptive Control
27
ously employed in the ODE method of Ljung [77], to give an explanation of the instability examples of l ~ h r s et al. [78]. Applying the method of averaging, Kokotovic et al. [79,80], Fu et al. [81], and Kosut et al. [82], analyzed adaptive control when unmodeled dynamics are present. Kokotovic and co-workers showed that the frequency content of the excitation of certain signals determines a sharp stability-instability boundary for slow adaptation. Sufficient conditions for the boundedness of signals were also derived in [82] based on the frequency range of the dominant excitation.
6.2 Relaxation of A s s u m p t i o n s Many practical adaptive controllers were found to perform satisfactorily even when the ideal Assumptions (I) given in Section 5 were not valid. This eventually led to a search for the least restrictive assumptions under which a solution can be obtained for the adaptive control problem. A major step in this direction was the result of Nussbaum [83] that, even when the sign of A is unknown, the differential equations ~1 = azl + A.f(xl,x2), = g(xl,
a> 0
(42)
can be made to have solutions for which limt-oo Xl(t) = 0 with xz(t) bounded, by the proper choice of the functions f and g. This had an immediate impact. Mudgett and Morse [84] demonstrated soon after that the sign of the highfrequency gain is not needed to stabilize the adaptive system. When Willems and Byrnes [85] considered an nth-order plant with relative degree one and zeros in the left half plane, and demonstrated that it could be stabilized by an adaptive high-gain feedback, it became clear that at least in special cases the bound on the order of the system may not be needed. The search for weaker assumptions was on. Numerous contributions were made during the following years and many partial results were obtained. For example in [86], Morse showed that a 3-dimensional high-gain adaptive controller can stabilize any minimum phase process of any finite dimension provided that the relative degree is less than three and later showed [87] that, if in addition the high-frequency gain is known a priori, a 3-dimensional controller could stabilize a process of relative degree _< 3 without using an augmented error. Indirect control methods rely on the estimation of the plant parameters to adjust the control parameters. Researchers in this area are interested in an approach that would eliminate the requirement that the zeros of the plant should lie in the open left half of the complex plane. An important contribution to relaxing the assumptions for stable adaptive control was made by Kreisselmeier and Smith [88], who showed that a plant with unknown parameters can be regulated in a stable fashion, assuming that only the order of the plant is known. A major advance in the search for weaker assumptions on the plant to ensure stable adaptation was made in 1985 by Martensson, who showed in [89] that the knowledge of the order £ of an LTI stabilizing compensator is sufficient to design
28
Narendra
a stabilizing controller. While the practicability of such a controller is dubious, its importance lies in the demonstration that stable controllers can be designed in theory with very little information concerning the plant. Martensson's contribution has encouraged a number of researchers to investigate adaptive controllers which, while based on weak assumptions, are also practically feasible. Active research based on switching functions is currently in progress in this area. While a complete answer to the question of least restrictive assumptions is not available at present, it appears that stable adaptive control may be possible even when all the assumptions in (I) are violated. 6.3 M u l t i v a r i a b l e A d a p t i v e C o n t r o l Attempts to extend stable adaptive control laws derived for SISO systems to multi-input multi-output systems (MIMO) were also initiated in the early 1980s. Since large uncertainty is generally associated with complex multivariable systems, adaptive control methodologies are even more relevant in such cases. While the adaptive control of MIM0 systems leads to questions which are similar to the single variable case, the solutions to these questions are substantially more involved. In particular the issue of parametrization of the adaptive controller becomes an important problem. The four assumptions corresponding to those in (I), which are sufficient to ensure the existence of a stable adaptive controller in the MIMO case are: Assumptions
(II)
(i) The high-frequency gain matrix Kp of the plant must satisfy the matrix equation FKp+KTpF=Qo, F = F w > 0 , (43) where Q0 is a symmetric sign definite matrix for some positive definite symmetric m a t r i x / ' , (ii) the right Hermite normal form Hp(s) of the plant transfer matrix Wp(s) is known, (iii) an upper bound v on the observability index of Wp(s) is known, and (iv) the zeros of Wp(s) lie in C - . As in the scalar case, assumptions (i) and (iv) are used in the analytic part of the adaptive control problem while (ii) and (iii) are needed for the algebraic part. Several adaptive control algorithms have been suggested in the literature which assure the stability of the overall system when Assumptions (II) are satisfied [90,91,92]. In [92] it is shown that the assumption (ii) concerning the Hermite normal form of Wp(s) has the practical implication that only those multivariable systems Wp(s) whose Hermite normal forms (or whose Hermite normal forms after compensation, i.e., Wp(s)We(8))are diagonal can be controlled in an adaptive fashion. The same questions concerning robustness in the presence of different types of perturbations of single variable questions also arise in the multivariable case.
The Maturing of Adaptive Control
29
All the methods discussed in the earlier sections can be extended to multivariable systems with minor modifications. 6.4 R e c e n t D e v e l o p m e n t s While the main thrust of the research in adaptive systems during the greater part of the 1980s was in the three directions described earlier in this section, many new areas of research emerged during the latter part of the decade. For the sake of completeness we merely list some of these below. Prompted by developments in nonlinear control theory dealing with feedback linearizable systems [93], the adaptive control of special classes of nonlinear systems with unknown parameters has been attempted recently [94,95]. The techniques used in these cases are naturally strongly motivated by the results for linear systems discussed in this paper. Another recent trend in adaptive control circles is towards the use of concepts based on variable structure switching controllers [96,97,98]. The proofs of stability using these methods follow closely along the lines indicated in Section 4 for continuous adaptive laws. While extensive simulation results have indicated that these methods result in substantially improved performance, the variable structure approach raises theoretical questions which need further investigation. Direct and indirect methods were briefly mentioned in Section 2 as the two principal methods used in adaptive control. Both of them have been used widely in robust adaptive systems. Recently it has been demonstrated that the two methods can be combined [99] to benefit from their individual advantages. Even as the research effort in the new areas gains momentum, work continues with increased vigor in the more established areas. Attempts to determine the least stringent conditions under which stable adaptive control is possible, the search for improved methods for controlling time-varying systems, the study of phenomena such as bursting which are peculiar to robust adaptive systems, as well as the application of adaptive control principles to practical systems, are all as active as ever. However, no single area dominates the field. The author is confident that the confluence of the different ideas that are being currently explored will lead to significant developments in the area of adaptive control in the 1990s.
7 Conclusion Having spent its fledgling years in the 1960s in relative peace and quiet among gradient methods, and its youth in the 1970s exploring the mysteries of stable nonlinear systems, a mature adaptive control theory entered the turbulent 1980s with unbridled confidence. Ten years later, on the threshold of a new decade, a more experienced and wiser field views the prospects for the future with cautious optimism. The introduction of new concepts and tools has significantly widened the field and radically changed its structure and boundaries. Towards the end of the last
30
Narendra
decade the field has also witnessed the proliferation of research areas, with the addition of adaptive nonlinear control, variable structure adaptive control, adaptive control using integrated methods, and switching adaptive systems. This, in turn, has made communication between experts in different areas no longer as easy as it was in the past. With this ever-increasing diversity in the field, the need for conceptual frameworks for viewing adaptive control problems and formulating the right questions is assuming greater importance. The burgeoning of new areas, while making the field more attractive to the researcher, has not been an unmixed blessing. Many important questions in the older areas have been left behind, only partially answered. While robustness has been studied extensively, we do not currently have a convenient method of quantifying it. The importance of persistent excitation for the convergence of parameters as well as for the robustness of the adaptive system is well realized, but few prescriptive methods for choosing the reference input or the filters in the controller currently exist. This implies that the choice of parameters in the adaptive system is a matter of trial and error without theoretical support. Since all the stability proofs are asymptotic in character, very little is known at present about the transient behavior of adaptive systems. In the analysis of systems with time-varying parameters, few elegant results exist which are comparable to those derived in the 1960s for linear time-invariant systems. Finally, it is good to remember that while the trend away from gradient methods to stability-based methods was strongly motivated by the desire for faster response, all the proofs of stability that have been derived in the last two decades are based on the time derivative ~(t) of the control parameters being sufficiently small. In light of this, the methods discarded in the 1960s no longer appear unattractive and may even be worthy of further examination. The revolutionary changes in computer technology in the last decade have made the practical implementation of adaptive algorithms economically feasible, and there is a growing demand for industrial adaptive controllers which meet increased speed, accuracy and robustness requirements. In fact, in many areas, the scope of the application is to a large extent limited by the existing theory. During the 1970s and the 1980s, the emphasis of theoretical research was on ensuring the boundedness of all the signals in adaptive systems rather than on performance. Hence, many of the results obtained are not directly applicable to problems in technology. The trend that started in the 1970s, which shifted the emphasis from performance to stability in adaptive systems, appears to have reached the zenith. With our current knowledge of the stability of adaptive systems, it is perhaps time that the trend be reversed to some extent and greater emphasis placed in the future on performance. Thirty-five years ago, the biological implications of the term "adaptive" lent the field of adaptive control an aura which was responsible for the great interest evinced in it. The same reasons are also responsible for the enormous attraction the field enjoys even today. During the intervening years a variety of terms borrowed from biology and psychology, such as learning, pattern recognition, self-organization, artificial intelligence, and intelligent control have found their way into the systems literature. These terms were motivated by the desire to
The Maturing of Adaptive Control
31
incorporate, in artificial systems, the ability of living systems to cope with large uncertainties in their environments. T h e various terms have developed into major independent disciplines with wide followings. Although the sources of the problems and the terminologies in these areas are different from those of adaptive control, m a n y of the concepts used and the theoretical difficulties encountered are quite similar. It is therefore safe to say that advances in adaptive control theory in the 1990s will have a significant impact on all the methodologies dealing with systems containing large uncertainties.
Acknowledgments The author would like to thank Prof. A. M. Annaswamy and Dr. R. M. Wheeler, Jr., for interesting discussions. Some of the sections in this paper are summaries of chapters in the book "Stable Adaptive Systems" by K. S. Narendra and A. M. Annaswamy, published by Prentice-Hall in 1989.
References 1. E. H. Carr, What is History?, Penguin Books, 1981. 2. J. G. Truxal, Automatic Feedback System Synthesis, McGraw-Hill, New York, NY, 1955. 3. B. C. Kuo, Automatic Control Systems, Prentice-Hall, Englewood Cliffs, N J, 1987. 4. R. F. Drenick and R. A. Shahbender, "Adaptive servomechanisms," AIEE Transactions, vol. 76, pp. 286-292, Nov. 1957. 5. J. A. Aseltine, A. R. Mancini, and G. W. Sarture, "A survey of adaptive control systems," IRE Trans. Aut. Control, vol. 3, pp. 102-108, Dec. 1958. 6. P. R. Stromer, "Adaptive or self-optimizing control systems - a bibliography," IRE Trans. Aut. Control, vol. 4, pp. 65-68, May 1959. 7. M. Margolis and G. T. Leondes, "A parameter tracking servo for adaptive control systems," IRE Trans. Aut. Control, vol. 4, pp. 100-111, Nov. 1959. 8. L. A. Zadeh, "On the definition of adaptivity, ~ Proceedings o/the IEEE, vol. 51, pp. 569-570, March 1963. 9. R. Bellman, Adaptive Control Processes - A Guided Tour, Princeton University Press, Princeton, N J, 1961. 10. J. G. Truxal, "Adaptive Control," Proc. ~nd 1FAG World Congress, Basle, Switzerland, 1963. 11. K. S. Narendra and A. M. Annaswamy, Stable Adaptive Systems, Prentice-Hall, Englewood Cliffs, N J, 1989. 12. L. S. Pontryagin et al., The Mathematical Theory of Optimal Processes, Interscience, New York, NY, 1962. 13. A. A. Feldbaum, Optimal Control Systems, Academic Press, New York, NY, 1965. 14. R. E. Kalman, "Design of a self-optimizing control system," Transactions of ASME, vol. 80, pp. 468-478, 1958. 15. E. Mishkin and L. Braun, Adaptive Control Systems, McGraw-Hill, New York, NY, 1961.
32
Narendra
16. I. D. Landau, Adaptive Control: The Model Reference Approach, Marcel Dekker, New York, NY, 1979. 17. C. S. Draper and Y. T. Li, "Principles of optimalizing control systems and an application to the internal combustion engine," ASME Publication, Sep. 1951. 18. V. W. Eveleigh, Adaptive Control and Optimization Techniques, McGraw-Hill, New York, NY, 1967. 19. R. J McGrath, V. Rajaraman, and V. C. Rideout, "A parameter perturbation adaptive control system," IRE Trans. Aut. Control, vol. 6, pp. 154-161, May 1961. 20. J. L. Douce, K. C. Ng, and M. M. Gup~a, "Dynamics of the parameter perturbation processes," Proceedings of the IEE, vol. 113, no. 6, pp. 1077-1083, June 1966. 21. R. E. Kronaner and P. G. Drew, "Design of the feedback loop in parameterperturbation adaptive contzols," in Theory ol Self-Adaptive Control Systems, Proc. ~nd IFAC Syrup. (Teddington, UK, 1965), P. H. Hammond, ed., Plenum Press, New York, NY, 1966. 22. K. S. Narendra and L. E. McBride, Jr., "Multiparameter self-optimizing systems using correlation techniques," IRE Trans. Aut. Control, vol. 9, pp. 31-38, Jan. 1964. 23. P. V. Kokotovic, "Method of sensitivity points in the investigation and optimization of linear control systems," Automation and Remote Control, vol. 25, pp. 1512-1518, 1964. 24. J. B. Cruz, Jr., ed., Systems Sensitivity Analysis, Dowden, Hutchinson and Ross, Stroudsburg, PA, 1973. 25. H. P. Whitaker, J. Yamron, and A. Kezer, "Design of model-reference adaptive control systems for aircraft," Report R-164, Instrumentation Laboratory, MIT, Cambridge, MA, 1958. 26. L. P. Grayson, "Design via Lyapunov's second method," Proc. 4th Joint Amer. Control Con]., 1963. 27. B. Shackcloth and R. L. Butchart, "Synthesis of model reference adaptive systems by Lyapunov's second method," in Theory of Self-Adaptive Control Systems, Proc. ~nd 1FAC Symp. (Teddington, UK, 1965), P. H. Hammond, ed., Plenum Press,. New York, NY, 1966. 28. P. C. Parks, "Lyapunov redesign of model reference adaptive control systems," IEEE Trans. Aut. Control, vol. AC-11, pp. 367.-367, 1966. 29. R. V. Monopoli, "Lyapunov's method for adaptive control system design," IEEE Trans. Aut. Control, vol. AC-12, pp. 334-335, 1967. 30. P. H. Phillipson, "Design methods for model reference adaptive systems," Proc. Inst. Mech. Engineers, vol. 183, pp. 695-700, 1969. 31. C. A. Winsor and R. J. Roy, "Design of model reference adaptive control systems by Lyapunov's second method," 1EEE Trans. Aut. Control, vol. AC-13, p. 204, April 1968. 32. K. S. Narendra and P. Kudva, "Stable adaptive schemes for system identification and control - part I" 1EEE Trans. Systems, Man and Cybernetics, vol. SMC-4, pp. 541-551, Nov. 1974. 33. K. S. Narendra and P. Kudva, "Stable adaptive schemes for system identification and control - part II" IEEE Trans. Systems, Man and Cybernetics, vol. SMC-4, pp. 552-560, Nov. 1974. 34. Ya. Z. Tsypkin, Adaptation and Learning in Automatic Systems, Academic Press, New York, NY, 1971.
The Maturing of Adaptive Control
33
35. K. J. AstrSm and B. Wittenmark, "On self-tuning regulators, ~ A u t o m a t i c a , vol. 9, pp. 185-199, 1973. 36. 1%. Carroll and D. P. Lindorff, "An adaptive observer for single-input singleoutput linear systems, z IEEE Trans. Aut. Control, vol. AC-18,pp. 428-435, Oct. 1973. 37. P. Kudva and K. S. Narendra, "Synthesis of an adaptive observer using Lya. punov's direct method," Int. J. Control, vol. 18, pp. 1201-1210, 1973. 38. G. Luders and K. S. Narendra, "An adaptive observer and identifier for a linear system," IEEE Trans. Aut. Control, vol. AC-18, pp. 496-499, Oct. 1973. 39. G. Luders and K. S. Narendra, "A new canonical form for an adaptive observer," IEEE Trans. Aut. Control, vol. AC-19, pp. 117-119, April 1974. 40. G. Kreisselmeier, "Adaptive observers with exponential rate of convergence," IEEE Trans. Aut. Control, vol. AC-22, pp. 2-8, Jan. 1977. 41. K. J. ,~strSm and T. Bohlin, "Numerical identification of linear dynamic systems from normal operating records," in Theory o] Self-Adaptive Control Systems, Proe. 2nd IFAC Syrnp. (Teddington, UK, 1965), P. H. Hammond, ed., Plenum Press, New York, NY, 1966. 42. M. M. Sondhi and D. Mitra, "New results on the performance of a well-known class of adaptive filters," Proceedings of the IEEE, vol. 64, pp. 1583-1597, Nov. 1976. 43. A. P. Morgan and K. S. Naxendra, "On the uniform asymptotic stability of certain linear non-autonomous differential equations," SIAM J. Control Optimiz., vol. 15, pp. 5-24~ Jan. 1977. 44. A. P. Morgan and K. S. Narendra, "On the stability of nonautonomous differential equations ~ = [A + B(t)]x with skew-symmetric matrix B(t)" S l A M J. Control Optimiz., vol. 15, pp. 163-176, Jan. 1977. 45. J. S. C. Yuan and W. M. Wonham, "Probing signals for model reference identification," IEEE Trans. Aut. Control, vol. AC-22, pp. 530-538, Aug. 1977. 46. B. D. O. Anderson, "Exponential stability of linear equations arising in adaptive identification," IEEE Trans. Aut. Control, vol. AC-22, pp. 83-88, Jan. 1977. 47. S. P. Boyd and S. S. Sastry, "On parameter convergence in adaptive control," Syst. Control Left., vol. 2, pp. 311-319, 1983. 48. K. S. Narendra and L. S. Valavaui, "Stable adaptive controller design-direct control," IEEE Trans. Aut. Control, vol. AC-23, pp. 570-583, Aug. 1978. 49. R. V. Monopoli, "Model reference adaptive control with an augmented error signal," IEEE Trans. Aut. Control, vol. AC-19, pp. 474-484, 1974. 50. A. S. Morse, "Global stability of parameter adaptive control systems," IEEE Trans. Aut. Control, vol. AC-25, pp. 433-439, June 1980. 51. K. S. Narendra, Y. H. Lin, and L. S. Valavani, "Stable adaptive controller design -part II: proof of stability," IEEE Trans. Aut. Control, vol. AC-25, pp. 440-448, June 1980. 52. G. C. Goodwin, P. J. Ramadge, and P. E. Caines, "Discrete time multivariable adaptive control," IEEE Trans. Aut. Control, vol. AC-25, pp. 449-456, June 1980. 53. K. S. Narendra and Y. H. Lin, "Stable discrete adaptive control," IEEE Trans. Aut. Control, vol. AC-25, pp. 456-461, June 1980. 54. B. Egardt, Stability of Adaptive Controllers, Springer-Verlag, Berlin, 1979. 55. K. S. Narendra, A. M. Annaswamy, and It. P. Singh, "A general approach to the stability analysis of adaptive systems," Int. J. Control, vol. 41, pp. 193-216, 1985. 56. B. B. Peterson and K. S. Narendra, "Bounded error adaptive control," IEEE Trans. Aut. Control, vol. AC-27, pp. 1161-1168, Dec. 1982.
34
Nazendra
57. C. Samson, "Stability analysis of adaptively controlled systems subject to bounded disturbances," Automatiea, vol. 19, pp. 81-86, 1983. 58. G. Kreisselmeier and K. S, Nitrendra, =Stable model reference adaptive control in the presence of bounded disturbances," ]EEE Trans. Aut. Control, vol. AC-27, pp. 1169-1175, Dec. 1982. 59. P. A. Ioannou and P.V. Kokotovic, Adaptive Systems with Reduced Models, Springer-Verlag, New York, NY, 1983. 60. K. S. Narendra and A. M. Annaswamy, "A new adaptive law for robust adaptive control without persistent excitation," IEEE Trans. Aut. Control, vol. AC-32, pp. 134-145, Feb. 1987. 61. K. S. Narendra and A. M. Annaswamy, "Robust adaptive control in the presence of bounded disturbances," IEEE Trans. Aut. Control, vol. AC-31, pp. 306-315, April 1986. 62. K. S. Narendra, I. H. Khalifa, and A. M. Annaswamy, "Error models for stable hybrid adaptive systems-part II," I E E E Trans. Aut. Control, vol. AC-30, pp. 339-347, April 1985. 63. B. D. O. Anderson and R. M. Johnstone, "Adaptive systems and time-varying plants," Int. J. Control,vol. 37, pp. 367-377, Feb. 1983. 64. H. F. Chen and P. E. Calnes, "On the adaptive control of stochastic systems with random parameters," Proc. $3rd 1 E E E Conf. Dec. Control, 1984. 65. D. J. Hill, G. C. Goodwin, and X. Xianya, "Stochastic adaptive control for exponentially convergent time-varying systems," Proc. ~3rd I E E E Conf. Dec. Control, 1984. 66. J. M. Martin-Sanchez, "Adaptive control of time-variant processes," Proe. 1985 Amer. Control Conf. 67. G. Kreisseimeier, "Adaptive control of a class of slowly time-varying plants," Syst. Control Lett., yah 8, Dec. 1986. 68. T. H. Lee and K. S. Narendra, "Stable direct adaptive control of time-varying discrete-time systems," Technical Report 8720, Center for Systems Science, Yale University, New Haven, CT, 1987. 69. K. S. Tsakalis and P.A. Ioannou, "Adaptive control of linear time-varying plants," Automatics, vol. 23, pp. 459-468, July 1987. 70. R. H. Middleton and G. C. Goodwin, "Adaptive control of time-varying linear systems," IEEE Trans. Ant. Control, vol. AC-33, pp. 150-155, Feb. 1988. 71. A. M. Annaswamy and K. S. Narendra, "Adaptive control of a first-order plant with a time-varying parameter," Proc. 1990 Amer. Control Conj. 72. K. S. Tsakalis and P. A. Ioannou, "Adaptive control of linear time-varying plants: a new model reference controller structure," IEEE Trans. Aut. Control, vol. 34, pp. 1038-1046, Oct. 1989. 73. K. S. Narendra and A. M. Annaswamy, in Adaptive and Learning Systems, K. S. Narendra, ed., Plenum Press, New York, NY, 1985. 74. L. Praly, "Robustness of model reference adaptive control," Proc. 3rd Yale Workshop on Applications of Adaptive Systems Theory, Yale University, New Haven, CT, 1983. 75. G. Kreisselmeier and B. D. O. Anderson, "Robust model reference adaptive control," 1EEE Trans. Aut. Control, vol. AC-31, pp. 127-133, 1986. 76. K. J. )kstrSm, "Analysis of Rohrs counterexamples to adaptive control," Proe. ~2nd IEEE Conf. Dee. Control, 1983. 77. L. Ljung, "Analysis of recursive stochastic algorithms," 1BEE Trans. Ant. Control, vol. AC-22, pp. 551-575, Aug. 1977.
The Maturing of Adaptive Control
35
78. C. E. Rohrs, L. VMavani, M. Athans, and G. Stein, "Robustness of adaptive control algorithms in the presence of unmodeled dynamics," Proc. 21st IEEE Conf. Dec. Control, 1982. 79. B. Riedle and P. V. Kokotovic, "A stability-instability boundary for disturbancefree slow adaptation and unmodeled dynamics," Proc. g3rd IEEE Conf. Dec. Control, 1984. 80. P. V. Kokotovic, B. Riedle, and L. Praly, "On a stability criterion for continuous slow adaptation," Syst. Control Left., vol. 6, pp. 7-14, June 1985. 81. L. C. Fu, M. Bodson, and S. S. Sastry, "New stability theorems for averaging and their applications to the convergence anMysis of adaptive identification and control schemes," Proc. 2~th IEEE Conf. Dec. Control, 1985. 82. R. L. Kosut, B. D. O. Anderson, and I. M. Y. Mareels, "Stability theory of adaptive systems: methods of averaging and persistent excitation," Proe. ~Jth IEEE Conf. Dec. Control, 1985. 83. R. D. Nussbaum, "Some remarks on a conjecture in parameter adaptive control," Syst. Control Lett., vol. 3, pp. 243-246, 1983. 84. D. R. Mudgett and A. S. Morse, "Adaptive stabilization of linear systems with unknown high frequency gains," IEEE Trans. Aut. Control, vol. AC-30, pp. 549554, June 1985. 85. J. C. Willems and C. J. Byrnes, "Global adaptive stabilization in the absence of information on the sign of the high frequency gain," Proc. INRIA Conf. on Analysis and Optimization of Systems, Lecture Notes in Control and Information Sciences, vol. 62, Springer-Verlag, Berlin, 1984, pp. 49-57. 86. A. S. Morse, "A three dimensional universal controller for the adaptive stabilization of any strictly proper minimum-phase system with relative degree not exceeding two," IEEE Trans. Aut. Control, vol. AC-30, pp. 1188-1191, 1985. 87. A. S. Morse, "High gain feedback algorithms for adaptive stabilization," Proc. 5th Yale Workshop on Applications of Adaptive Systems Theory, Yale University, New Haven, CT, 1987. 88. G. Kreisselmeier and M. C. Smith, "Stable adaptive regulation of arbitrary nthorder plants," IEEE Trans. Aut. Control, vol. AC-31, pp. 299-305, 1986. 89. B. Martensson, "The order of any stabilizing regulator is sufficient for adaptive stabilization," Syst. Control Lett., vol. 6, pp. 87-91, July 1985. 90. G. C. Goodwin and R. S. Long, "Generalization of results on multivariable adaptive control," IEEE Trans. Aut. Control, vol. AC-25, pp. 449-456, 1980. 91. H. Elliott and W. A. Wolovich, "Parameter adaptive control of linear multivariable systems," IEEE Trans. Aut. Control, vol. AC-27, pp. 340-352, 1982. 92. R. P. Singh and K. S. Naxendra, "Prior information in the design of multivariable adaptive controllers," IEEE Trans. Aut. Control, vol. AC-29, pp. 1108-1111, Dec. 1984. 93. A. Isidorl, Nonlinear Control Systems, 2nd ed., Springer-Verlag, Berlin, 1989. 94. J.-B. Pomet and L. Praly, "Adaptive nonlinear regulation: equation error from the Lyapunov equation," Proc. ~8th IEEE Conf. Dec. Control, 1989. 95. I. Kanellakopoulos, P.V. Kokotovic, and R. H. Middleton, "Observer-based adaptive control of nonlinear systems under matching conditions," Proc. 1990 Amer. Control Conf. 96. L. Hsu, "Variable structure model-reference adaptive control (VS-MRAC) using only input and output measurements: part II," Proc. ~7th IEEE Conf. Dec. Control, 1988.
36
Narendra
97. L. C. Fu, "A new robust model refeience a~laptive control using variable structure adaptation for plants with relative degree two," Proc. 1990 Amer. Control Con]. 98. K. S. Narendra and J. D. Boskovic, "A combined direct, indirect and variable structure method for robust control," IEEE Trans. Aut. Control, to appeax. 99. M. A. Duarte and K. S. Narendra, ~A new approach to model reference adaptive control," Int. J. Adopt. Control Sig. Process., vol. 3, pp. 53-73, 1989.
A Conceptual Framework for Parameter A d a p t i v e Control* A. S. Morse Department of Electrical Engineering Yale University
P. O. Box 1968 New Haven, CT, 06520-1968,USA.
A b s t r a c t . A conceptual framework is described in which a parameter adaptive control system is taken to be the feedback interconnection of a process •p and a parameterized controller •c(k) whose parameter vector k is adjusted by a tuner ~ T . The framework is general enough to encompass Mmost all parameterized controllers proposed in the literature for stabilizing linear process models. Emphasis is placed on the importance to adaptation of one of ~C'S outputs called a tuning error eT, which is the main signal driving E T . For the closed-loop parameterized system E(/;) consisting of ~p and Ec(/r), definitions and characterizations are given of the concepts of weak tunability and tunability of E(b) on a subset £ of the parameter space 7~ in which k takes values. It proves to be necessary to know a subset ~ on which E(k) is weakly tunable in order to be able to construct a tuner ~T which adaptively stabilizes E(/~). For a large class of linear multivaxiable process models, a properly designed certainty equivalence controller results in a tunable closed-loop parameterized system. The importance of this result to both the analysis and synthesis of parameter adaptive control systems is discussed. It is demonstrated by means of examples how the connection between certainty equivalence and tunability, together with the concept of tunability itself, can be used to markedly simplify the stability analysis of adaptive control systems. A new family of indirect parameterized controllers is described which have capabilities comparable to those of the well-known direct parametefized controllers which for a long time have served as basic building blocks for adaptive control systems of all types. The concept of implicit tuning is formalized and its potential importance to adaptive control is briefly discussed.
Introduction In a series of recent papers [1,2], a conceptual framework is introduced for defining, discussing and anMyzing parameter adaptive control systems of all types. The idea is to think of an adaptive control system as the feedback interconnection of a process ,UP and a "parameterized controller" E c ( k ) whose parameter vector k is adjusted by a "tuner" ~7T. A parameterized controller is a parameterdependent dynamicM system whose outputs include not only a control signal u c * This research was supported by the National Science Foundation under grant ECS9012551.
38
Morse
which in closed loop serves as the feedback control to the process, but also a suitably defined ~tuning error" eT which during adaptation drives the tuner ,UT. Depending on the application, eT might be an identification error, an augmented error, a process output, or something similar. In this context, a tuner is any algorithm driven by eT which adjusts k. N o matter h o w eT and ,UT are defined, the purpose of `UT is always the same: to tune k to m a k e eT "small". T h e aim of this paper, which is an abbreviated version of [1] and [2],is to discuss some of the implications of thinking of an adaptive control system from this point-of-view.
Section 1 briefly summarizes the salient features of a parameter adaptive control system viewed in the aforementioned way. In Section 2 attention is focused on properties of the subsystem ,U(k) consisting of `up in feedback with `uc(k). The concept of "tunability" is defined and discussed. Under mild conditions, tunability of `U on a known subset C of the space in which k takes values, proves to be necessary for adaptive stability. The problem of choosing `uc so that `U has this property is addressed in Section 3 for the case when ,Uc is "identifierbased". By an identifier-based paxameterized controller ,Uc is meant a system consisting of an "identifier" ,~x and an 'internal regulator" or "certainty equivalence controller" ~'R- ~'R is chosen off-line to endow a suitable defined "design model" ~UD with desirable features (e.g., internal stability) for each fixed value of the design model's parameter vector PI- The design model is presumed to have a transfer matrix which is "close" to that of the process model, for some unknown value of PI- The coefficient matrices defining L'D are related to those of `UI by an appropriately defined, parameter-dependent, linear transformation. By making use of this transformation, it is possible to prove that if the internal regulator stabilizes `UD for some set of possible design model parameter values ~, then the closed-loop system ,U consisting of ,Up, ,Ux and ,UR is tunable on C-this is the Certainty-Equivalence Stabilization Theorem. For the case when ,Up is minimum phase, to achieve tunability on C, it turns out to bc enough to require ,UR only to "output-stabilize" ~'D on ~. This result, called the Certainty Equivalence Output Stabilization Theorem, is the key technical link needed to explain how to achieve with indirect control those results achieved previously with direct adaptive control. The preceding theorems, which can each be proved using very elementary arguments, have many useful consequences. Since in their proofs there is no presumption of process model matching, the theorems can be applied without having to talk about "true process model parameter values", an idea which loses much of its meaning as soon as the process under consideration is assumed to have unmodeled dynamics. Since the theorems are parameterization independent, it is possible to use them to analyze directly and indirectly parameterized adaptive controllers in essentially the same way. Since the theorems enable one to sidestep much of the extraneous, detailed structure of error modcls in studying system behavior, it is possible to dramatically simplify the stability analyses of many adaptive control systems. An example which demonstrates this is given at the end of Section 2. To make things explicit, examples are given in Section 3, which show how
A Conceptual Framework
39
to apply the preceding ideas to SISO parameterized design models of both the direct and indirect control types. To output stabilize the latter it is necessary to restrict its parametric structure; what results is a parameterized design model which, together with its output stabilizing internal regulator, provides a new family of indirect, parameterized controllers, with capabilities comparable to those of the well-known direct parameterised controllers which for a long time have served as basic building blocks for adaptive control systems of all types [3-8]. Section 4 formalizes the concept of implicit tuning and explains how to analyze implicitly tuned systems using the ideas developed in Section 3. By implicit tuning is meant the adjustment of an identification parameter vector kx and an implicitly defined internal regulator parameter vector kR to make both a tuning error (e.g., an identification error) and a suitably defined "design" error simultaneously small. Implicit tuning algorithms can be either of the "one level" or "two level" type. We present examples of each and briefly discuss some of their possible generalizations. In the sequel, prime denotes transpose, IBn is the real linear space of ncomponent column vectors z, and Ilzll is the norm x/-~~.
1 Framework Classically [3-8], parameter adaptive control systems have been defined and discussed in terms of error models in which parameters typically enter linearly. It turns out t h a t for many adaptive algorithms (e.g., [9-13]) error models play no essential role and parameters enter nonlinearly. Hence, a new conceptual framework in needed to describe both the classical and more recent adaptive structures. To construct such a framework it is useful to view a parameter adaptive control system as the feedback interconnection of a process ,Up and a parameterized controller E c ( k ) whose parameter vector k is adjusted by a tuner ,UT. The process is a dynamical system with control input u E IR'~ , disturbance input w E IW'w and measured output y E IRn ' . The parameterized controller is a dynamical system ,Uc(k) depending on a control parameter k which takes values in some parameter space 7~ C IK"'. The inputs to 57c are the process open-loop control u, the process output y, and possibly a reference input r. The outputs generated by 27c are a tuning error ew which drives ,UT, a control signal uc which becomes the feedback control to the process when u is set equal to uc, and a vector of supplementary tuning data d consisting of known functions of r , y and the parameterized controller's state. The tuner is an algorithm ~[Tw(k0) initialized by k0 (i.e., k(0) = k0) with inputs d and ew and output k, k(t) being the "tuned-value" of k at time t; ,Uw(k0) is presumed to be well-defined for each
k0 e~,. The function of •W is to adjust k to make ew "small" in some sense. Although the specifics of ~T may vary greatly from algorithm to algorithm, in most instances tuning is carried out in one of two fundamentally different ways, depending on whether 7~ is countable or not. For the countable case (e.g., see
40
Morse
TUNER
zT w
1
I PARAMETERIZED CONTROLLER
PROCESS
=
Zp
~:c(k)
t. Fig. 1. Parameter adaptive control system.
[10,13]), tuning is achieved by sequentially stepping k through T~ along a predetermined path, using on-line (i.e., real-time) data to decide only when to switch k from one value along the path to the next. In contrast, for the uncountable case (e.g., see [3-8]) the path in :P along which k is adjusted is not determined off-line but instead is computed in real time from "gradient-like" data. The main advantage of countable search algorithms over gradient-like procedures appears to be their broader applicability. On the other hand, when applicable, gradientlike algorithms are likely to exhibit far superior performance, but so far this has not been clearly demonstrated. Since a tuner 2~W is an algorithm driven by ew, ~T will typically possess certain "rest" or "equlibrium" values of k at which tuning ceases if eT ---- 0. To be more precise, let us agree to say that in open-loop, a tuner ,UT is at equilibrium value Po E 7~ at time t0 > 0, if k(to) = Po and if for the input eT(t) = O, t >_ to, lc(t) remains fixed at P0 for t > to. In the sequel we assume that ,UT(k0) is stationary to the extent that its possible equilibrium values at to are independent of to and k0 E •; and we define the tuner's equilibrium set ET to be the set of all such values in •. One fairly general algorithm for a tuner might be a dynamical system of the form
= l(k) + k(0) = k0,
2(k, d)eT (1)
where the 7ri(.) are nonlinear functions and k0 E IR"~. In this case 7~ can be
A Conceptual Framework
41
taken as lR '~" and ~:T = {P : ~rl (p) = 0}. Alternatively, ,UT might be a switching algorithm of the form [13]
k(O =
ko, t e [t0,tl) h(i), t e It. t~+l), i ___ 1,
where h is a function from the positive integers to IR"', ko E image {h}, to = 0, and for i > 0 •
t
t, = [ mm,>,,_~ { t : f~ HeT[ldr = i!}, if this set is nonempty, [ oo, otherwise. In this case :P is the image of h (which is countable), and ET = :P. The ideas which follow apply to both types of algorithms. Thus, unless otherwise stated, no special assumptions will be made about ,UT, other than that it be stationary and possess a nonempty equlibrium set. In this paper we assume w to be zero, since the presence or absence of disturbances is not important for the points we want to make. We take the process model ,Up = (Cp, Ap, Bp) to be a member of some known class Cp ofstabilizable, detectable, time invariant, continuous-time linear systems; i.e., ~:p = A p x p
+
Bpu
y = Cp.Tp .
(2)
As a parameterized controller 27c(k), we shall consider a system of the form
kc = Ac(k)zc + Bc(k)y + Bv(k)r + B,(k)u ~c = F c ( k ) x c + Gc(k)v + G.(k)~ e T ---- C c ( / c ) a ~ c +
Dc(k)y,
(3)
where Ac(.), Bc(.), .... , Dc(-) are matrix-valued functions on :P. An equation for d is not needed for the discussion which follows. 2 Tunability The closed-loop parameterized system E(k) determined by (2), (3) and the feedback connection " = "c, (4) can be concisely denoted by the equations = ACk)z + BCk)r eT = C ( ~ ) x ,
(5)
where z = [z~,z~]'. Here A(.),B(.), and C(.) are defied in the obvious way using (2)-(4). In this setting, the following question arises: W h a t must be true of E in order for there to exist a tuning algorithm 2YT for which the closed-loop
42
Morse
adaptive system, consisting of ~TT and 57, is "stable"? In the sequel we provide some answers to this question. With £ any fixed, nonempty subset of ,U's parameter space 7~, let us agree to call (5) weakly tunable on E, if for each fixed p • g and each bounded, piecewisecontinuous exogenous input r : [0, oo) ---* IRnr, every possible system trajectory for which k(t) = p and ew(t) = 0, t • [0,oo), is bounded on [0, c~). Call 57 tunable on L" if for each p • t;, x goes to zero as t ~ oo along each trajectory on which k(t) = p and both ew and r equal zero, t • [0, c~). R e m a r k 1: It is easy to verify that 57 is weakly tunable on E just in case, for each p E £, the matrix pair (C(p), A(p)) is weakly detectable ~ and the matrix pair obtained by restricting C(p) and A(p) to the controllable space of (A(p), B(p)) is detectable. Similarly, (C(p), A(p)) is tunable on £ if and only if (C(p), A(p)) is detectable for each p • g. Thus, tunability of 57 on E implies weak tunability of ,U on S and is equivalent to weak tunability of ~U on £ whenever 2~ is controllable on £.O Our aim here is to briefly explain why weak tunability is necessary for adaptive stabilization. To be specific, call ET(.) an unbiased stabilizer of S if for each initialization k0 E 7~, each bounded, piecewise-continuous input r : [0,c~) --~ llq.n" and each initial state x(0), the state response x of Z ( k ) , with k tuned by 2~w(k0), is bounded on [0, c~). Suppose 57T is a candidate tuner for 27(k). The definition of weak tunability implies that if ,U(k) is not weakly tunable on ,Uw's equilibrium set £T, then for some exogenous input r, initial state x(0), and parameter value P0 E ~W, the untuned system 57(po) will admit an unbounded state response x(t) along which ew(t) = 0. If the same input r is applied to 57(k), with k tuned by -UT(P0), then clearly k(t) = Po, t >_ O, and ew(t) = 0, t > 0. Therefore ~ ( k ) will have exactly the same unbounded response to r that S(p0) has. We are led to the following T h e o r e m 1: A necessary condition for the tuner S T ( ' ) to be an unbiased stabilizer of the tuned system S ( k ) , is that S ( k ) be weakly tunable on the equilibrium
set CT o/ST(.). Clearly, weak tunability on ~ r is a fundamental property that any parameter adaptive control system of the aforementioned general type must have if stability is to be assured. 3 An interesting problem then, is to determine what's required of 2 A matrix pair (C, A) is weakly detectable if for each vector x for which CeAtx is bounded on [0, co), it follows that eAtx is bounded on [0, co) as well. (C, A) is detectable if for each eigenvalue-eigenvector pair ()~,x), .~ has a negative real part whenever Cz = 0. Detectability implies weak detectability but the converse is not necessarily true. For algorithms utilizing persistently exciting probing signals, weak tunability on ET may well be more than is required for stability.
A Conceptual Framework
43
a process model Z:p and its parameterized controller ZTc for the resulting closedloop system 27 to be weakly tunable or tunable on some given subset g C P . This problem is discussed further in Sect. 3. In the sequel we give some examples of tunable and untunable systems. E x a m p l e 1: Suppose for £7p, we take the one-dimensional system it = a y + O u ,
(6)
with a and g unknown constants satisfying a > 0 and g ~ 0. To stabilize this system, consider using a control law of the form
u = fu,
(~)
where, if we had our d'rnthers, we'd choose .f so that a + 9 f ' - - - 1 since this would stabilize (6); but since a and g are unknown, we might instead try to choose f i n accordance with the "certainty-equivalence principle" (of. Sect. 3} so that ~ q- ~ f ' = --1, (8) where ~ and ~ are estimates of a and g respectively. However, since standard identification algorithms may cause ~ to pass through zero, to avoid the possibility of 'division by zero', in place of (8) consider tuning f with the "gradient" adjustment law f = ----2 o]
= --geD,
(9)
where eD is the "design error" eD -- a + ~ } + I
(10)
(cf. Sect. 4). Finally, to construct estimates ~ and ~, observe from (6) that
y = (a+ 1 ) ~ + g ~ + e , where e = e - t ( y ( O )
- (a +
1)9(0)), and
b+ u = u u+fi=u.
(11)
Thus, to generate ~ and fi, it makes sense to use an algorithm driven by the "identification error" el -- (~+ 1).~+ ~ f i - y, (12) since this results in the familiar error equation e, = 9(a - a) + ~ ( ~ - g) - e.
(13)
If a standard identification algorithm is used, identification ceases when ex = 0, in which case ~ and ~ become constant. Viewing this algorithm together with
44
Morse
(9) and (10) as a tuner ,UT with tuning error input eT = ei and tuned parameter k = [~, ~, ~ ' , ,Uw's equilibrium set will be ET = { [Pl,P~,l~]' : PI(P2 + P l P a + 1) = 0, [Pl,P~,P3]' • IRa } • The overall parameterized controller ,Uc is thus ~= -,9+y ~=-~+u eT =
+
(14) +
y
BC ~ f y ~
and ,U(k) is the closed-loop parameterized system, described by (6), (14) and the feedback connection u = uc. (15) Observe that the point [0, a, 0]~ • F.T. It is easy to verify that for k fixed at this value, Z admits the unbounded soultion y - e ~t, ~ "- e~/(1-l-a), ~ = O, e T " - O, so ~U is not weakly tunable on ~T.O E x a m p l e 2: In E x a m p ~ 1, 57 is not tunable on ~T because ET contains points [~, ~, j~t at which ~ + ~ f > 0. It is possible to eliminate this problem and to achieve stability, if sign(g) is assumed known, by using in place of (9) the adjustment law f = --(sign(g))eD, together with the tuning equations 4 ,k a
= - - e D - - ,~e T
~----- - - f e D
-- ~eT •
In this case ~T'S equilibrium set is precisely those points [ f f , ~ , ~ E 1Ra for which (8) holds. It is straightforward to check that for k fixed at any point [Pl,P2, P3]' • £T, y(t) = pl(fi(0)--p3Tg(0))e -~ along any solution [y(t), ~(t), fi(t)]' to (6,14,15) for which ew(t ) ~ 0. Since this, (14), and (15) imply that any such solution is bounded, Z7 is now tunable on ST.O E x a m p l e 3: Take P = IR and let 57 be any closed-loop parameterized system with C ( k ) = [1,0] and
Since (C(.), A(.)) is detectable on :P, by Remark 1 ,U must be tunable on each nonempty subset £ C 7). In spite of this, observe that no matter how k is tuned, ,U(k) can have an unbounded state response (e.g., with r = 0 and x(0) = [1, 0]~, z ( t ) = et[1, 0]'), so adaptive stabilization is impossible.O 4 Motivation for these tuning equations stems from (13) and the observation that eD can be written as eD = ( ~ - a ) + f ( ~ - g ) - b g ( f ' - f ) [14]; this is an example of "single level implicit tuning" (cf. Sect. 4).
A Conceptual Framework
45
The preceding example shows that tunability of 27 on a known subset £ C IRn ' , is not sufficient to insure that there is a tuner ~'w which can adaptively stabilize ,U(k). However, using the ideas from [10], it can be shown that if 27 is tunable on £ and, in addition £ contains a countable subset £" containing a parameter value p0 for which L'(p0) is internally stable, then without knowing P0 it is possible to construct a switching algorithm 27T, with gw = £*, which is an unbiased stabilizer of,U(k). Thus for the case when P = IR"P , to achieve adaptive stability with some tuner 27T, it's enough to design ,Uc so that (C(.), A(.)) is detectable on a known countable subset £* C IR'*" containing a point P0 which stabilizes A(po). T h e following example shows that this is very easy to do without assuming very much about ,Up. E x a m p l e 4: For a fixed integer n > 0, define the parameter vector p = [Pl, Pu, • .. ,P2n+t]', and parameterized polynomials/~(p, s) = s'* + p n s " - I + . . . p2s + Pl, a ( p , s ) = p2n+ls n + p2ns n-1 + . . . p n + 2 s + P,~+I. Choose 7(s) to be any monic stable polynomial of degree n and let •I(P) - ( A l ( p ) , b i ( p ) , c l ( p ) , d l ( p ) ) and
= (A2(p), b2(p), c2(p), d2(p)) be n-dimensional realizations of
8)/V(s)
and 7 ( s ) / 3 ( p , s) respectively. Define E c ( k ) to be the cascade interconnection of ,Us(k) with 271(k) as shown in Fig. 2.
U
=
Zl(k)
Zp
:
Y
Z2(k)
Fig. 2. Parameterized system •(k).
Fix k at any value p E IR2n+1. Observe that along any loop-response for which eT is identically zero, both u and y must go to zero since 271(P) is stable and ,U2(p) has a stable proper inverse. From this and the standing assumption
46
Morse
that ,Up is stabilizable and detectable it follows that for each p E IR2"+x, ,U(p) must be detectable through eT; s in other words, ~7 is tunable on 7~ = IR 2"+I . Let E* be a countable, dense subset of T*; clearly, 27 is also tunable on E*. Moreover, if in the linear space I~l×nP (~ ~npXnp (~ i~npxl, ,Up is sufficiently close to a system which is stabilizable, detectable and of McMillan degree not exceeding n, then there must exist a vector P0 E E* for which A(p0) is stable. Thus E* and ~7 will have what's required for adaptive stabilization as long as ,Up is close enough to a stabilizable, detectable system with McMillan degree no greater than n.O Before concluding this section, we illustrate by means of an example how the concept of tunability can be used in the analysis of an overall adaptive control system. Although in the example which follows, the tuner is of a very simple form, the reasoning can be extended, without major modification, to handle the most general gradient-like tuning algorithms which have been considered so far in the literature [1]. E x a m p l e 5: Consider the widely studied situation in which the tuning error eT in (5), is a scalar-valued identification error which can be reexpressed in the form eT = z~c(k - Po) + e, (16) where P0 is a fixed but unknown vector in 7) = IRnP and e is a linear combination of decaying exponentials (of. Remark 2, Sect. 3). Assume that A, B, and C are continuously differentiable on ~o, that `u(k) is tunable on 79, and that ,UT is a "gradient-like" tuner of the form k=--ZC~T,
(17)
where ~W is the "normalized" tuning error
eT
=
eT/(1 +
IIx¢ll =) =
eT -IIx¢ll=eT.
(18)
Because the overall parameter adaptive control system consisting of (5), (17) and (18) is a dynamical system with state (x, k) and locally Lipschitz vector field, for each initial state (x0, k0) there must exist an interval [0, T ) of maximal length on which a unique solution (x, k) existsfi Along such a solution, (16)-(18) can be used to derive the expressions
d (Ilk - p0112)/a
-- - 2 ( ( 1 + Ilxcll2)
-
_ --(1 + 211xCII2)
+
5 It is interesting to note that this will no longer be true if 271 and 272 are interchanged, unless ~Tp is restricted to be minimum phase. o Since there is no global existence theory for differential equations whose vector fields are only locally Lipschitz, it cannot logically be concluded that T = oo, unless (for example) it can first be established that (z, k) is bounded wherever it exists (cf.
[15]).
A Conceptual Framework
47
Rearranging terms and integrating one obtains
Ilk -
po[I2 +
(1 + 2llzcll2)~,~dt _< co,
where c 0 - Ilk(0) - p01l 2 + f o e2d " Since c0 < oo, it follows that on [0, T) k is bounded and ~T and ]z are square integrable; moreover, the boundedness of k together with (16) and (18) imply that ~T is bounded as well. Up to here the analysis is straightforward and completely standard. To get beyond this point and show that x is bounded has in the past been a challenging matter requiring intricate, lengthy arguments. It is exactly here where the concept of tunability proves its worth. To show that z is bounded, we need two technical facts:
1. Because of tunability, there exists a matrix H(p) which stabilizes A(p) + H(p)C(p) for each fixed p E P ; moreover, because C and A are continuously differentiable, H can be constructed (using a Riccati equation) to be continuously differentiable as well (cf. [1]). 2. The matrix A.(t) = A(k(t)) + H(k(t))(C(t) + [0, k(t)')]) is exponentially stable on [0,T); i.e., the state transmission matrix ~(t, r) of A.(t) satisfies Ilk(t, )11 ___cl 0< < t < T , where $ and cl are positive constants depending on H and co. This result, which depends only on A + HC being a continuously differentiable stability matrix, k being bounded and k being square integrable, is well-known and can easily be proved using a standard argument along the lines of [16,17,1]. Armed with these technical facts, a proof that z is bounded is at hand: All one has to do is to use the expression ~W = (C(k) + [0, k'])x, which is a consequence of (5), (17) and (18), to rewrite (5) as ~ = A.(t)x - H(k(t))'~w. Since A, is exponentially stable and H and ~T are bounded, x can be viewed as the output of an exponentially stable linear system with bounded input -HEw. Therefore x must be bounded wherever it exists. From this, the boundedness of k and the hypothesis that [0,T) is the existence interval of maximal length it follows that T = ¢x~ and thus that both x and k must exist and be bounded on [0, c¢). The reader may wish to compare the simplicity of this reasoning, with that used previously in [4,6,7,8] and elsewhere.O 3 Identifier-Based
Parameterized
Controllers
For the purposes of this paper, a parameterized controller ,Uc(k) of the "identitier-based" type consists of two subsystems, one an "identifier" 2Yi(ki) depending on an identification parameter vector ki, the other an 'Snternal regulator" ,Ua(k) depending on a control parameter vector k. Typically k = kl; in some cases however, k = [k~, k~]' where ka is a vector of additional internal regulator parameters (cf. Sect. 4). Depending on how ,~x and ,Ua are defined, ,Uc may be called either a "direct" or "indirect" controller.
48
Morse
Although not always explicitly stated, the starting point for the development of an identifier-based parameterized controller is typically the selection of an identifier parameter space :P! C ]RnP~ and a design model,UD(Pl) upon which the design of 2Yc is to be based. ~D(PI) is chosen to have the same control input and measured output value spaces as the unknown process model L'p to be controlled, and also so that the process model transfer matrix from u to y can be presumed to be "close" in some sense to the transfer matrix TD(s,pI) of 2~D(pI) for some unknown P I E ~i. 7 The reduced transfer matrices of ED(pi) as PI ranges over ~I, are the transfer matrices which ,UD(pI) matches. It is of considerable practical importance to develop adaptive stabilizers for design models whose classes of matchable transfer matrices are as "large" as possible.
L
rI
ZR(k)
I I
Y
ZI0~i)
f
I
I I
T
Fig. 3. Closed-loop system ,U using an identifier-based controller ,Uc.
In most applications, ,UD can be chosen to be of the form (CD(pI), AD + DD(PI)CD(PI), BD(pI)), where B D , . . . , DD are matrix-valued functions on 7~i and AD is a constant stability matrix. The stability of AD implies that 2YD is detectable on PI. On the other hand, stabilizability of ,UD on 7~i generally cannot be guaranteed unless special constraints are placed on ~! [16,17] or ~UD is nonlinearly parameterized [18]. This has important implications which are well known [3,5,6,7,8] and which will be discussed briefly in the sequel. 7 By exploiting the concept of a design model, it is possible to discuss adaptive control algorithms without being too specific about Sp ; this is of course consistent with the idea that a process model can never be known exactly.
A Conceptual Framework
49
Because of the particular structure of ,~D(PI), it is a routine m a t t e r to construct a matrix-valued function EI(pI) and constant matrices A h BI, and DI, with AI stable, so that the estimation equations
EI(w)AI = ADEx(r ) EI(PI)BI = BD(pX) EI(/~)DI = DD(PI)
Ple
7~I
(19)
Ple
:PI
(20)
hold. s These matrices together with the matrix Ci(PI) -~ CD(PI)E(pl) determine an identifier S,i(ki) of the form xI -'- AxzI +
DIy +
BI u
eI ----"CI(kI)gl -- y
~D : EI(kl)xI,
(21) (22)
(23)
where zi is the identifier's state, e] is an identification error, and zD is a "surrogate" design model state which is often used in constructing the feedback control to the process. 9 The estimation equations imply that if kl were held fixed at Pl and u and y were replaced by design system input up and output YD respectively, then 2YI would be a state estimator for the design system kD ~- (AD + DD(PI)CD(PI))ZD + BD(Pl)UD uo = c D ( r a ) - o ,
(24)
in that ZD would be given by ~D(t) + eAD'(ZD(0) -- ~D(0)). E x a m p l e 6: I n d i r e c t C o n t r o l S I S O D e s i g n M o d e l - Pick n > 0 and choose (c, A, b) to be any n-dimensional, controllable, observable, SISO system with A stable. For each pair pl,P2 E lRn, define PI = [P~,p~]', AD = A, BD(PI) = P2, CD(PI) = c and DD(PI) = Pl. Then with 79I = IR ~", 2YD is the n-dimensional design model described by the equations
= (A + p : ) Z D + p2uo YD -~" CXD.
(25)
s These matrices can always be chosen so that EI depends linearly on the columns of BD and DD and also so that each eigenvalue of AI is an eigenvalue of AD. One method for doing this can easily be obtained by generalizing the construction in Example 6. 9 Note that each identifier (21)-(23) uniquely determines a parameterized design model (q(PI), AI + DICI(PI), BI(pI)), called 27i's natural design model, for which (19) and (20) hold with EI = I. Although this design model's class of matchable transfer matrices is exactly the same as that of the design model which originally determined EI, the former can often have a state space dimension which is significantly larger than that of the latter.
50
Morse
This is one form of the parameterized SISO design model commonly used for indirect control. As defined, 27D is observable on 7~i and stabilizable at those values ofpi E 79I for which the transfer function of Z:D has no unstable pole-zero cancellations. In addition, ,UD can match any transfer function with McMillan degree not exceeding n. Define identifier matrices Ax = block diag {A, A},/3I = [0', b¢]', DI = [b', 0']'. It is easy to check that estimation equations (19) hold with
=
(26)
where S(p) = [M1Rp, M2Rp,..., M , Rp]', R is the inverse of the controllability matrix of (A, b), Mi is the transpose of the observability matrix of (e~, A) and ei is the i-th unit vector in IBn. A simple calculation shows that cS(p) = p ' Q ' ,
(27)
where Q is the parameter transformation matrix Q = NoR and N is the observability matrix of (c, A). Thus, for this design model, CI(PI) = [P~Q', p~Q'], which is linear in pI-<> E x a m p l e 7: D i r e c t C o n t r o l S I S O D e s i g n M o d e l - Let n and (c,A,b) be as in Example 6. Pick m > 0 and choose (A., b.) to be any single-input, mdimensional, controllable pair with A. stable. For each Pl,P~ E IBn and P3 E ! ! IBm , define Pl = [P~,P~,p3] ,BD(Pl) ----- [p~,b~,]',CD(pi) : [01×r~,P3], DD(Pl) :
[p~, Oa×m]', and A
0
AD= [b.cA.]" With :PI = IB2n+,~, ~UD(px) is the (n + m)-dimensional parameterized design model XD1 = AZD1 -b PlP3I x D2 q-p2uD ~D2 -" b, czD1 -I- A.XD2 -I- b.uD
(28)
I x
YD----P3 D2, which, as will be shown in the sequel, is appropriate for direct control. As defined, ,UD is detectable on 79I and stabilizable at those values of px E 79i for which the transfer function of 27D has no unstable pole-zero cancellations. The class 7" of transfer functions which 57D can match can easily be shown to contain all transfer functions of relative degree n* and McMillan degree not exceeding n, provided n ° < n and the characteristic polynomial of A. has a real factor of degree n ' . Thus, by picking m < n and choosing A. to have o n l y real eigenvalues, 7" will contain all strictly proper transfer functions with relative degree not exceeding m and McMillan degree not exceeding n. If m _< n + 1, 7- can also be shown to contain all strictly proper transfer functions of McMillan degree equal to m.
A Conceptual Framework
51
It is not difficult to verify that as an identifier 27i(kl) for this design model, one can use a system described by the equations
zx = A z l + by "22 -" A z 2 + bu
~. = A , x , + b,u fiI1 = A . H 1 + b.z~ [-I2 = A , H 2 + b,z~
(29)
~D2 = z. + H1Qkl + H2Qk2 el = k~(z. + H1Qkl + H2Qk~) - y, !
!
where S(.) and Q are as in Example 6, and ]~I = []¢1,]C2,k~]'. For a more motivated development of this type of identifier see [3]. Identifier equations (29) can of course be written in the form of (21)-(23), with a:i a column vector, using the definition a:x = [z~, z~, z.,' nl," ha,"..., h~,~]' where hj is the j - t h column of H ---- [H1,H2]; calculation of matrices A I , B I , D I and EI(pI) for which (19) holds is a straightforward, but tedious task. For example, EI(PI) is of the form EI(pI) =block diag {[S(pl), S(p2)], T(Ia ) } where T(pl) = [I, -e~ Qpl I, . . . , t - e ~ n Q p l I , - e ~ Q p 2 I , . . . , - e nQl~I]mx(2n+l)ra and el is the i-th unit vector in IR". The linear dependence of Ei(pi) on [P~,//2]' and CD(PI) on P3 imply that CI(pI), as defined in (20), is a bilinear function of [P~,/12]' and/~.<> R e m a r k 2: Although not central to the purpose of this paper, at this point it nevertheless seems logical to digress briefly and give an alternative formula for ex, which is of particular importance to the analysis of all identifier-based parameter adaptive control systems. For this, fix P0 E PI and let 270 = (Co,Ao, Bo) be an n0-dimensional, observable, stabilizable realization of the reduced transfer matrix of 27D(P0)- Let 120 be an open connected set of observable systems in lK"~xn° @ IR'~°xn° @ I~ '~°x"" which contains £70. The expression for ei given by the following theorem is the key relationship upon which the analyses of most tuning algorithms are based.<> I d e n t i f i c a t i o n E r r o r T h e o r e m : There exists an integer n > O, and analytic f u n c t i o n s A :I20 --*lRnXn,B : J2o ~ ] R n x n ' , c : 120 ~ l ~ n ' × n , a n d D :120 --* ~:~nU×n" with the following properties:
(i) spec (A(27)) C spec (AI)U {uncontrollable spectrum of £70}, VZ' E /20. (ii) C(Zo)eA(:~°)'[B(Zo),D(Eo)] = O, t >_ 0 (iii) g Ep no. the,, for each =p(O) C e ,rid each pieeewi, econtinuous input u: [0, oo) ---, IR"', el can be written as el = eMM 4" ~ 4- ~ ,
(30)
52
Morse
where eMM ----(CI(]¢I) - Ci(P0))=i e -- C(~'p)(eA(2p)t(Mp~vp(0) -4- Ui:vi(0))
-- f t C(~p)eA(2p)(t_r)(B(.~p)u(r ) + D(~p)y(7-))dr, and Mp and MI are constant matrices not depending on po.
A number of expressions for ei similar to (30) have been derived previously [19,20,21] and are by now reasonably well known. Equation (30) shows that if ,Up E K20,1° ei can be written as the sum of three terms, the first (eMM), called the "mismatch error", is due to parameter discrepancy, the second (e) accounts for nonzero initial process model and identifier states and the third (6) appears because of unmodeled dynamics. In view of property (i), e must decay to zero exponentially fast so long as ,Up E /-2o. Assuming a known upper bound on the real part of the uncontrollable spectrum of 2Y0, property (i) and the expression for ~ clearly make it possible to generate a '%ounding" signal fl (cf. [19]), using u and y, for which Ila(t)ll _< el(~P) "~ e2(2YP)IIfl(t)II, V ~ p E 1"~0;here c1(-) and e2(.) are positive-valued analytic functions on f20 which, because of property (iii), satisfy ei(2Y0) = 0, i = 1,2. Such bounding signals have been used to construct "dynamically normahzed" tuning errors in an effort to develop tuning algorithms capable of handling unmodeled dynamics [20,21]. The implications of (30) have not yet been fully exploited.<> The other subsystem of ,Uc(k), called an internal regulator 2~R(k), is often simply a state feedback law of the form UC ----FR(kl)~D -~- V R ( k I ) r
where FR(pI) and GR(pI) are matrix-valued functions designed to endow the closed-loop design model (CD(pI),AD + DD(pI)CD(pI) + BD(pI)FR(pI), BR(PI)GR(pI)) with some desired properties (e.g., internal stability) for each PI E :PI- More generally, ,UR(k) might be an nR-dimensional dynamic compensator of the form uc = CR(k)~R + GR(k)r + FR(k)~D + MR(k)y ~R = AR(k)~R + BR(k)r + DR(k)~D + NR(k)y,
(31)
l I I where k = [ki, kR] and kR is a vector of internal regulator parameters which takes values in some parameter space PR C nTt",a. In this case, internal regulator
lo Since ,Up will typically lie in many such subsets, each with a different value of po, there is nothing especially unique about the particular value ofpo appearing in (30), in relationship to ,Up.
A Conceptual Framework
53
matrices Crt, Grt . . . . . NR are chosen to give the closed.loop design model,UDd(p), consisting of the design model (24) in feedback with
UD : CR(p)zR -t- GR(p)r q- FR(p)ZD -I- Mrt(p)YD XR = AFt(p)zR + Brt(p)r + Dp.(p)zD + NR(p)YD,
(32)
desired properties for each p E ~ = PI x :PR- In the sequel it is assumed, without loss of generality, that (CR(p), Aa(p)) is a detectable pair for each p e 7~. In many situations, the tuning error eT is chosen to be the same as the identification error ei. There are nevertheless cases (to be discussed briefly later) in which it is useful to define eT in an alternative way. For the present, however, we assume eW = e l
(33)
and view an identifier-based parameterized controller ,Uc(k) to be as depicted in Fig. 3. The heuristic idea of designing an internal regulator to control a design model, and then using the regulator in conjunction with an identifier to control the process is sometimes called the "certainty equivalence principle." The resulting internal regulator 27R is accordingly called a certainty-equivalence control. Thus, an internal regulator is the same as a certainty equivalence controller; in the sequel the two terms are used interchangeably. 11 Our aim is to make clear, in algorithmically independent terms, one of the fundamental implications of certainty equivalence control. For this, set xo¢, = [zk, xb]' and define the closed-loop design model 27D°,(p) = (CD¢I(P), ADd(P), BD,,(p)) so that the equations
XD¢, = ADc,(P)XD¢I "k BD¢,(p)r = CD,,(p)xDo,
(34)
model the feedback interconnection of design model (24) with (32). ,UR is said to stabilize the design model ,Up at p if ,UD,t is internally stable at p; i.e., if AD,I(P) is a stability matrix at p. Let ~ be a nonempty subset of 7). A certainty equivalence controller £:R(k) is called a stabilizing controller on E if for each p = [p~,pk]' E £, ,UR(p) stabilizes design model 27D at p. We can now state the
I f 27R(k) is a certainty equivalence, stabilizing controller on ~, then 27(k) is tunable on E.
Certainty Equivalence Stabilization Theorem:
11 The definition of 27R in (31) might appear more in accordance with the intuitive idea of a certainty equivalence controner, if the terms involving ~D were not present. We've elected to include these terms since they actually do appear in the defining equations for many certainty equivalence controllers.
54
Morse
-]
L
~ R(p)
l
• o
I I
Pig. 4. Closed-loop design model ED,I.
The theorem makes clear one of the central implications of certainty equivalence control: If ~Urt is chosen to stabilize design model `UD on £ (cf. Fig. 4), and is then used as a component subsystem of ,Uc, the overall parameterized system ,U (cf. Fig. 3), consisting of 2Yc in feedback with 5Yp, is tunable on E. For this to be true it is not necessary for the process model ,Up to have a transfer matrix in the class of transfer matrices matchable by LYD. In this context the notion of "true process model parameters" is neither necessary nor relevant. It is usually not possible to make use of the preceding theorem to achieve tunability on all of Pl since 7~r typically contains points at which ~D is not stabilizable. Various ideas proposed in the literature might be used to deal with this problem. One is to use a tuning algorithm which restricts ki to a subset 75i of ~I on which loss of stabilizability does not occur [16,17]; since this in effect reduces ,UD'S parameter space to ~I, ~UD'S class of matchable transfer matrices is correspondingly reduced, typically by a large amount. Another idea [8,22] is to only stabilize 2Yr~ at points outside of some region T~ C :Pl which contains the points at which ,Up loses stabilizability; since tunability cannot usually be guaranteed at points inside of 7~, in order to avoid instability, some form of persistent excitation must be used to make certain that ki does not get stuck in ~ . A third approach might be to use a specially parameterized design model which does not lose stabilizability at any point in its parameter space [18]; this unfortunately leads to difficult tuner design problems for which there are at present no satisfactory solutions. A fourth approach, for the most part unexplored in the literature, is to make use of implicit tuning. This will be discussed briefly in Sect. 4. For the case when ,Up is "minimum phase", the problem we've been discussing can often be avoided all together. To understand why, we need a few more ideas. Recall that a constant, linear multivariable system ( C , A , B ) is m i n i m u m phase if its transfer matrix is left invertible and its transmission zeros 12 [23] are 12 For a system (C, A, B) with left-invertible transfer matrix T(s) = C ( s I - A) -I B, a complex number X is a transmission zero if there exists a nonzero vector [z', ul] ' such that C x --- 0 and ( h i - A)x + B u = O.
A Conceptual Framework
55
all stable; as defined, minimum phase systems are necessarily detectable. Call a matrix pair ( e , A) (or a system (C, A, B)) output stabilized if Ce At ---, 0 as t --* 00.13 27rt is said to output stabilize the design model 27D at p if 27Dc~ is output stabilized at p. Let g be a nonempty subset of :P. A certainty equivalence controller ~ R ( k ) is called an output stabilizing controller on £ if for each p -[p~, p~]' E E, ER(p) output stabilizes the design model ,Up at p. T h e following theorem provides the key technical link needed to explain how to achieve with indirect control that previously achieved in [4,6,7,8] and elsewhere with direct adaptive control.
C e r t a i n t y E q u i v a l e n c e O u t p u t S t a b i l i z a t i o n T h e o r e m : If 27p is minimum phase and 22R(k) is a certainty equivalence, output stabilizing controller on £, then E ( k ) is ~unable on E.
R e m a r k 3: The certainty equivalence output stabilization theorem does not require ~Up to have a transfer matrix which can be matched by 27D(pI) for some value of PZ E :PI- It is interesting to note, however, that in the case when exact matching is possible, to achieve tunability it is not necessary for the design model to be stabilizable, even at the "true design model parameter value" at which matching takes place! For example, the process model ~ = - 4 y T u is matched by the direct control design model ZD1 ---- --2ZD1 -b PlP3XD2 -I- p2UD, X D 2 - " --XD1 -- XD2 "b UD, YD : P3XD2, at p = [Pl,P2,P3Y -- [--6, 3, 1]' and for this value of p, the design model has an uncontrollable eigenvalue at s = 1. Since for obvious reasons a process model ,Up must always be assumed to be at least stabilizable and detectable, this example shows that a design model cannot necessarily be viewed merely as a process model with variable parameters.<>
R e m a r k 4: Both of the preceding theorems apply to the case when eT = ei. In order to facilitate the tuning of k it is sometimes helpful to use an alternative definition of ew. One possibility is ew ----[e l, e~t]' where e a = Cia(k)zR and C m ( k ) is some matrix-valued function of k; since detectability through el clearly implies detectability through eT, the preceding theorems apply to this case as well. Yet another possible definition of eT, useful in the case when AR is a constant stabihty matrix, DR = 0, and eT and eR are of the same size, is eT -- ex+eR; the preceding theorems are also applicable in this case since the stability of AR and the assumption DR = 0 insure that detectability through ei implies detectability through ew. This definition of ew finds application in model reference adaptive control as the following example illustrates.<> 13 Note that (C, A) is output stabilized if and only if each of its unstable modes is unobservable. On the other hand, (C, A) is detectable if and only if each of its unobservable modes is stable. It follows that (C, A) is both output stabilized and detectable if and only if A has no unstable modes; i.e., if and only if A is a stability matrix.
56
Morse
E x R m p l e 7 ( c o n t ' d ) : Observe that the control up = --CZD1 canses the o u t p u t YD of design model (28) to satisfy YD -- p3eA°tZDl(0) • Thus, the internal regulator matrix F a - I - c , 0] o u t p u t stabilizes `UD on IR 2'~+'~. Therefore, by the Certainty Equivalence O u t p u t Stabilization Theorem, if ,Up is minimum phase, the closed-loop parameterized system consisting of ,Up, identifier (29), internal regulator u -- --CXD1 - - - k ~ Q ' z 1 - k~Q'z~ (cf. (27)) and tuning error eT -- el is tunable on IR2'*+m. The simple dependence of this control law on ki is the principal virtue of direct control. T h e challenge in coming up with an overall adaptive control system based on this direct control design model, lies in the synthesis of an appropriate tuner since CI(kI) (i.e., the equation for ew) is nonlinear in ki. The tuning problem simplifies significantly if, for fixed n* < m, the characteristic polynomial of A. is chosen to have a real monic factor a . of degree n*, and P3 is constrained to be of the form P3 = p0c., P0 E IR, where c. is defined by the formula c . ( s I - A . ) - l b . = l / a . . In this case ,F,D([p~,p~,poc.]') has 2n ÷ 1 parameters and a class of matchable transfer functions containing only those of relative degree n* with McMillan degree not exceeding n. Moreover, e W "-- koC,(~, ÷ H 1 Q k l ÷ H 2 Q k 2 ) - y which is a form for which by now appropriate tuning algorithms are well known (e.g., see [17,19]). 14 To convert the preceding into a model reference algorithm, we need a reference model ETr which we define to be
XR ---~ A . z R ÷ b . r ,
(35)
and a modified control law which we take to be u a -- --CZD1 ÷ pRr. This control law o u t p u t stabilizes `UD on IR 2n+2 and results in the closed-loop transfer function POPR/a. from r to YD. Thus as before, the closed-loop parameterized system consisting of the minimum phase process model (2), the internal regulator u = --C~D1 + kR r -- -kll Q' Zl -- k'~W ~ ' Z2 + kRr (cf. (27)), the identifier (29) with modified identification error ei = k0c.(z, + Hi/el + H2t2) - y
(36)
and tuning error eT = el, is tunable on IR 2"+2. T o achieve model following it is necessary to change the definition of eT. To motivate the change, note from (35) that the ~racking error y - Yr can be written as Y -- Yr -- --(el ÷ eS) -~- ]¢0C*~ ,
(37)
where eR is the referenceparameter error ea -" (1- kOkR)Yr and ~ : z, + H I kl + H 2 ] g 2 -- ZRkR; moreover if k is differentiablethen by a simple calculation,
= A,~ ÷ HI]el ÷/-/~:2 ÷ ZRkR.
(38)
14 Things can be simplified somewhat by redefining kl and k2 to be Qk, and Qk2 respectively; for then, u = - k ~ z l - k~z2 and eT = koc,(z, + H l k l + H2k2) - y.
A Conceptual Framework
57
Equations (37) and (38) clearly show that if eT is defined to be eT -" [ei, eR]' then a zero tracking error will be attained asymptotically, along any system trajectory on which k becomes constant and eT goes to zero. Equations (37) and (38) also imply that this will still be true if eT is alternatively defined to be simply e1+eR. In this case, eT proves to be precisely the same as the "augmented error" originally introduced by Monopoli [24]. As noted in Remark 4, neither of these definitions of eT effect the tunability of the overall parameterized system E(k).O E x a m p l e 6 ( c o n t ' d ) : Simple examples show that it is impossible to output stabilize on lR 2n the indirect control design model ,UD of Example 6. One way to circumvent this problem is to modify the parameterization of ,Up. The price paid for this, however, is a significant decrease in the size of the class of transfer functions which the modified design model can match. The modified parameterization is constructed as follows: with c and A as before, choose a nonnegative integer m < n and a monic, stable polynomial a(s) of degree m; define vectors bl and b2 so that e ( s I - A ) - l b l = 1//3(s) and c ( s I - A ) - X b 2 = ot(s)/fl(s) respectively, where j3(s) is the characteristic polynomial of A. In place of (25), define the '~multiplicatively parameterized" indirect control design model SD(PZ) by the equations J:o = (A + poPlC)ZD -1- (b2 -'1"-Tp2)uD (39) YD = p O e Z D ,
where
T = [bx,Abt,... ,Am-%1],
(40)
t I ! Pl E IRa, P2 E IR a and Pt = [P0,Px,P2] • As defined, ,Up is detectable on :PI = IRn+m+l. However, unlike the previous indirect control design model which can match any transfer function of McMillan degree not exceeding n, the modified design model (39) can only match those transfer functions of McMillan degree with relative degrees exactly equal to n - m. This can be deduced using the identities
cAJ-lbi=O, j < n - m , cA'~-'~-tb, - 1, i E {1,2}
i6{1,2} (41)
which hold because the transfer functions 1//3(s) and a(s)/3(s) used to define bl and b2 respectively, both have relative degrees no smaller than n - m. What's especially important about the design model (39) is that it can be output stabilized on :PI by a continuously differentiable feedback function. P r o p o s i t i o n 1: Let T(s) be any monie, stable polynomial of degree n - m . f(PI)
-- --c~(A
+ pOPlC),
With (42)
the control law UD = f(pI)ZD output stabilizes the indirect control design model (39) on T'i.
58
Morse
Proposition 1 gives an explicit formula for a feedback function .f which output stabilizes the multiplicatively parameterized design model (39) for each PI E ~I. The n components of f are each polynomials in the n + 1 components of [P0, idl]'. It can easily be shown that ff f is defined by (42), then the closed-loop design model characteristic polynomial satisfies det ( s I - A - pople - (b2 + T1~)f(pi)) = a D ( s , p , ) 7 ( s ) ,
p E 79I,
(43)
where aD(s,pi) is the m-th degree, monic numerator polynomial of the transfer function c ( s I - A ) - l ( b 2 + T p 2 ) . It should be noted, however, that merely decreeing that f should satisfy (43) does not guarantee that f will satisfy (42) or that the closed-loop design model 27D¢~ will be output stabilized. In other words, the specification that f(pI) should assign to A + PoP1 + (b~ + T p s ) f ( p l ) the closedloop characteristic polynomial OtD(s,pI)7(s) does not pin down f enough for the purposes of indirect adaptive control. Proposition 1 implies that to achieve tunability on 79x, it's enough to choose f on T~I in accordance with (42) so that (43) holds, even though for some Pz E :PI,~D(S,pI) may be an unstable polynomial. By recognizing that Po~D(s,pi) is the numerator of the transfer function of open-loop, indirect control design model (39), it can be seen that what's being done here is a kind of "adaptive pole-zero cancellation" similar to what's often implicitly done with algorithms of the direct control type. To construct an indirect parameterized controller for adaptive model following, one can proceed from this point in essentially the same way as in the direct control case of Example 7. First, define an identifier for the multiplicatively pnrameterized design model (39) by modifying the identifier for the design model (25) in the obvious way; that is, by replacing c and ks in the identifier for (25) by koc and b9 + Tkg. respectively, where ks is an m-vector. What result are the equations ~1 = A z l + by 42 = Az2 + bu
= k0(klO'z + + = S(k )z + S(b2 + Tk2)z2,
-
y
(44)
where b, Q, and S(.) are as defined previously. Next, set 7 = a . and u -- f(kI)XD -I- k R r ,
(45)
where f is as defined in (42), ki = [k0, kl, k~]' e IR"+'~+1, krt is an additional internal regulator parameter to be tuned, and r is the input to the reference model (35). In view of the Certainty Equivalence Output Stabilization Theorem, the closed-loop parameterized system consisting of the SISO minimum phase process model ~ ' p , the parameterized controller (44),(45) and the tuning error eT = ei,
A Conceptual Framework
59
is tunable on IKn+m+1. To develop tracking error equations analogous to (37) and (38), firstuse (17),(35),and (44) to write Y -- Yr ~- --(ei -{- eR) -~- k0(eXD -- ]gRC, XR),
(46)
where, as before, eR = (1 -- kokR)Yr. Assume that k = [k{, kR]' is differentiable. Using the estimation equations (19) together with the definition of u in (45), one gets
~D = (A + kokle + (b2 + Tk2)f(ki)) "XD +(b2 + Tk2)kRr + S(kl)Zl + S(k2)z2 - koklex .
(47)
To proceed, we need the following result. L e m m a 1: Let T(s ) be as in Proposition 1 and write (~,A, b) for the unique con. troUable, observable realization of 1/7(s) with (A, b) in control canonical form.
If f(pI) satisfies (42) and C
e(A + PoP1c) M(pI) =
c(A + p0plc)"-"-I then for all Px E ]Rn+m+l, c = eM(pl)
M(pl)(b2 + Tp2) = M(pI)(A + poptc + (b2 + Tp~)f(pl)) = fi~M(pt).
(48) (49)
(50)
Now set 7(s) = a.(s), a.(s) being the characteristicpolynomial of A., and suppose without loss of generality,that (c.,A., b.) is in control canonical form; thus (~,A, b) = (c.,A., b.). Hence from (46) and (48), Y -- Yr : --(el + ez) + k0c,~,
(51)
where ~ = M ( k i ) ~ o - kRZR. Moreover from (35),(47),(49) .and (50) = A.~ % M(kI)~D % M(ki)(S(kl)Zl % S(k2)z2 - kokxex).
(52)
The imphcation of (51) and (52) is clear: If k is tuned in such a way t h a t k, ei and eR each tend to zero as t ---* oo, then the tracking y - Yr will tend to zero as well. Thus, an appropriate definition for ew would be eT = [eI, eR] I, just as in Example 7. It is interesting to note however, that unlike Example 7, zeroing the scalar tuning signal ew -- eI @ eR will not necessarily produce a zero tracking error because of the presence of ei in (52).0
60
Morse
4 Implicit Tuning As noted in Sect. 3, one of the things to be done in constructing a certainty equivalence controller for a given design model 2YD, is to select an internal regulator 2YR which causes the closed-loop design model 27Dd to have certain desirable properties. This task, which might be accomplished using established synthesis techniques from linear system theory (e.g., pole placement, model following, decoupling theory, linear quadratic optimal control, etc.) can often be reduced to finding a solution pR to some design equation of the form
(PI,PR) = 0,
(53)
where Pt E :Pl C 11~n'x is a vector of design model parameters, P a E Prt C lR2,a is a vector whose elements are elements of the coefficient matrices defining 2Ya and A is a "design" function mapping 11~"" ~IK",R into IR'*,R; (53) might arise from a Riccati equation, a pole placement equation or something similar. If (53) is solved explicitly for p a as a function of pt, then the resulting identifier-based controller's parameter space P is simply Pi and the only parameters to be tuned are the elements of kx. If, on the other hand, pl~ is taken to be a vector of additional regulator parameters, then the corresponding controller's parameter space :P is Px x :Prt and the parameters to be tuned are both the elements of k1 and those of kR; in this case the role of 2YT is to adjust the parameter vector k = [k~, k~]' to make both the tuning error ew and the design error eD = A(k,, ka)
(54)
simultaneously small. An identifier-based controller whose parameters are defined and adjusted in this manner is said to be implicitly tuned. 15 Implicit tuning algorithms can be classified as being of either the "one level" or "two level" type depending on which errors kl is being adjusted to make small. In the two level case, ki is adjusted to make just the identification error ex small whereas in the one level case k1 is tuned to make both the identification error e1 and the design error eD small. Two level tuning is thus more in line with the traditional idea of identifier-based adaptive control than is one level tuning and thus is perhaps more intuitively appealing. In addition, two level tuning allows one to use conventional identification algorithms to adjust kx whereas single level tuning does not. 16 In spite of this there are compelling reasons for also considering the concept of single level tuning. At the end of this section we provide an example in support of this view. The general idea of implicit tuning has been around in one form or another for a long time. The need for dynamically tuned regulator parameters is alluded to in [3]. Some of the reparameterization techniques developed in [26] can be 15 A more general concept of implicit tuning would allow the design function A(.) in (54) to also be a function of controller state z C. [21] is The design of a single level implicit tuner is a nonstandard identification problem for which there is apparently almost no literature even for the interesting, highly structured case in which C is linear in k and A is bilinear in ki and kR.
A Conceptual Framework
61
construed as a form of implicit tuning. T w o level implicit tuning is proposed in [27] in order to avoid some of the parametric nonlinearities which arise in connection with the multiplicative indirect control design model of Example 6. Single level implicit tuning is studied in [14] in a model reference context in an effort to achieve improved system performance. While there are no doubt many other examples of implicit tuning in the literature, the potential value of the concept does not appear to be widely appreciated. The aim of this section is to explain and illustrate by means of informal discussion and examples how the idea of implicit tuning fits into the framework of this paper and why it is likely to prove of importance to parameter adaptive control. In the sequel we briefly sketch how the concept of tunability can be used to analyze an implicitly tuned adaptive system. For this, assume that one has chosen a design model Z:D, a design function A and an internal regulator ~'R SO that for each value of the parameter p = [p~,idR]'in the set E =
:Z
(pl,pR) = 0},
(55)
the closed-loop design model 2YDd depicted in Fig. 4, is at least internally stable. Then, because of the Certainty Equivalence Stabilization Theorem, the closedloop parameterized system 57 shown in Fig. 3 and described by = A(k)~ + B(k)r e~ = c(k)x
(56) (57)
must be tunable on E. For purposes of analysis define the "extended tuning error" =
[ c(k) 1 [cE(k)J
(58)
where Cv.(k)z is a column representation of the elements of the outer product A(ki, kR)z'; in view of (54)
IICE(k)ll =
(59)
where npR is the size of PR- It is known (cf. the Tunability Lemma of [1]) that tunability of 27 on g implies tunability on P = PI x PR of the system described by (56)-(57) and the extended tuning error (58). Because of this and Proposition 1 of [1], we can state the following: O b s e r v a t i o n 1: There exists a matrix [H(p), HE(p)], depending on p as smoothly as C, A and A do, for which A(p) + U(p)C(p) + HE(p)CE(p) is a stability matrix for each p E 7). In the light of the preceding it is not difficult to see how to modify the analysis at the end of Sect. 1 in order to establish houndedness of x for an implicitly tuned system in which both eD and a normalized tuning error eT are simultaneously small. The following examples, which are of interest in their own right, serve to illustrate how this can be done.
Morse
62
E x a m p l e 8: T w o level implicit t u n i n g - Assume that 7~ = IRn'z (B IItn'a , that the closed-loop parameterized system (56)-(57) is tunable on E, that C and A are continuously differentiable and that eT can be reexpressed in the form e T = Wt(kI -- q) + e,
(60)
where w = Lz, L is a constant matrix, q is a constant vector, and c is a linear combination of decaying exponentials (ef. Remark 2, Sect. 3). Assume, in addition, that z~(pI,pa) has continuously differentiable first partials z~p~ = 0A/0pt and ,~pa = a A / 0 p a and that for each p = Lv~,p~]' E ~ , the smallest eigenvalue of the matrix A~aApR is no smaller than some positive number p; i.e., A~aApR >_PI,
(61)
Vp E 7~.
This last assumption is of course restrictive. Consider the simple two level implicit tuning algorithm defined by the equations
t~ = - - ~ T ~R = --A;a(~I, kR)eD ,
(62)
where ~T is the normalized tuning error
~T = e T / 0 + II~Jl2 + II~P~(kI, kR)~ll2) •
(63)
Let [0, T) denote the maximal interval of existence of the dynamical system defined by (54), (55), (56), (62), and (63), Using these equations it is easy to see that dt
Ilkl -- qll2 "I" lleDll 2
2 ( ( i + llwl] 2 -~-llAp,~[] 2) ~2 _ GeT)
- 2 (ebzapx~T + l[~p~eD[l 2) _< - -
I
((i + 2[I~I? + 211a~ll 2) ~ - ~ )
< _ ! ((~ + 2[I~I? + llap,~ll ~) ~ - ~ ) P
-.[kvl[ ~ •
Thus, by rearranging terms and integrating,
1 IIki -
qll 2 + llevll 2 +
)
(1 + 2[Iwll 2 + IlAp, wl[ 2) ~? + .llcDII 2 dr _< co,
where co = (~llkx(0)- qll2 + IleD(0)ll 2 + f ~ ~2dr). This shows that k[ and eD are bounded on [0,T) and that 6T, W~T, ]¢I, Apx~T and eD are in L2(T), the space of square integrable functions on [0, T). Moreover, since the determinant of Apa is a Jacobian for (53), assumption (61) and the implicit function theorem make it possible to express kR as a continuous function of ki and eD; since the latter are
A Conceptual Framework
63
bounded on [0, T) the former must also be. Note in addition that eT must be bounded on [0, T) because of (60) and (63). The utility of Observation 1 becomes apparent as we turn to the problem of establishing the boundedness of z on [0,T). As in Sect. 2, what we need to do first is to show that the matrix A . ( O -- A ( k ( t ) ) + H ( k ( t ) ) ( C ( k ( t ) ) + ~'(t)L) is exponentially stable on [0, T) where [H, He] is as described in Observation 1 and ~ is the £9(T) function ~ = - ( I + A~ /tp,)W~T. This can be done by first noting that A. can be rewritten as A, = A,, - HECE
(64)
,
where A** = A + H C + HBCE + H f ' L . Because of the pointwise stability of A + H C + H E C B noted in Observation 1, and the fact that ~ and k are in £ 2 ( T ) , it is easy to see that A** is exponentially stable on [0, T) (cf. [1,16,17]). Since (59) and eD E £2(T) imply t h a t H~CB 6 L:2(T), the same reasoning can be used again to conclude that A. as expressed in (64) is exponentially stable on [0, T). From this point on, the logic is exactly as in Sect. 2: T h a t is, using the expression ew = ( C ( k ) + t~'L)z, which is a consequence of (62) and (63), the differential equation for z in (56) can be rewritten as z = A . x + B r - HgT. From this and the boundedness of r and ew it follows that z is bounded on [0, T); and this together with the boundedness of k and the definition of T imply that T = oo and thus that (z, k) is bounded on [0, oo).O E x a m p l e 9: S i n g l e level i m p l i c i t t u n i n g - Let 2YD denote the two parameter indirect control design model
"}'plUD ZD2 : ZD1 -- (Pl "b p2)ZD2 + UD ZD1 -~----ZD2
(65)
YD ~ ZD2
defined on the parameter space PI = IR2. Note that ,UD'S transfer function is (s + p , ) / ( s 2 + (Pl + p2)s + 1) and that a pole-zero cancellation occurs whenever PiP2 - 1 = O. Define Pl = [Pl,P2]' and f(pI) = (Pl +P2 - 1 ) [ - p l , p , ( p l +P2) - 1]. A straightforward calculation shows t h a t the state feedback law UD = ( 1 ~ ( p i p 2 1))f(PI)ZD assigns to (65) the closed-loop characteristic polynomial s 2 + s + 1 for every value of PI 6 IR2 for which PiP2 - 1 does not vanish. This implies that with UD redefined to be UD = p 3 f ( P l ) Z D ,
(66)
the closed-loop characteristic polynomial of (65) will be s 2 + s + 1 along the algebraic surface in IR3 on which PZ(Plt~ - 1) - 1 = 0. While it is tempting to define A to be p3(plP2 - 1) - 1, this leads to a difficult tuner design problem. In order to avoid this difficulty, we shall "lift" the problem to ]1%4 by defining
-p2
+
'
(67)
64
Morse
where PR = [Pa, P41'- Since pa = 1~(pip2 - 1) for every value of p = [p~,P~t]' in the set = {[pl,pld' : A ( ~ , ~ ) = 0 } , (68) the feedback law (66) stabilizes design model (65) at each point p E £. Let E](ki) be an identifier for the design model (25), as described in Example 6. Then el can be written as el ---- w 2 ( 1 -- k l -- ks) q- Wakl -- y ,
(139)
where [Wl, w2, wa, w4]' = (block diag [Q, Q])' ~I, zl is the identifier's state, and Q is as in Example 6. In accordance with certainty equivalence, define the internal regulator state feedback law =
k3f(ki) o,
(70)
where XD is the identifier's surrogate state defined by (23). Let er = C(k)x
= A(k)x
(71)
denote the closed-loop parameterized system ,U(k) consisting of any fixed SISO model ~Up, ,UI, the tuning error eT = ei and the certainty equivalence controller (70). Since the closed-loop design model ,UD,,(p) described by (65)-(66) is stable for each p E E, by the Certainty Equivalence Stabilization Theorem, E(k) is tunable on g. Now assume that ,Up has a transfer function of the form (s + q l ) / ( s 2 + (q~ + q2)s + 1) where q = [ql, q2]¢ is in the set 7~- = {PI : PiP2 - 1 > 0, Pl < 0, P2 < 0}. Note t h a t ,Up is nonminimum phase and that 7Z- is an open set bounded by the algebraic surface on which the design model (65) loses controllability. In view of (69), the definition of eW and Remark 2 in Sect. 3, this assumption enables us to write ew in the familiar form
where e is a linear combination of decaying exponentials. The assumed structure of ,Up's transfer function also allows us to rewrite the design error eD
= A(kx, k~)
(73)
as
-k4
k2
-
q2
]c4
q4
(74) '
where
and [qa, q4] = [0,-1](Q~1) '. Note that QP is positive definite for each q E ~ - . This proves to be a consequence of 7~- being convex, a requirement which at present appears to preclude the generalization of these particular ideas.
A Conceptual Framework
65
The structure of eT in (72) and eD in (74) provides plenty of motivation for using a single level implicit tuner of the form
]CR " - - - e D ,
where ~---- [ w 3 - w 2 , - w 2 r , / ~ @ eD -----[k3eD~, b4eD2]j and eT is the normalized tuning error eT
~T
--
1 + II, ll
"
Note that, unlike Example 8, ki is being adjusted here to make both eT and eD small. We leave it to the interested reader to verify that on the overall system's maximal interval of existence [0, T), k, e D and ew are bounded and eD, eT and k are in /:~(T); moreover, just as in Example 8, the equation for z in (71) can be rewritten as ~ = A . ( t ) z - H~T where H is as in Observation 1, A° is the exponentially stable matrix A , ( t ) = A ( k ( t ) ) + H ( k ( t ) ) ( V ( k ( ~ ) ) + ~'L), = ]ci + kR Q eD and L is the constant matrix for which Lx = ~. All this of course implies, just as before, that the entire system state (z,k) exists and is bounded on [0, c~). As noted previously, the successful development of a stabilizing tuner for this example relies on q being in a convex subset, namely 7~-, whose boundary is determined by the algebraic surface on which the underlying design model loses controllability. Examination of this surface reveals that there is another subset, 7~+ -- {Pl : PiP2 - 1 > 0,pl > 0,p2 > 0}, with the same properties. It is easy to show that, by simply changing the signs of the diagonal terms in the square matrix in (67) which defines A, one can obtain a stabilizing adaptive control for the case when q E T~+. And to handle the case when q might be in either 7~+ or T~- one can use hysteresis switching, as is demonstrated in [28]. From the preceding it is clear that it is the detailed geometry of the algebraic surface on which an indirect control design model loses controllability, which is of importance to adaptive control. This example thus shows that if we cart learn how to implicitly tune on each of the open subsets of the parameter space bounded by this surface, whether or not they are convex, then we will finally have a reasonably complete solution, not relying on persistent excitation, to the long standing adaptive pole placement problem for SISO linear systems.(> R e m a r k 5: In the preceding examples, the problem of making both eT and eD simultaneously small is approached by exploiting the specific algebraic forms of both eT and eD viewed as functions of ki and kR. An alternative and highly innovative approach to this problem, which does not make use of the algebraic structure of eD, has been proposed in [29] for discrete-time systems. The idea can be explained as follows. Assume that 27D is the discrete-time version of the indirect control SISO design model of Example 6 and that 2Yi is its associated discrete-time identifier. Using an interesting idea due to Elliot [30], it is possible to construct a discrete-time internal regulator SI~ of the same form as (31), a design function A(pi,prt) and a row vector cC(~,pR) depending linearly on PR, with the following properties:
66
Morse
1. For any SISO process model ,Up, the closed-loop parameterized system consisting of ,Up, ,UI, ,UI~ and the tuning error eT = ei, is tunable on ¢ =
:
= 0}.
2. With UD defined by the discrete time analog of (32) and T(s,pI,l~) the ! I closed-loop design model transfer function from r to cc(p~,pu)[ZD, zrt ]I ,
T(s,p~,prt)=O .'. ,. [P'] Assume that standard gradient-like tuning algorithms, driven by normalized versions of tuning error eT and "control error" ec = ec(kl, kR)[~D, x~t]', are used to adjust k! and krt respectively. Since eT and ec depend linearly on kl and kR respectively, in the absence of unmodeled dynamics these algorithms will cause ew and ec to tend to zero. Property 2 above insures that as this occurs, [kx, .k~]' will approach ~: (or equivalently eD will approach zero), provided [~D, ~R]' is "sufficiently rich". The central contribution of [29] has been to show that [ ~ , ~t]' can be made to have this property by setting r equal to a suitably defined persistently exciting probing signal which is internally generated by an algorithm driven by the norm of eD! With the algorithm of [29] viewed from this perspective, it is not difficult to think of alternative algorithms, of the same type, for solving the problem under consideration. For example, since with the Elliot parameterization A also turns out to be linear in Pa, one might consider adjusting ka with a gradient-like tuning algorithm driven by eD rather than e¢. To carry out a stability analysis in this case, it would have to be shown that r is persistingly exciting enough to insure that ki is eventually driven into a region of the parameter space for which the Jacobian matrix Apa(ki , kR) of A remains nonsingular. For if this can be accomplished, and it is likely that it can be, then it would be easy to show that such an overall tuner would cause both ew and eD to go to zero, which because of property 1 above, is all that is required for internal stability.O Concluding
Remarks
The purpose of this paper has been to discuss some of the reasons why it is especially useful to think of a parameter adaptive control system as the feedback interconnection of a process 2~p and a parameterized controller ,U(k) whose parameter vector k is adjusted by a tuner `UT- The configuration is general enough to describe many different kinds of adaptive systems including those of the model reference, self-tuning and high-gain feedback types. While error models are not used in this setting, special emphasis is placed on the importance of a tuning error in the characterization of an adaptive system. This leads naturally to the concept of tunability - an idea of particular utility to adaptive control, as explained in Section 2. The central conclusion to be drawn from the discussion in Section 3 is that, subject to mild conditions, certainty equivalence implies tunability. Although the section's stated results hold for a large class of continuous and discrete-time
A Conceptual Framework
67
multivariable adaptive control systems, as presented they are not as general as they might be. For example, there are a large number of alternative identifier structures which could be substituted for (21)-(23) without changing the validity of the two theorems which follow. Among these are the identifier implicitly used in [3], identifiers in which ei is the output of a stationary linear system with a positive real transfer matrix whose input is of the form Ci(h~)a:i - y, and identifiers of the "equation error" [31] type in which zI is generated by an equation of the form r.I = (AI + D i C i ( k i ) z l + Bxu. In addition, there is good reason to believe that these results can be extended to slowly time-varying linear systems and even to some restricted classes of nonlinear systems, using, for example, ideas along the lines of [32]. In other words, the observation that certainty equivalence implies tunability, is probably as universal as the validity of the idea of certainty equivalence itself. The aims of Section 4 are to call attention to the concept of implicit tuning and to explain how the ideas of Section 3 might be used in the analysis of an implicitly tuned system. The section's main message is that many meaningful adaptive control problems can be reduced to nonstandard, but highly structured identification/tuner design problems in which the concepts of implicit tuning, certainty equivalence and tunability play prominent roles. Examples 8 and 9 describe two different implicit tuning methodologies. Each appears to have potential for dealing with the loss of controllability/stabilizability problem mentioned earlier, without having to resort to ad hoc methods. Clearly what's needed are new tuning or identification algorithms capable of handling the somewhat unusual combination of error equations which arise from implicit parameteriza. tions. In most cases the tuning error equation will be linear in kl; to achieve this may sometimes require a reparameterization along the lines of [26]. Usually A will be a continuous algebraic or polynomial function with the property that, for almost every value of pl, the design equation A(phpR) -- 0 has a unique solution PR; moreover, A will typically be of a form for which there is at least one well-known recursive algorithm for computing PR whenever it exists. With this much known, it would seem that significant advances in this direction in adaptive control are inevitable.
References 1. A.S. Morse, "Towards a unified theory of parameter adaptive control-tunability," IEEE Trans. Aut. Control, voL AC-35, pp. 1002-1012, Sept. 1990. 2. A. S. Morse, "Towards a unified theory of parameter adaptive control-Part 2: certainty equivalence and implicit tuning," IEEE Trans. Aut. Control, to appear. 3. A. Feuer and A. S. Morse, "Adaptive control of single-input single-output linear systems," IEEE Trans. Aut. Control, vol. AC-23, pp. 557-569, Aug. 1978. 4. A. S. Morse, "Global stability of parameter-adaptive control systems," IEEE Trans. Aut. Control, vol. AC-25, pp. 433-439, June 1980. 5. G.C. Goodwill and K. S. Sin, Adaptive Filtering Prediction and Control, PrenticeHall, Inc., Englewood Cliffs, NJ, 1984. 6. K. J. Astrom and B. Wittenmark, Adaptive Control, Addison-Wesley Publishing Co., 1989.
68
Morse
7. K. S. Narendra and A. M. Annaswamy, Stable Adaptive Systems, Prentice-Hall, Inc., Englewood Cliffs, N J, 1989. 8. S. S. Sastry and M. Bodson, Adaptive Control: Stability, Convergence and Robustness, Prentice-Hall, Inc., Englewood Cliffs, N J, 1989. 9. R. D. Nussbaum, "Some remarks on a conjecture in parameter adaptive control," Syst. Control Lett.,vol. 3, pp. 243-246, 1983. 10. B. Martensson, "The order of any stabilizing regulator is sumcient a priori information for adaptive stabilizing," Sgst. Control Left., vol. 6, pp. 87-91, 1985. 11. A. S. Morse, "A three dimensional universal controller for the adaptive stabilization of any strictly proper minimum-phase system with relative degree not exceeding two," IEEE Trans. Aut. Control, vol. AC-30, pp. 1188-1191, Dec. 1985. 12. J. C. Willems and C. I. Byrnes, =Global adaptive stabilization in the absence of information on the sign of the high-frequency gain,~ Lecture Notes in Control and Information Sciences, Proc. 6th Int. Con]. on Analysis and Optimization of Systems, Nice, June 1984, pp. 49-57. 13. D. E. Mi]Jer, Adaptive Control of Uncertain Systems, Doctoral Thesis, University of Toronto, Oct. 1989. 14. M. A. Duarte and K. S. Natendra, "Combined direct and indirect approach to adaptive control," Center for Systems Science, Yale University, Report No. 8711, Sept. 1987. 15. J. K. Hale, Ordinary Di~erential Equations, Wiley-Interscience, 1969. 16. G. Kreisselmeier, "An approach to stable indirect adaptive control," Automatica, vol. 21, pp. 425-431, July 1985. 17. K. H. Middieton, G. C. Goodwin, D. J. Hill and D. Q. Mayne, "Design Issues in Adaptive Control," IEEE Trans. Aut. Control, vol. 33, pp. 50-58, Jan. 1988. 18. F. Pait and A. S. Morse, "A smoothly parameterized family ofstabilizable, observable lineax systems containing realizations of all transfer functions of McMillan degree not exceeding n," IEEE Trans. Aut. Control, to appear. 19. P. Ioannou and J. Sun, "Theory and design of robust direct and indirect adaptivecontrol schemes," Int. J. Control, vol. 47, pp. 775-813, 1988. 20. B. Egaxdt, Stability of Adaptive Controllers, Springer-Verlag, Lecture Notes in Control and Information Sciences, vol. 20, 1979. 21. L. Praly, "Global stability of direct adaptive control schemes with respect to a group topology," Adaptive and Learning Systems, Plenum Press, 1986, pp. 10091014. 22. B. D. O. Anderson and R. M. Johnstone,"Global adaptive pole positioning," IEEE Trans. Aut. Control, vol. AC-30, pp. 11-22, Jan. 1985. 23. A. S. Morse, "Structural invariants of linear multivadable systems," SIAM J. Control, vo]. 11, pp. 446-465, Aug. 1973. 24. R. V. Monopoli, "Model reference adaptive control with an augmented error signal," IEEE Trans. Aut. Control, vol. AC-19, pp. 474-484, Oct. 1974. 25. L. Praly, =Towards a globally stable direct adaptive control scheme for not necessarily minimum phase systems," IEEE Trans. Aut. Control, vol. AC-29, pp. 946949, Oct. 1984. 26. S. Dasgupta and B. D. O. Anderson, "Physically based parametezizations for the design of adaptive controllers," Dept. of Systems Eng. Report, Australian National University, Canberra, 1986. 27. A. S. Morse, "Indirect adaptive control of processes satisfying the classical assumptions of direct adaptive control," Proc. 1988 Amer. Control Conf., Atlanta, GA.
A Conceptual Framework
69
28. A. S. Morse, D. Q. Mayne and G. C. Goodwin, "Identifier-based, switchedparameter algorithms for the adaptive stabilization ofllnear systems, ~ Prac. Sixth Yale Workshop on Adaptive and Learning Systems, New Haven, Aug. 1990. 29. G. Kreisselmeier and M. C. Smith, "St&hie adaptive resulation of arbitrary nthorder plants, ~ IEEE Trans. Aut. Control, vol. AC-31, pp. 299-305, April 1986. 30. H. Elliot, "Direct adaptive pole placement with application to nonminimum phase systems, = IEEE Trans. Aut. Control, vol. AC-27, pp. 720-722, June 1982. 31. C. R. Johnson, Jr., Lectures on Adaptive Parameter Estimation, Prentice-Hall, Inc., 1987. 32. G. Bastin and M. R. Gevers, "Stable adaptive observers for nonlinear time-varying systems, ~ IEEE Trans. Aut. Control, vol. AC-33, pp. 650-658, July 1988. 33. A. S. Morse and W. M. Wonham, "Status of non-interacting control," IEEE Trans. Aut. Control, vol. AC-16, pp. 568-581, Dec. 1971.
Robust Adaptive Control: Design, Analysis and Robustness Bounds* Petros Ioannou and Aniruddha Datta Department of Electrical Engineering-Systems University of Southern California Los Angeles, CA 90089-0781, USA.
A b s t r a c t . Despite the publication of numerous papers and several books dealing with the design, analysis and robustness properties of adaptive control, the theory of adaptive control may appear too technical and sometimes confusing to some people due to the so many different approaches, modifications and stability tools employed by various researchers in the field. The effort of this paper is to alleviate some of this confusion by presenting a procedure for designing and analyzing adaptive schemes. This procedure not only unifies most of the modifications used for stability and robustness but also clarifies why different approaches lead to the same result. The general framework that w e developed can be used not only to analyze and unify most of the adaptive control schemes proposed in the literature but also to generate new ones. In addition, the ana]ytical tools used allow the calculation of robustness margins and error bounds in a straightforward manner.
1 Introduction The p r o b l e m of designing stable adaptive controllers for a linear time-invariant system with no modeling errors was resolved as far back as 1980 [1-5]. Subsequently it was d e m o n s t r a t e d [5-7] t h a t in the presence of modeling errors and disturbances, an adaptive controller designed for the ideal situation, i.e., no modefing error or disturbances, could exhibit instability. Since then, for almost a decade, a considerable amount of effort has been directed towards the development of so-called "robust adaptive control schemes", i.e., adaptive control schemes which can retain certain stability properties in the presence of a wide class of modeling errors [8--19]. Due to these efforts, the 1980s have witnessed the development of a large number of robustness results in adaptive control. These results employ different robustness modifications and different methods of analysis and a p p e a r to be substantially different. However, in [13] it was shown t h a t there does exist an underlying theory which could be used to unify within a single framework some of the robust adaptive control schemes presented in the literature. Furthermore, the results in [13] revealed t h a t the key to a unified analysis of robust adaptive * This work is supported in part by the National Science Foundation under Grant DMC-8452002 and in part by General Motors Foundation.
72
Ioannou and Datta
control schemes lies in first analyzing the properties of the robust adaptive law independent of the controller structure to be used and then combining the two of them together using the Certainty Equivalence Approach [20]. As far as the analysis of the adaptive laws is concerned, the ones derived in [13] are all based on gradient error minimization techniques. Indeed, such adaptive laws have the useful property that the stability analysis in the ideal case, i.e., in the absence of unmodeled dynamics and disturbances, extends, after some modifications, to the case where unmodeled dynamics and disturbances are present. Thus, it is not surprising that most of the robustness results to be found in the adaptive control literature use gradient or Newton's techniques for the derivation of the adaptive law. Nevertheless, it is well known that some of the first stability results in continuous-time adaptive control [1,2,3] were established using a positivity and Lyapunov-type synthesis approach which led to the emergence of the celebrated Strictly Positive Real (SPK) condition in adaptive control. The SPIt condition was further exploited in [15] for the local analysis of adaptive systems using averaging techniques. In this paper we show that robust adaptive laws can be generated using not only gradient and Newton's techniques, but also by using the positivity and Lyapunov-type synthesis approach of [2] in a unified manner. One principal ingredient of this unification is the definition of the estimation error (known as prediction error in discrete-time algorithms) and the normalized estimation error, which clarifies the role of the artificial "augmented error" and "auxiliary" signal used in [2]. The robust adaptive laws developed in this paper can be combined, using the Certainty Equivalence Approach, with robust controller structures to yield robust adaptive control schemes. In particular, we consider the design and analysis of adaptive control schemes obtained from model reference, pole placement and linear quadratic (LQ) control structures, which are some of the most popular control structures in the adaptive control literature. The paper is organized as follows: Section 2 contains some mathematical preliminaries. In Section 3 we develop the theory for the design and analysis of adaptive laws. Section 4 contains a discussion of commonly used adaptive controller structures and their associated robustness properties. In particular, we consider and analyze a model reference, a pole placement and a linear quadratic controller structure. In Section 5 we design and analyze robust model reference, pole placement and linear quadratic adaptive controllers by combining the appropriate controller structures of the previous section with any of the robust adaptive laws of Section 3. We illustrate the design procedure and analysis using simple examples. In Section 6, we summarize our conclusions and outline the directions for future research.
2 M a t h e m a t i c a l Preliminaries In this section, we give some definitions and lemmas which will be used in the subsequent sections.
Robust Adaptive Control
73
D e f i n i t i o n 2.1 For any signal x : [0, oo) ~ z to the interval [0, t] and is defined as zt(r)=
JR", xt denotes the truncation of
{x(0r)otherwise.ifr
(2.1)
Definition 2.2 For any signal x : [0, oo) --* IR', and for any 6 > O, t > O,
IIx, llg is defined as
=
(£
e-'('-r)[xT(r)x(7")ldT
)'
,
(2.2)
provided that the integral in (2.2) exists.
R e m a r k 2.1. Clearly, II(')t 112 s represents the exponentially weighted L2 norm of the signal truncated to [0,t]. When 6 = O and t - oo, II(')dl~ becomes the usual L2 norm and will be denoted by [1" ]12. It can be shown that H" 1162satisfies the usual properties of the vector norm. [] D e f i n i t i o n 2.3 Let H(s) be a stable m x n proper transfer function matrix. Then, IlH(s)ll• ~ ,sssup &[H(jw)], (2.3) where ?r denotes the largest singular value.
The I1" I1~ is the so-called Hoo-norm widely used in robust control [22]. Definition 2.4 Let H(s) be a proper transfer function matrix analytic in Re[s] > - 8 / 2 . Then,
[[H(s)l[~ ~= IlH(s - 6/2)11oo.
(2.4)
D e f i n i t i o n 2.5 Consider the signals x : [0, ~ ) --* IRn, y : [0, co) --* lit + and the set
s(u) = { x : [0,00)--+]Rn[~tt+TxT(T)X(q')dT ~ e Jr[t+T y(T)dT+cl 1 for
some
c~c 1
,
>_ 0 and V t, T >_ O. We say that x is y-small in the mean if
~ s(u). The following lemmas are key lemmas in the stability and robustness analysis of the adaptive control schemes to be presented in Sect. 5. L e m m a 2.1 Consider the following homogeneous linear time-varying system: (2.5)
= A(t)x.
Let the elements of A(t) in (2.5) be differentiable bounded functions of time and assume that
[A1]
Re{Ai(A(t))} < - a 8 ,
where O's > 0 is some constant. If any one of the following conditions
Vt>_O,
i= l,2,...,n,
74 (i)
Ioannou and Datta
dt
O0
fl(T) dT <_p T + ao, or
A(r)
__
or
dt
where ao E IR + and # > O, is satisfied Vt > O and some T > O, then there exists a l** > 0 such that V/~ E [0,/i*) the equilibrium state x e = 0 of(2.5) is u.a.s in the large. Proof. From [A1] it follows that the Lyapunov equation
(2.6)
AW(t)P(t) + P ( t ) A ( t ) = - I
has a unique bounded solution P ( t ) for each t. We consider the following Lyapunov function: V(t, x) = xT P ( t ) x . Then, along the solution of (2.5), we have:
? = -IIx(0112 + ~T(0P(t)~(0.
(2.7)
From (2.6),/~ satisfies: AT(t)/~(t) + P ( t ) A ( t ) = - Q ( t ) ,
(2.8)
Vt >_ O,
where Q(t) = _/tw(t)P(t) + P(t)z4(t). Due to [A1], it can be verified [23] that 15(0 = ~0 °° eAT (t )r Q(t)eACOr dv
satisfies (2.8) for each t _> 0. Therefore,
Since [A1] implies that HeA(OrH < a l e - a ° r for some a l , a0 > 0, it follows that
for some c _> 0. Then,
IIQ(011 < 2liP(011 I A(t) I together with P E Loo imply that
for some constant fl ~ 0. Using (2.9) in (2.7) and noting that P satisfies 0 < fll _< Xmin(P) < Xmax(P) < f12 for some ill, f12 > 0, we have that: V(t).
Robust Adaptive Control
75
Therefore,
t-A: Using condition (i) in (2.10), we have
v(t) < V(to)exp { - (n~' - n~71ii) (t - to)} exp {nnrl~0}. Therefore, for p* = 7 ~ and Vii E [0,ii*) V(t) .-+ 0 exponentially fast as t - - oo, which implies that Ze = 0 is u.a.s, in the large. In order to use condition (ii) we rewrite (2.10) as
Using the Schwartz inequality and condition (ii), we have
___[ii~(t
- to) ~ + ,~o(t - to)] ~
< ii(t - to) + vtff'ffCt - t o .
Therefore,
V(t) < V(to) exp { - a ( t - to)} y(t), w h e r e a = !2/ ~ -21 _ f l f l ~ t p , and
y(t) = exp [ - l fl~l(t - to) + flfl~lvt~
tVt't-ZToto] .
It can be shown that
[1
f12f121 =
y(,) < exp [ ~ 0 - - 3 7 - j
c,
V* > *0.
Hence,
V(t) <_ cV(to) exp {-o~(t - to)} • Choosing • p* = 2 #~]~, we have Vp e [0, p*) that a > 0 and, therefore, V(t) --* 0 .~ exponentially fast as t --* oo, which implies that Xe = 0 is u.a.s, in the large. Since condition (iii) implies condition (i), the proof of part (iii) follows directly from that of part (i). [] The proof of Lemma 2.1 can also be found in [13] and it is a byproduct of results on slowly time-varying systems given in [24].
76
Ioannou and Datta
L e m m a 2.2 Let
= H(s)[.] = h •., where H ( s ) is a proper transfer matriz and h(t) is the corresponding impulse response. I f H ( s ) is analytic in Re[s] > - 6 / 2 for some 6 >_ 0 and u 6 L2e, then
IIz, ll'2 <__IIH(s)ll~llu, llg •
(2.11)
If, in addition, H ( s ) is strictly proper, then IIz(t)ll _< --~pilCs + p)g(s)llSllu, IIg ,
(2.12)
where p >_ 6 is an arbitrary constant. The proof of Lemma 2.2 follows from the results in [17,18] and is given below. Proof. The transfer function H ( s ) can be expressed as H ( s ) = d + H , ( s ) with h(t) = If we define h,(t) =
0, t
O. t<0 O, d6(t) + e g ' h . ( t ) , t > 0
and u6(t) = e~" u(t), then
z,(t) =
eg~t-~)h(t- r)eg'~(~)dr = h,*
=,,
where z6(t) - egtz(t). Now u 6 L2e =~ u6 6 L2e. Therefore, for the truncated signals z6t, u6t at time t, we have: IIz,dl2 _< I I H ( s - 6/2)lloollu,,ll2. 6t
~
Since e-, IIz,,l12 = IIz, llg and e-, I1~,,11~ (2.13).
=
(2.13)
II~,llg, (2.11) follows directly from
If H ( s ) is strictly proper, it can be written as
H(s) = for some p > O, where
i
s+p
Hp(s)
Sp(s) ---(s -bp)S(s) is a proper transfer function and
z(t) = -~p[~(t)].
~(t)= H,(,)[~(t)].
Then,
Ilz(t)llleqfo'e-'('-')ll~(r)lldr_<(fo'e-'('-')dr) ~ (fo'e-'"-~)ll~(~)ll~dT) ,
Robust Adaptive Control
77
where the last inequality is obtained by using the Schwartz inequality. Since (1 - e-pt)/p < l i p and p > di, we have:
IIz(t)ll < ~
'(/o'
e-SO-')ll~'O)ll ~dr
)'
=
:'dig,
which, together with
I1~"11~ < IIHp( s - 'Y2)II~II"'II~
= I1("
+ P -
,yg.)H(,,
-
,5/2)11,,o I1",11'~, El
imply (2.12). L e m a n a 2.3 Consider the linear time-varying system given by
= A(Ox + B(t)u,
~:(0) = Xo,
(2.14)
where x E IKn, u E IRm, and the matrices A, B E Log and their elements are continuous functions of time. 1f the state transition matrix ¢(t, r) of (2.10 satisfies
lie(t, r)ll < ,Xoe-6°('-')
(2.15)
for some Ao, 6o > 0, and u(t) is continuous, then for any 6 e [0,6o), we have A
6
(i) IIx(t)ll _< .7.~ll,~,lh + ,,, and )t
(i0 llx, llg ___
,/60(60 - 6)
Ilu, llg + e,,
where et is an exponentially decaying to zero term due to Zo, and A = tag, where e is the Log bound for B(t). Proof. The solution x(t) of (2.14) can be expressed as
• (t) = ¢(t, o)~,o +
¢(t,~-)B(~-)u(,-)d,-.
Therefore,
II~:(t)ll ~ lie(t, O)ll I~ol +
114,(t,r)ll IIB0-)u0-)ll dT.
Using (2.15), we have
IIx(t)ll < ~, + ~
e-'°('-')ll,,O')lla,',
(2.16)
where c and A are as defined in the statement of the lemma. Applying the Schwartz inequality and using the fact that 0 < & < &0, we have Ilx(t)ll < e,+), ( / t e - 6 ° ( ' - ' ) d r ) ]
(/'e-'('-")llub-)ll'dr)< e,+'-'-~llu,ll~
78
Ioannou and Datta
which completes the proof of (i). Using the triangle inequality for the II" IIg-n°rm it follows from (2.16) that IIx, llg ___Ile,llg + AIIO, II~,
(2.17)
where
°11(/~'e-'°O-~)llu(r)lldr )1(
IIO, llg =
$ 2
Using the Schwartz inequality, we have:
-
11I' ~-,o-,) //
~oo
e-'oC.-.)ll,(,)ll2dsd
~]'
.
where the last inequality is obtained from the use of f o e-~°(r-')ds <- 1/60. We now change the sequence of integration and obtain
<
I
[e_~t
t
< <
t
e
uts) l
6-~-'~
1 \/~o(ao - a) II~'llg '
which, together with (2.17) and the fact that to zero, complete the proof.
Ile, llg is also exponentially decaying I:1
L e m m a 2.4 (GronwaU L e m m a ) . Let A(t), g(t), k(t) be nonnegative piecewise continuous functions of time. I f the function y(t) satisfies ~he inequality: ~(t) <_~(~) + g(t) then
y(t)
<_a(t) + g(t)
Jl
a(s)k(s)
f2 ( (/,'
k(s)y(.)d.,
exp
In particular, irA(t) = ,~ is a constant and
y(O <_ ~ exp
(/,i) k(s)ds
Vt >__to >_ O,
k(,')a(~')dr
g(t)
))
ds, Vt > to > O.
- 1, then
, W > So >_ O.
Robust Adaptive Control
79
The proof can be found in [24]. L e m m a 2.5 ( S w a p p i n g L e m m a ) . Let ¢,w : lit + ---, lR n and let ~ be differentiable. Let W(s) be a proper stable rational transfer function of order m with a minimal realization (A, b, c, d), i.e.,
W(s) = cW(sI -- A ) - l b + d. Then, where We(s) -" --cT(sI -- A) -1,
Wb(8) = ($X -- A ) - l b .
Proof. We have W($) [~bT0~] = W ( s ) [6aTe] -" d~bT~ -]- cT
eA(l-r)bo, TTq~d7"
-- d ~ T w -I- cTe A~ X
( ff e-ATb'T(~')d'¢(t')l''=' ,,_-o -/' ff e-Aabo~T(q)dq~(r)dr) = ¢T [d,,,+ cTfo' ,A('-')bo,0")d~"] --cT fo' eA(t-r) / " eA(T-')bwT (cr)do-¢(7-)dr ---- *T(t)W(sl[wl --cT(sI - A) -x [ ( s I - A ) - l b [wT] ¢ ] . [] Various other proofs for Lemma 2.5 can be found in [16,17]. L e m m a 2.6 [18] Let q~, w : lit + ---, ]Rn and ¢, w be differentiable. Let W(s) be a rational transfer function with relative degree n* and with stable poles and zeros. Then,
t -j) where sAx ( s) = 1 - A( s, a ), A( s, a) = ak / ( s + a) ~ and k > n* for any arbitrary constant c~ > O. Furthermore, for large a
IIAx(,,,~)ll~_
Ioannou and Datta
80
Proof. We have
¢~
=
1
(s
sk I ¥~)k [¢~]
Since
ak
1
sk
+ (s + ~)------~w-~(s)w(s) [ ¢ ~ ]
"
(s + a) k - a k
(s + a) k =
(s + s) k
= sAl(s),
we have
cww=A,(s)[/kmw+¢T(o]+A(s)W-i(s)W(s)lCWw].
(2.19)
Using Lemma 2.5 to substitute for W(s) [¢ww], the equality (2.18) follows directly from (2.19). Now Al(s) =
1 [(s + ,~)k _ ski s (s + o<)k
l [s'-t+(~)ots'-2+(:)s2s'-z+...+(kkl)c~'-x ],
(s + s)i where
k!
=
Hence, for large a,
IIAx(s)ll~_<~
[]
for some e C 1~+.
L e r a m a 2.7 Consider the system
(2.20)
Y = H(s)[u],
where H(s) is a strictly proper rational transfer fnnction analytic in Re[s] _> 0. If u E S(# 2) for some constant # >__O, then y E Loo. Proof. It follows from (2.20) and the properties of H(s) that
( I'
ly(t)l = <_ a,
e-~°('-')llu(r)lldr
)'
<--
SO
e-~°('-T)llu(r)ll=dr
for some s l , s0 > 0. Now
,,(, o)O= L ' e - : ° 0 - ' ) l l u ( ~ ) l l 2 d ~ _ <
e-:"'
e:°Tllu(~)ll~d~ :7::/'+' i=0 Ji
n [i+l < e-<,o' ~ e<'°o+~) Ilu(,-)ll2d~-, i=0 dl where n is an integer satisfying n < t < n + 1. Since u E S ( # 2 ) , it follows that #~ + c ea°(i+')< -1 - - e -°'o e a °
A (t , O) < e-°'°'(#~ + c)
~
oo.
i=O
[3
Robust Adaptive Control 3 On-line
81
Parameter
Estimation
In this section we will consider the problem of estimating on line the constant parameter vector O* of a certain class of dynamic systems described by
z(t)=f(O*,t,r),
t>r>0,
(3.1)
where at each time t, the response z(r) with ~-_< t can be observed and f is some function or functional whose form m a y be known. Ifwe let O(t)to be the estimate of 0* at time t, then the estimate $(t)
A
~(t,0) of z(t)can be constructed as
~(t,0) =/(0, t,r),
(3.2)
for some function or functional .f which m a y or m a y not have the same form as f. The estimation problem is to design the adjustment law for 0 so that ~(t, 0) is as close as possible to z(t) in some sense. The form and the properties of the adjustment law which we refer to as the adaptive law will be very much dependent on the form of the functions f and / as well as on the criterion chosen to compare 2(t, 0) with z(t). Typical choices of criteria for checking the quality of the estimation might be: (i) Ilk(t, 0) - z(t)ll ~ L2 n Loo (ii) Ilk(t, 0) - z(t)ll - + 0 as t --, c o (iii) Ila(t) - O*ll --' 0 as t --* ~o
or any combination of (i)-(iii). For example, in parameter identification for stable plants criterion (iii) is very crucial, whereas in adaptive control (i) and (ii) may be important but not (iii). An important class of parametric models for 0* of the form (3.1) is the one where 0* appears in a linear form, i.e., z = W(s)[0*T~ + 00] ,
(3.3)
where
- W(s) is a known proper transfer function with stable poles, z E IR, ca E ]Rn are piecewise continuous signals which can be measured at each time t, - 0* is the unknown constant vector to be estimated, and - r/0 E IR is an unknown signal due to modeling errors such as unmodeled dynamics, disturbances, noise etc. -
We will refer to (3.3) as the linear parametric model for 0". Another interesting class of (3.1) is the one where the unknown parameters appear in the special bilinear form = W(s) I f (0*T~ + z0) + ~0], (3.4) where z0 E n~ is a piecewise continuous signal which can be measured at each time t and p* 6 IR, is an unknown constant whose sign is known. We will refer
82
Ioannou and Datta
to (3.4) as the bilinear parametric model for 0", p'. The general case where 0* appears nonlinearly is a difficult one for estimation. Approximation techniques using Taylor series expansions around some known nominal value of 0* may be used to approximate (3.1) with (3.3) or (3.4). The following examples illustrate that (3.3) and (3.4) apply to a large class of dynamical systems. E x m m p l e 3.1. Let us consider the LTI system
y = G0(,)(1 + Zam(,))[u],
(3.5)
where G o ( s ) - Z o ( s ) / R o ( s ) , Am(s) represent the nominal plant and multiplica. tive uncertainty, respectively, and Z o ( s ) -- bins m + b i n - i s m - 1 - t - ' . . -}- bo Ro(s) = s" + a._is "-I +... + ao.
Let a~, bj, i = 0 , 1 , . . . , n - 1, j = 0 , 1 , . . . , m be the unknown parameters to he estimated. Then, (3.5) can be expressed as (s" + a n _ i s "-1 + - . . + a0)[y] = (brns ra at- b i n - i s m - 1 J r . . .
'1-
b0)[u] -}- Z0(s)Am(s)[u] • (3.6)
If the derivatives of y, u are available, then (3.6) can be easily expressed in the form of (3.3) by defining O* - - [ a n - l , . . . ,
ao b m , . . . , bo] T
W(s) = 1
.o
=
Zo(s)z~m(s)M
z = y("). In the case where the derivatives of y and u are not available, the filtered values of y and u can be used as follows: We filter each side of (3.6) with the filter 1 / A ( s ) , where A ( s ) is an nth-order Hurwitz polynomial to obtain 1 (s n -~- a n - i s n - ' -~-----~- a0) A-~[y] -'- ( b i n s m "~- b r a _ l s r a - 1 - ] - . . . - 1 - bo)
or
1
A ~ [ l t ] -at-
8" O,r¢ + Z0(S) Am(S) A(s) [y] = a(s)
Z0(s)Am(s ) A(s)
[~]
(s.8)
'
where [
s n-x
¢= L-a--~[y],...,
1
sm
[u], (3.7)
1
iT
Abh[y], a--~Cu], ..., A(slM]
'
Robust Adaptive Control
83
i.e., (3.8) is in the form of (3.3) with W(8) - 1, z = ~-~[y], ~0 = Zo(,)n~(,) AO) bL"J1 and w = ~. An alternative way is to write (3.7) as
z°(8) r,,1 + 7o, [y]+ A(s) t '
y--
(3.9)
by adding y on both sides of (3.7). Then, for A(s) -- s n + )~n_ls '~-1 + . . . + ~0, (3.9) m a y be expressed in the form (3.3) by letting
8" ~- [an-1 -- ) I n - 1 , . . . , a 0
-- ~o, b m , . . . , b o ] T
z~y
W(s)
-----
1
and w = ¢ as defined in (3.8).
[]
E x a m p l e 3.2. Let us consider the nonlinear system c*~--a*g(x)+b*f(x),
(3.10)
where a*, b*, c* are unknown scalars to be estimated, sgn(c*) = 1, and g(x), f ( x ) are nonlinear functions which can be measured at each time $. We can write (3.10) as c*(s + 1)[x] = a*g(x) + b*f(x) + c*x. Filtering each side by ~
we obtain 1
Z=s+
1
where p* = 1/c*, which is in the form of (3.4) with z0 = 0, 70 = 0, 8" = [a*, b*, c*] T, oJ = [g(x), f ( x ) , x] w and W ( s ) = 1/8 + 1. [] We should note that (3.4) can be converted to the form of the linear parametric model (3.3) by rewriting (3.4) as
z = w(8)
+ 7°],
where 0" -- [p*8 *T, p*], ~ = [~dT, Z0]T. In this case the estimate of 8" is calculated by dividing the estimate of p*0* by that of p*. Such calculations may involve division by small numbers and can be avoided by modifying the adaptive law for p* so that the estimate p(t) of p* satisfies Ip(t)l > p0 > 0 Vt >_ 0 and some constant P0 _< IP* I. This modification requires the knowledge of P0 and sign o f f [5,10,16,26]. In the next subsections we show how 8" in the parametric models (3.3) and (3.4) can be estimated on-line using various techniques based on Lyapunov, gradient and least-squares methods.
84
Ioannou and Datta
3.1 L y a p t m o v M e t h o d With this method, a dynamic equation is first obtained in terms of error signals which include the estimation and parameter error. A certain Lyapunov-like function V is then considered whose time derivative ~" along the trajectories of the dynamic equation is made nonpositive for V > Vo and some constant V0 > 0, by properly designing the adaptive law. The properties of V and 1)" are then used to establish the stability properties of the on-line estimation scheme. The procedure for developing the dynamic error equation and choosing V is given below. Since #* is a constant vector, the linear parametric model (3.3) may be rewritten in the form z = W(s)L(s)
[o'T( + 0],
(3.11)
where L ( s ) , L - l ( s )
are proper transfer functions with stable poles, v = L-I(s)[V0] and L(s) is cho n so that W ( s ) L ( s ) is a strictly proper S P R transfer function. For some W ( s ) it is possible that no L ( s ) can be found such that W ( s ) L ( s ) is strictly proper and SPR. In such cases (3.3) can still be manipulated and put in the form (3.11), e.g., for W ( s ) = , - 1 we can
=
"+' [o*T¢ + O], where write (3.3) as ~ = 0+~)(,+3)
(
8--1 = 8--1 and , + i t[¢0]~, O = ,-'g-g[r~0]
:
Let 0(t) be the estimate of 9" at time t. Then the estimate ~ of z at time t can be constructed as ~, = W ( s ) L ( s ) [#w¢] . T h e estimation error •1 is then given by • 1 = ~ - z = W(s)L(s)
[¢W( _ rl],
(3.12)
where ¢ -~ 0 - $* is the parameter error, i.e., •x is a measure of the uncertainty between ~ and z due to o(t) # o* and the modeling error term r / # 0. We can also construct the normalized estimation error • as • = •1 - W ( s ) L ( s )
[•n~] = W ( s ) L ( s )
[¢T(~ _ en~ -- 7/],
(3.13)
where n, is what is called the normalizing signal [11,9,13] and is designed so that
(A1)
~__ _~0 E Loo
where m 9 = ~ + fln~ for some ~, fl > 0. T h e normalizing effect of ns is more transparent when we express (3.13) in a state-space form and solve for the quasisteady state of •, i.e., = A c e + b¢(~bTff-- en~ -- ,7) • = hoe,
(3.14)
Robust Adaptive Control
85
where (he, A¢, b¢) is a minimal state representation of W ( s ) L ( s ) - hc(sI Ac)-tbc. Setting ~ -- 0 and solving for the quasi-steady-state response e.. of e we obtain - h c A c l b c ( ~ T ~ - tl) Clss 1 - hcA~lbcn~ 1 - hcA~lbcn 2 " Since W ( s ) L ( s ) is SPIt, hcA~lb¢ < 0 and therefore ess is equal to the normalized quasi-steady-state response elu of the estimation error ex. Due to n., e88 can only be driven unbounded by the parameter error ¢. In [2] Q is called the "augmented error" and en~ the "auxiliary signal" and they are developed using different considerations. Treating ex, e as the estimation and normalized estimation errors helps us to understand the separation of the identification and control parts and allows us to unify the various approaches based on other techniques such as gradient and least-squares methods. It should be noted that ¢ = 0 - 0" in (3.12)-(3.14) is an unknown signal vector and therefore the second equalities in (3.12) and (3.13) cannot be used to generate el, e. T h e signals el, e are generated from the first equalities in (3.12) and (3.13), whereas the second equalities are used for analysis only. The adaptive law for generating the parameter estimates 0(t) is developed by considering the Lyapunov-like function eWpce cw/'-l~b v(¢,~) = -'5-- + ----Y-- ' where F - - r T >
0 and Pc - p T
> 0. T h e matrix Pc is chosen as the one
which satisfies the algebraic equations P c A c + A T pc = _qqW _ uL (3.15) Pcbc = hc for some vector q, L = L T > 0 and a small constant u > 0 and whose existence is guaranteed by the S P K property of W ( s ) L ( s ) = h c ( s I - A c ) - l b c [2,10]. Using (3.15), the time derivative II of V along the solution of (3.14) is given by = --
--
e n s -- e~ + ~bTF-I~ •
Choosing = ~ = -r~(
- rw(t)0,
(3.16)
where w(t) > 0 is a signal to be designed, referred to as the leakage, we have ~'(~b,e) < - ~ eP
T
Le
--
~
~n s2 - -
e~
_w(t)~bT0
Using --~eTLe < - - ~ H 2, where ~ = 2Ao/llhcll 2 and )~o = VAmin(L)/4 > 0, we have
¢(¢, e) <
-~oll~ll ~ - I--~-(Z +.2) + H 171 I'~1:%~ ~(t)¢ To 2
"
86
Ioannou
and Datta
Furthermore, using
(#+-~1[ ~[ 12 1,112 2 ~ (# +,,D] + .,2 ,
-1~12 (# + -,~) + Idl'll < 2
where
. ~ = 2(~ + n~),
(3.17)
we obtain
¢(¢,e)<-~olHI 2 -
~2.~
~(t)¢T0+
2
m ~
'
where due to the property of n., ~ (__ C 2 for some Co > 0. T h e leakage term w(t) is to be chosen so that for V > V0, and some constant Vo > 0, ~" < 0. This property of V will imply that V E Loo and therefore e, e, ¢ E Loo. In a similar manner, an adaptive law for estimating p*, O* in (3.4) can be developed as follows: We rewrite (3.4) as
= WCs)S.,(,~) [p" (0*Tc + zl) +,7], where L ( s ) , ( , ~1 are as defined in (3.11) and zl = L - l ( s ) [ z o ] . of z at time t can be constructed as
T h e n the estimate
= W(s) r~(8) [p (0T( + z l ) ] , where p, 0 is the estimate of p*, 0* respectively at time t, and the estimation error can be defined as el:£--z. T h e normalized estimation error e is generated as e --- ~" - z - W ( s ) L ( s )
[en2~],
(3.18)
where ns is chosen so that (hl')
( , r/ , 77l ~T~ rn
ELoo,
where ~ = 0T( ~- Zl and m is given by (3.17). For analysis and design purposes (3.18) carl be expressed in terms of the parameter errors ¢, ¢ defined as
¢__a p - p, , ¢ =~0 -
o* .
From (3.18), we have = W ( s ) L ( 8 ) [ p ( 0 T ( -4- Zl) -- p* (0*To -4- Zl) -- ~ -- ens2] •
Since pOW~ -- p*O*W~ + p*OW( -- p*OT( = ~)oW~ + p*¢W~, we h a v e = W(s)i(s)
[¢~ + . . ¢ T / _ ~.~ _ ~],
(3.19)
Robust Adaptive Control
87
or
= Ace + be (¢~ + p, oT~ _ e.s2 _ ~) e =hce. Choosing
~TF-1 ¢ eWpce ¢2 v(¢, ¢,e) = ~ + ~ + Ip*l~
(3.2o)
where 7 > 0, it can be shown that
~(¢, ¢,e) --< -~o11~11= ~="~ 2 +
e¢~ + ep*¢T~ + ~1'71= +
+ L0*l ~ T r - l ~ , 7
where m 2 is as defined in (3.17). For (3.21)
= ¢ = - r e ~ sgnCP*) - r w ( O o , we have
2 2
? - -~°llell2
e 2-s
WlCp- wCTalP*l + 1~12 m2 "
(3.22)
The leakage terms w l , w >__ 0 are to be chosen so t h a t for V >__ V0 and some
constant Vo > 0, I¢" < 0, which guarantees that e, ¢, ¢ E Loo. A straightforward choice for w(0 is w(t) = or, where or > 0 is a constant referred to as the or-modification first used in [8] to improve robustness of adaptive schemes with respect to bounded disturbances and unmodeled dynamics. As indicated in [8], the or-modification achieves robustness at the expense of destroying the ideal properties of the schemes, i.e., in the absence of plant uncertainties or > 0 acts as a disturbance in the adaptive law which leads to non-zero estimation errors at steady state. This drawback of the or-modification motivated two new choices for w(t), the switching-or [9] and e-modification [10]. With the switching-or, w(t) = ors, where as > 0, when ]]011 is larger than some constant M0 > 110"[I, and a~ = 0 when I]0H < M0. As shown in [9], with this choice of w the ideal properties of the adaptive laws are maintained at the expense of knowing an upper bound for the unknown [10*H. The e-modification is given by w(f) = ]ern]vo and was first introduced in [10] with the rationale that in the absence of modeling errors w(0 will go to zero with the estimation error and therefore the ideal properties of the adaptive law will be guaranteed. Apparently this was not the case unless some persistently exciting conditions are used as shown in [10]. Simulations, however, indicate that with the e-modification the convergence properties of the adaptive law are improved [10]. The following theorem describes the stability properties of the adaptive law (3.21) for different choices of the leakage terms w(f) and wl(t).
Ioannou a n d D a t t a
88
T h e o r e m 3.1 The adaptive law (3.21) guarantees the following properties: (a) Fixed-a [8]: For w(t) = ~r, wx(t) = ~ , where o, ~1 > 0 are constants, we /lave ( 0 ,, o, p c Loo, 5 0 ,, ,n,, O, i' c s ( ~ + ~ + (b) Switching-~, [9]: For w(t) = o,, Wz(t) = ¢r1., where
~).
0 )m ([10[[ \Mo _ 1
er. =
ifllOll _< Mo ~ro if Mo < IlOll < 2Mo
~o
0)o
~x, =
- 1 ao
if llOll > 2Mo
ao
i f m l < Ipl ~ 2ml iflp[ > 2 m l ,
with the design constants no, a0, > 0, Mo > IIO*ll, ml > Ip*l, and m any finite positive integer, we have (i) ,, O, p e L ~ (ii) ,, , n , , O, ~ e S ( ~ ) .
(c) e - m o d i f i c a t i o n [10]: For w(t) = lemlvo, w z ( t ) = l e m l ~ l , wh~r~ vo, vz > 0 are design constants, we have (i) e, O, p e Loo (ii) e, en., O, k e S(vo + Vl + m-m~-~ ). (d) Ideal Case: For y = 0 and w = wz = 0 or w = cr,, wl = #1., we have (i) e, O, p C Loo (ii) e, en., O, b E L:.
The properties of the adaptive law (3.16) follow directly from those of Theorem 3.1 by setting p = p* = 1 and w l = 0.
Proof. (a) F o r w = ~, Wz = o l , we have -~xCp
- ~ ¢ T o __. --~a ( ¢ 2 _ I¢11P*I) -- ~ ( ~ T ¢ _ I1¢11110*ll) --
2--
Hence, (3.22) becomes
f" -< -:~°llel12
e2n~
2
,,12,,, - £211¢112+ ~,~1 P. 2 ÷ 2"11°'112 + m2' '72 ' 2
or
(3.23)
1]2
(3.24)
r~ <: - o , v + z~ I p * I 2 ÷ z~ll°'112 ÷ "-'w m '
where ----min km
'
2 ' 2[p*[A
F -1
"
Therefore, for V > Vo _ ;z, (a_xlp.12 2, , +~II0*II 2+ c02), where Co2 is the upper bound for ~--~, we have ~r < 0, which implies that V E Loo. Hence, e, ¢, ¢ E Loo, which implies that e, 8, p E Loo. Let us now express (3.22) in the form 712
Robust Adaptive Control
using -aollell ~ < have
by
-~
89
and (3.17). Integrating on both sides of (3.25), we
+ ~p~ + ~llOll~ d~ <_ v ( t ) - v(t + T)
Vt > 0 and any T > 0, which to§ether with the boundedness of V implies that e, en,, cqp, aH0ll e S(~rl + a ÷ m--~). Using the property of m, which guarantees that -Lm,-L m e Loo, it follows from (3.21) that
Ipl~ < ~ (l•ml~ ÷ ,~xlpl~)
]o I~ _ ~ (l~ml ~ + ~llOll ~) for some c E IK+. Therefore, P, i~ E ,.g(al + a + z~). (b) For w(t) = ~r,, Wl (t) = Crl., we have
~.¢T0 = ~, (II011~ -- 0*T0) -- ~.II011 (II011 Since
~,(11Oll-Mo) _ o and Mo
Mo + Mo -
IIO*ll) •
> II0"11, it follows that
Crs¢TO ~> ~11Oll(Mo -IIO*11) _> o, i.e.,
cT0
~,11Oll ___~. S o -II0"11 " Similarly, a l , ¢ p > 0 and be written as
~l,lpl ___~1,~_-~.
The inequality (3.22) for V can
2 2
• ns
-1°1---~2
(3.26)
For If011= II¢ ÷ 0*If > 2Mo, we have --~rs~bT0_< --~II¢II~ ÷ ~llO*ll~, and for ]pl - I¢ + P*l > 2ml, we have - a l , ¢ p < - e ~ l ¢ l ~ + e ~ l f l ~. Hence, following the same approach as in the proof of (a) we can show that for V > V0 and for some V0 > 0, V < 0, i.e., V E Leo, which implies that e, 0, p E Loo. Furthermore, (3.26) also implies that e, en,, v / ' f f ~ , fftrsCw0IP*I e 3(m~ )-
Since ~ . l p l ~ ___ elo'ls¢,O, ~.llOll = ___C20"sCT~, where el, e2 E IR + depend on the bounds for ~'olpl and
~ollOll, respectively,
have
Ibl 2 < c(I,ml 2 + ~ 1 . ¢ p ) , which imply that #b, ~ e
2
(m--~)-
II01l2 ___c(leml 2 + ~ . ¢ T o ) ,
we
90
Ioannou and Datta
wl(t)
(c) With w(t) = }emlPo, --
= lemlv~ from (3.20) we obtain
eTqqTe 2
veTLe 2
e2n~ - t0 - wap¢ - Ip*lwcWo
<
eTqq Te
ueTLe
eZn~+l,ml[l~l_v,pC_lp, l~,o¢Tol.
-
2
2
Since _¢T o < _
/m
I¢12 + m10,12
-
2
2
_¢?_ '
J
_ m¢2 + : 2 2
2
'
we have eTqq Te 2
peTLe 2
2 2 e n,
Using similar arguments as before, we can establish that for V > V0 and some V0 > 0, V < 0 and therefore e, 0, p E Loo. From (3.27) we can show that e, en, E S(v0 + Vl + ~-~), which together with Ip[2 '< _ c¢2m 2 , ]0[ 2 < c : m ~ , - ~ = 2(8 + n~), imply t h a t b, 0 e S(v0 + Vl + ~-~). (d) For r} = 0, (3.22) becomes 2
f" ----~°llell2
E
2 n s
2
,.¢T01P*I- wlCp.
(3.28)
For w = a,, W 1 = als , or for w = wl = 0, we have w~)To ~ 0 and w l C p >_ O. Hence, P" _< 0 and V, e, 0, p E Loo. Since V is a nonincreasing function of time which is bounded from below, limt__.oo V(t) = Voo exists. Therefore, from (3.28) we have
A°HeH2 + T i.e., e, ¢ns, c, ~ ,
"~-wcT0]P*[ + ~/)I¢P d r < V(O) - Voo < ~ ,
~
I~12 _< e~2r~ 2,
e L2. Since for w = Wl = 0 2
]0
< c ~ . ~ ~ , m 2 = 2 (/~ + n ~ ) ,
it follows that p, 0 E L2. For w = ~rs, Wl *" O'ls, we have from (b) that _
which implies that ~, 0 E L2.
,
< c (lem[ 2 + O'sCT0) , 13
The use of the parametric model (3.11) and the S P R condition to derive robust adaptive laws whose properties are given by T h e o r e m 3.1 has not been done before, and that constitutes one of the new results given in the paper. In Sect. 5, we show how such adaptive laws can be combined with control laws to form robust adaptive control schemes.
Robust Adaptive Control
91
3.2 Gradient M e t h o d This method is based on the development of an algebraic error equation and the minimization of a certain cost function J(O, t) w.r.t, the estimated parameter 0 for each time t by using the steepest descent method as shown below: Since 0* is constant, we can rewrite (3.3) in the form Z -" 0 * T ¢ -']- 7 ,
where ¢ =
W(s)[w], 7} =
W(s)[y0].
The estimate £ of z at time t is given by = 0T~,
and the estimation error by e I ~-- z - z : 0T~-- z. The normalized estimation error e is constructed as el 7712
~~'oTt,--Z ~2 I
where m 2 = 1 + n !2 and ns is the normalizing signal designed so that (A1)
- - , - - y( E L o o . m
m
The adaptive law for updating 0 is derived by minimizing various cost functions of 0 w.r.t. 0. We start by considering the instantaneous cost
J(O,t) = '2m----'~22+ 2 0 T 0 - (0T¢--2rn2 z)2 + 2 0 T 0 ,
(3.29)
where w(t) >>0 is a design function which acts as a weighting. Using the gradient method we obtain = -FVJ(0), w h e r e / " = F T > 0 is a scaling matrix, and VJ(O) is the gradient vector of J(O,t) w.r.t. 0, i.e., V J(0) = e( + wO and
= - F e ( - FwO, 0(0) = 00.
(3.30)
The adaptive law (3.30) has the same form as (3.16) except that e and are defined in a different way. Similarly, we can derive an adaptive law for the bilinear model (3.4) by rewriting it in the form
Z = f (o'T¢ + zl) + where zl = W(s)[zo].
92
Ioannou and Datta
Then, the estimate ~ of z, the estimation error ~t and the normalized estimation error e can be constructed as follows: = p o T ( + pzl e1 = ~ , - - z ' - p * O T ( - l - ( p - - p * ) ~ + p * z
l-z,
~= oT(-f'z 1
£'-- z _ p*OT( + ( p - - p*)~ "4- p* z l -- z
m2
m2
where m s = 1 + n~ and ns is chosen so t h a t (AI')
(
,
m
~ m
,
r/
ELoo,
m
and p, 0 are as defined before. T h e cost function to be minimized w.r.t, p, 0 is given by J ( p , O , t ) = e2m~ 2 +-~P"
=
+ 2 Ip*lOwO
+ -2-
2m 2
+ 70wWoip, I,
where w, wl > 0 are design functions which act as weightings. Using the gradient method, we have b = -'rVJ(p)
= -Te~
-
7wig,
= - F a V J ( O ) = - / ~ x p * e ( - Ip*lFlWO, Since F1 is arbitrary, we can assume t h a t /"1 = ~
7 > 0 F1 = F~r > 0. for some other arbitrary
matrix F = F T > 0 and use it to get rid of the unknown p a r a m e t e r p* in the adaptive law for O, i.e., = - r e < sgn(p*) - t w o
(3.31)
= - T e ~ -- 7 w l p . T h e stability properties of (3.31) for various choices of w, wl are given by the following theorem. Theorem
3.2 The adaptive law (3.31) guarantees the following properties:
(a) Fixed-a: F o r w ( t ) = a, w l ( t ) = al, w h e r e a l , a > 0 are constants, we have
(i)~,en.,O,p,O,~L~o (ii)¢,~..,O,~S(a+al+~). (b) Switching-a: For w ( t ) = as, w l ( t ) = als, where as, a~s are as defined in T h e o r e m 3.1(b), we have
(i) ~, ..., o, p, d,/. e Loo (it) e, e.., O, ~ e s ( ~ ) . (c) e-modification: F o r w ( t ) = Icml~0, w l ( t ) = Icml~x, and for some constants vo, vl > O, we have
Robust Adaptive Control
93
(d) Ideal Case: For ~ = 0 and w = wx = O, or w(t) = a, and w~ = axe, we have
(i) e, en,, O, p, O, ]~ E n ~ (ii) e, en,, O, ]~ E n2. /n
If, in addition,
E L ~ , then •, •na, O, ]~ ~ 0 as t -+ oo.
17~ ~ D 2 ~ rt~ ~ D'~
Proof. We consider the positive definite function
v ( ¢ , ¢) = ¢2 + llP, lCWr_bl, '
(3.32)
which has the same form as that given by (3.20) with Pc = 0. The time derivative of V along the solution of (3.31) is given by
V" < --¢T<::elP*I sgn(p*) - ¢+• - Ip* lw¢ TO - w1¢P. But p * 0 T ~ -~" ¢ ~ "~" p * Z l
-- Z
.-.
p * p T ~ _~_ ¢ ~ _ p * 0 * T ~ _ }7 .-- p * ¢ T ~ .~_ ¢ ~ _ }7
m2
m2
m2
Since IP'lsgn(p') = p*, it follows that (p.¢T¢ + < -c2m
~2
-<
2
2 -
~
-
Ip* l w ¢ ' r 0
-
=
w1¢p
2 2
• 2n,
(3.33) + }7, and therefore
}72
Ip*lw(W0 - w t ( p +
-rn- 2 •
(3.34)
Equation (3.34) has exactly the same form as (3.22) except for the first terms -A011ell 2 in (3.22) and - - ~ in (3.34). Since the first term does not affect the stability arguments and results, we can use Theorem 3.1 to establish that •, 8, p E Lm and property (ii) in (a), (b), (c), (d) holds for the various choices of w, wl. In order to complete the proof we need to establish that in (a), (b), (c), (d) we also have en,, O, ~ E Loo and in (d) that for ~ , ~ m' ~m ' ~m E Loo we have e, end, O, ~ ---* 0 as t --+ oo. From (3.33) we have that 0, ¢, ~ -{- & E Lm imply that cm E Loo, which in turn implies that e, en8 E Loo and •~,e+ E Loo. Since for all choices of w, wt in (a), (b), (c) and (d) wO, w t p E Loo, it follows from (3.31) that 0, ~ E Loo. Using (3.33) with '7 = 0 and 0, ¢, ~ , , ~ , ~ ,+~ e, E L~o it follows that t, ~ ( e m ) E Loo, which together with era, • E L2 imply that era, e, ens ---* 0, and therefore 0, ~ ---. 0 as t ---* c~. I-I The stability properties of (3.30) for the linear parametric model follow directly from Theorem 3.2 by setting p = p* = 1, wz = 0 and ¢ --- 0. Instead of the instantaneous cost J(O,t), J ( O , p , t ) one can choose various different types of cost functions which will lead to different adaptive laws. An interesting cost function to consider is the integral cost [21] j(o,t)
=
I /0' e_~(l_,)~2(t' T)m2(T)d T Jr +tt)(t)0T0'
(3.35)
94
Ioannou and Datt~
where m 2 = 1 + n 82 satisfies (Aa),/3 > 0 is a design constant, w(t) > O, w E Loo is a weighting coefficient,
,(t, 7-) - ~(t, m2(7-) ~) - ~(~) , 7- < t ,
(3.36)
and £(t, 7-) = oT(t)((v) is the estimate of z(v) at time 7- formed by using the estimate O(t) of 8* at time t > r , i.e., e(t, 7-) is the normalized estimation error at time 7- based on the estimate of O* at time t _> 7-. The design constant fl > 0 acts as a forgetting factor, i.e., as t increases, the effect of the old d a t a at time r < t is discarded exponentially. Using (3.36) we can express (3.35) in the form
J(O,t) = ~1 fote_P(t_r)(0T(t)((r)m2(7-) z(r)) 2 dr +
1 w(t)or8.
(3.37)
Applying the gradient method to (3.37) we obtain
O = -FVJ(O) = - r r1 ' e_Z(,_~) (oT(t)¢(r) -- z(r)) ¢(r)dr
Jo
-
FwO,
(3.38)
where F = F T > 0 by assuming that ( ( r ) , z(r), re(r), w(t) are independent of O(t) at time t. The adaptive law (3.38) can be implemented as [21,19]
= -F(R(t)O + r(t)) - F w e = - / 3 R + -~~T- ¢ , ÷ = -/3r-
z!
R(0) = o
(3.39)
r(0) = 0,
1712 '
where R E IK" x " , r E ]R"xz. It is clear from (3.39) that the signal e(t, r) does not have to be constructed. It is only used for design and analysis purposes. Similarly, for the bilinear model we can develop the adaptive law = -sgn(f)r(RpO
= -pR+ ÷ = -~-
~¢¢T -~, ~,
R1 = -/3R1 + %-~, ¢1 = -/3rl - m 2 ,
+ ~) - r ~ o
R(O) = 0 ~(o) = o
RI(0) = 0 n ( 0 ) = 0,
(3.40)
95
Robust Adaptive Control where R1 • IRnxx, rl • IK, by minimizing the cost function ](o,
p) =
1 [' [p'OT(t)¢(r) + (p(t) -- p*)¢(v) -t- p ' z l ( r ) -- z(r)] 2 ~ t e-~('-~) aT ~o ~(~)
1 2 +~W(t)oTo + ~Wl(t)p w.r.t. 0, p under the a~sumption that ((r), ((v), z(r), w(t), lOl ( t ) are independent of 0(t), p(t) at time t. The stability properties of (3.39) and (3.40) for different choices of the leakage w, w~ can be developed in a slightly different way than those of (3.30) and (3.31). For clarity of presentation we confine ourselves only to the analysis of (3.39). We consider the positive definite function V = CwT-l~b whose time derive2 tire along the solution of (3.39) is given by = --¢T[R0 + r] - wCWO
= _¢TR¢ + CT(t)
//
(3.41)
~2(~) d r - wCTO e-a('-,)((~)"(~)
[£'
(¢T(t)((r)) 2- ]~
_
(3.42) _
r/~
½
--w~TO (where we made use of the Schwarz Inequality)
= -¢TR¢ + (,TR¢)½[fote-#('-r)2~((;))drl ½- wdpTo,
(3.43)
or
1)" < _
ffT_Rff
1 + 3( I\.(~1~ ~_~tl]tl2/ 8 2 _w~To
(3.44)
where we carried out a simple completion of squares. The leakage term w > 0 is to be chosen so that for V > V0 and some constant V0 > 0, V < 0, which guarantees that ~b• Loo. The following theorem describes the stability properties of the adaptive law (3.39) for different choices of the leakage term w(t). T h e o r e m 3.3 The adaptive law (3.39) guarantees the following properties: (a) Fixed-a: For w(t) = a, where a > 0 is a constant, we have (i) ,, ,n,, O, 0 • L ~ (i 0 e, 'ns, 0 • S(a + ~ ) . (b) Switching-a: For w(t) = as, where as is as defined in Theorem 3.1(b), we have ( i ) , , , n s , O, O e L ~ (i/) ,, ,ns, 0 • S ( ~ ) . (e) e-modification: For w(t) = II(e(t,r)m)~ll~vo, where vo > 0 is a design constant, we have (i) e, ens, O, 0 • Loo (ii) e, ens, 0 • S(Vo + m~ ).
96
Ioannou and Datta
(d) Ideal Case: For ~ = O and w = 0 or w = ~,, we have (i) e, ens, O, O e L ~ (ii) e, en,, O e L=. If, in addition, --' ~ m ¢n' - E Loo, then e, en,, 0 --+ 0 as t -+ oo. m m Proof. (a) For w = a, we have
_ ~ T e < _~ (~T~ _ I1¢1l lle*ll)
Hence, (3.44) becomes
CTR~
'(i
tt:)=
Thus, as in the proof of Theorem 3.1(a), we can show that 3 Vo > 0 such that for V > V0, V < 0, which implies that V and hence ¢, 0, e, ens E Loo. Since R E L ~ , from (3.39) it follows that 0 E L ~ . Now integrating on both sides of (3.45), we have Vt, T > 0:
1 f- -t + T
_
7/2
_
+ ~t+T = V(t) - V(t + T)
~lle*ll=d,
le-'('-')~-dr]ds+e,I [f, J ft+T
+~
.It
s
2
m2
where we used the fact that m]~ E L ~ . Interchanging the order of integration in the second integral on the right hand side and using the boundedness of V, we obtain that X / T Y ~ , V~[[¢[] e ,q(~r + $-~). Now from (3.39) we have
d [¢TR¢] = 2¢TR ~ + cTR¢ a7 = -2¢,TR { t I R e + ,.] + r ~ o } + ¢~ [-ZR + ¢¢r] m=j ¢ = -2~TR_F'[R~ + r + o'O] + cT - f i r + m2 j
=--2~TR/~I R e -
Jof'~-ZO-~) ~l(r)((r) m2(r ) _ +a0]
~ l ~"
(3.46)
RobustAdaptiveControl
97
Also, ~bT( -- 17
ms
¢w¢ ==~ ~ m = - m
,7 m
2~(¢w¢)2 + 25
=I~ 6 2 m 2 < _
(3.47)
where we used the identity (a + b) 2 <: 2a2+ 2b2. Using (3.47) in (3.46), we obtain
_
d [~TR~] > _2¢TRER¢ 4- 2¢TR/"
/ ' _~(,_,)~(r)¢(r)
--21rq~WR F o -- fl~W R ¢ 4-
mS(r ) dr
-
dt
e2m2
72
2
m ~"
Integrating both sides of the above inequality from t to t 4- T and making use of the fact that CTR¢[ t >_O, we obtain an inequality involving CWR~b[t+T. Using this inequality in (3.42) we obtain
f/(t + T) <
CWRFRCdr _
CTRF
[I
' e_O(,_~) , ( r ) ¢ ( r ) . ]
2 dr +2f,'+~°'¢T RFOdr 4- fl f'+~~TRCdr_f,'+~~s +
/,+T ~,is - ~ d r + /,+T C ~c'+T-,)l'7(r)l .,, .,o
IlcT(t +T)II II¢(r)lldr
m2(r)
--acToI,+T .
(3.48)
Now
12¢WRrfo' e-~c'-~)'7(r)¢(r) mS(r) a_r,i _< = II,TRrll
<
e_fl(s_r ) Mr)ll¢(r)lmS(r) dr
,,oT--,,=+
_
_
02
dr]j
where we used the identity 2ab < a s + b2 and the Schwarz Inequality. Integrating both sides of the above inequality and making use of R, ~ 6 Loo and (¢TR¢)½ 6 S(m---~-+ a), we can show that
Also, v~ll¢ll e s ( ~
+ ~) together with R, 0 6 Loo imply that
I~¢~RrOI½6 S ( ~+" ~)
.
(3.50)
98
Ioannou and Datta
Now using (3.49) and (3.50) in (3.48), and noting that 1~"e Loo, ¢Wn¢ e S(a + m--~), it follows that ,m and hence e, en, • ,.q(~r+ m--~). From (3.39) we can show that
(11 I:)
2
for some el, c2, ca > 0. Since R, ~ • Loo, ~ • S(a + ~), ~ S(tr + ---~), it follows that 0 E S ( a + m--~). This completes the proof of (a). (b) For w(t) = o'., we have
•
~ . ¢ Te = ~,(llall ~ - o ' T o ) > ~,llell(llOll - Mo + Mo - Ila*ll).
Since ~.(llOll - Mo) >_ 0 and Mo > [1O*ll, it follows that ~=~T0 >_ ~.llell(Mo -IIO*ll) _> o,
i.e,~ CTO '~dlOII <_ a, Mo _ IiO* II "
(a.51)
The inequality (3.44) can be written as
- ---g-
+ 7
m , =/
- ~'¢Te"
(3.52)
For II011= I1¢+ 0"11 > ~ g o , we have (7o
--,~=¢Te < --~11¢112 + yIIO*ll ~ . Hence, following the same approach as in the proof of (a) we can show that for V > V0 and for some II0 > O, 1~ < O,i.e., V E Lop, which implies that e, O, ens E Loo. Integrating on both sides of (3.52) we obtain ~ , vf~scT0 • S(~-~-~). Proceeding as in the proof of (a) and making use of (3.51) we can show that e, en, • S ( & ) . Since
~l[0112 ~ ~ T e , where cl • lR+ depends on the bound for ~o11011, we have
which implies that (~ E 8 ( ~ ) .
Robust Adaptive Control
99
(e) With w(t) = ll(*(',,),,-,)dl~a,~o, we obtain from (3.41)
? = --¢T[R0 + r]
-
wCTO
= --¢T [~t e-fl('-l") ' 'mT2Odr _ ~* e-fl(t-r ) ~d,r] _ w~bTo = --~bT = -
I'
[~'e-fl(t-~r)'(r)f-(f, T)dT] - ¢TOll(e(t, ~-)m)dl{vo e-~('-')e(t, r) [e(t, ,-)mS(r) + '7(")] d" - cTOIl(e(t, r)m)dl{~'o
_< - (,,(.,~)~),,.a)
+.,(.,.)~),,,a I1(~).. -,-o,,(.,.)~),,.a~0
II(~),1: + ,,0~0]
: -"(*,*)~)'"~ ["(*,')~)'"~But
_q~T~ < - - -i1¢11 - ~ + II0*li ~
2
2
Thus,
Using the same arguments as before, we can establish that for V > V0 and some V0 > 0, V _< 0 and therefore e, 0, ens E Loo. From (3.44) we have
Ik <
¢TR$ + ~ (iI
_---~
( )~1 1 ~~ ...~.,,.~j
II(4,, ~-).-0, ll~,.o eta
_ ¢TR____2¢+ 1 Integrating on both sides of the above equation and using the boundedness of V
~o~ ~ , w~oo ~ ,
,m~. ~
[ll(.. ~)~).ll~] ~ Iw,ll~ s¢~0+~). ~o.~ow~,
the same steps as in (a) we can conclude that ~, ~n,, 0 6 S(vo + ~ ) proof is complete. (d) For 0 = 0, (3.44) becomes < --
CWR~b weTs.
and the
(3.53)
2
For w = ~rs or for w = 0, wCTO >_ O. Hence, II _< 0 and V, e, a E Loo. Since V is a nonincreasing function of time which is bounded from below, lim,-.oo V(t) = Voo exists. Therefore, from (3.53) we have ~ , X/~'-¢T0 e L2. Now using the same arguments as in (a) above, we can show that e, ens, ~ E L2. The rest of the proof can be completed using exactly the same arguments as in the proof of Theorem 3.2(d). []
100
Ioannou and Datta
3.3 L e a s t S q u a r e s
A type of integral cost function of the form (3.35) whose discrete-time version attracted considerable interest in parameter identification of discrete-time systems is the exponentially weighted function
J(O, t) = ~e-~'(O- Oo)TQo(O- Oo) 1 +~
[ * e-~('-') [~'(t,~)m~(~) + ~(~)OT(t)0(~)] d~,
(3.54)
J0
where Qo = QT > 0, fl >__0, 0o = 0(0) and all other variables are as defined in Sect. 3.2. Using (3.36) we can rewrite (3.54) in the form 1
J(O,t) = 2e-#t(O - Oo)TQo(O - 0o) 1
'
gr(~)
L
The cost (3.54) or (3.55) penalizes the square of the parameter change from t = 0 to t weighted by e-~tQo, art exponentially weighted average of the square of past estimation errors based on the current parameter estimate and an exponentially weighted average of the past and present parameter estimates. The exponential weighting coefficient /? acts as a forgetting factor, i.e., as time t increases the effect of old data at time r < t and of the initial parameter 00 are discarded exponentially. Since ~m, -~ E Loo, J(O) is a convex function of O(t) over IR'~ at each time t. Hence, any local minimum is also global and satisfies
v J(0) = 0, v t > 0, i.e.,
v s ( e ( , ) ) = e-P'Oo(e(t) - Co)
+ ~ ' e - O ( ' - ' l { ( O T ( Q ( (dr ) - zr( r )') ' ( ~( r ) +( w (~r ) O)( r ) }
= o.
(3.56)
The form of J(0) is such that it allows us to solve for the solution 0(t) of (3.56) explicitly, i.e., (3.57) where
P(t) =
e-13tQo +
e -~(t- ) m2 dr)
, P(O) = Qo 1 = Po .
Robust Adaptive Control
101
Since Qo = QT > 0 and ~CT is positive semidefinite, P(t) exists at each time t. Equation (3.57) represents the so-called non-recursive least-squares algorithm for continuous-time systems [25,26]. A recursive version of (3.57) is developed by differentiating (3.57) w.r.t, time and using the identity
d [pp-1] = 0 = p p - i + p ~ p - I dt to obtain
= -P(e~ + wO) = tip _ p ~ ¢ T p rn2
(3.58) '
P(0) = P0 = Q~-I
where
(3.59)
Z(~)
We refer to (3.58)-(3.59) as the continuous-time recursive least-squares algorithm. We should note that P(t) = Iv J2(0)] -1, where VJ2(O) is the Hessian of J(O, t) at 0, i.e., the least-squares solution can be obtained by applying Newton's method to the cost (3.54). For the bilinear model we cannot develop an equivalent adaptive law using the least-squares method which can be implemented. This is due to the fact that such an adaptive law will depend on p* in the same way as in the gradient method. In the gradient method, however, we were able to get rid of the unknown p* by scaling the adaptive gain appropriately. In the least-squares case such a sealing is not possible because the adaptive gain P(t) is no longer arbitrary but is generated from the differential equation (3.59), whose state P(t) cannot be scaled by an unknown scalar. In the identification literature, (3.58)-(3.59) with fl = 0 and w = 0 is referred to as the ~pure" least-squares algorithm and has a very similar form as the Kalman filter. The matrix P is usually called the covariance matriz. Setting fl = 0, (3.59) becomes
P_
p((rp m2
,
dp_l or dt
_ ((T - m2
(3.60)
Since ~ p - 1 >__ 0, p - 1 may grow without bound; therefore, P may become arbitrarily small and slow down adaptation in some directions. This so-called covariance wind-up problem can be prevented by using various modifications which prevent P(t) from going to zero. One such modification is the so-called covariance resetting defined as follows:
=
p¢¢Tp TT~2
~
P('+) = Po = poI
(3.61)
where tr is the time for which Amin(P(t)) _< Pi and P0 _> Pl > 0 are some design scalars. Due to (3.61), P(t) >_piI Vt >_O.
102
Ioannou and Datta
When/~ > 0 in (3.59) the problem of P(t) becoming arbitrarily small in some directions does no longer exist, and (3.59) is usually referred to as the least squares with forgetting factor. In this case, however, P(t) may grow without bound due to tiP > 0 and to the fact that ( P ( ( T p ) / m 2 is only positive semidefinite. One way to avoid this complication is to modify (3.59) as p¢¢Wp /5 = tiP - ~-~ if IP(t)I < Ro (3.62) 0 otherwise, where P ( 0 ) = P0 = p T > 0, IP01 < R0 and Ro is a c o n s t a n t that serves as an upper bound for P. This modification guarantees that P E Loo and the adaptive law (3.58) with P given by (3.62) is referred to as the modified least-squares with forgetting factor. With either modification, the adaptive law (3.59) with P given by (3.61) or (3.62) behaves as a gradient algorithm with a time-varying gain, and is no longer the least-squares algorithm that we developed by setting V J(0) = 0. Despite its deficiency, the pure least-squares algorithm (3.58)-(3.59) has the unique property of guaranteeing parameter convergence to constant values in the case of ~/= 0 and w = 0 as described by the following theorem. T h e o r e m 3.4 The adaptive law (3.58) with ~ = O, w = O, and P given by (3.59% guarantees that
(i) O, e, ens, P E Loo, (ii) e, en,, ~ e L2, and (iii) limt_.oo O(t) = O, where 0 is a constant vector.
Proof. From (3.59) we have that t5 < 0, i.e., P(t) < Po. Since P(t) is nonincreasing and bounded from below (i.e., P = p r > 0 Vt > 0), it has a limit, i,e.,
lim P(t) = P ,
~--*Oo
where P = p T _> 0 is a constant matrix. Let us now consider ~t ( p - l ¢ ) = _ p - 1 / S p - l ¢ Since for ~/= 0, • = G
+ p-1 d =
e~.
= m'~' we have d (p-1¢) = 0 dt
i.e., P - l ( t ) ¢ ( t ) = Po1¢(0) and ¢?(tI = P(t)PoX¢(O). Therefore, limt-~oo ¢(t) = P P o l ¢ ( 0 ) and limt-~oo 0(t) = / s p ~ . ¢ ( 0 ) + 0 " = 0. Since P E Loo, it follows that ¢,0 E Loo, which together with -~-m E Loo imply that e, em, •ns E Loo. Let us now choose the function cTp-lcb cT(t)P0-" 1¢(0 ) v(¢) =
2
=
2
Robust Adaptive Control
103
Then, the time derivative r~ of V along the solution of (3.58) is given by ~r = _e(bW~ = _e2m~ <_ O,
(3.63)
which implies that V E Loo. Since
]Lm
= ¢
(0)Po 1 P R o ' c ( 0 ) =
exists, it follows from (3.63) that em E L2, which implies that e, ens E L2. From the adaptive law m
and the fact that P, -L m E Loo and em E L2, it follows that 0 E L2 and the proof is complete. 13 When w(t) ¢ 0 the properties of Theorem 3.4 cannot be guaranteed even when ,7 = 0. Let us now consider the adaptive law (3.58) with P given by (3.61) or (3.62). The following theorem establishes the stability properties of (3.58) for different choices of the weighting coefficient or leakage w(t). T h e o r e m 3.5 The adaptive law (3.58) with P given by (3.61) or (3.62) has the following properties: (a) Fixed-a: For w = or, where ~r > 0 is a fixed constant, we have (i) e, ens, O, 0 e Lc~ Oi) e, ens, 0 e S ( a + ~-~). (b) Switching-a: For w(t) = as, where aB is as defined in Theorem 3.1(b), we have (i) e, ens, O, O e L ~ (ii) e, en,, O, e S(m-~ ). (c) e-modification: For w ( t ) = leml~0, we h,ve (i) ,, ,n., O, O e Loo O0 ,, , , . , O C S( o + ). (d) Ideal Case: For ,7 = 0 and w = 0 or w = ~., we have (i) e, en,, O, O e L ~ (ii) e, ens, O e L2. If, in addition, ~ , in , m
m
E Loo, then e, end, 0 --, 0 as t ~ oo. m
Proof. Let us first examine the properties of the covariance matrix P(t) given as the solution of (3.61) or (3.62). In (3.61) P(t) is a discontinuous function of time whose value between discontinuities is given by the differential equation (3.61). At the discontinuity or resetting point tr, P(t +) = Po = poI and therefore P - l ( t + ) = P o l I . From (3.61) we have that between discontinuities d P - l ( t ) > 0, i.e., P - l ( t 2 ) - P - l ( t l ) > 0 for t2 > tl > 0 and for t r ¢ It1, t2], which implies that P - l ( t ) > p~Xl Vt > O. Due to the resetting P ( t ) >__pxl Vt > O. Therefore, i3.61) guarantees that poI >_ P(t) > p l I ,
p'~11 >__P - l ( t ) > P o l I .
(3.64)
104
Ioannou and Datta
In (3.62), IP(t)[ < Ro Vt > 0. Using the identity ~ p - 1 = have
_ p - l p p - 1 , we
<
~p_~
_~p-1 + 0
where P - I ( 0 ) - P0-1. Since ~
ifIP(t)l_
(3.65)
otherwise,
E Loo, (3.62) with ~ > 0 implies that p - 1 E
Loo,
Therefore, (3.61)-(3.62) guarantee that P and p - 1 are bounded positive definite symmetric matrices. Let us now consider the function
V ( ¢ , t ) - ~bTp-l¢ 2 where P is given by either (3.61) or (3.62). Since P-~ is a bounded positive definite symmetric matrix, it follows that V is decrescent and radially unbounded in the space of ¢. Furthermore, p,
1 Td
-1
T -I
'
i.e., (3.66) or
t-2 rn2
T/~ 1 T ~P d --1¢. ~¢T0+~+5¢
If P is given by (3.61), then ~ P - '
(3.67)
= ~T and
1 Wd - 1 (¢T~)2 ~2m2 2¢ ~P ¢ = 2m 2 = 2
T]2
Therefore, using (3.66) we obtain I/-----
e2m 2 2
7}___~ 2 weT0 + 2m ~ "
(3.68)
If P is given by (3.62), then ~dP - i = _ ~ p - 1 + ~ iflPl _< R~ and ~ e -I = 0 otherwise. Hence, it follows from (3.65) and V that
~2
e2rn2 =
w•T0 ÷ "2m -'~ -
2 e2rn 2
2
cWp-l~b if ]PI ~ tto
7-]2
w¢W0 ÷ 2m 2
otherwise.
Therefore, for P given by either (3.61) or (3.62), we have
l/<-
e2rn2 2
~12 wCTO÷ 2m 2 "
(3.69)
The rest of the proof follows directly from the proof of Theorem 3.2 with p = p* = 1, ¢ = 0 , wl = 0 . [D
Robust Adaptive Control
105
3.4 D e a d - Z o n e s a n d P r o j e c t i o n s In Sects. 3.1-3.3 we developed adaptive laws which guarantee parameter boundedness, in addition to other properties, in the presence of the modeling error ~}. This is achieved by using a special normalizing signal and a leakage t e r m in the adaptive law. In this section we introduce some additional modifications, which can be either used alone or combined with leakage to form adaptive laws which can guarantee similar properties as in Sects. 3.1-3.3 in the presence of modeling errors. D e a d - Z o n e . Let us consider the normalized estimation error oT¢ _
,7
ffl 2
WI2
as defined in the gradient and least-squaxes methods for the linear parametric model of 0" which is used to "drive" the adaptive law. The signal e is a measure of the parameter error ¢, which is present in the signal $w~, and of the modeling error ~/. When ~1 = 0 and ¢ = 0 we have e = 0 and no adaptation takes place. Due to ~ , ~ E Loo large e m implies that "~,n is large, which in turn implies that ~ is large. In this case the effect of the modeling error T/is small and the parameter estimates driven by e move in a direction which reduces 4. When e m is small, however, the effect of 77 may be more dominant than that of the signal ~bw~ and the parameter estimates may be driven in a direction that increases the parameter error ¢ or ¢W(. The principal idea behind the dead-zone is to monitor the size of the estimation error and adapt only when the estimation error is large relative to the modeling error T}, as shown below: We first consider the gradient algorithm for the linear parametric model Z --" 0 * T ~ "4" 77"
Instead of the cost function (3.29), we consider J(0,=) =
E2m2 2'
and write (3.70)
L
0
otherwise.
In other words, we move in the direction of steepest descent only when the estimation error is large relative to the modeling error, i.e., [em[ > go and switch adaptation off when em is small, i.e.,]~m I < go. In view of (3.70), we have o
=-F((e+g),
g----
> go
-e if[em[
This discontinuous dead-zone function f(e, g) = e -F g is shown in Fig. 1.
(3.71)
106
Ioannou and Datta
go
. . . . . . .
m
n
go
.go:
n
ra I i
Fig. 1.
In . . . . . . .
go
Discontinuous dead-zone.
In order to avoid any implementation problems [27] which may arise due to the discontinuity in (3.71) the dead-zone function is made continuous as follows: go if era < - g o m
=-F¢(e+g),
g=
_g°
if era > go
(3.72)
171
- e if ]eml <_ go. This continuous dead-zone function f(c, g) : e -I- g is shown in Fig. 2.
t'(~g)~+g
/
go m
/ Ira,
go
R m
£
Fig. 2. Discontinuous dead-zone.
For the bilinear model (3.4) the adaptive law (3.31) with the dead-zone becomes
= - r ~ ( ~ + g) sgn(p*) = - ~ ( ~ + g),
(3.73)
Robust Adaptive Control
107
where g is as defined in Fig. 2 and (3.72) but with e=
p(0T~ -I- Zl) -- Z ---- p*~bT~ q- ~b~ -- O m2 m2
The least-squares adaptive law (3.58) with the dead-zone becomes
= -P~(e + g),
(3.74)
where e, g are as defined in (3.72) and P is given by either (3.61) or (3.62). The dead-zone modification can also be incorporated in the adaptive laws (3.39), (3.40) derived from the integral cost (3.35). For clarity of presentation, we consider the adaptive law (3.39). The principal idea behind the deadzone remains the same as before, i.e., to shut off adaptation when the normalized estimation error is small relative to the modeling error. However, this time the "shut-off" is no longer based on a pointwise-in-time comparison of the normalized estimation error and the a priori bound on the normalized modeling error. Instead, the decision to shut off adaptation or otherwise is based on the comparison of exponentially weighted L2 norms of certain signals as shown below: Instead of the cost function (3.37), we consider
J(O,t) = 7
e-#('-r)(e(t, v))'2rn2(r)dr,
(3.75)
and write
I
II
-/'VJ(O) ifll(e(t,r)m),ll ~ > go(t)>_ (~), 0=
~ > 0 being arbitrary, 0
(3.76)
otherwise.
In view of (3.76), we have o = -r
[f
]
e - ~ c - T ) ~ ( t , ~-)¢(,-)d,- + g ,
(3.77)
where
f0 ,
= -
0 if II(e(t,r)m),ll~ e-a('-T)~(t,r)¢(r)dr otherwise.
> g0(t)
(3.78)
In order to avoid any implementation problems [27] which may arise due to the discontinuity in (3.77)-(3.78), the dead-zone function is made continuous as follows:
O= -F [fo' e-'('-r)c(t,r),(r)dr + g]
(3.79)
108
Ioannou and Datta
0
g=
if II(~(t, ~)m),lh~ > 2~](t)
if g0(t) < [[(e(t, r)rn),[[~ < 2g0(t)
,
fo e-~('-~)e(t,
(3.80)
if [lee(t, v)m)dl~ < go(t).
T)~(T)dT
The following theorem gives the stability properties of the adaptive laws (3.72), (3.73), (3.79), (3.80). T h e o r e m 3.6 (a) The adaptive laws (3.72), (3.79), (3.80)for the linearparamelric model guarantee: (i) e, ens, 0 ~ L ~ ,
(it) ,, ~ns e s(go + ~ ), (iii) 0 E L2 N L1, and (iv.) Flm,....,~ O = #. (b) The adaptive law (3.73) for the bilinear parametric model guarantees that: (i) e, en,, O, b e L ~ , (it) e, en, e S(g~ + m~ ), (iii) /J, 0 e L= n L1, and (iv) limt.-.~ O = 0, limt__,~ p = p. Here, 0 and ~ denote a constant vector and scalar respedively. Proof. We start with (3.72). We consider the function V = cTF-I¢
2 whose time derivative II along the solution of (3.72) is given by
? = _¢V;(~ + a) = --(~m 2 + ,)(~ + g)Now
')
(em + go)2 - (gO - m ~. (em + ( era2 + rl)(e + g) =
(era
--
( g0) 2 q- go + ~
(era
0
--
(3.81)
> 0 if em < go) > 0 if e m >
~go
go
if ]em[ < go.
(3.82) Hence, (era 2 + rl)(e + g) > 0 Vt and 1/ < 0, which implies that V, 0 E L ~ and ~f(em2-t - ~/)(e + g) E L2. Furthermore, 8 e L~o implies that ¢, e, enB E L ~ . From the adaptive law (3.72) we have ~T0 =
m 2 (e+g)2m2.
(3.83)
109
Robust Ad~ptive Control But
(era+go) ~ if era < -go
(~ + g)~m ~ = (~m + gin) ~ =
(era - go) ~ if ~m > - g o if I~ml ___go.
0
which, together with (3.82), implies that
0 _< (~ + g)2m2 _< (~m 2 + ~)(~ + g) ~
(~ + g)m e L~.
Hence, from (3.83) and ~ E L~o we have that/) E L2. Equation (3.81) can also be written as I
I
_< _~2~ + i~ml~~r n + I~lgo + I~_lgo, by using Ig] <- ~- Then, by completing the squares, we have
~
1,112
,2 ___----5-- + ~ - + gg +
¢2m2
<_ - - - ~
3 ITI[~
_~ g°
3 2
+ ~-~~ + ~ go ,
i.e., ¢rn fi ,9(90 + ~ ) . Since rn 2 = 1 + n~, we have ¢, cns E 8(g0 + m--~)To show that limt...~o e = g, we use (3.81) and the fact that (~rn2+r/)(e+g) > 0 to obtain = - ( e r a ' + ~)(e + g)
= - I ~ m + ~11~ + 91m _< -II~ml- -~11~+ glm < -1go -
~11~
+ glm
(since I¢ + 91 = 0 if Icml ___go). Since ~ E Loo and go > l.~m,we integrate both sides of the above inequality and use the fact that V E Loo to obtain that (¢+g)rn E L1. Then, from the adaptive law (3.72), using the fact that ~ E Loo, it follows that Ildll e L~, which in turn implies that f o Odt exists and therefore 0 converges to 0. We next analyze the adaptive law (3.79), (3.80). This adaptive law may be rewritten as
where 1
if ll(e(t,r)m),ll~ > 2go(t) - 1 if go(t) < ]](e(t,r)m)tl[~ _< 2go(t)
go 0
if If(dr, r)m),ll~ <
go(t).
(3.85)
110
Ioannou and Datta
Once again, we consider the positive definite function V = ~ time derivative along the solution of (3.84) is given by
P -- --O'eeT(t)
= -~
Z'
fte-P(t-r),(t,
"/')~(T)dr
whose
(3.86)
~-~('-'~(t,,-)[~(t, r)m'(, ) + ,7(,-)]dr
(using the Schwarz Inequality). From the definition of ae it follows that V < 0, which implies that V, 0, e 6 Loo and limt-.oo V(t) = Voo exists. Also, 0 6 Loo implies that enB 6 L~o. From (3.87) we obtain _< -~oll(,(t, r ) " ) , l l ~ ~. Now integrating on both sides of the above inequality and using the fact that Voo < oo, it follows that
~.ll(,(t, r)m),ll~ e LI.
(3.88)
From (3.84), using the fact that ~ 6 noo, it follows that II0ll • n l . Moreover, from (3.84) we also have 8 • Loo, which together with 8 • Lz imply that 0 • L2. Furthermore, using the same arguments as in the proof for the adaptive law we
u.e ,,o
II01
to conclude that lim,.~oo 8 = 0 for
I l l
some constant vector #. We also need to show that e, First we can rewrite (3.86) as
ens 6 S(m-'~ + go). This can be done as follows.
V = --f T (t) [fot e-ZO-r)e(t, r)((v)dv ] -- eT(t)g
= -¢TR¢ + eT(t)/0' ~-P¢'-'~ ¢(r)~(~)
~- .T., m2(r ) - T - 9 t~)g.
This equation being of the same form as (3.42), we can duplicate the steps in the proof of Theorem 3.3(a) to conclude that e, en, • S(~-~-~+ go) and the proof is complete. [] P a r a m e t e r P r o j e c t i o n M e t h o d . In some parameter estimation problems it is known a priori that the unknown parameter vector 0* lies in some convex region C = {0 6 a%"lg(O ) < 0} , where g : IR" ---* IR and C°, ~i(C) denote the interior and boundary of C. The adaptive laws developed in Sects. 3.1-3.3 can be modified so that the parameter estimate 8(t) 6 C Vt _> 0. The principal idea behind such modifications is to
111
Robust Adaptive Control
choose 0(0) -- 00 E C and project the direction of adaptation (i.e., ~), whenever 8(t) e 6(C) has the tendency to move away from C, so that 0(t) does not leave C. Even though there are various ways to implement such projections and guarantee that 0(0 E C Vt >>O, we are only interested in those projection algorithms which, in addition to 0(t) E C Vt _> 0, also guarantee that the stability properties of the adaptive laws without projection established in Sects. 3.1-3.3 will not change. In this section we present some projection algorithms based on the gradient projection method which have such properties. We start by considering the following constrained minimization problem minimize J(O)
(3.89)
subject to g(0) -- 0. We start with a point 80 satisfying the constraint, i.e., g(00) = 0. To obtain an improved vector 01, we project the negative gradient of J at 00, i.e., - V J ( 0 o ) onto the tangent plane M(Oo) = {0 E IRnivw(00)0 = 0} obtaining the direction vector Pr(00)- Then 01 "- 00 + ~Pr(00), where ~ is a fixed steplength or is chosen to minimize J(01). The general form of this iteration is given by
Ok+l=Ok+~P~(O~).
(3.90)
The explicit expression for Pr(0k) can be obtained as follows: The vector - V J ( 0 k ) can be expressed as a linear combination of the vector Pr(0k) and the normal vector Vg(0k) to the tangent plane M(0k), i.e.,
--VJ(Ok)--:~Vg(Ok)+Pr(Ok)
(3.91)
for some constant ~. Since Pr(0k) lies on the tangent plane M(Ok), we have VgT(Ok)Pr(Ok) -- 0, which together with (3.91) imply that
ot ~- - - ( V g T V g ) - I v g T v j . Hence, Pr(0k) = -- ( I
VgVgT
(3.92)
We refer to Pr(0~) as the projected direction onto the tangent plane M(0k). The gradient projection method is illustrated in Fig. 3. It is clear from Fig. 3 that the new vector 0~+1 given by (3.90) may not satisfy the constraint. There are a number of techniques that can be employed to move 0~+1 from M(0~) to g(0) = 0 [28]. In our case we let A ~ 0, which gives us
= Pr(O) = -
I
VgVgT] V - - - ~ J V J ' O(O) = Oo ,
(3.93)
where g(Oo) = O. Since VgTpr(0) -~ 0, i.e., 0(t) lies on the tangent plane M(O(t)) V t >_ 0, the trajectory 0(0 , if it exists, will satisfy g(O(t)) = 0 Vt >_ 0. A scaled version of (3.93) can be obtained by using the change of coordinates F½~ = 0,
112
Ioannou and Datta
-vJ(a,)f...
~
~t(~,)
Fig. 3. Gradient projection.
where F ] is a positive definite matrix that satisfies/" = / ~ is a constant scaling matrix, i.e., 0 = Pr(/9) = - F ]
1
T F~and F = FT > 0
V g V g T ~1 ] I - F-~ V g T F V g l ~] F ~ V J .
(3.94)
The minimization problem (3.89) can now be extended to minimize J(/9) subject to g(/9) < O. The solution to (3.95) is
(3.95)
given as follows: -VJ
if/9 E CO , or if/9 E 6(C) and - v T j V g <_ 0
VgVgW w.; Pr(/9) = - V J + VgTVg__V otherwise,
(3.96)
where/9(0) =/9o E C°, or with the sealing matrix -FVJ ~-
VgVgT Pr(/9) = -_F'VJ + F ~ g F V J
i f O G C °, or i f O 6 6 ( C ) and - ( F v J ) T V g < 0 -otherwise.
(3.97)
T h e o r e m 3.7 The adaptive laws of Sects. 3.1-3.8, and the dead.zone adaptive law with the projection modification given in (3.96) or (3. 97) retain all their stability properties that are established in the absence of projection, and in addition 0 E C Vt > O, provided 0(0) = Oo G C.
Robust Adaptive Control
113
Proof. It follows from (3.96) or (3.97) that whenever 0 E 6(C), 0 T v g _~ 0, which implies that O(t) E C Vt > 0 provided 0(0) = 00 E C. The adaptive laws of Sects. 3.1-3.3 and the dead-zone adaptive law are of the form
=-FVJ,
Vt>_O.
(3.98)
Thus, projection introduces the additional t e r m / ' ~ / ~ V J
when 0 E and --(I"VJ)TVg > 0. If we use the same positive definite function to ana. lyze the adaptive law with projection the time derivative ~r of V will have the additional additive term q(¢) =
~bTVgVgT F V J VgwFvg
(3.99)
when 0 e ~i(C and --(-PvJ)TVg > 0. We rewrite q(¢) as q(¢) = --¢WVg ( - ( r V J ) T V g )
VgWrVg
•
Since - ( / ' v J ) T V g
> 0 and 0* E C, 0 E 6(C) implies that cTVg ---- ( 0 0*)TVg ~ 0: we have q(¢) _< 0. Therefore, the term q(~) due to projection can only make V more negative and will not affect the results developed from the properties of V and V. In order to avoid repetition, we complete the proof of Theorem 3.7 only for the adaptive law (3.30) with w(t) = a~, which with the projection becomes - F ( e ~ + ~rsO)
if 0 E C° , or if 0 e 6(C) and -[r(,¢ + ,o)]Tvg _< 0
!- VgVg T - r ( e ¢ + ~ 0 ) + ~ vg--e-FffrVgr ( e ¢ + a,0) otherwise. (3.100) We choose cTr-14 V_ - 2
Then, --¢2m -- ~r] -- ~rs~T0
=
if 0 E CO, or if 0 E ~(C)and
+ _e2m2 ~T
~-
_ eO -
,o)]Tvg < 0
ascT0
~V olV atT ar~+ ., + ~r~0) otherwise.
Since CTVg > 0 and VgT_F'(¢(+ o's0) < 0 when 0 G 6(C) and --[F(e(+O'sO)]TVg > 0, it follows that _< - d i n 2 _ e0 - ~,¢T0.
114
Ioannou and Datta
Proceeding as in the proof of Theorem 3.2, it can be shown that e, ens, O, 0 E Loo 2 It remains to establish that 0 e $ (m---~).From and e, on,, ~ e S (m-~). (3.100) we have that for 0 e 6(C) and -[F(e¢ + ass)]WVg > 0
oTo _. V j T F r , Vj_t_ 2VjTF.r'vg
(-vgTrvs) vgTFVg
VgTI"FVg (vJTFvg) 2 -I" V g T F V g VgTFVg
for some c E IR+, where V J = ~ + ~s0. Hence,
II0112 < cllVJII 2, vt > 0, and the proof proceeds as in Theorem 3.2 to show that 0 E 8(~2~).
[]
The projection algorithm can also be used with the modified least-squares algorithms (3.61) and (3.62) as follows:
-PVJ
ifOEC O, or i f 0 E 6 C and -- (PVJ)TVg <_0
= nVgVg wPVJ /St(0) ---- - P V J + ~'VgTVg p = ~
0
[ as in (3.61) or (3.62)
otherwise
if 0 e 6(C) otherwise,
i.e., on the boundary 5(C) the least-squares algorithm becomes a gradient one with constant gain since P = 0 for 0 E 6(C). Further details on projection algorithms can be found in [14,16,20,26,36].
3.5 Normalizing Signals The adaptive laws in Sects. 3.1-3.4 were developed under the assumption that a normalizing signal ns can be constructed which can bound from above the unknown modeling error term y(t) and the known information vector signal ~(t). In the ideal case when y = 0 the choice of ns is straightforward, i.e., it can be chosen as ns = 7~Tp(t)~ for any constant 7 > 0 where P(t) = p T ( t ) > 0 with P(t) E £oo etc. When y # 0, the design of ns to bound the modeling error is possible provided some a priori information about y is available. For LTI plants may represent the effect of bounded disturbances and/or unmodeled dynamics. In this case, y is the output of a transfer matrix A(s) whose input is the plant input u and output y, i.e.,
Robust Adaptive Control
115
where d is a bounded disturbance. For such a class of modeling errors, Lemma 2.2 can be used to design the normalizing signal: Consider the normalizing signal given by ¢na = -Sore, + u 2 + y2, m,(O) = O, (3.101) i.e.,
Using Lemma 2.2, we have
llllm <
+ Jldll
II(s +
m
and
II 'll ° < II (,)ll m
+ IId, m
where m = x/i" + n~. Various other forms of normalizing signals can be found in [5,9,11,12,14]. The idea of normalization was inherent in most discrete-time identification algorithms [5,20] even before the robustness issues in adaptive control were raised. The normalizing signal used in the ideal case was proportional to the Euclidean norm of the information vector ~ and it was essential in establishing global stability. The normalizing signal used in the continuous-time case was referred to as an "auxiliary signal" [2] and had a similar form as in the discrete-time case. The use of normalizing signals of the form (3.101) was introduced in [5] in order to improve robustness in the presence of disturbances. This idea was later on used in [11] to achieve global boundedness results in the presence of unmodeled dynamics. After [11], a series of robustness results appeared, making use of normalizing signals of the form (3.101) together with leakage [6,9,10], dead-zones [5,12,13,14] and projections [11,14]. The normalizing signal slows down adaptation relative to the speed of the information vector ¢ and modeling error 7. A finite speed of adaptation which decreases with time in some fashion is essential in establishing stability of a closed-loop adaptive control scheme. 3.6 P e r s i s t e n c e o f E x c i t a t i o n a n d D o m i n a n t R i c h n e s s The adaptive laws developed in the previous subsections guarantee parameter boundedness and some smallness condition for the estimation error and speed of adaptation. In the absence of modeling errors these adaptive laws without modifications can also guarantee parameter convergence, i.e., 8(t) ---, 0* as t ---, cx~ provided ~(t) is persistently exciting (PE), i.e., provided ~ satisfies
1 f+T j,
~(T)~T(7")dT ~ OC0/
Vt >_ 0, some T > 0 and s0 > 0 referred to as the level of excitation. The PE condition above and related ones are discussed extensively in [10,15,16,20].
116
Ioannou and Datta
Such conditions attracted considerable interest in adaptive control and parameter identification since they guarantee exponential convergence of all error signals including the parameter error to zero. In the presence of modeling errors, it is shown in [15] that the PE property of ( guarantees local stability of the adaptive control scheme which employs an adaptive law without modifications. When plant uncertainties are present, however, the level of excitation a0 has to be larger than the modeling error in order to establish stability. This led to the definition of dominantly rich (DR) signals in [6] and [37]. In [37] DI~ signals are used together with normalization and a switching-a modification in order to establish global stability for an adaptive control scheme in the presence of unmodeled dynamics as well as exponential convergence for the tracking and parameter errors to residual sets whose sizes are of the order of the modeling error. DR signals achieve their level of excitation in the lower frequency range thus avoiding excess excitation of the unmodeled dynamics. DR signals are also used in [10] to establish boundedness in the presence of bounded disturbances for an adaptive control scheme without any modifications. In this case, the level of excitation is required to be higher than the level of the disturbance.
4 Controller Structures An adaptive control scheme consists of two essential parts: (i) the control law which can be calculated and used to meet the control objective when the plant parameters are known and (ii) the parameter estimator or adaptive law for estimating the unknown parameters on line. The stability properties of the adaptive control scheme are very much dependent on the properties of these two parts, i.e., the control part and the estimation part. In Sect. 3, we have already designed and analyzed a wide class of adaptive laws for estimating unknown parameters on hne. In this section, we study the design, analysis and robustness of several control laws which can be later on combined with appropriate adaptive laws to form adaptive control schemes. The analysis of the stability and robustness properties of these control laws in the case of known plant parameters is particularly important since it gives us a feel for what is achievable with a particular control law, given perfect parameter information. If, even in the case of known plant parameters, a given control law cannot guarantee robustness or meet the desired control objective, then it would be unreasonable to expect that such a control law would perform any better when, in the face of parametric uncertainty, it is used along with an adaptive law. 4.1 A G e n e r a l F e e d b a c k S y s t e m Consider the feedback system shown in Fig. 4, where Po(s) represents the plant transfer function matrix and Cl(s), C2(s) the cascade and feedback compensators respectively; u denotes the control input and e0, el, e2 the inputs to the compensators C1, C2 and plant respectively; r, dl, d2 denote the reference signal, the output disturbance and input disturbance or noise respectively; y, Y0
Robust Adaptive Control
117
is the plant output and the output of the compensator C2 respectively. Due to the presence of sensor noise and output disturbances (dl in Fig. 4) the measured plant output el is different from y.
:
m
•I @ dl Fig. 4. Feedback system.
The matrices P o ( s ) , C l ( s ) , C 2 ( s ) axe assumed to be proper, rational and of compatible dimensions so that the interconnections in Fig. 4 make sense. The feedback system is described by the equations
el
=
dl d2
e2
-
0 -
0
[io] 00] =
(4.1)
-
C20 0 Po
el e~
(4.2)
,
which can be written in the compact form e = U-
FY ,
(4.3)
Y = G(s)[e] ,
where e
= [e0, el, e2]T, U -- [r, dl, d2]T, F=
00-I0
,
G=
Y -- [u, Yo,y]T
[ 00] C2 0 0 Po
.
Equation (4.3) may be written as [I + FG][e] - U.
(4.4)
118
Ioannou and Datta
The system (4.1), (4.2) is well-posed if the det(I + FG(s)) is not identically zero for all s. This condition guarantees that for every input signal U (4.1), (4.2) or (4.4) has a unique solution for e given by e = (I + F G ) - I [ u ] ~ H[U].
(4.5)
Y = G ( I + FG)-I[u] = GH[U].
(4.6)
Similarly, We can obtain several equivalent expressions for H(s) by using the well-known matrix identities [24]
(I + PC) -i = I - P(I + C p ) - i c C ( I + P C ) - i = (I + C P ) - ' C (I q- p ) - I = 1 -- P ( I -{- p ) - l .
(4.7)
It can be verified using (4.7) that U(s) = [Hi(s) H2(s) Ha(s)] ,
(4.8)
where
I - C2Po(I+ CIC2Po)-ICI] Po(I + CIC2Po)-iCi | (I + CiC2Po)-'C, J
Hi(s) =
--
- Po(I + C1C~Po)-IC1C2 - ( I + CiC2Po)-lCiC2
[-C Po(I + cleaR0)-' ] g3(s) -
|
Po(I + C1C2Po) - I
[
( +clc po) -1
| .
J
In view of (4.8), it follows that the system is well-posed, i.e., the solution exists and is unique, if det(I + CiC2Po) is not identically zero for all s. D e f i n i t i o n 4.1 The feedback system (4.,~) is Lp-stable with p E [1, c~] ff (i) it is well-posed (ii) U E Lp implies e, Y E Lp (iii) []e[lp < ki][Ullp, where ki < oo is independent of U. Using Definition 4.1, we can establish the following result. L e m m a 4.1 Assume that the feedback system of Fig. 4 is well posed. Then this system is Lp-stable Vp E [1,¢o] iff H(s) is a stable proper transfer matrix. Proof. We have e = H(s)tU],
Y = GH[U].
If we establish that H stable implies that G H is stable, then the proof follows from well-known results [29]. We have H = (I + F G ) -~ = I - F G ( I + FG) -1, i.e., H = I - F G H . Since F is a nonsingular matrix independent of s, we have G H = F - I - F - 1 H . Hence, GH is stable iff H is stable. []
Robust Adaptive Control
119
The stability of H(s) is equivalent to requiring that each of its nine entries be a stable transfer function matrix. It should be noted that this concept of stability is more complete than the usual stability concept of undergraduate textbooks where system stability is checked by examining the roots of the characteristic equation. To further illustrate this fact, we give the following scalar example.
Example 4.1.1. Let Cl(s) = ~,+x , Po(s) = --;-4"f' 1 C2(s) = 1. Then the characteristic equation 1 + Po(s)cl(8)c2(s)
= 1+
1 s+l
• = 0
(4.9)
has a single root at s = - 2 , which would indicate stability. On the other hand, (s+l) Po(s)[1 + Cl(S)C2(s)Po(s)] -1 = ( - s + 1)(s + 2) is an unstable transfer function so that a bounded disturbance d~ could produce an unbounded response at ex. Thus, blindly analyzing the roots of the characteristic equation is no guarantee for closed-loop stability. If, however, Po(s), Cl(s), C2(s) are modeled as the ratio of coprime polynomials, i.e., Po(s) = - ~ ,
Cl(s) = ,acl(s), ¢ ~ 0 C2(s) = 2Ld~(~) d%(s) for some coprime polynomial pairs (no(s), d0(s)), (n¢,(s), d¢,(s)), (nc~(S), dc2(S)), then a necessary and sufficient condition for stability is that the equation do(s)dz,(s)dc=(s)+no(s)nc,(s)nz2(s ) = 0
(4.1o)
have all its roots in the open left half plane. For the example just considered, no(s) = 1, do(s) = ( - s + 1), n¢,(s) = ( - s + 1), dc,(s) -- (s + 1), ne=(s) = 1, d¢=(s) = 1. Thus, (4.10) becomes
( - s + 1)(s + 2) = 0, which clearly has one root in the right half plane and so the closed-loop system is unstable. [] The above example illustrates that if we have a SISO system, then to determine the closed-loop stability one can either go ahead and blindly check the stabihty of all the entries of H(s) or, alternatively, analyze the roots of (4.10). If, however, G1 and C2 are stable compensators, the stabihty of H(s) is implied by the stability of only one of its entries, namely, the transfer function matrix between the input d2 and the output y, i.e., Po(I+G1CzPo) -1, which is the case often emphasized in most books on control systems.
120
Ioannou and Datta
4.2 U n c e r t a i n t y C h a r a c t e r i z a t i o n s The first task of a control engineer in designing a control system is to obtain a simplified mathematical model which describes the actual plant to be controlled with a reasonable degree of accuracy over the operating range of interest. While a simple model leads to a simpler control design, such a design must possess a sufficient degree of robustness or insensitivity with respect to the unmodeled plant characteristics. In order to study and improve the robustness properties of the control design, we need to come up with some characterization of the types of plant uncertainties that are likely to be encountered in practice. In the following definitions, we give various descriptions of the plant uncertainty that will be used later for robustness analysis and design. D e f i n i t i o n 4.2 Additive Perturbations: Suppose that the true plant P(s) and
the nominal plant Po(s) are related by P(s) = Po(s) + Aa(s),
(4.11)
where Aa(s ) is stable. Then Ao(s) is called an additive perturbation of Po(s). D e f i n i t i o n 4.3 Multiplicative Perturbations: Suppose that the true plant P(s)
and the nominal plant Po(s) are related by
e(s) = (z + za~(~))P0(~),
(4.12)
where Am(s) /s stable. Then Am(s ) /s called a muitiplicative perturbation of Po(s). D e f i n i t i o n 4.4 Stable Factor Perturbations: Suppose that the true plant P(s)
and nominal plant Po(s) have the following right coprime factorizations [30] P(s) = [N0(s) + A1Cs)][n0 + A2(s)] -1
(4.13)
e0(s) = NoCs)Do1(,), where N0(s), Do(s) are proper stable rational transfer f~uaion matrices that are cop~me 2 and [A1(s)],,~, [A~(s)],,~ are stable. Then AI(,), za~(s) are calleJ
stable factor perturbations of Po(s). 2 Two stable rational transfer function matrices are right/left coprime iff every fight/left divisor is unimodular in the ring of stable rational transfer function matrices. Two proper rational stable transfer functions are coprime iff they have no finite common zeros in the dosed right half s-plane and at least one of them has relative degree zero [30].
Robust Adaptive Control
121
S i n g u l a r P e r t u r b a t i o n s Another important description of a wide class of plant dynamic uncertainties is given by singular perturbation models. For LTI plants, the following singular perturbation model in the state variable form
~=Axlz+Amz+Bxu,
zEIB?,
p z -" AzlZ + A22z + B2u,
V = C l z + Czz,
uEIK r
z E IRm
V E ]Rq
(4.14) (4.15)
(4.16)
can be used to describe the slow (or dominant) and fast (or parasitic) phenomena of the plant. The scalar p > 0 represents all the small parameters such as small time constants, small masses etc. to be neglected. In most applications, the representation (4.14)-(4.16) with a single parameter p can be achieved by proper scaling as shown in [31]. All the matrices in (4.14)-(4.16) are assumed to be constant and independent of p. As explained in [31], this assumption is for convenience only and leads to a minor loss of generality. Assuming that A22 is a stable matrix, an approximation of (4.14) is obtained by setting p = 0, solving for z in (4.15) and substituting for its value in (4.14) and (4.16), i.e., xo=Aozo+Bou, zoEIR n (4.17) Yo = Coxo + DoU, where A0 = All - A m A ; # As,, Bo = B1 - A I z A ; ) Bz, Co = Ca - CzA~# Azx and Do = - C z A ~ # B z . With p set to zero, the dimension of the state space of (4.14)-(4.15) reduces from n + m to n because the differential equation (4.15) degenerates into the algebraic equation 0 = AzlX0 + A22zo + B z u ,
i.e., z0 = - A ~ # ( A z l x o + B z u ) ,
where the subscript 0 is used to indicate that the variables belong to a system with V = 0. The transfer function matrix
Po(s) -- Co(sI - Ao)-X Bo + Do
(4.18)
represents the nominal or slow or dominant part of the plant. We should emphasize that even though the transfer function matrix P(s) from V to u of the full-order plant given by (4.14)-(4.16) is strictly proper, the nominal Po(s) may be proper but not strictly proper since Do = -CzA~21B2 may not be equal to zero. The singular perturbation model (4.14)-(4.16) received considerable attention in the control literature [31], where it is used for control synthesis as well as for robustness analysis. The state-variable model (4.14)-(4.16) can also be expressed in the s-domain as follows: As in [31], we use the similarity transformation
[:]:[:z
pH
[::],
where L, H satisfy the algebraic equations p(All - A m L ) H - H(A22 + pLAto) + A12 = 0
A21 - A22L + laLAll - pLA12L = O,
(4.2o)
122
Ioannou and Datta
to transform (4.14)-(4.16) into the form = A.x. + B.u
(4.21)
p~f = Afzf + Bfu y = C,z, + Cfzf,
(4.22) (4.23)
where
Aa= A n - A 1 2 L ,
Bsm BI-pHB2-pHLB1,
Af = A22 + ~LAI~ , Bf = B2 + LB1,
CB--CI-C2L,
Cf = C2 + I~(C1 - C2L)H.
The solution of (4.20), which is of interest, can be shown to be of the form [31]
n = AnA~
+ O(p),
L = A~#A21 + O(/J).
(4.24)
Omitting initial conditions, the transfer function matrix P(s,/~s, #) from the input u(s) to the output y(s) is given by
P(s, ps, p) = Ps(s, p) + Pf(ps, p),
(4.25)
where P.(s,/J) = C.(sl. - A.)-IB. is called the slow transferfunction matrix, while Pz(m,
) = C
(m;m -
Af)-IBf
is called the fast transferfunction matrix. It is worth noting that Ps(s, p) has the low-pass property, i.e.,P,(s,p) = O(p) for s • [jw0,oo) for some fixed w0 > 0. 0n the other hand, using the identities(4.7) we can express Pf(ps, p) as
Pf(/Js,/,) = Pf(0,/J) + A(/~s),
(4.26)
where Pf(0,/J) = - C f A f l B f and
A(/~s) = p s C f A f 1(IJSXm - A f ) - l B f .
(4.27)
Hence, for low frequencies, i.e., for s • [0,jwl] and some fixed wl > 0, we have Pf(/J8,//) -- Pt(0,/~)+O(/z),
Vs • [0,jwl].
In view of (4.26) and the above analysis, we can express P(s, #s,/~) as
P(s, IJS, IJ) = Po(s,/J)+ A(~us),
(4.28)
where Po(s,/J) = P.(s, ~u) + Pf(0, ]J) is a low-frequency approximation of P(s, #s, ~u) and can be taken to represent the nominal plant. Therefore, A(/Js) can be treated as an additive perturbation according to Definition 4.2. Using (4.24), it can be shown that Po(s) = P(s, 0, 0) (given by (4.18)) is also a low-frequency approximation of P(s, ps, IJ) since Ps(s,t~) = Po(s) + O(#) and P(s, lJs,1~) = Pa(s,lJ) + O(/J), Vs • [0,jwl]. Therefore, Po(s) can also be taken to be the nominal model. In this case, the corresponding additive perturbation is A , ( s ) =
Robust Adaptive Control
123
A(ps) + p A l ( s , # ) , where Al(s,/J) can be shown to be a proper stable transfer function matrix. We have shown that the singular perturbation model (4.14)-(4.16) can be treated in the complex domain as a nominal plant P0 with an additive perturbation An. We have to emphasize, however, that even though the overall transfer function matrix P(8, ps, 8) of the plant (4.14)-(4.16) is strictly proper, its nominal part Po(s,p) or Po(s) may not be so. For example, in (4.28) P0(s,p) = Cs(sI, - A , ) - I B, - CfA~'l Bf , which is strictly proper provided CfAf-lBf = 0. The fast dynamics or parasitics in (4.14)-(4.16), which create a throughput I I - CfAf'lBf]l > O(p) in P0(s,#), are referred to as strongly controllable and observable parasitics. As discussed in [31,8], if they are not taken into account in the control design, the results may be catastrophic. One way to eliminate the effect of strongly controllable and strongly observable parasitics is to augment (4.14)-(4.16) or (4.21)-(4.23) with low-pass filters [8] as follows: we pass y through the filter I~---/-]--,+fofor some fixed f0, fl > 0, i.e.,
b = -fo~t "]- f l y - -fo~l + f l C l Z -~- f l C 2 z ,
(4.29)
and augment (4.14)-(4.16) with (4.29) to obtain the (n + q + m) system = A r t S + A12z +/31u pk = A21~ + A22z + B2u 9 = ¢1~,
(4.30) (4.31) (4.32)
where } = [9, xW]T and A n , A12, -421, /31, ¢l are appropriately defined. Since z does not appear in the filtered output Y, i.e., ¢2 = 0, the parasitics are made weakly observable. In the complex domain, the transfer-function matrix P(8, ps, p) from u to 9 is given by =
+
A(us, 8),
where Po(s,p) = Po(s,p)s+-~l° is now strictly proper and zi(Us, 8) = A(#s)~+/I0
= p~CfAf-l(pSIm-Af)-lBf. Due to the filter and the stability ofAf, z3(/~s, s) = O(U) for 8 = jw and all w, whereas without the filter A(#s) can be of 0(1) at high frequencies. We should note that in our definition the plant transfer function perturbations are assumed to have stable poles. This is not essential in robust nonadaptive control [30], where the plant perturbations are allowed to have a certain number of unstable poles. Usually the unstable poles in the plant perturbations arise due to parametric uncertainty. Since in adaptive control the parametric uncertainty is not considered as a plant perturbation, the assumption of stable plant perturbations is not restrictive. We illustrate this point with the following example: Consider the plant transfer function G(8)
-
1 s-l+p
124
Ioannou and Datta
where # is a small unknown constant. In nonadaptive control one would express
G(s) C(s) =
[1 -I- Am(S)] ,
_
Am(s )
--
S
-- 1 + / l
'
and model it as Go(s) = s-'~, z i.e., Am(s) has one unstable pole for p < 1. In adaptive control we model G ( s ) as 1 G(s) = s + a for some a which is constant but unknown, i.e., Am(S ) -- 0. We illustrate the various types of plant uncertainties by the following examples: E x a m p l e 4.2.1. Consider the following equations describing the dynamics of a DC m o t o r J ( . = ki L~t = -kw - Ri + v , where i, v, R and L are the a r m a t u r e current, voltage, resistance and inductance respectively, J is the m o m e n t of inertia, w is the angular speed, and ki and k~ are the torque and back e.m.f, respectively. Defining = = ~, z = i, we have ~= alz p~ = - s 2 z - a3z + v
(4.33)
y-----X,
where s l = ~, s~ = k, s a = R and p = L, which is in the form of the singular p e r t u r b a t i o n model (4.14)-(4.16). The transfer function between the input v and the o u t p u t y is given by y(s) = v(s)
al
= P(s, #s,p),
(4.34)
~s 2 + sas + so
k2
where s o = "7". In most DC motors, the inductance L = p is small and can be neglected leading to the reduced-order or nominal plant transfer function Po(s) = P(s,O,O) =
sl SaS + so
(4.35)
Using Po(s) as the nominal transfer function we can express P ( s , ps, p) as
P(s, ~s, f,) = P0(s) + za.( s) , where A a ( s ) = --P(~+a~,+ao)(a3a+ao)a~ o~ is strictly proper and stable since #, ~z, s 0 > 0, or as P(s, #s, #) --- Po(s)(1 + A m ( s ) ) ,
Robust Adaptive Contzol
125
aI*2+as is proper and stable. where Am (s) = --# ~° ~+as° Let us now use definition 4.4 and try to express P ( s , p s , # ) (4.13). We can write
in the form of
P ( s , ps, s) = ° q ( P s 2 + a3s + ao) - p s 2 a l #82al
a t - / , s 2 + a s s + ao (a3s
+
a0)
a1
~s2al
s+ A
(t~s 2 + a s s + ao)(S + A) a a s + ao + 0 s+A
for some A > O, i.e., N o ( s ) = ~°+X ' DoCs) = ~,~+~ , A1Cs) = - (°+),)(~0~+aa,+ao) ~'~"' and As(s ) = 0. []
E x a m p l e 4.2.2. Consider the system shown in Fig.5. Here T > 0 is a small time delay in the feedback. The transfer function is y(s) = P ( s ) = 1 u(s) s + e- T ' - I
For low frequencies e - T ° -- 1 is O ( T ) . Therefore, for small T the nominal plant can be taken to be
P0(,) =
1 -
8
(4.36)
With Po(s) = 7, i however, P(s) cannot be expressed in the form (4.11) with a stable additive perturbation Aa(s ) or in the form (4.12) with a stable multiplicative perturbation Zlm(S). W e can express P(s), however, in the form (4.13) as follows: 1 P(s) =
where N o ( s ) -- • +x 1
s+ No(s) + s e -T° - 1 = D o ( s ) + A 2 ( s ) ' --.Jff - s+A s+A
Do(s) -
'
with A > 0 is a coprime factorization of
Po(s) = ~ and A1(s ) = 0, As(s) = e-Y#_~ ,+x are proper stable factor perturba-
tions. []
126
Ioannou and Datta
u
±
T
I
y(t),
I x-1 I I
=
Fig. 5. System for Example 4.2.2.
4.3 R o b u s t n e s s Analysis: K n o w n P a r A m e t e r s In this subsection, we consider the general feedback system of Fig. 4 subject to the plant perturbations discussed in the last subsection. Our goal is to derive conditions which, if satisfied by the perturbations, will guarantee closed-loop stability. The launching point for any robustness analysis is the stability of the nominal (i.e., unperturbed) plant-controller pair. Accordingly, we assume that in the absence of any perturbations each of the nine entries of H(s) is stable. Using this assumption, we can obtain the following robustness results. T h e o r e m 4.1 Let the nominal plant Po(s) be subjected to an additive per-
turbation Aa(s). Then, the closed-loop system of Fig. 4 with Po(s) replaced by Po(s) + A~(s) is Lp-stable provided that II[I + CIC2Po]-lClC;Aallir < 1 .a
(4.37)
Proof. The perturbed plant-controller configuration for this case is shown in Fig. 6. We first note that in the absence of As, we have a stable nominal closed loop consisting of only rational transfer function matrices. As a result, it follows that [24,29] the nominal feedback system is Lp-stable Vp E [1, oo]. Thus, the stability of the perturbed closed-loop system will be established once it is shown that r, dl, d2 E L~ implies that e2 E Lp. From Fig. 6, using (4.8), it follows that
e2 -- [I q- C1C2Po]-lClr - [I 4- C1C2Po]-Ic1C2d, q- [I q- C1C2Po]-X d~ - [ I + C1C Po]-iC1C2z . 2 .
z In this paper, H" lit, denotes the operator norm induced by the Lp norms on the input and output signals
Robust Adaptive Control
127
Y$2
d, ÷ *
|
÷
tl
y
Fig. 6. Perturbed plant-controller configuration (~dditive perturbation).
Truncating the signals, taking Lp norms on both sides and making use of Minkowski's Inequality [24], it follows that
I1~,11, ~ I1([z + cic~Po]-lc,,'),ll, +
+
I1([ + CxC, Po]-'d,),[I,
+
I1([1 + c,c=po]-*cic=a,),[l, I[([I
÷
c,c, Po]-lc, c,a,*~),ll~ •
Now from the Lp-stability of the nominal H(s), it follows that each of the operators [ I + C1C2Po]-ICx, [I+CIC2Po]-ICIC~, [I+C1C2Po]-x have finite induced Lp norms. Now, assuming that Aa is causal, we obtain
c1O=Po]-lCl(J=Ilip]ldl,llp + [l[x + c,c=Po]-I I]~pIId=,llp ÷ II[I + ClC~.Pol-'C,C=Ao[l~p ll~2,11p •
tle2,1I, _< I[[I +
ClC=Pol-lCliJiplit, lip +
11[I +
It now follows that if (4.37) holds, then
Ile2,11p ~ k, llr,[Ip ÷ k211d1,11p+ kalld2,11p for some kl, k2, lea > 0. Thus, r, dl, d2 6 Lp =~ e2 6 Lp. This completes the proof.
[]
A few remarks concerning Theorem 4.1 are now in order:
R e m a r k 4.1. It is well known that [[.l[.4[29]of any lineartime-invariantoperator provides an upper bound on its induced Lp norm Vp 6 [i,co]. Thus, if (4.37) holds with [[.[[ipreplaced by ][-[[.4,then the closed loop is guaranteed to be stable Vp 6 [1,co]. However, H" H.4 is difficultto compute and also the condition (4.37),with []-]llpreplaced by [[.H~, cannot be easilyverifiedin the frequency domain. []
128
Ioannou and Datta
R e m a r k 4.2. ff we choose p -- 2 in Theorem 4.1, then we obtain a condition which can be easily verified in the frequency domain, since (4.37) now becomes
II[_r+ CxC' P0]- Gc2 a(8)lloo < 1. This is precisely the popular "loop-gain-leas-than-one" stability result in robust control [32]. [] R e m a r k 4.3. In an effort to obtain a stability condition which can be readily checked in the frequency domain, we are now restricted to concluding only L2 stability. However, if (4.37) holds with I1" [lip replaced by the operator norm induced by I1('),11~ for some ~ > 0, then it can be shown as in [24] that the perturbed system is also Loo-stable, i.e., L2 stability with exponential weighting implies Leo stability. T h e o r e m 4.2 Let the nominal plant Po(s) be subjected to a muitiplicative per-
turbation Am(8). Then, the closed-loop system in Fig. 4 with Po(s) replaced by [I + Am(s)]P0(s) is np-stable provided that H[I + PoClC2]-XPoCxC~Amllip < 1.
(4.38)
Proof. The proof is very similar to that of Theorem 4.1 and is therefore omitted. [] R e m a r k 4.4. In many controller structures C2 --- I. Then, S ~ ( I + PoC1)-I is called the Sensitivity, while T ~- (I+PoC1)-lPoC1 is called the Complementary Sensitivity. Clearly S and T are related by S + T - I.
(4.39)
Now from (4.38), it is clear that the smaller T is, 4 the larger the robustness margin. In other words, a small T enhances robustness. On the other hand, from the (1,1) entry of H(s), it is clear that for good tracking performance S must be made small. Moreover, it is clear that a small S will lead to better output disturbance rejection. Thus, a smaller S enhances performance. Since S and T are constrained by the relationship (4.39), it follows that in any control design, there is a fundamental tradeoff between performance and robustness. H2 and Hoo optimal control [33] have emerged as powerful tools for achieving such a tradeoff. Now tracking high frequency signals is not very meaningful. Thus, for tracking purposes, S should be made small only in the low-frequency range. On the other hand, at high frequencies the plant is very poorly known so that it is desireable to have T small at high frequencies. These simple observations give the control engineer a qualitative idea of how to achieve such a tradeoff. Of course, the actual design may involve H~/Hoo controller synthesis techniques. [] 4 Here the size of T is measured by its H~
norm.
Robust Adaptive Control
129
T h e o r e m 4.3 £et the nominal plant Po(s) have a right coprime factorization (No(s), Do(s)) and let (Aa(s),A~(s)) be a stable factor perturbation of Po(s),
so that P(s) = [No(s) + Al(s)l[Do(s) + A2(s)] -1
(4.40)
represents the perturbed plant. Then, the feedback system of Fig. ~ with Pc(s) replaced by P(s) of (~.40) is Lp-stable provided that
[-[Do + GC No]-lClC2, [D0 + ClC Nol
zh
ip
Proof. The proof is essentially the same as that of Theorem 4.1 and is therefore omitted.
[]
Remarks 4.1, 4.2 and 4.3 apply to Theorems 4.2 and 4.3 also. 4.4 C o n t r o l S t r u c t u r e s f o r A d a p t i v e C o n t r o l In this subsection, we describe three control structures which have become very popular in the adaptive control literature. T h e y are the model reference control structure, the pole placement control structure and the linear quadratic control structure. The controllers themselves are designed on the assumption that there are no disturbances (i.e., dl = d2 = 0) or unmodeled dynamics. However, each of these control structures is shown to be a special case of the General Feedback System of Fig. 4 so that the robustness results of the last subsection apply to each one of them. We begin with the model reference control structure. M o d e l R e f e r e n c e C o n t r o l S t r u c t u r e . In model reference control, the objective is to stabihze the plant and to force its output y to track the output ym(t) of a reference model given by:
Zm(S) ~ks)
ym(8) ~-- W m ( 8 ) r ( 8 ) -: km'~'-7"7_~r(s)
(4.42)
for any bounded piecewise continuous reference signal r(t), where k m > 0 is a constant and Zm(S), Rm(s) are monic Hurwitz polynomials of degree mr, nr respectively. In order to achieve such an objective the following assumptions [2] are made about the modeled part of the plant transfer function
Zo(s) :
P0( ) = kp no(,)
(M1) Zo(s) is a monic Hurwitz polynomial of degree m < n - 1, (M2) Ro(s) is a monic polynomial of degree n, (M3) the relative degree of Win(s) is nr - mr = n - m, and nr _< n.
(4.43)
130
Ioannou and Datta
The MRC law is given by U = e 1 "a(s) ~U
y * a ( s ) 0"3Y + Co"r , + 2 A(s)
(4.44)
where a(s) = Is "-~, s"-3, . . . , 1]T, A(s) = Ao(s)Zm(S), Ao(s) is a monic IIurwitz polynomial of degree n - mr - 1 and 0~, i = 1,2,3 and c~ are the constant controller parameters to be determined so that the control objective is achieved for the modeled part of the plant Po(s). It can be shown [2] that there exist 0~, 0~, 0~, c0 such that the control objective is achieved for the nominal plant Po(s). Now with c;A(s) C , ( s ) = A(s) - O~ra(s) ' C~(s) =
O;r a(s) + O;A(s) c~A(s) '
it can be easily shown that the general feedback system of Fig. 4 reduces to the model reference control structure of (4.44). We use the following simple example to illustrate the design analysis and robustness of a model reference control scheme. E x a m p l e 4.4.1. Consider the plant 1 Y = s + a (1 + ZSm(S))U,
(4.45)
where Am(s) is a multiplicative plant uncertainty which is stable, and a is the parameter of the modeled part of the plant. The control objective is to choose u such that all the signals in the closedloop plant are bounded and y is as close as possible to the output Ym of the reference model b,~ Ym = S + a'm [r], (4.46) where bin, am > 0 are known, for any bounded reference input signal r(t). I f a is known, the MRC law (4.44) yields u = O~y + b,~r
(4.47)
with O~ = a - am, and this choice of u guarantees that the control objective is met for Am(s) = 0. When Am(s) ¢ 0, we have [s
+ am
-
0~Zlm(s)]y = bin(1 + Am(s))r,
so that the closed-loop characteristic equation is
(s + am) -- 0azamC,) = 0, or
am(S)
- 0 . 1 - 0~ m (s +- am)
Robust Adaptive Control
131
Since 0 ~ is a stable transfer function, it follows from the small gain theorem that a sufficient condition for the closed-loop system to be stable is that Am(s ) satisfy = s -I- a m
co
s -t- a m
I1~
< 1
(4.48) '
which agrees with the robustness condition (4.38). Furthermore, the tracking error
A er ~-
y
--
Yrn satisfies
b.~
(s + am + 0~)
a
,,o~r~.l
er - - (8 -I- am) (S + % = = 0~Z~rn---($))- - r n ' ° / t ' J "
O The same example can tor perturbations without also involves the transfer so that a - am is as small objective.
be analyzed in the presence of additive and stable facany significant changes. Condition (4.48) for stability function of the reference model. One can choose am as possible depending on the flexibility of the control
Pole Placement Control Structure. In pole placement control,the objective isto stabilizethe plant Po(s)by assigning the poles of the closed-loop plant to the roots of some desired polynomial. This stabilization objective can be extended to tackle the tracking problem for a special class of reference input signals by using the so-called internal model principle. Suppose the signal ym(t) to be tracked satisfies:
O(p)ym(t) -~ O,
(4.49) ,x
where Q(p) is a fixed known monic polynomial of the differential operator p = d ( . ) of degree q0 with non-repeated roots on the jw axis. In pole placement control, we seek to synthesize the control input u(t) which places the closed-loop poles of the system at the roots of a given Hurwitz polynomial A*(s) of degree 2n + q0 - 1 and guarantees that the output y(t) tracks ym(t). Q(s) is called the in~ernai model of ym(t). In order to achieve the pole placement control objective, we make the following assumptions about the modeled part of the plant transfer function Po(s) =
Zo(s)/Ro(S): (P1) Ro(s) = s n + Ro(s, 07) is a monic polynomial of degree n, (P2) Zo(s) = 20(s,0~) is a polynomial of degree < n - 1, and (P3) Q(s)Ro(s) and Zo(S) are coprime.
Here 0F, 0~ are vectors whose elements are the coefficients of ~0(s), spectively. The pole placement control law is given by [10,20] u = [Al(s) - P(s)Q(s)],l-~_,[u] + L(s),l'~_,(ym - y), Ill(S) /.tits )
Zo(s)
re-
(4.50)
132
Ioannou and Datta
where L(s), P(s) are polynomials of degree qc + n - 1 and n - 1 respectively, obtained by solving the Bezout equation
P(s)Q(s)Ro(s) + L(s)Zo(S) = A*(s),
(4.51)
where q0 is the degree of Q(s), Al(s) is a Hurwitz polynomial of degree n + q0 - 1, and - A*(s) is the desired closed-loop stable polynomial of degree 2n + q0 - 1. -
-
The coprimeness of Q(s)Ro(s) and Zo(s) guarantees that (4.51) has a unique solution [30]. With Cl(s) = L(s)/P(s)Q(s), C~(s) = 1 it can be easily shown that the general feedback system of Fig. 4 reduces to the pole placement control structure. From (4.50),
P(s)Q(s)u = L(s)(ym - y) P(s)Q(s)Zo(s)u = L(s)Zo(s)(ym - y). Using Zo(s)u = Ro(s)y, we obtain
[P(s)Q(s)Ro(8) + L(s)Zo(s)]v = L(s)Zo(s)ym. Using (4.51)
Y=
L(s)Zo(s) [A*(s) - P(s)Q(s)Ro(s)] Yr,, A*(s) Ym= a*(8)
i.e.,
[P(8)R0(,)] [Q(,)Vm] • Y - Ym -- - [ A*(8) J Since Q(s)ym = 0, it follows that the tracking error er = y - Ym will converge to zero exponentially with time. Since no cancellation of unstable polynomials took place, the closed-loop plant is also internally stable. In order to implement (4.50)-(4.51), however, we need to know R0(s) and Zo(s), i.e., the parameters 0~, 0~ of the plant. Let us now use a simple example to illustrate the design, analysis and robustness of a pole placement control scheme. E x a m p l e 4.4.2. Consider the plant
v=
b
(1 + Am(S))u,
(4.52)
where Am(s) is a multiplicative plant uncertainty which is stable, and a, b are the parameters of the modeled part of the plant. The control objective is to choose u such that the poles of the closed-loop system are placed at the roots of a given Hurwitz polynomial, say A* (s) = (s +
Robust Adaptive Control
133
1)2, and the output y(t) tracks a unit step input u,(t). Thus, here ym(t) -- u,(t) and Q(s) = s. The control law
Ax u = TTE[u]
lls + 12, + ~LYm
- Y],
where 11, 12 satisfy
s(s + a) + (lls + 12)b = (s + 1) 2 ,
i.e.,
2-a It=--T-,
1 12 = -~ ,
and A1 > 0 is a design constant, can be shown to meet the control objective when Am(s) = 0. When Am(s) ¢ 0, the closed-loop plant is given by
b(1 + Am(s))(ils + t2) Y - [(s + 1) 2 + bAm(s)(lls "~i2)] [Ym]" Therefore, for (s + 1) 2
~o
< 1,
(4.53)
the closed-loop plant is stable. Furthermore, the tracking error er -" y - Ym is given by er
s+a 1 s+a [(s + 1) 2 + bAm(s)qls + 12)1s[ymj = [(s + 1) 2 + bAm(S)q,s + :2)1 [0]'
i.e., if (4.53) is satisfied er(t) converges to zero exponentially fast.
O
The same result can be established in the presence of additive and stable factor perturbations. L i n e a r Q u a d r a t i c C o n t r o l S t r u c t u r e . In this case the objective is to minimize a particular cost criterion while stabilizing the plant and forcing its output y(t) to track the reference signal ym(t), which satisfies Q(p)ym(0 - 0 (where A
p = ~ ( . ) ) for some monic polynomial Q(p) of degree q0 with no repeated roots on the jw axis. To obtain a solution to this problem, we make the following assumptions about the modeled part of the plant transfer function Po(s) = Zo(s)/Ro(s): (L1) Ro(s) = s n + flRo(s,O~) and Zo(s) = 2o(s,O;) are coprime polynomials of degree rn and n respectively, with m < n, and (L2) Zo(s) and Q(s) do not have any common zeros on the jw axis. Here, 0~ and 0~ are vectors with the coefficients of/~0(s), Zo(s) respectively. The tracking problem is first transformed into a regulator problem as follows. In Fig. 4, let C2(s) = 1. Then - e 0 = y - Ym, which can be rewritten as
Zo(s)
--co -" - ~ - - ~ u - - Ym :=~ --R0(s)e0 "- Zo(s)u - Ro(s)ym.
Ioannou and Datta
134
Filtering each side of the above equation by Q(s)/Ql(s), where Ql(s) is an arbitrary monic Hurwitz polynomial of degree q0, and using the fact that Q(p)ym(t) - 0, we obtain eo = P~(s)fi, (4.54) -
where
P~(8) = z0(8) QI(s) ~ = Q(~) u Ro(O Q ( O ' Q-~ " The LQ controller is designed by minimizing the cost criterion
J(~) =
(4.55)
[eg(O + ~ ~(01 d~,
where ~ > 0 is the control weighting. Let A, b, C be a minimal state-space realization of Pg(s) in the observable canonical form. Then,
----
,
',o
--
02,
(4.56)
where 0~, 0~ are the coefficient vectors of [sn -{- Ro(s,O~)]Q(s) - s '~+q° and Zo( s, O~)Ql ( s ) respectively, and C = [10--- 0].
(4.57)
From standard LQR theory for the infinite interval problem [34], it follows that the optimal control fi is given by
f~- -Ge,
G-
1---bwp, )~
(4.58)
where e is the state vector corresponding to the realization (A,b, C) and P satisfies the Algebraic Ricatti Equation
AT p -t- P A -}- c T
c -
.1----PbbTp = O .
(4.59)
When the state vector e is not available, it is replaced in (4.58 / by its estimate obtained from the following state observer:
= A~ + bf, + k(C~ + e0),
(4.60)
where k is chosen so that A + kC is a stable matrix. Using standard LQG theory [34], it is possible to obtain an optimal choice for k which involves the solution of a dual Ricatti equation. However, any k that makes A + k C a stable matrix is an acceptable choice. Thus, k is chosen to satisfy A + kC = A*,
(4.61)
Robust Adaptive Control
135
where A* is a stable matrix. Without loss of generality, we assume t h a t A* has the following form
[,
A=
-a*
[ I"+~°-1
I I
(4.62)
0
so that (4.61) is guaranteed to have a solution for k. The control u(~) is thus given by 1 Qx(s)
= Q---~.,
~ =-a~,
a = _~Tp,
(4.63)
where ~ is obtained from (4.60) and P is the unique nonnegative definite solution of (4.59). With
Cl(S)
-~G(sz-
A - kC + bG)-lk,
C2(s) = 1,
it can be easily shown that the general feedback system of Fig. 4 reduces to the linear quadratic control structure. Let us now use the same example as in the pole placement case to illustrate the design, analysis and robustness of an LQ control scheme. E x a m p l e 4.4.3. Consider the plant
bl
Y -- S + a (1 + Am(S))u'
(4.64)
where Am(S) is a multiplicative plant uncertainty which is stable, and a, bl are the parameters of the modeled part of the plant. The control objective is to choose u to stabilize the plant in a way so as to minimize a particular performance index while making the output y ( t ) track a unit step input u , ( t ) . Thus, here ym(t) = u,(t) and Q ( s ) = s. Choosing Qx(s) = s + q, q > 0 being arbitrary we can convert the tracking problem for (4.64) into an equivalent regulation problem for
co-
(8--~-q)bl s(s+a)
[l_[_Z~m(8)] ~
(4.65)
where z~ eo - - Y m - - Y
= ~[u].
s+q
Equation (4.65) can be realized in the state-space form as = Ae -{- b(~ J¢- Z~m(S)~) e 0 "- - C e
,
(4.66)
136
Ioannou and Datta
where
A=
[-olo]
, b=
blq
'
The LQ control law
f~ = - G e
G =
1--~-bwp
'
~Ul
(4.67)
'
where P is the unique non negative definite solution of the Algebraic l:ticatti Equation (ARE)
A T p + P A + c T c -- 1-.~-PbbWp= 0,
(4.68)
can be shown to meet the control objective of minimizing the cost criterion fo[e0~(t) + Atofi2(t)]dt in the ideal case, i.e., for Am(s ) = 0. Since the state vector e is not available, it is replaced in (4.67) by its estimate obtained from the state observer
= A~ + bfi + k(C~ + e0),
(4.69)
where k = [a - a t , - a 2 ] T and a l , ag. > 0 are chosen such that s 2 + a l s + as is a Hdrwitz polynomial. To summarize, the control input u for (4.64) is given by u =
s+q
[a],
G = 1-LbTp,
fi=-G~
S
'
(4.70)
'~W
where ~ is obtained from (4.69) and P is the unique nonnegative definite solution of (4.68). Equation (4.70) along with (4.68), (4.69) guarantee that the LQ control objective is met when Am(s) = 0. When Am(s) # 0, we analyze the closed-loop system as follows: From (4.69), we have = [sl - (A + kC)] -1 [ba + ke0].
Substituting this expression in (4.70) we obtain a = - G [sI - (A + kC)] -~ [ba + keo], or
[1 + G[sI - (A + kC)]-lb] ~ = - G [sI - (A + kC)] -1 keo. Combining the above expression with (4.65), we obtain [1 + G ( s I - (A + kC))-lb]eo -
(s + q)bx [1 + Am(s)]V[sI - (A + k C ) ] - l k e 0 , s ( s + a)
or
{s(s + a)[1 + a ( s I - (A + kC))-Ib] - (s + q)blG[SI - (A + k C ) ] - l k --(s T q)blAm(s)G[sI - (A + k C ) ] - l k } e0 = 0. (4.71)
Robust Adaptive Control
137
Using the small gain theorem as in the earlier examples, we conclude that the closed-loop system is stable provided that
Am(s) s(s -I- a)[1 -I- G(sI
-
-
(A "-}-kC))-lb] - (s + q)blG[sI - (A + kC)]-lk < 1,
oo
(4.72)
which agrees with the robustness condition (4.38). Also, from (4.71) and the A
stability of the closed loop it is clear that the tracking error er -" y - Ym --e0 --+ 0 exponentially fast. I:] The same result can be established in the presence of additive and stable factor perturbations.
5 Robust
Adaptive
Control
Schemes
In Sect. 4.4, we have considered the design of three particular robust controller structures. However, when the parameters of the modeled part of the plant are unknown, the control input u cannot be calculated and therefore none of these controller structures can be implemented. A natural approazh to follow in the unknown parameter case is to use the same control law as in the known parameter case, replacing the unknown controller parameters with their estimates obtained by using the identification techniques of Sect. 3. This approach is called Certainty Equivalence and has been widely used in the design of adaptive control schemes. The way the controller parameters are estimated led to two different classes of adaptive control schemes, called direct adaptive control and indirect adaptive control, which are treated in the following subsections. 5.1 D i r e c t R o b u s t A d a p t i v e C o n t r o l The appropriate control input u that will force the unknown plant to meet the control objective is obtained by using the following steps:
Step 1: We derive the control law that can meet the control objective when the plant parameters are unknown. This step demonstrates that there is sufficient structural flexibility in the closed-loop plant which allows us to meet the control objective. Step 2: We use the same control law as in Step 1 but with the controller parameters repla~ed by their estimates generated by an adaptive law. The adaptive law is designed by first parametrizing the plant in terms of the unknown desired controller parameters and then using any of the methods given in Sect. 3 to derive the adaptive law. Step 3: We analyze the adaptive control scheme formed in Step 2 and show that it meets the control objective.
138
Ioannou and Datta
The adaptive control scheme developed as in Step 2 is referred to as direct adaptive control for the simple reason that the controller parameters are estimated directly without any exphcit information about the plant parameters. The derivation of the adaptive law in Step 2 assumes that the plant equation can be parametrized in terms of the unknown controller parameter vector 0", i.e., the unknown plant parameters are expressed in terms of 0" and no longer appear in the plant equation. In order to eliminate the unknown plant parameters and express the plant equation in terms of 0* and in the form of the parametric models considered in Sect. 3, we have to cancel the unknown zeros of the modeled part of the plant by filtering. This cancellation restricts the modeled part of the plant to be minimum-phase, i.e., to have stable zeros. As a result, the only class of direct adaptive control for which a complete stability analysis exists, is the one based on the model reference approach [2,9,10,11,13,16,20], which requires the modeled part of the plant to have stable zeros. We illustrate the design and analysis of model reference adaptive control (MRAC) using a simple example. E x a m p l e 5.1.1. Consider the plant 1 Y = s + a (1 + Am(s))u'
(5.1)
where a is an unknown constant and Am(s) is a multiplicative plant uncertainty which is assumed to be analytic in Re[s] > 0. The control objective is to choose the input u(t) such that all the signals in the closed-loop are bounded and the output y tracks the output of the reference model
ym = - - [ d ,
(5.2)
s -4- a m
where am > 0, bm are known parameters, as closely as possible for any bounded reference input signal r(t). We design the MRAC scheme by following Steps 1-3. Step 1: As we have shown in Sect.4.4 the control law * U=8oY+bmr,
00* - - a - a m ,
(5.3)
meets the control objective exactly when Am(s ) -- 0. Furthermore, when Z~m(S) ¢ 0, the control law (5.3) guarantees that for all Am(S) that satisfy
] 0 ; A m f__(6) $ "1- am vo
.
.
](a--am)Am(s)l . . . S 3t- u m
vo
< 1
'
(5.4)
we have signal boundedness, and the tracking error er = y - Ym satisfies er = (s + am) (* + "m -- e
Am(s))
m,
Step 2: When a is unknown (5.3) cannot be implemented. Therefore, instead of (5.3) we propose the control law = oo(t)y + bm ,
(5.5)
Robust Adaptive Control
139
where Oo(t) is the estimate of 8~ at time t. The adaptive law for generating 00 is developed as follows: W e substitute a = 0~ + am in the plant equation to obtain (s + a= + O;)[V] = (1 + Am(s))[u]. Filtering by 1/(s + am), we have =
1
+
s + am
(5.6)
where z = - y + ,+--~=[ul, w = y, t/0 = --Am(s)tu], which is in the form of the linear parametric model (3.3). The adaptive law for estimating e* can be developed by using any one of the methods given in Sect. 3. Let us consider the Lyapunov method and the gradient method. (i) Lyapnnov Method. Since ~ 1 is SPR, we choose L(s) = 1 and use the results of Sect. 3.1 to obtain the adaptive law t~o=-Te~-7cr,0o, 1 , = ~ [ 0 o ¢
S + am
~=w=y 1
(5.7)
- e-~l - z = ~ [ ¢ o ¢
s + am
- e . ~ - 71,
where ~bo = 0o - 0~ and t / = ~/o = --Am(8)[ll] (ii) Gradient Method. Using the results of Sect.3.2, we have g=-?e¢-~,e0,
¢=
1
s + am
0o~ - z ¢0~ - ~ e= m= = m~" , 0
Iv]
1 S+amAm(S)[ul'
(5.8)
where m = = cr + fln~ for some a, fl > 0. The normalizing signal n~ is generated a S n s2 "-- 7Tt 8 , where ri~ = -,~om, + lul 2 + Ivl 2 , m . ( o ) = o
(5.9)
for some am > 60 > 0 and such that Am(8 -- 60/2 ) has stable poles. Hence, the first aSsumption that Am(s ) haS to satisfy in the adaptive case is (A1) Am(s) is analytic in Re[s] > - 6 0 / 2 for some known 60 > 0. Let us now check whether ns generated from (5.9) has the property ~ , ~ E Loo in (5.7), (5.8). We first consider (5.7), where ff = y and ~ = -Am(s)[u]. We have # = - a m y + (1 + Am(S))U - - ( a - am)y. Using Lemma 2.2, we have [Y(t)l
II'° II=,llg°+~ I1*+P11 '°llv, llgO OO
oO
for some p > 60 and e E IR+. Therefore, if Am(s) is proper, i.e., the Hoo-norrns above are finite, we have
ly(,)l _
Ioannou and
140
Datta
for some c E IR+, which implies that ]~(t)l E Loo. Similarly
I~(0[ <__c II(, + p)Am(,)[l~ I l u , @ , which implies that ~ 6 L~o, provided Am(s) is strictly proper. The requirement that Am(s) be strictly proper can be relaxed by filtering both sides of (5.6) as shown in [17] before deriving the adaptive law (5.7). Let us now consider (5.8). We have
v)Am(S ) 60 I¢(t)i < c II[I(8 "~ P)IIIl~oIlydig ° , I~(~)l < c II1[ ~(8+ + ~,-,, oo II,-,,@', ii s + am iloo which imply that ~ , ~ E Loo provided Am(S) is proper. As we showed in Sect. 3, the adaptive laws (5.7), (5.8) guarantee that (i) 80, e E L~o (ii) e, ens, do E ,-q(~-~-~) independent of the boundedness of all the other signals in the closed-loop plant. Step 3: In this step we use the properties (i) and (ii) of the adaptive law to establish stability and robustness of the overall closed-loop plant, which can be represented by the following equations: u = O~y + bmr +
CoY
y = Ym + 7 T ~ m [COY] +
[U].
Using the properties of the Lie norm and Lemma 2.2, we have
Ilu, llgo ___clly, llgo + II(¢0y),lig ° + c
Ily, ll~° _< cll(¢0y)dlg ° + Alll=,llg ° + ~, where [[ s + am [[oo ' and c E IR+ is a general bound for the various finite constants which are independent of Am(s ) . Hence, m ___ ~
< c + c(iiutlg ° + IiY,iig°) _< czhlludlg ° + cil(¢0y)dig ° + c (5.11)
for some c E ll~+. We now need to manipulate the term CoY in (5.11) using the properties (i) and (ii) of the adaptive law used to generate O0. Let us first use (5.7). We have
¢0( = CoY = (8 -~- am)[e ] -~- ena2 -~ 7Using Lemma 2.6, we have ¢0Y =
1
s + a'
[ ] c~ ~ = ~ +1 ~ ¢0Y + CoY + ~--4--~.( ~ + am)[e] + ~
[end] + ~ - ~ [ ~ 1
Robust Adaptive Control
141
for some a > 60 > 0 to be chosen, which implies that
II(¢oy),llg ° < - ~Ol
(11(¢o~ ° +ll(*oy),ll, t
+c[[(~,,.),
~ +ca~llu, llg° + c ,
where (5.12)
/x 2 = IIAm(~)ll~.
<_ c[lytl[~, since •
~o~ I~o~ ~ I~o~1 ~
6o
¢0 • Loo.
Furthermore,
Ily, ll~o < ~ -
][,~
I '° II(¢oy),ll~ o + c i1,~<,>11 ,o r e + c , ~ II s + am [[oo
s + a ~ [[oo
i.e.,
II(
I
'
'o
where _z~
II~m¢~ll ~o
¢~
1[8--g-~-i[o ° .
Choosing a > 2c, we have
_
m
-I-c,
where 72(r) = 2 ~1¢0"12 + 2ce2n~, i.e., 7 • S(~-~), which by (5.12) implies that 7 • S(A2) • Therefore, from (5.11) we have
m ~ cAlra-t- c ( ~3. -}- A2) ra-[- c ( ~ e-6"('-r)72(r)m2(r)dr ) + c. Hence, for
c (z~l + z~+-~) < 1,
(5.14)
we have
(f ' e_6O(,_T)72(r)m2(r)dr)½ or
m 2 <_ c
I'
e-6"O-')72(r)m2(r)dr+c.
Using Lemma 2.4, we obtain m 2 _< c ~ e-60(t-,) 7 2( r )
e~f;'y'(,)e, dr +
c.
142
Ioannou and Datta
Now f t 7 2 ( s ) d s < eZl](t - 7) + e. Therefore, m 2 < c ~o t e - # ° ( t - r ) 7 ~ ( r ) d 7
,
where
~0 = So - c~g.
(5.15)
For eA~ < 6o we have flo > 0. Since 7(7) E S(A~), it follows from Lemma 2.7 that m E Loo. Therefore, y = ~ and )/axe bounded, which implies that u and therefore all the signals in the closed loop are bounded. The tracking error er = y - Ym is given by er -- s a t-1a"-'-~[¢oy] _ ) / = e + S ~ a m [end] + S ~ a m [)/] -- r/,
which implies that 9 ~l m allf I levi _< I~(t)l + I,7(t)l + II(¢n,),ll2 + 11'7,112 •
Hence,
[ ler(t)l=d7 ___cza~t + c.
(5.16)
The conditions for stabihty that Am(S) has to satisfy when (5.7) is used as the adaptive law are summarized below (C1) Am(S) is analytic in l~[s] _> --~0/2 for some known 60 > 0. (C2) A 1 -"
A m ( 8 ) 'l~° A 2 " - I [ A m ( s ) l [ ~ , z53 =
8 + am oo' II(s + p)zam(s)ll~ are finite. (C3) c ( A x + A 2 + ~ 3 )
SAm(S) ~o and
S-l-am oo
< 1, w h e r e c e l R + a n d c ~ c a n b e a x b i t r a r i l y l a r g e .
(c4) zal < ~_0 for some c e ~t +. c
Condition (C2) will be satisfied if Am (s) is assumed to be strictly proper. The arbitrary constants denoted by c in (C1) to (C4) can be calculated by following the steps given above in a similar way as in [18], where such explicit calculations are performed for a gradient algorithm. In a similar way we can proceed with the analysis of the MRAC scheme, where (5.8) is used as the adaptive law. In this case, it can be shown that the conditions on Am(s) for stability are given by (C1 t) Am(8 ) is analytic in Re[s] > - ~ o / 2 for some known/50 > 0. (C2') A I = (C3') A 2 :
~
I
< c for somec > 0. ($ "~-am)
< c6o for some c > 0.
Robust Adaptive Control
143
It is clear that (CI')-(C3') allow Am($ ) to be proper, in contrast to (CI)(C4) for (5.7), where Am(s) is required to be strictly proper. [] Comparing the stability condition (5.4) in the known parameter case (i.e., the stability condition for robust nonadaptive control) with conditions (C1')-(C3') or (C1)-(C4) one may conclude that the conditions in the adaptive case are more restrictive. While this conclusion m a y be true in some cases, it is not true in others. In particular, it is not true in the case where in robust non-adaptive control Am(s ) is largely due to parametric uncertainty, as shown by the following example. E x a m p l e 5.1.2. 1 [,1 s+l-e for some constant e. In robust nonadaptive control we express (5.17) as y=
(5.17)
1 y = - - ~ [ 1 + ,am(S)], where Am(s ) = ,-VT'~-~,~i.e., in (5.1) a -- 1. Then, condition (5.4) becomes
( s + T - .- -eT~ Tb am )
OD
< 1,
which requires e to be small, i.e., for am ---- 0.1 and 1 - ~ > 0 we require ~ < 0.1. In adaptive control, however, (5.17) is modeled as 1 s-{-a
i.e., Am(S ) = 0, which satisfies (CV)-(C3') for all e.
[]
The above example together with our analysis for Example 5.1.1 demonstrate that robust adaptive control can handle a larger class of parametric uncertainties than robust nonadaptive control but a smaller class of non-parametric uncertainties. 5.2 I n d i r e c t R o b u s t A d a p t i v e C o n t r o l In Sect. 5.1, the control of the unknown plant was carried out by directly estimating the desired controller parameters. An alternative method is to estimate the plant parameters on-line and use them to calculate the controller parameters at each time t. The scheme derived with this method is commonly known as indirect adaptive control because the evaluation of the controller parameters is achieved indirectly by using the estimated plant model. The main steps used in the design and analysis of indirect adaptive control are the following:
Step 1: We derive the control law that can be used to meet the control objective as if the plant parameters were known.
144
Ioannou and Dz~tta
Step 2: We propose the same control law as in Step 1 but with the controller parameters calculated at each time t from the estimated plant parameters generated by an adaptive law. Step 3: We analyze the adaptive control scheme formed in Step 2 and show that it meets the control objective. In the adaptive control literature, the most frequently encountered indirect schemes are of two types. They are (i) the adaptive pole placement control (APPC) scheme [10,12,!3,14,16,20 ] for which the control objective in Step 1 is pole placement and (ii) the adaptive LQ control (ALQC) scheme [38,39,40,41] for which the control law in Step 1 is obtained by minimizing the LQ cost as discussed in Sect. 4.4. We consider these schemes one by one. A d a p t i v e P o l e P l a c e m e n t C o n t r o l . We illustrate the design and analysis of an adaptive pole placement control scheme using a simple example. E x a m p l e 5.2.1. Consider the plant b y = --4--~ (1 + ~ m ( s ) ) u ,
(5.18)
where a, b are unknown constants. The control objective is to choose u such that the poles of the closed-loop system are placed at the roots of a given Hurwitz polynomial, say A*(s) = (s + 1) 2, and the output y(t) tracks a unit step input us(t). Thus, here ym(t) = Us(t) and Q(s) -" s. Step 1: As we have already shown in Sect. 4.4, the control law
s ÷ 12 + 11 ~-----7 [ym - y], s+
u = ~[u] where
2-a
tl = - y - ,
1
12 = ~ ,
(5.19)
(~.20)
and ~1 > 0 is a design constant, can be used to meet the control objective exactly when Am(S ) = 0. Furthermore, if/kin(8 ) ~ 0, then for all Am (s) satisfying
bz:~m(S)(llS "{- 12) [ ~¥T)~ I
< 1,
(5.21)
OO
the closed-loop plant is stable and the tracking error er = y - Ym is given by s+a s+a e~ = - [ ( s + 1) 2 + b~m(S)(Zls + Z~)]"[ym] = [(s + 1) ~ + b~m(S)(Zls + t~)] [0], i.e., if (5.21) is satisfied, er(t) converges to zero exponentially fast. Step 2: Since a, b are unknown, (5.20) cannot be used to calculate the controller parameters 11, 12. Instead of (5.19)-(5.20) we use
(
) 1
~, - ~ [ u l . + ~ + h(t), + i~(t) --~-~[Ym -- Y],
(5.22)
Robust Adaptive Control
145
where 11, [2 are calculated as
t~=2-a
, i2 = ~ 1,
(5.23)
where ~, b are the estimates of a, b generated by an adaptive law as follows: From (5.18) we have (s -b a)y --- b. --~ bAm(s)[u ] . (5.24) Filtering both sides of (5.24) by 1/(s + A), with ~ > 0, and rearranging term,s we obtain Z = 0*W~ -~- r], (5.25) where z=
,
i=
1[_:] 0"Io T
s ~
'
b 8-~- AmIB]LtLj,[ \F 1
which is in the form of the linear parametric model used with the gradient and least-squares methods. The adaptive law for estimating 0* is then given by
o T ~ -- Z
(5.26)
(~T~ - -
m2
m2
'
where m 2 = 1 + n~ and n~ = m, is generated from yh, = - 6 0 m , + u 2 + y2,
m.(0) = 0.
As shown in Sect. 3.2, the adaptive law (5.26) has the following properties (i) e, 0, 0 e Loo (ii) e, en,, 0 e S(m-~). These properties, however, do not guarantee that [1, [2 are bounded, since (5.26) may generate estimates b which are arbitrarily close to zero. If the sign of b and a lower bound bo > 0 for Ib] are known, then (5.26) can be modified so that Ib(t)[ _> b0 Vt > 0 without altering the properties (i) and (ii) [14,11]. 1 Step 3: Let 5: A_ _~.i[x]" Since A and A1 are both design constants, we can choose 2 = 2x. Then, using the estimation error ern 2 = oW( -- z and the control law (5.22), we can write
= -i~a-
(i2 -
ita)~+
.Om+ i~'~ 2
.~ = - ~ m 2 - a.O + ha, ~a+f~ r 1 where f/m = i ,+xl tyro] iS bounded due to [1, [2, Ym E Leo. Defining xt = f/, z2 = ~, we have
Ioannou and Datt~
146
where x = [zl, Z2] T and
For each fixed t, d e t ( s I - A ( t ) ) = s ~ + 2s + 1, and
Provided A _> 60, using Lenuna 2.2, we can show that m= < $
i o,
IIA(,)I
c > O,
~,~in~
A(t) is exponentially
IlbZ~m(')ll
= z~02 ,
Lemma 2.1 it follows that for A0 < c a n d stable. Hence, from L e m m a 2
some
2.3 we have
6
tlx, ltg _< c II( m ),11 +c for some 6 > 0. Now u and y can be expressed as
Since 9m, A • Loo, we have
(lly, lh) + (ll=,llg) 2 _< c (llx, ll~) ~
c
Therefore, m S = 1 +n~ < c
(It(
em2)t
+ ~.
,:)'
+ c,
i.e., m 2 <_ c
c - ~ ( ' - T ) ( e m ) 2 m 2 d r + c,
where e m • S ( z ~ ) . tIence, by applying Lemma 2.4 it follows that for A0 < c and some c > O, m • L ~ , which implies that all signals in the closed loop are bounded. The following expression for the tracking error 0t
ler(r)12dr <_ e z ~ t + e
can also be obtained in a similar manner as in the direct case. The condition for stability is that 1 z~0 = ~ IlbZ~m(S)ll~ < c for some e > 0 which can be calculated if an estimate of the rate of exponentim convergence of the transition matrix of A ( t ) can be obtained. If, instead of leakage, a dead-zone modification is used, the bound for A0 does not depend on the transition matrix of A(t) and is more convenient to calculate [14,18]. []
Robust Adaptive Control
147
A d a p t i v e L i n e a r Q u a d r a t i c C o n t r o l . We illustrate the design and analysis of an adaptive LQ control scheme using a simple example. E x a m p l e 5.2.2. Consider the plant y = ~
b, (1 +
~m(s))~.
(5.27)
The control objective is to choose u to stabilize the plant in a way so as to minimize an LQ cost, while making the output y(t) track a unit step input u,(t). Thus, here ym(t) = u,($) and Q(s) = s. Step 1: As we have already shown in Sect. 4.4, the LQ control law s + q_
(5.28)
S
= -e~
(5.29)
G = -~,bTp
(5.30)
= A~ + b~ + k(O~ + e0),
ATp + PA +
cTc
A=
, b=
1-}-PbbTp = 0
-
where
k = [a - ~ , , - ~ 2 ] ~
(5.30
(5.32)
[bl] blq ' c=,10,
can be used to meet the control objective exactly when Am(s) = 0. Furthermore, if Zlm(S) ¢ 0, then for all Am(S) satisfying
Am(s) s,s + a)[1 + a ( s I - (A + kC))-lb] - (s + q)bla[sI - (A + k C ) ] - l k
< I,
oo
(5.33)
we have signal boundedness and the tracking error er = Y - Y m --+ 0 exponentially fast. Step 2: Since a, bx are unknown, the control law in (5.28)-(5.32) cannot be implemented. Instead, we replace a, bl by their estimates a, bl in the matrices A, b to obtain the matrices A, b and then use the latter instead of A, b in (5.28)(5.32) to calculate the control input u. Since the design of the adaptive law (5.26) in Example 5.2.1 did not depend on the particular control structure, the same adaptive law can be used here also with the only modification that the parameter estimate O(t) now consists of a and ba. As before, the adaptive law guarantees (i) e, 0, 0 e Loo (ii) e, enh, 0 e S(m-~-~). The adaptive law (5.26) does not, however, guarantee that the pair (.~(t), b(t)) is uniformly controllable for all t > 0. Such a condition is needed to ensure that the solution P(t) of the ARE (5.32) with A, b replaced by .4(t), b(t) exists and is uniformly bounded. If the sign of bl and a lower bound b0 > 0 for [bx[ are
148
Ioannou and Datta
known, then (5.26) can be modified to guarantee that Ibl(t)l > b0 v t _> o and a(t) is strictly hounded away from q W > 0, without altering the properties (i) and (ii)[14,11]. This will guarantee that (.4, b) is uniformly controllable Yt > O. Step 3: The adaptively controlled closed-loop plant can be written as: s + A1) [y] + ¢2 (s + (s),)(s + q)+ A1) [~1 e = A¢(t)e + k(Ss~ ++A)(Sa~s++a~A1) - ¢ 1 (s + A)(s
-ks2 blsAm(s) + trls +
[u]
(5.34)
a2
s + x1)[u] + ¢2 (s + (s~)(~ + q)+ A1) [fi] e0 = - c ~ + (Ss2++~)(s~ls++~1)~2 -¢1 (s + ~)(~
blSZlm(S) s 2 + ~ls + ~2
[u]
(s + a)(s + q)M'(s)
't/.
(5.35)
(s + a)(s + q)F'(s)
s:' +,61s 2 +,62s +/~:3 t.a, + r ' , s 3 + f~ls 2 + ~2s + f13 [y] bl(s + q ) F * ( s ) A m ( s ) r ,
(5.36) (5.37) (5.38)
a = -V(O~
y= ym--eo,
where
- Ae(t) ~- A(t) - b(t)G(t), - c = [10], -- k = [~i-~I - a 2 ] T,
- s2 + iris + a2 is an arbitrary monic polynomial of degree 2 with all its roots in Re[s] < - ~ 2 ' - ( s + Xx) is a Hurwitz polynomial with X1 > 60, - Wbi(s), Wei(s),i - 1, 2, are stable strictly proper transfer function matrices obtained using Lemma 2.5, - M*(s) and f*(s) solve the Diophantine equation
M'(s)s(s + a) + F'(s)bl(8 + q) = s 3 + ~ls ~ + Z2s + ~3, where s 3 + ills 2 + fl2s + f13 is an arbitrary monic polynomial of degree 3 having all roots in Re[s] < - ~ 2 " For each fixed t, it follows from standard LQR Theory [34] that the matrix At(t) is stable. Furthermore,
Robust Adaptive Control
149
Provided ~ > $0, using Lemma 2.2, we can show that
,,
1(
~-~ -< X Ilbl/t.(,)ll
= / t 0 ~,
i.e., Loo. Furthermore, since the eigenvalues of Ae(t) are continuous functions of the parameter estimates 0(t) and O(t) E L ~ (i.e., 0 E K, Kcompact C IR2 ), it follows that 3ers > 0 such that Re{)q(Ae(t))} _< -ere Vt ~ 0. Hence, applying Lemma 2.1, it follows that for /t o < c and some c > 0, At(t) is exponentially stable. Using Lemma 2.5 and the relationship (s + q)fi = su, we obtain
s
, =
¢
-¢1 (~ + ,x)(, +
+ ~)(. + ,h)[a]
+wo3(,) {(wb3(,)[¢TI) Since c T ~ __ ~m2 .~. T], we can write
s [~,] + (s + a)Cs + ~ 1 ) ( ¢2 s
(s + q) 2+ [~] _ _ r S 17] + a)Cs + a l ) 7 - 4 - - ~ t'm
-w03(,) Using the above expression in (5.34)-(5.35) and making use of (5.36)-(5.3S), it is clear that the stability analysis for this example can be completed by essentially duplicating the steps involved in the analysis of Example 5.2.1. The condition for stability is that
zl0-~
1
60
IlbxZ3m(s)ll~ < c
for some c > 0 which can be calculated if an estimate of the rate of exponential convergence of the transition matrix of A¢(t) can be obtained. As in Example 5.2.1, the use of a dead-zone instead of a leakage will make the bound f o r / t o independent of the transition matrix of Az(t) and hence easier to calculate. The following expression for the tracking error
fo
* le"('r)12dr < C/to2-L + c
can also be obtained in a similar manner as in Example 5.1.1.
[]
An important drawback of indirect adaptive control is what is called the "sta~ bilizability problem": the control law is designed based on the estimated plant, which has to be controllable and observable at each time t. The adaptive laws cannot guarantee such a property alone without additional modifications. One
150
Ioannou and Datta
possible modification is to use projection and constrain the estimated parameters to be in a convex set which contains the unknown parameters and is such that for every member of the set the stabihzability condition required for the calculation of the control law is satisfied [14,42]. While such modification seems feasible in theory, in practice the development of such convex sets for higher order plants is complicated and awkward. Other approaches based on persistence of excitation are used in [43,44] to handle the stabilizability problem. 6 Conclusions In this paper we have given a unified treatment of the design, analysis and robustness of most of the continuous-time robust adaptive control schemes that are currently available in the adaptive control literature. The methodology presented can also be used to synthesize new robust adaptive control schemes by simply combining an adaptive law, developed using the procedure of Sect. 3, with a robust controller structure. Thus, this paper is part of an ongoing effort to develop a systematic and unified theory for the design and analysis of robust adaptive control systems so that the latter cease to be looked upon as a mere collection of assorted tricks and algorithms. The theory of robust adaptive control is still far from complete. A satisfactory theory for the quantitative analysis and transient performance of robust adaptive control systems is yet to be developed. The results presented in this paper are applicable to LTI plants. Their extension to linear time-varying plants is attempted with considerable success in [45,46,47,48,491. References 1. A. S. Morse, "Global stability of parameter adaptive control systems," IEEE Trans. Aut. Control, vol. AC-25, pp. 433-439, June 1980. 2. K. S. Naxendra, Y. H. Lin, and L. S. Valavani, "Stable adaptive controller design -part II: proof of stability," IEEE Trans. Aut. Control, vol. AC-25, pp. 440-448, June 1980. 3. I. D. Landau, Adaptive Control: The Model Reference Approach, Marcel Dekker, New York, NY, 1979. 4. G. C. Goodwin, P. J. Ramadge, and P. E. Gaines, "Discrete time multivariable adaptive control," IEEE Trans. Aut. Control, vol. AC-25, pp. 449-456, June 1980. 5. B. Egardt, Stability of Adaptive Controllers, Springer-Verlag, Berlin, 1979. 6. P. A. Ioannou and P.V. Kokotovic, Adaptive Systems with Reduced Models, Springer-Verlag, New York, NY, 1983. 7. C. E. Rohrs, L. Valavani, M. Athans, and G. Stein, "Robustness of adaptive control algorithms in the presence of unmodehd dynamics," Proc. ~lst IEEE Con]. Dec. Control, 1982. 8. P. A. Iormnou and P. V. Kokotovic, "Instability analysis and improvement of robustness of adaptive control," Automatica, vol. 20, pp. 583-594, Sept. 1984. 9. P. A. Ioannou and K. S. Tsakalis, "A robust direct adaptive controller,~ IEEE Trans. Aut. Control, vol. AG-31, pp. 1033-1043, Nov. 1986.
Robust Adaptive Control
151
10. K. S. Narendra and A. M. Annaswamy, Stable Adaptive Systems, Prentice-HMl, Englewood Cliffs, N J, 1989. 11. L. Praly, "Robust model reference adaptive controllers-part 1: stability analysis, = Proc. ~3rd IEEE Conf. Dec. Control, 1984. 12. G. Kreisselmeier, "A robust indirect adaptive control approach," Int..7. Control, vol. 43, pp. 161-175, 1986. 13. P. Ioannou and J. Sun, "Theory and design of robust direct and indirect adaptive control schemes," _Int. J. Control, vol. 47, pp. 775-813, 1988. 14. R. H. Middleton, G. C. Goodwin, D. J. Hill, and D Q. Mayne, "Design issues in adaptive control," IEEE Trans. Aut. Control, vol. AC-33, pp. 50-58, Jan. 1988. 15. B. D. O. Anderson, R.R. Bitmead, C.R. Johnson, Jr., P.V. Kokotovic, R. L. Kosut, I. M. Y. Mareels, L. Praly, and B. D. Riedle, Stability of Adaptive Systems: Passivity and Averaging Analysis, MIT Press, Cambridge, MA, 1986. 16. S. S. Sastry and M. Bodson, Adaptive Control: Stability, Convergence and Robustness, Prentice-Hall, Englewood Cliffs, N J, 1989. 17. P. A. Ioannou and K. S. Tsakalis, "Time and frequency domain bounds in robust adaptive control," Proc. 1988 Amer. Control Conf. 18. K. S. Tsakalis, "Robustness of model reference adaptive controllers: input-output properties," Department of Electrical and Computer Engineering, Arizona State University, Report no. 89-03-01, 1989, to appear in IEEE Trans. Aut. Control). 19. J. M. Krause, P. P. Khargonekar and G. Stein, "Robust adaptive control: stability and asymptotic performance," Proc. ~8th IEEE Conf. Dec. Control, 1989. 20. G. C. Goodwin and K.S. Sin, Adaptive Filtering, Prediction and Control, Prentice-Hall, Englewood Cliffs, N J, 1984. 21. G. Kreisselmeier and D. Joos, "Rate of convergence in model reference adaptive control," IEEE Trans. Aut. Control, vol. AC-27, pp. 710-713, June 1982. 22. B. A. Francis, A Course in Hoo Control Theory, Springer-Verlag, Berlin, 1987. 23. T. Kailath, Linear Systems, Prentice-Hall, Englewood Cliffs, N J, 1980. 24. C. A. Desoer and M. Vidyasagar, Feedback Systems: Input.Output Properties, Academic Press, New York, NY, 1975. 25. P. J. Gawthrop, Continuous-Time Sell-Tuning Control, Research Studies Press, Wiley, New York, NY, 1987. 26. R. H. Middleton and G. C. Goodwin, Digital Control and Estimation: A Unified Approach, Prentice-Hall, Englewood Cliffs, N J, 1990. 27. M. Polycarpou and P. Ioannou, "On the existence and uniqueness of solutions in adaptive control systems," Department of EE-Systems, University o] Southern Cali]ornia, Technical Report no. 90-05-01, 1990. 28. D. G. Luenberger, Optimization by Vector Space Methods, Wiley, New York, NY, 1969. 29. M. Vidyasagar, Nonlinear Systems Analysis, Prentice-Hall, Englewood Cliffs, N J, 1978. 30. M. Vidyasagar, Control System Synthesis: A Factorization Approach, MIT Press, Cambridge, MA, 1985. 31. P. V. Kokotovic, H. Khalil, and J. O'Reiily, Singular Perturbation Methods in Control: Analysis and Design, Academic Press, New York, NY, 1986. 32. M. G. Safonov, A. J. Laub, and G. L. Hartmann, "Feedback properties of multivariable systems: the role and use of the return difference matrix," IEEE Trans. Aut. Control, vol. AC-26, pp. 47-65, Feb. 1981.
152
Ioannou and Datt~
33. J. C. Doyle, K. Glover, P. P. Khargonekar, and B. A. Francis, "State-space solutions to standard //2 and Hoo control problems," IEEE Trans. Aut. Control, vol. 34, pp. 831-847, Aug. 1989. 34. H. Kw~kernaak and R. Sivan, Linear Optimal Control Systems, Wiley, New York, NY, 1972. 35. K. J. Astrom and B. Wittenmark, Adaptive Control, Prentice-Hall, Englewood Cliffs, N J, 1989. 36. G. C. Goodwin and D. Q. Mayne, "A parameter estimation perspective of continuous time model reference adaptive control," Automatica, vol. 23, pp. 57-70, Jan. 1987. 37. P. A. Ioannou and G. Tao, "Dominant richness and improvement of performance of robust adaptive control," Automatica, vol. 25, pp. 287-291, March 1989. 38. C. Samson, "An adaptive LQ control for nonminimum phase systems, ~ Int. J. Control, vol. 35, 1982. 39. D. W. Clarke, P. P. Kanji]al, and C Mohtadi, "A generalized LQG approach to self-tuning control, part I: aspects of design," Int. J. Control, vol. 41, 1985. 40. D. W. Clarke, P. P. Kanjilal, and C Mohtadi, "A generalized LQG approach to self-tuning control, part II: implementation and simulations," Int. J. Control, vol. 41, 1985. 41. J. Sun and P. Ioannou, "Robust adaptive LQ control schemes," Proc. 1989 Amer. Control Conf. 42. Ph. de Larminat, "On the stabilizability condition in indirect adaptive control," Automatica, vol. 20, Nov. 1984. 43. G. Kreisselmeier, "An indirect adaptive controller with a self-excitation capability," IEEE Trans. Aut. Control, vol. 34, pp. 524-528, May 1989. 44. R. Cristi, "Internal persistency of excitation in indirect adaptive control," IEEE Trans. Aut. Control, vol. AC-32, pp. 1101-1103, Dec. 1987. 45. G. Kreisselmeier, "Adaptive control of a class of slowly time-varying plants," Syst. Control Lett., vol. 8, Dec. 1986. 46. K. S. Tsakalis and P.A. Ioannou, "Adaptive control of linear time-varying plants," Automatica, vol. 23, pp. 459-468, July 1987. 47. R. H. Middleton and G. C. Goodwin, "Adaptive control of time-varying linear systems," IEEE Trans. Aut. Control, vol. AC-33, pp. 150-155, Feb. 1988. 48. K. S. Tsakalis and P. A. Ioannou, "Adaptive control of linear time-vaxying plants: a new model reference controller structure," IEEE Trans. Aut. Control, vol. 34, pp. 1038-1046, Oct. 1989. 49. K. S. Tsakalis and P. A. Ioannou, "A new indirect adaptive control scheme for time-varying plants," IEEE Trans. Aut. Control, vol. 35, pp. 697-705, June 1990.
R o b u s t C o n t i n u o u s - T i m e A d a p t i v e C o n t r o l by Parameter
Projection
*
Sanjeev M. Naik, 1 P. R. Kumar, 1 and B. Erik Ydstie 2 1 Coordinated Science Laboratory and Department of Electrical and Computer Engineering University of Illinois, Urbana, IL 61801, USA. 2 Department of Chemical Engineering, Goessmann Laboratory University of Massachusetts, Amherst, MA 01003, USA.
A b s t r a c t . We consider the problem of adaptive control of a continuous-time plant of arbitrary relative degree, in the presence of bounded disturbances as well as unmodeled dynamics. The ~iaptation law we consider is the usual gradient update law with parameter projection, the latter being the only robustness enhancement modification employed. We show that if the unmodeled dynmnics, which consists of multiplicative as well as additive system uncertainty, is small enough, then all the signals in the closedloop system are bounded. This shows that extra modifications such as, for example, normalization or relative dead zones, are not necessary for robustness with respect to bounded disturbances and small unmodeled dynamics. In the nominal case, where unmodeled dynamics and disturbances are absent, the asymptotic error in tra£king a given reference signal is zero. Moreover, the performance of the adaptive controller is also robust in that the mean-square tracking error is quadratic in the magnitude of the unmodeled dynamics and bounded disturbances, when both are present.
1 Introduction Recently, there have been m a n y a t t e m p t s to study the adaptive control of plants with bounded disturbances and unmodeled dynamics. In his pioneering work, Egardt [5] showed t h a t even small bounded disturbances can cause instability in adaptively controlled plants. He further d e m o n s t r a t e d t h a t modification of the a d a p t a t i o n law by projecting the p a r a m e t e r estimates, at each time instant, onto a compact, convex set known to contain the true p a r a m e t e r vector, provides stability with respect to bounded disturbances. Proving the stability of an adaptive control s y s t e m in the presence of unmodeled dynamics is however more difBcult, and researchers have proposed various additional modifications to the adaptation law to analyze and bound the effects of unmodeled dynamics. For example, in [8-10], Praly introduced the device of using a normalizing signal in the p a r a m e t e r estimator, while another notable modification is the idea of a normalized dead-zone [11]. Other modifications used in conjunction with normalization include projection [5] or some form * The research reported here has been supported in part by U.S.A.R.O. under Contract No. DAAL 03-88-K-0046, the JSEP under Contra~t No. N00014-90-J-1270, and by an International Paper Fellowship for the first author.
154
Naik, Kumar, and Ydstie
of leakage, the concept of which was first introduced in [7], such as switching-amodification [12,14]. An excellent unification of these various modifications and results can be found in Tao and Ioannou [18]. In this paper, we show that Egardt's original simple modification of parameter projection is in fact enough to provide stability with respect to small unmodeled dynamics as well as bounded disturbances. Our main results are the following: (i) A certainty equivalent adaptive controller, using a gradient based parameter estimator with projection, ensures that all closed-loop signals are bounded, when applied to a nominally minimum-phase continuous-time system with bounded disturbances and small unmodeled dynamics (Theorem 9.1/. (ii) In the absence of unmodeled dynamics and disturbances, i.e., in the nominal case, the error in tracking a reference trajectory converges to zero (Theorem 10.1). When unmodeled dynamics as well as bounded disturbances are present, the mean-squared tracking error is quadratic in the magnitude of the unmodeled dynamics and bounded disturbances (Theorem 11.11. Thus, the adaptive controller provides robust performance in addition to robust boundedness. While our work thus shows that a simple modification is sufficient to ensure robust boundedness and robust performance, and some early modifications may have been proposed due to the limitations of proof techniques used, we feel that it is nevertheless important for future work to compare the various modifications on the basis of the amount of robustness provided, the resulting performance as a function of unmodeled effects, transient response, etc., as well as the complexities of the modifications themselves. The key stimulus for our work here is the recent paper of Ydstie [19], which showed that parameter projection in a gradient update law is sufficient for ensuring the boundedness of closed loop signals for a nominally minimum-phase, unit delay, discrete-time plant with some types of unmodeled dynamics as well as bounded disturbances. The continuous-time systems studied here give rise to several additional issues such as filtering of signals, parametrization of systems, differentiability considerations of signals, augmented errors, etc., which motivate various changes, and allow us to establish stability for nominal plants with arbitrary positive relative degree, as well as for a class of unmodeled dynamics which is larger than those considered earlier, for example in [20],[14] or [19]. For instance, unlike Ydstie[19] we do not require the true plant to be stably invertible; only the nominal plant is assumed to be minimum-phase. Additionally, in contrast to [14] we also allow the unmodeled dynamics to be nonlinear or time-varying, and do not require differentiability of either the bounded disturbance, which is lumped together with the unmodeled dynamics in our treatment, or the reference input. The rest of the paper is organized as follows: Section 2 introduces the system and reference models. In Section 3, we reparametrize these models, and describe, in Section 4, the adaptive control law. Our analysis starts in Section 5, where we show that all closed-loop signals in the system are bounded by a particular signal re(t). In Section 6, we introduce a signal z(t) defined through a "switched system" which overbounds re(t). In Section 7, we show that the filtered signals
Robust Continuous-Time Adaptive Control
155
are comparable to z over certain bounded intervals of time. To apply these results to the stability analysis of the closed-loop system, in Section 8 we present a nonminimal representation of the closed-loop system, which is then used in Section 9 to complete the boundedness analysis by showing that a certain positive definite function of the signal z(t) and the non-minimal system state error e(t) is bounded. In Section 10, we show that asymptotic tracking is achieved in the nominal case, and in Section 11 we establish a mean-square robust performance result. In Section 12 we present simulation examples to illustrate the results. Finally, Section 13 presents some concluding remarks. Some necessary technical results are collected in Appendices A, B, and C. A list of constants appears in Table D. A preliminary version of the results presented here is contained in [20].
2 System
and Reference
Models
Consider the single-input, single-output system,
y(t)
B(s) = A"~-~_~C1 +/.JmAm(a))B(t) + V(t),
(i)
where B(s)/A(s)is the transfer function of the modeled part of the plant, Am(S) represents the multiplicative uncertainty in the plant, and v(t) represents the effect of additional additive unmodeled dynamics as well as bounded disturbances. We will make the following assumptions on the nominal model of the plant: (A1) A(s) = s" + E n-1 =o B(s) are coprime.
and B(s) = ET=o
o < m <.,
where
(A2) B(s - Po) is Hurwitz for some P0 > 0, and bm > brain > 0. The relative degree of the nominal plant will be denoted by nr := n - m. We will make the following assumptions on the unmodeled dynamics and bounded disturbance of the plant: (A3) The multiplicative uncertainty, Am(S), is a transfer function with relative degree greater than or equal to (1 - nr) such that Am(s -- P0) is stable. (A4) The additive unmodeled dynamics and disturbances give rise to a signal v(t) which satisfies Iv(t)] < Kvm(t) + k~ + k~oexp[-pt], where m(.) is defined by, re(t) -- - d o m ( O + dx(IKOI + ly(t)l + 1), re(O) >_ dx/do,
(2)
All the constants are positive. Furthermore, 0 < p _< do and do q- d2 < P0 < 2d0 + d2, for some d2 > 0. The term kv0 exp[-pt] above allows for the effect of initial conditions.
156
Naik, Kumar, and Ydstie
E x A m p l e 2.1 The above assumptions allow for the class of linear systems
B(s) ( y(t)--~
Nm (s)'~ l+~mD--~fuit
)+vit),
where deg Nm(s) _< deg Dm(s) + n r - 1 (which could be greater than the degree of Dmi s) if nr >_ 2), v(t) = D---~y(t~ NI(S) . . +
~2(s) u(z)N2(s) . . . .. w(t) + ~o exp[-pt],
where ~ , ~ are strictly proper, w ( t ) i s a bounded disturbance, and the polynomials Dm(s - Po), DI(S - Po) and D2(s - 19o)are Hurwitz. D
E x a m p l e 2.2 A class of nonlinear unmodeled dynamics satisfying (A4) is vit) - f ( t, gl it) " n l i ' ) Y i t ) , g2(t) • n2i8)t~(~)) ,
[f(t, Zl(t),z2(t)l <
where f is any nonlinear function such that
k~l~x(e)l +
k21xz(t)l + k3, ga(t),g2(O are bounded time-varying signals, and Hxis),H~(s) are strictly proper transfer functions such that stable.
Hl(s -Po)
and
H~(s -po)
are 1:3
The goal of adaptation is to follow the output of a reference model given by ym(t) = Wm(s)r(0,
(3)
where Win(s) is a stable transfer function with relative degree nr, and r(t) is a reference input. We will suppose that Ir(t)l < k.l,Vt > 0 and lym(t)l < k~.~,Vt > 0. 3 Parametrization
of System
and
Reference
Models
We now repararnetrize the system and reference models so that they are in a form more suitable for the development of an adaptive control law. To do this, we need to filter the input and output signals. Let a be a positive number satisfying a > d2+2do. We define the "regression" vector~3
¢ :=
(
(. +.)~.-m-1 Y,-", (. + . ) . - m ~,
1
1
)T
(. + . ) ~ . _ m _ l " , . . . , (. + . ) . _ . ~ " 1 z All of the results of this paper continue to hold if instead of the filter (,+~)~_,._. x' we use the filter (,+,)--'~xx(,)' ~ where )~l(s) = (s + a)),l (s) with )~I(s) being monic, of degree ( n - 2), and with all roots having real parts less than or equal to -(2do + d~).
Robust Continuous-Time Adaptive Control
157
We will presently show that there exists a "parameter vector" 0 = (01 . . . . . 02n)T such that the system (1) can be represented as
yCt) = eTCt)0 + ~(t) + ~2~o(t), 02. = b~,
(4)
where eW(t)0 represents the nominal part of the system, ~2do(t) represents the effect of initial conditions arising from the filtering operations, 4 and v~(t) represents the effect of unmodeled dynamics and bounded disturbances. We will also reparametrize the reference model (3) as 1 ym(t) - (s Jr a) nr rt(t) Jr ~2d°(t) '
(5)
where r'(t) := ( s + a ) nr Wm(s)r(t). Note that r'(t) is well defined since the relative degree of Win(s) is hr. Further, r'(t) is bounded since r(t) is bounded. We shall directly suppose from now onwards that rt(t) _< kr,Vt >_O. To see the existence of a 0 for which (4) holds, let F(s) be a monic polynomial of degree (n~ - 1) and G(s) a polynomial of degree less than or equal to (n - 1), such that A(s)f(s) + G(s) : A(s), where A(s) := (s + a) T M . Then, from (1), taking into consideration the effect of the initial conditions introduced by the filtering operation 1/A(s), we have,
A(S)~s_~(s) y(t) = F(s)-~-~y(Q . .A(s) rB(s)
+ ~2cI.(Q .
-- F(s) [ ~--'~u(t) Jr/Jm
B(s)Zlm(S) . . A(s) ..] ~ U(t) Jr ~--~V($)] Jr ~2do(Z).
Thus, if 0 = (0~,..., 02n) T is defined from the coefficients of C(s) -rid B(s)F(~) by, O,(s Jr a) "-1 + . . . Jr 02(s Jr a) Jr 01 = G(s), 02,(s + a) "-1 + . . . + 0,+2(s Jr a) + On+l = B(s)F(s), and v~f(t) = A "x--~v(t) + p m ~ ( ~ Am(s)u(t), then (4) is satisfied. We note that since F is monic, Oun = bin.
4 T h e A d a p t i v e C o n t r o l Law We will use the control law
~T(t)~(t) = ,'(t), to implicitly define the input u(Q, where ¢ ( t ) : = ( ~ = - r Y , . . .
(o) ,Y, ~7~F=-rU,
.... u) T, and 0(t) is an estimate of 0 that we shall presently specify. Note that the "regression" vector ¢(t) defined earlier satisfies ¢(Q = ~ ¢ ( t ) . 4
Here and throughout, ~q(t) will denote a signal which satisfies the following properties: (i) I~(~°(t)l _< co exp[-qt], i = o, 1, where ~O(t) denotes the i - t h derivative of ~q(t). (ii) ~q(t) - 0 when initial conditions are zero. When the value of q is unimportant, we will sometimes drop the subscript on ~.
158
Naik, Kumar, and Ydstie
The adaptive control law (6) is a "certainty equivalent" control law, since if 0(t) = 8 in (6), then the nominal part of the system (4) tracks ym(/), since 1 y(t) = cT(t)0 + ~do(t) = (, + a ) " - " [¢T(t)0] + ~ o ( t ) 1 - (8 + a) - - " ¢(t) + 6do(t) = ~m(t) + ~2ao(t) • Let us now turn to the parameter estimator. We shall use the gradient estimator with projection, ~(t) = Proj [ti(t), a¢(t)e~(t)] II0(0)ll < M, tJ2,(0) > brain, .(t) J' where
a > 0,
ea(t)
(7)
is the "augmented" error,
ea(t) = y(t)
1
-
nCt):=~+
ym(t) -4- (s -4- a)n-rn [cT(t)~(t)]
~
y(t)
+
-
cT(t)0(t) '
u(t)
,~>0,
is a normalizer of the gradient, and Pro j[., .] is a projection whose i - t h component is defined for i ~ 2n by
{
if Hp[I < M or pWz _< 0, pTx zi -- ~ Pi, otherwise,
zl,
Proj[p, x] [i = and for i = 2n by
{
x2,~
Proj[p,x] I~- =
if (IlPll < M or p T x < 0) and (P:n > bmin or z2n >_0),
pTx
pTz-> O)
z2. - II-]-~Tp=.,
~f (llpl[ --- M and
0,
and (P2n > bmin or z2. :> 0), otherwise.
There are several features worth commenting upon. First, the augmented ea(t) consists of the tracking e r r o r y(t) - ym(t), as well as the familiar
error
"swapping" term
1 n-m [¢T(t)0(t)l (s + a)
-
[ (s
+
:),~_mcx(t)]O(t),whichwould
be zero if tJ(t) were a constant. Second, note that n(t) is slightly different from the usual II¢(t)l? since it additionally contains the lower-order filtered terms 1
1
y(t) and ----Tu(t) (s + a) for 1 < i < n - m - 1, which are absent in ¢(t). (8 + a) i Turning to the "projection" mechanism, it has two features. Without any projection, a gradient scheme would simply consist of ~(t)- aC(t)e~(t) ,,(t)
Robust Continuous-Time Adaptive Control
159
However, to keep the estimates inside a sphere of radius M centered at the origin, one would project them according to
~(t) = Vroj' [~(t),
~,~(t)e.(t)] .(t)
J '
where Proj'[p, z] = z if IlPll < M or pTz _< 0, i.e., when the estimate is inside the sphere or not about to leave it, and otherwise
pTx Proj'[p, z] = z - H - ~ p , i.e., one projects the drift term so that it evolves tangentially to the sphere. Our projection Proj[p, z], however, involves an additional feature to ensure that bm(t), the 2 n - t h component of 0(t), is larger than or equal to bmin. So we have Proj [O(t), a~(t)e.(t).]
n(t)
= Proj'
J ~,,
[O(t),oCt)o.Ct).] '*(0 / ~ '
if ~2.(t) > bmin or Vroj' "/O(t), o~b(t)ea(t)] /
,,(t)
J 2.
> O, -
i.e., if bm(t) is not about to become less than brain, and
Proj [O(t), a~(t)ea(t)] = 0 otherwise. n(t) J2. The following easily verified consequences of projection are important. (PO) 02n(t) ~ bmin, II#(t)ll
~¢(t)~,(t)
_< M.
(P1) I1~(011 _< nCt) I (P2) If I1011 _< M and 02n =bm ~ brain, i.e., O lies in the region to which we confine the parameter estimates, then
~r(t)~(t ) < ~r(0 -
[~e(t)~.(t)~ k
-(0
) '
where ~(t) := ~(t) - o
Finally, since the existence of a solution to the above parameter estimator may not be assured due to the discontinuous nature of the projection, we can replace the above projection by the "smooth" projection due to Pomet and Praly [15], for which existence is assured. The key properties (P1) and (P2) continue to hold for the "smooth" projection, and all the results of this paper rigorously hold for this "smooth" projection. In what follows, we will throughout suppose that the nominal plant described by 0 satisfies the assumptions of (P2).
160
Naik, Kumar, and Ydstie
5 Bounding
S i g n a l s by m
In this section we will show that all the signals in the system can be bounded in terms of re(t) given in (2). a For brevity of notation in the proof, we define
~1(~) = (s + ~)"-~, ~(~) = (, + ~)-~, ~(~) = ~ ( , ) ~ ( ~ ) ,
,(t)= A-~(s)[¢r(t)Oct)]-[X~(s)¢V(t)] O(t), ¢'(0 =
v ( t ) , . . • , 5 - V - Z v ( t ) , y S ~ ,,(t),... ,
~(t)
, ~d
el(t) -" y(t) -- Ym(*) • L e m m a 5.1
lu(t)l < K=~,II¢'(t)II + K.~I~(t)I + k . , + coexp[-2dotl,Vt > O.
Proof. Let ¢I := (~-~(S) y,..., ~ - -1~ V , y , A - ~1u , . . . ,
(8)
1 ) be a sub-vec(s+a)U
tor of ¢, and 01 the corresponding sub-vector of 0. Then, the control law (6) gives u = (r ~ - cToT)/bm. From (4), we then obtain lY(t)I < MII¢'II + Iv6l + co exp[-2dot]. Hence ]l~xll < lyl + I1¢'11 _< (1 + M)II¢'II + Iv61+ co exp[-2dot]. Since $ m > brain, and 110111< 11011< M, we thus conclude that I"1 ___m~n (kr + M(1 + M)II¢'II + MIv+l) + coMexp[-2dot], a
Theorem
5.1 For all t > O,
(i) Iv! (t)l _< Kvfm(t) + kvf + Coexp[-(do + dz/2)t] + kvfo exp[-pt],
(ii) Iv~(t)l < K'fm(t) + k~t + co exp[-(do + d~/2)t] + kofo exp[-pt], (iil) I[¢'(t)ll -< K~,..m(t) + co exp[-2dot], (iv) le.(t)l _< Ktmm(t) + k~f + co exp[-(do + d2/2)t] + kvfo exp[-p/] +co exp[-2dot], (v) ly(t)l < Kv,~m(t) + k.f + eo exp[-(do + d~/2)t] + k~fo exp[-pt] (vi)
+co exp [-2dot], In(t)l < K.mm(t) + k... + coexp[-(do + d2/2)t] + K~k~foexp[-pt] +co exp[-2d0t].
Here and throughout, the values of useful constants used in the bounds are specified in Table D in the Appendix. Certain constants whose exact value is unimportant and which do not depend on K~, #m,k~ will be denoted generically by G. Any constant whose exact value is unimportant but which depends on K~, tim, k~, and such that its value decreases as K~,#m,k~ decrease, will be denoted generically by c. Any positive constant which depends only on initial conditions of some filter, and whose exact value is unimportant will be denoted by co. Finally, all constants throughout are positive, unless otherwise noted.
Robust Continuous-Time Adaptive Control
161
The proof of this theorem is based on the following lemma. LeImma 5.2 Let H(s) be a strictly proper, stable transfer function, whose poles
{ps) satisfy rte(p~) < -(do + d~) < 0 for all j. (i) If Win(t) is the input ~o a system with ~ransfer function H(s), which satisfies the bound Iwin(t)l < k.lu(t)l + k~lv(t)l + kmm(t) + k' + k " e x p [ - f t l , V t >__0, for 0 < f <_do, then the output Wout(t) is bounded by IWout(t)l < kxllz(0)ll exp[-(do + d2/2)t] + k3m(t) + k4 + k5 exp[-p't], where z(0) is the initial state corresponding to a minimal state representation of H(s), and kl, k3, k4 and ks are related positive constants specified below. (it) Let H'(s) = kp + H(s). If ~ , ( t ) is the input to a system with transfer function H'(s), which satisfies the bound Iwin(t)l _< kmm(t) + k' + k" exp[-p't], Vt >__ O, for 0 < p' <_ do, then the output Wout(t) is bounded by IWout(t)l _< klllz(0)ll exp[- (do + d2/2)t] + k6m(t) + k7 + ks exp[-p't], where z(O) is the initial state corresponding to a minimal state representation of H'(s), and kl, k6, kr and ks are related positive constants specified below. 6 Proof. (i) Let ~ = Az + bwin,Wout = c ~ z be a minimal state representation of H(s). Then there exist constants kl, ks such that IIcTnexp[At]l I < kl exp[-(do + d=/2)t].¥t >_ 0, and Ih(t)l < k2exp[-(do + d2/2)t].Vt > O, where h(.) is the impulse response of H(s). Thus,
I'
I~out(t)l ___ IIc~ exp[At]ll I1~(0)11 +
Ih(r)llwi.(t - r)ldr
_< k~ll~(0)ll exp[-(do + d~/2)t] + k2km m(t - ,-)d,- + k,.
I'
I'
exp[-(do + d2/2)r]
exp[-(do + d,¢2),'](k~l,~(t - ")l + k, lV(t - ")1
+ k ' + k" exp[-p'(t - ~')])dr. Using m(t - r) < exp[dor]m(t) gives
/0'
~xp[-(do + d,/2),-]m(t - ,-)d~- <_ re(t)
/0'
e x p [ - d , . r / 2 ] d r <_ 2 m ( t ) / d ~ .
Also, from 0 < p' _< do,
fo
' exp[-(do + d2/2)~] exp[-pl(t - ,')ld~"
= exp[-p't]
exp[-(do + d2/2 - f ) r ] d r
exp[-p'tl <- do + d2/2 - f
< ~2 exp[-p ,t]. -
6 k~, k4, ke and kr axe independent of k", which will mean that initial conditions have no influence on the magnitude of the bounds or the size of allowable unmodeled dynamics.
162
Naik, Kumar, and Ydstie
Thus, Iwout(*)l _
k3 =
2(
~11
+
2kin) k4 = d2 '
2kzk' 2do+d2'ks= -
-
2k2k" d---~
(ii) This follows from part (i).
Proof of Theorem 5.1. The results (i) and (ii) follow from Lemma 5.2 and (A4) by noting that v! = ~ - ~ v and v~ = v] + t ~ m ~ ( ~ A m ( s ) u . The result (iii) follows by applying Lemma 5.2 to each component of ¢,, noting that each component is either of the form ~ ' r Y or ~ } - r u . Since ea = _ e T a + v~ +~2a0 the bound for ea in (iv) follows from the bounds for [[¢11 -< I[¢'[[ and v~. The proof of (v) is similar to (iv) since y = cT0 + v~ + ~2do- Finally, (vi) follows from Lemma 5.1 and (ii), (iii) above. []
6 Bounding
by a Switched
System
Let us introduce the "switched" system,
}(f) = I(t)(-doz(t) + d,(luCt)l + le~(t)l + 1)) + (1 - X(t))(-g2z(t) + K2) ,(9)
40) _>
m(0) (1 + ~ ) '
where
I(t) = 1, if (-doz(t) + dx(lu(t)l + le~(t)l + 1)) > (-g2z(t) + K2), = 0, otherwise.
(10)
In Theorem 6.1(ii) below we will show that m itself is bounded in terms of z, thus showing from Theorem 5.1 that all other signals can also be bounded in terms of z. Theorem 6.1
(i) rh(t) _< Kmm(t) + k,~ + co exp[-(d0 + d2/2)t] +Co exp[-pt] + Coexp[-2d0f], Vt _> 0. (i O re(t) < K~,z(t) + k,n, , Yt >_O. (iiO <_ K,z(t) + k, + c0 exp[-(d0 + d2/2)t] +c0 exp[--#] + co exp[-2d0t], Vt _> 0.
(11)
(13)
Robust Continuous-Time Adaptive Control
163
Proof. (i)
This follows from in = -dora+ dl(lul + lYl + 1), by making use of the bounds for lul and lYJ in terms of m given in Theorem 5.1(v),(vi). (ii) By the definition of the augmented error ca, we have y = (ca + Ym) - ~, which implies lYl < led + lyml + I¢1. By the Swap.ping 1emma (Morse [6]; Goodwin and Mayne [13]), ((t) = [hTexp(Qlt)] • (H(t)O(t)), where Qx is an (nr x nr) stable matrix such that det(sI - Q1) = A2(s), HT(t) = (¢(t), sd2(t),..., sn-m-l¢(t)), h T is a constant row vector of dimension nr, and "." denotes convolution. Now,
where the last inequality follows from Property (P1) of the parameter estimator, and since IIHII <_ C11¢'11. Since A2(s) = (s + a) nr, we have IIhwexp(Qlt)l I < Cexp(-at/2). Therefore I~(t)l < fo aCexp[-a(t- r)/2]lea(r)ldr, and so
I'
I¢1 exp[-d0(t - r)]dv < C a
/0'
exp[-d0(t - r)]
or e x p [ - a ( r - t')/2lle.Ct')ldt'dr
I' J/exp[
= ca
exp[-dot] exp[at'/2] - ( ~ / 2 - do)r]dT"le~(t')ldt'
2Ca __ . - ~ o
fo t
exp[-do(t - ¢ ) ] l ~ ( e ) l e ' .
This implies
for lyl exp[-d0(t
- r)]dv <
2do))le~(t')ldt'+k~,~do .
for exp[-d0(t - t')](1 + (a 2Ca_
Now, add fo(N(r)l + 1)exp[-d0(t - ~)]dr to both sides and multiply by dl. Finally, adding exp(-dot)m(O) to both sides yields the desired result. (ia) The result follows from ~(t) _< - ( d o + g~)z(t) + dx(lul + levi + 1) + K2 and the bounds on levi and lul in Theorem 5.1(iv, vi) by using the bound for m in (ii). o
From Theorem 6.1 it follows that z can grow at most exponentially fast, and therefore does not have a finite escape time. Since z bounds all signals, it follows that no signal in the system has a finite escape time. In what follows, we choose a large time Tt, and restrict attention to t > Tl, for which the following bounds hold. (i) Iv(t)] < K~m(~) + k~ + k(Tl) < Kvgmzz(t) "1-g~kmz (ii) Iv~(t)l < K~fg,~z(t) + K'~fk,,~ + k~f + k(Tt)
+ kv + k(Tz)(15) (16)
Naik, Kumar, and Ydstie
164
(iii) II¢'(t)ll ~ K ¢ , ~ K ~ z ( t ) + K~,mkm~ (iv) ly(t)l ~ MIl¢'(t)l[ + K£~K~(t) + g~t' k ~ + k~f + k(Tt) (iv)' ]y(t)l < Ku~z(t ) + ku, + k(Tt)
(vi) k(t) < K,z(t) + k, + k(Tt)
(17) (is) (19) (20) (21) (22)
(vii) r/re(t):= Am(S) u(t~ satisfies [r/m(t)l < cz(t) + e + k(Tt) ,
(23)
(v) lu(t)l < Ku¢,ll¢'(t)]l + K~vfK.vgm2z(t) + k. + k(Tt)
(v)' [u(t)[ < K.,z(t) + k., + k(Tt)
q(s)
""
if q(s - 2d0 - d~) is a Hurwitz polynomial such that ~ is proper. (The result (vii) follows from Theorem 5.1(vi), Lemma 5.2(ii) and Theorem 6.1(ii)). In all the above inequalities and in what follows, k(Tl) denotes a generic constant which decreases with increasing Tt at a rate faster than exp[--pTl].
7 Comparing ¢ ' and z In this section we compare ¢' and z.
"(0
T h e o r e m 7.1 (i) If I(t) = 1, then z - ~ >- Knz. (ii) Consider T' > O. Let ta >_ Tt + T' be any instant such that I(tx) = 1. If z(t) >_ L,Vt e (tl --T',tl] where L is a large positive constant (specified in Table D), then there exists a K~max > 0 and a I~m~ > 0 such that for all Kv e [0, Kvmax] and for all Um E [0, Umax], W e have:
.(t)
z~(t----~ > 5(T'),
V t e (tl - T',tl].
(24)
Proof. (i) Let us fix t such that I(t) = 1. By (10), we have [u] + ]eal > - g ~ z + K z ~a . Using the upper bound (20) on u gives, K~¢,1]¢'11 + lea] > (do - ~x)z + /£2 =: RHS. We note that we can take do > 0 and/£2 > 0, by the choices RHS. of g~, Tl, gvmax, and/~max given in Table D. If H¢'H > - - , then the result 2K~0, RHS RHS clearly holds, so we consider He'll < 2K,¢--"-~"Then clearly [e~I >_ ~ . Therefore using the upper bound (16) for v~, we obtain by definition of the augmented error,
~11¢11> I~r¢l > levi - Iv~l - co exp[-2d0TL] > I[(& - ~2)z + k~]/21 - (K',K~zz(t) + K'~k~ +k~f + k ( ~ ) + co exp[-pTl]). Note that (& - ~2) >_ 4K~fKm~ and R2 >_ 2(K~'fkmz + k~f+ k(Tl) + k exp[-pTt]), again from the choices of Ks, Tt, K~m~, and pmax given in Table D.) This implies
Robust Continuous-Time Adaptive Control
165 .(t)
He'[[-> ][¢H -> [.(d04M~2)z]. Hence we obtain, z(--~)2 _> g , z .
n(t)
(ii) First, we will bound the growth rate of z-~(t)" We do not require z to be large for this part of the proof. Note that d ( ~
h _ 2 n ~z < h + 2g2~'~. n Also, ) = ~" 2 z - ~"
IkO~ ~
¢ ' - a¢', which implies h : 2~btT(¢t-a~,bt) "~ ( 1 - 2a)11¢'112 +11¢'112, where, 1 1 1 1 Cei' :__ ((s + a) T M y ..... ~ " ~ Y ' y' (S -I- a) 2n-m-2 u ..... (s + a) u, u) , i.e.,
¢':
¢' := (s + a)¢'. Now, 11¢'11~ < I1¢'112 + y~ + u s. Using this, we get h < 2(1 a)[[¢,ll2 +y2 + u 2. Next, using the upper bounds for u and y from (20), (18), and
z-~(t ) ~ _< K s ( ).)2(t ) + Kb + K c recalling that n(t) = v + II¢'(t)[I 2, we get "~ x(n(t)
Z2(~) "
Next, using the fact that z(t) >_ L, Vt • (tl - T ~,tx], we get
d ~ (
z-,g)n(t)
) < K , ( - 7 7 ~ ) + Kd, where K d := K b +
"-~-'4Kc
z~-(t~) <- exp[K~(q - t'~lt , ~ z ~n(t' , ~ , ,)~,)+ [ " exp[K~(~' -
This gives,
r)lKddrl
JtS
Kd
< - e x p [ K a T ' ]~[_~(~) + ~-=,(1 - exp[-K~T'])]
,
Vt' • [tx - T',ta]
Thus, 71(ft) t rl Kd Kd > e x p [ - K , T ][~--ff(tl) + ~-~] Z2($/) Ks
>_exp[-K~T'][K,, + ~_.~d]_ K--A g , ' Vt' e It1 - T ' , tl]. The fact that there exist Kvmax > 0, Pmax > 0 such that 6(T') > 0 for all Kd K~ e [0, Kvmax], P m • [0, Pmax] follows from the fact that ~ can be made as small as we wish by making L sufficiently large, and then choosing appropriately small gvrnax and Pmaxn
8 A Nonminimal
System
Representation
Recalling 9 = 0T~b "~" I)~-I-~2do, ~b(t) "-- ~2--~¢(t) , c T ( t ) 0 ( $ ) "-- rJ(t), and Vm = ~-EZlmU, we note that 1 y ~ ~ ( r
t -- b'T¢) Jl- ~f "J- ~rnl)m "~ ~2do.
(25)
The system (1) can be represented as ' ~p = Apxp -I- bpu + pmbp~m , y = hTxp + ~ + ~o
(28)
166
Naik, Kumar, and Ydstie
where (Ap, bp, h T) is a minimal state representation of the nominal plant transfer function, rhn := ~ u ,
with q(s) chosen as some polynomial of degree nr
1
such that q(s - 2d0 - dz) is Hurwitz, thus giving rise to the ~2do term in (26)). In (26) above, h T ( s I - A p ) - l b p B ( s ) and h T ( s I - Ap) -1 b,~ - B(s)q(s) A(s) Note =
"
Am(S)
that ~
is a proper, stable transfer function. The control law (6) is equivalent
t o o T ¢ de. 0"*r5 :
r s.
Hence, b m u
Defining ¢y =
=
--oT ou
--
Ony oT~y --
--
~W¢ "4"r'.
(27)
1 1 )T and cu likewise, we note that (s + a) n-1 Y ' ' ' " " ( ~ ' ~ Y
¢(t) = (¢T(t), y(t), cT(t), u(t)) T, with ~y(t) = Q f y ( t ) + qy(t), Cu(t) = QCu(t) + qu(t),
(2s)
where Q is a stable matrix such that det(sI - Q) = (s + a) "-1 , and (Q, q) is a controllable pair. T cw]. Then, using (26), (28), and (27), we get X¢ = Define Z T : - [xpT ,¢~, AcX¢ + bc(r' - oT~b) + bcvv + ~umbc~r/m,y = hTXc + v + ~2do, where, Ac is a stable matrix (as we will prove shortly), := [by/b.., O, qT/bm], bTo : = [--O,~bT/bm,qT,--O,~qT/b,~l,b T := [b~,T,O, 01 and h T := [hT,0,0]; see [3]. For 0 = O, v --_ 0 and ~m = 0, we have, ~7c = AcX¢ + bcr', y = hwXc + ~2do. This implies y = h i ( M - Ae)-lbcr ' +~2do. However, we already know (from (25)) that y = ~ 7 ~ r ' + ~2do, in such a case. This means that (after cancellations) hT(s I _ Ac)_lb c :-
1
2(s) "
(29)
It can be easily verified (e.g. see [17, p. 136]) that we have only stable pole-zero cancellations because the nominal plant is minimum-phase and Q is a stable matrix. This proves that (Ac, h I ) is detectable. Since the overall transfer function is stable (by (29)), we conclude that A¢ is a stable matrix. Thus we can write the following nonminimal representation of ~2-~: -~m = A c X m -{- b c r I , Ym - - h T X m ,
where := [x T, ~bv,n, T Cure]" T Ac being stable and r I being bounded implies that OXm(t)][ < K x m , V t > O, for some positive constant K x m . Define the state error e as e := Xc - Xm. This gives,
- Ace - bc(~r~b) + bevy + ~mb¢,l]m , el ---- h : e + V, where we recall that el = 9 - Ym is the tracking error.
(30)
Robust Continuous-Time Adaptive Control
167
9 Robust Ultimate Boundedness Define W, which bounds all signals, by
W = k, eTpe + l z 2 ,
(31)
where P = pW > 0 satisfies PAc + A ~ P = - I . Such a P exists since Ac is a stable matrix. Our main result on robust ultimate boundedness of the overall system is given by the following theorem. It states that eventually W(t) (which bounds all signals) enters a compact set, the size of which is independent of the initial conditions. Furthermore, the size of the allowable unmodeled dynamics for which this is guaranteed is independent of the initial conditions. It should be noted, however, that the time Targe that it takes for W(t) to enter this compact set can depend on the initial conditions. Choose constants 0 < 7 < 1, e0 > 0 small, and e, > 0. Let T,e, Tt, K , max, and/~max be as in Table D. T h e o r e m 9.1 ( R o b u s t U l t i m a t e B o u n d e d n e s s t h e o r e m ) . There ezist a Tlar~ >_ Tt, and positive Kvmax, #max, such that for all Kv E [0, Kvmax], and for all P m e [0, pmax],
W(t) < 8K~,L2exp[4(K, + ez)T], Vt > Tlarse •
(32)
Furthermore, gvmax,lJmax, T, and L are independent of the initial conditions. Proof. The idea of the proof, based on the following lemmas, is to show that whenever W(t) > K w z L 2 throughout an interval of length 2T, then at the end of the interval its value is smaller than at the beginning of the interval. L e n n n a 9.1
W(t) < K w , z2(t) + kwz, Vt > Tt.
(33)
Proof. Since W = keeTpe + ~, z 2 e, = X¢_Xm, l[Xm}l_ O, and from Theorem 5.1(v), [y'(t)[ < cm(t)+ c + co exp[-pt],V¢ >__0. Noting that B'(s) :- S(s)q(s) is such that B'(s - p ) is Hurwitz, and applying Lemma A.1, we get []xp(t)H _< cm(t) + c + co exp[-pt],Vt >_ 0. Finally applying Theorem 5.2(ii), we get the desired bound on [Ixp[[. []
168
Naik, Kumar, and Ydstie
Lemraa 9.2 Consider an interval [a, b]. Then < exp
-W(a) - -
g(7")d
1 + f14 e x p ~ ( b - a)]
(34)
where g(t) := - ~ + Zs I~r(t)¢(t)l + & I~r(0¢(t)l, Z~, Z4 and Z~ bei.g speci~d
~(t)
~(t)
in Appendix C. Proof. See Appendix C.
Lem.-a 9.3 Co.~ider a a.,e inte~at[., b] such tha~ (i) b > ~ > Tt, (ii) W(t) > KwzL2,Vt E [a,b]. Then,
(i) ½z2(t) <_ W(t) < 4Kw~z2(t),Vt e [a,b]. (il) If I(t) = O , Vt e [a,b], then W(b) <8Kw~
(
2K2~ ~ exp[-g2(b- a)] + ~2L ] W(a).
(Eft) If ( b - a) > 1 and for each t e [a,b] such that I(t) = 0 there exists a t' e [0,T] such that I(t + t') = 1, then W(b) <_ k. exp[- -~ ( b - a)l(1 + K w ~ L 2 exp[~(b- a)])W(a) (iv) W(b) < 8Kw~ exp[2(K~ + e~)(b - a)]W(a). Proof. The result (i) follows by Lemma 9.1 since W > K w z L 2 implies z > L/2. The result (ii) follows from (i) because of ~ <__- g 2 z + K 2 , and since W > Kw~L 2 implies z > L/2. To prove (iii), note from Lemmas 9.2, B.4, the discussion preceding Lemma B.4, and by our choices of e,/~max,/¢vmax and L, that
< -~(b - .) + ~3~3/d + ~ 3 b ~ 4 ( T ) / d + ~ 5 ( T ) / v ~ + ~6(T)~" +4."-ink(#, L)l(b - .) + ~5~,7/d + ~[~,=~,~(T)/d + v~l(b - a) (~3 + < - ~" i( b .) + -e 2 ' -
which gives us the desired result after defining k = exp [(f13u3 +_ ~/sur)] Finally,
t
~2
J"
(iv) follows from the bounded growth rate of z(t) shown in (22), and (i).
[]
Lemrrm 9.4 ( C o n t r a c t i o n P r o p e r t y ) . If to > Tl, and W(~) >_ K w z L ~ Vt E [to, to + 2T], then W(to + 2T) _< 7W(to). Proof. There are four possibilities:
Robust Continuous-Time Adaptive Control
169
(1) Suppose I(t) = O, Vt C [to, to + 2T]. Then from Lemma 9.3(ii), we have
(
W(to + 2T) < 8Kw2 exp[-2g2T] +
W(to) <_7W(t0), where the last
inequality follows by the definitions of T and L given in Table D. Let tl := min{t e [0, 2T] : I(to + t) = 1}, and t2 := max{t E [0, 2T] : I(to + t) = 1}. (2) Suppose 0 _< tz < t~ < T. First, by Lemma 9.3(ii),
W(to + 2 T ) < 8gwz
K ~)2 W(to + tg). exp[-g2(2T-t2)] + 2g~L
(2A) Suppose t2 < 1. Then, the above inequality implies W(to + 2T) <
8gw= "(exp[-g2(2T - 1)] + 2K2~ 2W(to + t2). Using Lemma 9.3(iv), we g2L ) have W(t2 + to)) < 8gw~ exp[2(Kz + ez)]W(t0), which implies that [
2K2 h 2
W(to + 2T) < 64K~=exp[2(g~ + e~)] [ e x p [ - g 2 ( 2 T - \ 1)]+ a--~} W(to) -
g2
-
so that W(to + 2T) definitions of T and (2B) Suppose t2 > 1. definitions of T and
)
< 7W(t0), the last inequality following from the L. In this case, we have by Lemma 9.3(iii) and the L, W(to + t~) <_exp[-~t2/4](k + eo)W(to), so that
W(to + 2T) < 8exp[-fl/4](k + eo)gw~
exp[-g2T] -t- 2K2~2 g2L ] W(t0)
__ 7w(t0). Since the rest of the proof is similar, we abbreviate it. (3) Suppose 0 < tl < T < t2 < 2T. We apply the definitions of T and L to the following cases just as in eases 1 and 2 above and in each case get W(to + 2T) < 3'W(to). (3A) Suppose I(t 0 + T) = 1. Then we separately consider the case where I(to + t) = 1, for some t 6 [t~,t~ + T], and where I(to + t) = O, Vt 6 [t2, t2 + T]. (3B) Suppose I(to + T) = 0. Define ts := max{t 6 [0,T] : I(to + t) = 1}. (1) (t3 < 1) (a) (t~ - T < 1) (i) (I(to + t) = 1, for some t E [t2,t2 + T])
(ii) (I(to + t) = O, Vt 6 [t~, t2 + 7~) (b) ( t 2 - T > 1) (i) (I(to+t) = 1, for some t E [t2,t2 + T ] ) (ii) (I(to + t) = O, V t e Its, tz + T]).
(2) (t3 > 1) (a) (t2 - T < 1) (i) (I(to + t) - 1, for some t E [t~,t2 + T]) (ii) (I(to + t) = 0, Vt E It2, t2 + T]) (b) (t2 - T >__1) (i) (I(to + t) = 1, for some t e It2, t2 + T])
(ii) (I(to + t) = O, Vt E Its, t2 + 7"]).
170
Naik, Kumar, and Ydstie
Here one needs to consider the four cases (t3 < T / 2 ) / ( t z > T / 2 ) and (t2 - T < T / 2 ) / ( t 2 - T > 7"/2). (Because of symmetry, these reduce to just two). (4) (T < tl < t2 < 2T). Again, we apply the definitions of T and L as in cases 1 and 2 to get W(to + 2T) < 7W(t0) in each case. (a) (I(to + t) = 0 ,Vt • [t2,t2 + T]) (i)(t2 - T < 1) (ii)(t2 - T > 1) (b) (ICto + t) = 1, for some t E [t2, t2 + T]). []
Proof of Theorem 9.1. Lemma 9.4 and the fact that z has a bounded growth rate gives us the desired result since W ( t + 2T) _< 8Kw~ exp[4(K, + e,)T]W(t) whenever W(r) > K w , L2,Vr E [t,t + 2T]. Both K w , and L are independent of initial conditions. However, Tinge can depend on initial conditions. This concludes the proof. 1:3
I1¢'11 C Loo, Oi) v, vs, v~ E Loo, Oil) y C Loo, (iv) u • Loo, (v) e~, e Loo, (vi} i e L ~ , and (vii) "oe goo.
T h e o r e m 9.2 (/)I1¢11,
Proof. The result (i) follows from Theorem 5.1(iii). The result (ii) follows from assumption (A4) and Theorem 5.1(i),(ii). The result (iii) follows since y = ¢T0+ v6 + 6do, and Ilall ___ M. Lemma 5.1 implies that (iv) is true, and since e~ =
_ ¢ T g + v~ + 52do, and Ilffll _< M, (v) holds true. The result (vi) follows since = hTexp[Qlt]. (HT~), and IIHTffll ___C~e~. Finally, result (vii) follows from property (P1) of the parameter estimator.
10 P e r f o r m a n c e
for a N o m i n a l
O
System
We now show that if the system has no unmodeled dynamics and disturbances, then exact asymptotic tracking is achieved. T h e o r e m 10.1 / f K~ = O,k~ = O, and i~m = O, then ly(t) - ym(t)l -+ 0 as t --* oo.
Proof. It suffices to show e~(t) ~ 0 and ~(t) ---* 0 as t ~ oo, since el(t) = e~(t) - ¢(t). Note that since K , = 0, k, = 0, P m = 0 we have v~(t) - O. So, by Lemma A.2, ~n -< - 3 + cexp[-2d0t]. This implies, since I1~1 -< M, that
fo ° e-l-(~v, )d. ~ < ~ ntr)
-
a
+ k . 2do
(35)
Robust Continuous-Time Adaptive Control
171
Boundedness of n(.) .then implies ea • L2. Recall that ea • Loo. ea -- _ ~ r ¢ + ~2do implies ]~al < ]lO"l][l¢l[ + llo]] II¢ll + co exp[-2dot], which implies d~ e Loo. Therefore, by Barbalat's lemma (Popov [1, p. 211], or Sastry and Bodson [17, p. 19]), ea(t) --~ 0 as t --* oo. Now, recall that ((t) = h T exp[Qlt] * ( H ( t ) ~ ( t ) ) , and ][H01[ < (~C[ea[, which implies,
1¢(01 __ c'~
exp[-,,(t - ~')/211e~(")ld".
Using the just established fact that e.~ ~ 0, we will show that ( --* 0. Leg ear, :-- sup{le~(0 [ : t _> 0}. Pick a 6 > 0. Then, 3 a T~ < eo such that (i) C,~le~(t) l (ii) 2Caeam exp[-aT1/2] < 6/4, Vt > TI, and < 6/2. Hence, Vt > 27"1, a a
exp[-a(t-¢)/2]ae+
we have, [¢(0[ -< C'~(e~m i f 2
exp[-.(t-e)/2]de) < 2
2e~,~ e x p [ - a t / 4 ] / a + 6/2 < 2e~m e x p [ - a T / 2 ] / a + 6/2 < 6, thus proving that ~(t) --, o as t ~ o. n
11 R o b u s t
Performance
We now show that the performance of the adaptive tracker, as measured by the mean square tracking error, is robust in that it is a quadratic (hence also continuous) in the magnitude of the unmodeled dynamics and bounded disturbances. T h e o r e m 11.1 lim sup T2.-.* c~
1 --T2 /0
.
(y(t) -- ym(t))2dt < c(K2v + k~ + p2m)
~-2
-2
t)~2
Proof. From Lelnma A.2, we have e~ -
n
±
[T~ d -
T~JT,
v~ n
< -
"
< _ V + co exp[-2d0f], which implies ot
kexpI-2d0T,] T2
'
VT2 > T ~ . -
Hence, lim sup 1 ~ T , e~ -- v~2 < 0 =¢, lima.~_.oosup~1
(36) "
It,-fT'vf-n ' T2 e~n -< limT~..,ooSUp ~l iT,
Noting that z(t) > min{z(0),K2/g2} =: Zmin, and defining Zmax := suPt_>Tt z(t), by (16), we get, v~2 2 (g~vfkm~ + kvf + k(Tt)) ~ z'-"~ <_ 2(glvfgmz) 2 + z2
172
Naik, Kumar, and Ydstie
which then gives
1 /.T~ v~ _
_
< 2(KIvfKmz) 2 +2(Kvfkrnz +kvf+k(Tt))2
2 ] Zmax
VT2 >" T t .
This implies T2
2
fT e~ < c(K~ + k2 + ~m + k~(T~)), 2 n --
lim sup 1~ T2--,¢¢
which then gives lim sup
! [r~
T~...¢¢ T~jT
2<
e~ -
c(K2v + k~2 + 1~2m+ k2(Tt))
where we recall that k(Tl) depends only on initial conditions and decays exponentially at a rate faster than exp[-2d0Tl] as Tt increases. Finally, using the fact that e~ is bounded and the fact that the expression above is true for all 1 /'T2 2 Tt > 0, we get lim sup ~ [ ca< c(g2v + k2v + p2m). The proportionality conT2--*oo
2 JO
--
stant e is basically Zmnxnmax and is therefore O(z2m~x). From the Robust Ultimate Boundedness theorem, we know that it decreases as the unmodeled dynamics and bounded disturbance decrease. This means that the right-hand side in the expression above indeed goes to zero as the unmodeled dynamics and bounded disturbances go to zero, thereby giving us robust performance. Now consider the swapping term. Recall from the proof of Theorem 6.1 that,
(/0
~o'C2(t')dt'~ (~C)2fot "expt-a(t'- r)/2]le~@)ldT dr' <
exp[-a(t' -- ~)/2]e~(r)~t'
8
< ~ ( ~ c ) ~[
.to
t
e~(r)d~,
where the second inequality follows from Cauchy-Schwartz. This implies that lim sup ~1
foT2,2(tt)dt'< 8(aC)2(K2v + k2v + p2m)c. Finally, recalling that
T~--*c¢ -¢ 2
el(t) = e~(t) - ¢(t) yields the desired result,
t3
Robust Continuous-Time Adaptive Control 12 S i m u l a t i o n
173
Examples
We now present two simulation examples to illustrate the results. The first example is the same as the one considered in [12], except that we do not assume knowledge of the (nominal) plant gain, and that we also add a bounded disturbance to the output. The actual plant is unstable and has a fast nonminimumphase zero. This plant is modeled as a nominal plant which is minimum-phase and unstable, with a multiplicative uncertainty. E x a m p l e 12.1 The true system is given by
(1 y ( t ) - 7((;=T)
+
w(t)
(37)
where 0 < p < 1 and w(l) is a bounded disturbance. The reference model to be matched is 1 (38) ym(t) (s + 1)(s + 2) r ( t ) ' where r(t) = lOsin(O.5t). We consider a nominal model of the form y(t) = 1 u(t), which is parameterized using Al(s) = ( s + 1), ~2(s) = ( s + 1) 2 s 2 + a l s + ao (so that )~(s) = (s + 1)3), which results in F ( s ) = (s + 4), G(s) = (7s + 1), and 0 = ( - 6 , 7 , 3, 1) T. The simulation results with 0(0) = ( - 4 , 3, 6,4) T, p = 0.02, w(t) = unit square wave with period 10, parameter estimator constants o~ = 4.0, y = 1 and brain -" 0.5, and constants do = 0.7, dl = 1.0, m(0) = 2.0 for the overbounding signal re(t) are given in Figs. 1-3. Despite the presence of unmodeled dynamics and a bounded output disturbance, the plant output approximately tracks the given reference model output over the time interval considered. []
174
Naik, Kumar, and Ydstie
Reference model output, y_m and Plant output, y 20
10 0
-10 -20 5O
i
100 Plant output, y (including initial transient)
150
Timet
200
Timet
200
200 100
-100 -2130 0
5'0
Fig. 1. Simulation results for Example 1.
I00
150
Robust Continuous-Time Adaptive Control
175
The overbounding signal,m 400.
300, 200, I00 0 0
50 3ounded disturbance
100
150
Time t
2DO
I00
150
Time t
200
I00
150
Timet
26O
100
150
Time t
02 0 -0.5 -1! 0
50
Filtered unmodeled dynamics
2,
-2 0
;0 Filtcrtxl bounded disturbance
0 -2 0
5~3
Fig. 2. Simulation results for Exaanple 1.
I 26o
176
Naik, Kumar, and Ydstie
Plant input, u 100
o~ -100 0
50 I00 150 Timet ~ilteredunmodeled dynamics with bounded distm4mnce,vf
2l 0
5O
2OO
1 !1 I
l~O
150
Time t
~o
150
Time t
200
150
Time t
200
Parameter error (components 1 & 2)
0
50 I00 Parametererror(components3 & 4)
2
0
0
50
F i g . 3. Simulation results for Example 1.
I00
Robust Continuous-Time Adaptive Control
177
The next simulation example illustrates a wider class of unmodeled dynamics allowed in our algorithm. Specifically, the true system is a nonlinear, timevarying plant which has a linear time-invariant (LTI) part that is strictly proper and unstable. This LTI part is considered to be the nominal plant, and the nonlinear, time-varying term is treated as unmodeled dynamics. It also includes a constant disturbance term and an exponentially decaying term to allow for initial conditions. E x a m p l e 12.2 The true system is given by
y(t)=
u(t)+~, kAt).(s+2)y(t)..k,,(~). +2 + 3 exp[-t/2],
1 I] 1/~ (s + 3) u(t) (39)
where ky(g) = sin(g/20),ku(t) = a unit square wave of period 10, and p -- 0.05. The reference model to be matched is 1
ym(t) = (V4-~r(t),
(40)
where r(t) = 10sin(0.ht). The nominal model considered is of the form y(t) = b --u(t), which is parametrized using Al(s) = 1,A2(s) = (s -t- 1) (so that s -l- a A(s) = (s + 1)), which results in f ( s ) = 1, G(s) = 2, and 0 = (2,1)W.The simulation results with 0(0) = ( - 1 , 2) T, and the parameter estimator constants a = 4.0, v = 1 and, brain -- 0.2 are given in Figs. 4 and 5. For the overbounding signal re(t), do = 0.7, dl = 1.0 and m(0) = 2.0. Again, approximate tracking is achieved. The main cause of the lack of tracking accuracy appearing to be due to the constant disturbance of magnitude 2, as is clear by looking at Fig. 6, which displays simulation results with the same v(Q as above, but with/~ = 0. Finally, if both the constant disturbance and unmodeled dynamics are set to zero and v(t) = 3exp[-t/2], the adaptive control algorithm does indeed result in asymptotically perfect tracking. This is clear by looking at Fig. 7. I::1
178
Naik, K u m a r , a n d Ydstie
Reference model output, y m and Plant output, y
|
0
50 Unmodeled dynamics, v
100
150
Timet
200
100
150
Time t
200
100
150
Timer
200
100
150
Time t
200
5
3
0
50
Plant input, u
-10, -2~ -30. 0
5'0 Tracking error, el
20,
10,
0 0
50
F i g . 4. Simulation results for E x a m p l e 2.
Robust Continuous-Time Adaptive Control
179
The overbounding signal, m 60,
4O 2O 0 0
5~
16o
1;o T~e~
2;0
Filtered unmodeled dynamics, vf
4! 21 -2, 0
|
i
|
50
100
150
Time t
200
|
1~o
150
Time t
2~o
Time t
2OO
Parameter error (component 1) 2
0
0
Parameter error (component 2) 2 1 0
-1 0
|
!
|
5O
100
150
Fig. 5. Simulation results for Example 2.
!
180
Naik, Kumar, and Ydstie Refemncz model output, ym and Plant output, y
0
50
100
150
Time t
200
100
150
Time t
200
Tracking error, el 20
10
0 0
50
Fig. 6. Simulation results for Example 2 with v(t) =- 2 + 3 exp[-t/2].
Reference model output, )an and Plant output, y
0
50 Tracking error, el
I00
150
Time t
200
1
150
Time t
200
15. 10. 5.
o
;o
F i g . 7. S i m u l a t i o n r e s u l t s for E x a m p l e 2 w i t h v ( t ) = 3 e x p [ - t / 2 ] .
Robust Continuous-Time Adaptive Control
181
13 Concluding Remarks In this paper, we have obtained boundedness and performance for continuoustime plants of arbitrary relative degree, with a somewhat wider class of unmodeled dynamics than in [12], but without any extra modifications except projection. Unlike [14], we allow non-differentiable bounded disturbances, and nondifferentiable reference inputs. We also allow some time-varying and nonlinear uncertainties. The nominal plant is however restricted to be minimum-phase. We have shown that eventually all the signals enter a compact set, the size of which is independent of initial conditions. Also, the upper-bounds on the size of allowable unmodeled dynamics are independent of initial conditions. Our results thus show that the projection mechanism alone is sufficient to guarantee robust boundedness and robust performance, at least with respect to small unmodeled dynamics and bounded disturbance. It is important to study the dependence of the bounds on the parameters of the nominal plant, the constants defining the unmodeled dynamics, initial conditions, etc. Also, it is important to reevaluate the various robustness modifications which have earlier been proposed, to examine the amount of robustness they provide, the performance guaranteed in the presence of unmodeled dynamics and disturbances, and to thus determine whether they actually provide some improvements with respect to employing just the projection mechanism. Acknowledgements. We thank Petros Ioannou, Jean-Baptiste Pomet, and Laurent Praly for helpful discussions.
References 1. V. M. Popov, Hyperstability of Control Systems, Springer-Verlag, Berlin, 1973. 2. C. A. Desoer and M. Vidyasagar, Feedback Systems: Input-Output Properties, Academic Press, New York, 1975. 3. K. S. Narendra and L. S. Valavani, "Stable adaptive controller design - Direct control", IEEE Trans. Aut. Control, vol. 23, pp. 570-583, 1978. 4. A. Feuer and A. S. Morse, "Adaptive control of SISO linear systems", IEEE Tram. Aut. Control, vol. 23, pp. 557-569, 1978. 5. B. Egaxdt, Stability of adaptive controllers, Lecture Notes in Control and In[o. Sciences, vol. 20, Springer-Verlag, Berlin, 1979. 6. A.S. Morse, "Global stability of parameter-adaptive control systems", IEEE Trans. Aut. Control, vol. 25, pp. 433--439, 1980. 7. G. Kreisselmeier and K. S. Narendra, "Stable MKAC in the presence of bounded disturbances", IEEE Trans. Aut. Control, vol. 27, pp. 1169-1175, 1982. 8. L. Praly, "Robustness of model reference adaptive control", K. S. Narendra, ed., Proc. 3rd Yale Workshop on Adaptive Systems, pp. 224-226, 1983. 9. L. Praly, "Robust model reference adaptive controllers, part I: stability analysis", Proc. ~3rd IEEE Conf. Dec. Control, Dec. 1984. 10. L. Praly, "Global stability of a direct adaptive control scheme which is robust w.r.t, a graph topology", Adaptive and Learnin 9 Systems: Theory and Applications, K. S. Narendra, ed., Plenum Press, New York, 1986.
182
Naik, Kumar, and Ydstie
11. G. Kreisselmeier and B. D.O. Anderson, "Robust model reference adaptive control", IEEE Trans. Aut. Control, vol. 31, pp. 127-133, 1986, 12. P.A. Ioannou and K. S. Tsakalis, "A robust direct adaptive controller", 1EEE Trans. Aut. Control, vol. 31, pp. 1033-1043, 1986. 13. G. C. Goodwin and D. Q. Mayne, "A parameter estimation perspective of continuous MRAC", Automatica, vol. 23, pp. 57-70, 1987. 14. P.A. Ioannou and J. Sun, "Theory and design of robust direct and indirect adaptive-control schemes", Int. J. Control, vol. 47, pp. 775-813, 1988. 15. J.-B. Pomet and L. Praly, "Adaptive nonlinear regulation: equation error from the Lyapunov equation", Prac. 28th IEEE Con]. Dec. Control, pp. 1008-1013, Tampa, FL, 1989. 16. L. Praly, S.-F. Lin and P. R. Kumar, "A robust adaptive minimum variance controller", SIAM J. Control Optimiz., vol.27, pp. 235-266, 1989. 17. S. Sastry and M. Bodson, Adaptive Control : Stability, Convergence, and Robustness, Prentice-Hall, Englewood Cliffs, N J, 1989. 18. G. Tao and P. A. Ioannou, "Robust stability and performance improvement of discrete-time multivariable adaptive control systems", 1at. J. Control, vol. 50, pp. 1835-1855, 1989. 19. B. E. Ydstie, "Stability of discrete MRAC--revisited", Syst. Control Lett., vol. 13, pp. 429-438, 1989. 20. S. M. Naik and P. R. Kumar, "A robust adaptive controller for continuous-time systems", submitted to the 1991 Amer. Control Conf., 1991.
Appendix
A
T with zero iniL e m r n a A . 1 Consider the system: ~ = Awx + bwwin, Wout - hwz
tim conditions. (Aw, bw, h T) is a minimal representation of rio(s) = ~ , where Ho(s) is strictly proper and of relative degree one, and B ' ( s - po) is tlurwitz. If lwin I < Kirn + ki + coexp[-pt], and Iwoutl < Kom + ko + co exp[-pt], then, Ilxll ___K=m + k= + coexp[-pt] for some positive constants K~ and k=. Proof. W i t h o u t loss of generality, suppose
B'(s) Ho(s) = i_i~l(s2 + a n s + ai2)x lr-rn--2kt l j = l (s + aj) Since Ho(s) is minimal, the corresponding states are the states corresponding to Win Win , i n 1 , . . . , k and - - , j = 1 , . . . , n - 2k. Now,
S 2 q- ails "t- ai2 1
s q- aj
Win(~) = Hk--l(8 2 dl" a l l
s + a-----7'
n-2k
s ~-
ai2) I-I~-i/ff~l(8 Jl- aj)
B'(s)
Woot(t)
Using L e m m a 5.2(ii) (since B ' ( s - p o ) is Hurwitz), we get [s - ~ t Win(t)[ _< cm(t)+ c + co exp[-pt], l --- 1 , . . . , n - 2k. Define
wt(t)
--
1 Win(t) = ; jrlrt(s)Win(t). S2 "4"alas q- a12
Robust Continuous-Time Adaptive Control
183
Then, since
wl(t) = I-I~=1'~¢'(s2 + a,: + a,~) 11./=1 ~..-2~. t,S -]- at) BI(s )
Wout(t) = : Ittout(S)Wout(t),
using Lemma 5.2(i) (since B'(s -Po) is Hurwitz), we get Iwt(t)l < era(t) + c + co exp[-pt], l = 1 , . . . , k. Further, since Htout(S) is strictly proper, Lemma 5.2(ii) gives [fi~t(t)l = [sHtout(S)Wout(t)l < cm(t)+c+co exp[-pL]. Since wt(L) and tbt(t) are the states corresponding to Ht(s)urm(t), we are done. cl
L e m m a A . 2 Define V(t) := II~'(t)ll 2 for the parameter estimator with projec-
tion. Then, we have P'(t) < - a (~T(t)~b(t))2n(t) + ~v~f2(t)n(t)-to ~ e0M" " exp[-2d0t] .
(41)
Furthermore, (41) also holds when the term o"T(t)~b(t) on the right-hand side is replaced by e~(t).
~(t) ~¢(t)c.(t) Proof. For notational simplicity, let el(t) := IIg(t)ll' and c2(t) . n(t) First, consider the parameter estimator when the projection is not used, i.e. ~(t) = c2(t). Then, recalling that ca(t) = --oT(t)C(t) + v~(t) + e2do(t), we obtain
f'(t) = 2if(t) T ~¢(t)e~(0 .(t) Ot = n(t)[-2(~r(t)C(t))2 + 2(~r(t)C(t))v~(t) + 2(~r(t)C(t))~2a°(t)]
_< --~[-( ~ ~r ( t ) C ()) t 2 + ~?(t)] + a ~ ec0M xp[-2dot]. Now, consider the parameter estimator with projection. Using property (P2) of the projection, we get
fz = 2~r~= 2~rvroj(~, ~) <_2b'rc2, and so the bound for the estimator without projection still holds,
Appendix
r3
B
In this section, we will assume that (i) t > Tt, and (ii) W(.) > Kwz L 2, which implies that z(.) > L/2.
Lemm. B.I O)li¢II,II¢~l],II¢-II,ll~ll_
Z
Z
Z
184
Naik, Kumar, and Ydstie
(ii) II¢(01l___al, i = l , . . . , n - - m . z
(ill)
-~(y - v~) <_ k;z + k(Tt), for some ky > O, and k(Tt) is a positive constant which decreases exponentially with increasing Tt.
~(,~ -
"/~m + ~.'~/f'~.) _< k'~, + ko. + k(~), for some V > O,
where k(Tl) is a positive constant which decreases ezponentially with increasing Tt and k~z is a positive constant which is a weighted linear combination of K v , k v and Pro. Proof. The result (i) follows from Theorems 5.1(iii),(v),(vi), and 6.1(ii),
while (ii) follows since ~bT = (
1
1
(s + a)~-=-,-i ~' "'" (s + a ) - - " y'
1
)
(s + a) ~"-m-1 u , . . . , (s + a) " - m u . For (iii) note that y - v~ = ~bT0 + ~do, so that by (ii) above - ~ ( y - v~) <_ a t M z + k(TL). For (iv) recall that the control
I r' + -:--v O. t, = - I [0,~ ^T¢~, + Oy ^T~k~ + o . ( y law is u - .---bm bm implies d -~(U
~)1 =: R H E/b,~. This
l__.r, On ," ~2m 1 (:T -T bm + ~m vf) = ( R U E ) - ~mm Ou¢u + Oy d?~ "T" ~" ^ d + o . ¢ . + o~ ¢~ + ~ . ( ~ - v~) + e . ~ ( y - ~ )
.
Now, Ilgm¢=ll _< I~,n~lll¢=ll ___ II~ll II¢'ll _< ~'ll¢'[lUle'l -< ~le.I _< cz + k., + k(T,). n
Similar analysis using [10l[ < M and bm > brain > 0, gives b.Zm(RHE) <_ cz + k~, + k(Tt). Also, I t T ~ I < CZ + k~, + k(Tt), etc. Finally, using these results, the fact that ~ is bounded, and (i), (ii), (hi), etc., we get the desired result. []
For future use, define 1
k , ~ ( T ) = ~-~-~, if I(t) = 1;
= 1/8(T), if I(t) = 0, and I(t + ~') = 1 for some t' E (0, 7"]. Lemma
B.2 For instants t such that l(t + t') = 1 for some t' E [0, T],
d .~r,~(O
.I
~ i ( - - - 7 - - ~ ( t ) l <_ k~(T) , i = 0 , . . . , , - m - i .
I
(42)
Robust Continuous-Time Adaptive Control
d b'T¢(') I~(~)1
185
IIb'l111¢<°11+- IlOqlII¢¢'+~)11
Proof. Note that Now, the parameter-update law yieldz
Z
4
IlOllI1@)11 lal Z
Z
11311 < ~11¢11(16T¢1 + Iv$1) z 2 --
B
7. 2
'
and z~(t) n(t) < k , = ( T ) .
This implies that H _< ~(T). Finally, using (16), (22) and Lemma B.l(ii), we get the desired result. [3
L e m m a B.3 For instants t such thai I(t + t') = 1 for some t+ E [0,T],
2n-m/
d
< kn-m(T)
z
(43)
-
x-~n-rn-1
Proof. Since ~ = (s + a)("-m)¢, we have ¢("-'~) = ~ - z.,i=o ei¢(O, where ci are the appropriate constants which depend on a. Using the control law(f), we get,
~r~ = $T¢ _ 0T~ -- r I _ 0 T ¢
= r' --
r ¢~ (o.
+ 0~¢~) - 0 . ( y - v ~ ) - - 0 2 . ( =
0~,, , --OnV~-- -z--r bm
- b~l r' + V8.
,. v~)
0 0'~ ' "Jr 2n.'~--'~f ,
bm
which implies eta'_
(1 -- 02n)r' "4- (On -- 02n ~-)tt~ = --(0.T 4 . + 0 ~ ¢ ~ )
L-m
--02n(tt-
~'m r
0 . ( y -- ~)
"~ '~m~)f) •
Applying Lemmas B.1 and B.2 to the above equation yields the desired result, using (22). O
186
NaJk, Kumax, and Ydstie
Let ~r* < T. Assume that I(t + t') = 1 for some t' E IT', T]. Then, from Lemma A.2, we have:
lit a *+T' (~T¢)2 n [ V ( t ) - V(* + T ' ) 1 nZ2 [f*+T t ('£f)2dr ~7 at ~ dr < sup ~ ~ + ~-7 sup J, z + aco'M exp[-2d0t]]
o < "~7
--
+Knn
(g'
o.
' + 2vT'd'-----~
'k
kuf+ vfKmz + g~f m, +L/2
k(T~))2knz(T)
where the supremum above is taken over the interval It, t + TI]. Defining/z : : I Kt~fkm z + k~f + k(Tt) the above inequality becomes KtjfKmz + '
L/2
+2-~o +p:K,,koz(T) . Subsequently we will apply Lemma B.4 to this inequality. The important point to note is that p can be made as small (but positive) as we please by making K~f small enough and L large enough, and K~t can in turn be made as small as desired by making K~ and Vm appropriately small. R e m a r k . The necessity of the constraint 1 < T ~ in Lemma B.4 will become clear in the proofs which follow. [2
L e m m a B . 4 (i) If 1 <_T' <_T, and
2
Cl <- ~'7 + P2c~(T) ,
(44)
Chen,
~ < ~_ +v~[ ~ [ 2v4(T) --~
.
-*- p
v5(T) -~-
+v6(T)C + 4a"-'~k(l~, L)] T' where e E (0, 1], p = 2 -(n-m+1), and
k(p,L) = ~'bmin +
t12
102"Ira
+ (I°'l + b-~ )"
(45)
Robust Continuous-Time Adaptive Control
187
(ii) If 1 < T' < T, a.d Cl
< ~ Jr ~u2e2(T),
(46)
Jt
then
'+T'I _¢1_<~
+ .5
T',
+
where e 6 (0, 1]. Proof. The result (i) is an extension of the proof of Theorem C1 of [12], which uses Lemmas B.1, B.2, B.3, and B.5, B.6, B.7 below. The result (ii) is a special case of the proof of (i). The complete proof of (i) is presented following the proof of Lemma B.7. []
I,hT#l
L e m m a B.5 Considert such that I ( t - t - t ' ) = l for somet' E [O,T]. if[2~-:-[ < I
for any e G [0, 1], then
I
,:v
IlOll <__~ v ~ - ~ ( ~
+ ~).
Proof.
I1~'11<- ~ll¢llleal <- ~ieal < ,~ I¢'r~'l + ,~ (~,; + k(T~)) . ~ - ~ ¢z
<-
~
,
V7 + K~,K~ +
2 ( g ' f m~ + k~f + k(TL))
L
'
which is (letting p be as in Table D) equivalent to
I1511_<- v ~ ; ~
(4~ + ~). D
L e m m a B.6 If ]¢w~] --< V~, Vt 6 [tj, tj @ At], Ve 6 (0, 1], where ~j > Tl and Z At >_ 1, then 3 constants hi(T) > 0 and continuous functions fi(e,T) with 0<
o (e) [~W¢(,)] ~ )~i(T) ea' ÷ pfi(e,T), Vt E [tj,tj -t-At], Z
(47)
188
Naik, Kumar, and Ydstie
t0]ler£
ai = 2 -(i+D, i = 1 , 2 , . . . , n -
m-
1,
and
lYr¢("-"~)l <
,~,~_m(T)e ~'"-" + p f n _ m ( e , T ) + k ( ~ , L ) .
(48)
Z
Proof (by induction). Consider an instant t E [tj, tj + At], and ~ > 0 such that t + 6 E [tj, tj + At] (see Fig. 8). Then,
f+, yT¢,) yw¢<,-1) I'+~_/t+$ at
z
dr
=
z
~b(~-l) d~.
:t
||
z
+ f t + s b'~r¢(i_l) ~dr, Jt
Z
i = 1,..., n - m.
(49)
Z
Assume (47) holds for ( i - 1 ) . Next note that z(-) > L / 2 implies - 2 K z z < - g 2 z + K2 < ~; < g ~ z -4- kz -4- k(T.L) <_ 2Kzz,, which further implies that I~./zl < 2K~. Using this, the fact that [l~'ll < kT(x/7+p)(where kr := a y / k - ~ ) , B.1, we have from (49)
If'+'
yr¢(,) Z
<
and Lemma
(2 + 2K~6)(~,_l(r).~,-, +.f,_,(., r)) +~T(V7 + ~)a,_~
(50)
for i = 1,... ,n - m, with fo(e,T) = O, ~o = 1, ao = c, and fro = ½.
Fig. 8. Interval considered for analysis. ffW¢(0 Using the mean value theorem and the continuity of - - , i = Z ra -- 1 (Lemma B.2), we get,
[,+. ~¢o d.- = ~(r)¢~)(r) z(rj Jt z
1,...,n-
I.'~-+,", some ~" ~ EO,~I-
(511
Substituting (51) into (50) and taking 6 -- ea', we have ]0T(r)¢(0(r)]
< 2(1 + Kz~)(,~i_l(T)e ~' + pe-~' f~_l(e, T)) q" ai-lkT(Vf~dr U), /
-
-
1,... ,n - m - 1.
(52)
Robust Continuous-Time Adaptive Control
189
T h i s follows because a l = ~1 a i - x , so t h a t ¢~' = e ~ ' ' - ~ - a ' . Since e~ < e ~' < 1, i = 1 , . . . , n - m - 1, (52) m a y be w r i t t e n as
I~W(~)¢<0(~)lII < ~i(T)e a' + pfi(e, T), ~(~) I,=,+,"
(53)
where
-~i(T) = 2(1 + K , ) A i - I ( T ) + ai-lkT, fi(e,T) = a,-lkT + 2(1 + Kz6)e-C"fi_l(e,T), i = 1 . . . . . n - m - 1. with A0 = 1,
(54)
fo(e, T) = O.
F r o m L e m m a B.2, 0"T¢(0/z is a uniformly continuous function of time; therefore, its value in the interval [t,t + 6] will not differ f r o m its value at t + 6" b y m o r e t h e n ki(T)6 : ki(T)¢ ~' . Setting h i ( T ) = -Xi(T) +k,(T), it follows f r o m (53) that
z(v)
Since e e (0,11, it follows f r o m ( 5 4 ) t h a t
Vr E [t,t + b~].
(55)
fi(e,T)<_O(~e),i=l,2,...,n-
m - 1. Consider now a set of intervals [tk,t~ + 6],k = 0, 1 , . . . ,l where l > A t / 6 such t h a t to = tj, tk < tk+l < tk + 6 and ta + 6 - tj + z3t. T h e fact t h a t U~= 0 [tk, t~ + 6] = Itj, tj + At] t o g e t h e r with (55) imply (47), and therefore the p r o o f is c o m p l e t e for i -- 1 , . . . , n - m - 1. Now, let i = n - m . F r o m (50) we have,
[ftt+6 oT¢(~-m)dv] < (2 + 2Kz6)(~,_m_l(T)e ~. . . . , + l~fn_m_1(e,T))
+6kr(v~ + g) • a , _ , ~ _ l . By L e m m a B.3,
~:= is continuous.
190
Naik, Kumar, and Ydstie
Now,
~ ,+s
~'T¢(~-.~) dr] Z
(+
z
dr
&
-~lJ,
-l-J, '+'[('- ~'~'- z(°"-°~<) v'ldr ,
/'+'
~lc(,+~')l "--
~-~]
- ~'~ j
--j|
'
-
o.-
~ o ~ ] v~ dr.
,g
Now
,., v,I< k(,. L). This gives
I f ' + ' ~¢:-")dr] z
> 6 I~(~ + ~'),¢"-")(t + 5")1 _ 2~k0,, L). -
z(t + ~*)
So, we have 5 '-'tl~-wl + 5*)¢(n-'~)(t + 5")1 < 2(1 + K z S ) ( 2 . _ . ~ _ l ( T ) e a . . . .
~(t + ~*)
~ + #f._.,_z(e.
T)) + ~ k r ( ~ + ~)~,_~_, + 2~k(~, L). Letting 5 =
ea*-"
and dividing both sides by 5, we get
['6"~r(v)¢(n-m)(r)l]] 4~-)
(5~)
_< 2(1 +
KzS)(An_m_I(T)C ''-~
.,=,+6.
+ ~ - ~ ' - " y~_.,_,(~,T) +,,.-..-~}T(V~ + ~,) + 2kO,, L) < X,,_,.,,~",,-,,, + t,/',,_,,,(~, T) + 2k(t,, L).
(57)
Robust Continuous-Time Adaptive Control
191
where
~n-m = 2(1 + Kz)An-m-10 -I- an-m-lkT f._.~(e, T) = a . _ ~ _ l b r + 2(1 + K . ) e - ~ ' - ' f ~ _ ~ _ I ( e , T ) . Next, (57) implies,
I((t + 6")1 ___E._=,"--- +pfn_m(e,T)+3k(p,L).
(58)
By Lemma B.3, ~(r) is uniformly continuous. Therefore its value in It, t + 6] will not differ from its value at t-i-6* by more than kn_m(T)6 = kn_m(T)e ~ ' - " . Setting An_,,~ = "~,~-,n + k._m(T), (58) implies
I¢(r)l < M_,,e""-" +M._,.,(e,T)+ak(~,L), Vre [t,t+S]. This finally gives
~r(r)¢("-=)(r)
-< a " - = e ~ ' - " + P/n-re(e, T) + 4k(p, L),
z(r)
thus concluding the proof.
(59) []
L e m m a B.7 l f t j >_Tt, At > 1, and
I~r¢l _<.,/7, vt e [tj, tj + ,at], Z then Igw¢] z <_ho(T)e p + # -hi(T) - - ~ +4a"-mk(l~'L)' where p = 2 - ( n - m - O , e E (0, 1] and ho(T), ha(T) are some positive constants which depend on T. n--IT~
Proof. ¢ = ¢ + Z ci¢(0 for some constant coefficients ei(cn-m = an-m). Therei=1
fore by Lemma B.6, we have
I~r~l ___n - -~m - - 1 le~ll~r¢(')_____._+~lic._ml i~r¢(._=)l + I~¢1 Z
g
i=1
Z
Z
n--ITl
<_ v/~ + ~
Icil (Ai(T)e a' + pfi(e,T))
i----1
+ 41e,_,~ I
Io=.l' le2.1M')
2k~ (1 + T27] L
+t,(le.l+
bin,.
2
192
Naik, Kumar, and Ydstie
Setting satisfies get
ho(T) - ~.__-~nIc~l~iCT)+ 1, hi(T) - ~-~__.-~nleil~i(T), where hi(T) fi(e,T) < ~ and noting that 0 < ea~ < ep, i = 1 , 2 , . . . , n - m, we • hi(T) "4"4an-mk(p,L) 0
Proof of Lemma B.4(i}.
Consider Its, tl +T']. Partition this interval into [T'J - 1 intervals of length 1 and one interval of length _> 1. Let N := LT~J. So we have ON=l[ti,ti+l] = [ti,tl +T'] such that ti+l - t ~ = 1, i = 1 , . . . , N - 1 and t u - iN-1 >_1.
Consider the following sets,
01= l[t~,ti+l] : I¢:O'] < v~,¥t • [ti,ti+x]) O2 = [t~,h + T]\O~. From Lemma B.7 we have that Yt • £21, ~<_ho(T)eP÷ft~'Ae÷4an-'nk(p,L). Now, let s(t) := ( ~ )
2
Is(tO-s(to)l=
=z
'. ~d\ (~T(O~(O~ z(t) ] dt
I],i~ ¢'(t)~(') d [¢'(t)~(t)~jt z(t) dr\ z(t) ]
< 2(tb If [ti,ti+l] that ks :=
fa)ko(T)Mc.
E 02, then s(t) >_e for some t* • [ti, t~+l]. Due to (60) max{1,2cMko(T)} _> 1 we have (since ti+l - ti >_. 1) ti+,
s(t)dt >_~ s '
V[t,,ti+l] e ~22.
(60) and the fact
(61)
¢ll
Now let N1, Nu be the number of subintervals contained in £21 02, respectively. Then, from (44) and (61),
N2 < 72ks(cl + IJ2c2(T)T'),
N1 = [T'J - N2 _< T'.
Robust Continuous-Time Adaptive Control
193
Hence, z
jn.,
<_N1 ho(T)e" + . . . . . - ~ + 4a"-mk(p, L) + N2MC <_T' (ho(T)e" + ~ - - e
4- 4a"-mk(.,L) )
2 k , ~ C(q + t* 2c2(T)T r) +-~-
2k, MCCl ~. £2
+~hl(T) ho(T)~"+ 4~"-"k(~,,L) T', ~ +
e~
thereby concluding the proof of Lernma B.4(i),with
vz = 2k.clMC v4(T) = 2k.c2(T)Me vs(T) = hi(T)
v6(T) = ho(T). [:2
Appendix C
Proof of Lemma 9.2. The proof follows along the lines of Ioannou and Tsakalis [12]. Recalling that W = keeTpe + ½2, and that (9) implies i, <_-(do 4- gz)z 4 dl(luJ + leal + 1) + K2, we have (62)
-- ke(e, T p e 4- eT p e ) 4- ~.z --
b e y 1)
+(-(do 4- g=)z + dl(M + le.I + 1) + gz)z.
(63)
N ow, (i) ea --~ _¢T~4- Vl 4- ~2do which implies leal _< leT0"]+ [vll 4- co exp[-2doTl].
(64)
Naik, Kumar, and Ydstie
194
(ii) y = h T x , + v + ~2do
lYl -- CllXoll + Ivl + co exp[-2doT,] =~ lYl < Cllell + ~ + I"1 + co cxp[-2doT~], and T
NI_<
T
. (I,"l+lOyCyl+lO,,¢,,l+le.llyl).
Therefore by (30), i.e., the definition of e,
I~1 ___Clldl + clvl + e + eo exp[-2doTt] .
(65)
Assume Tt large enough to get,
lul + I~.1 _< IcTffl + CIl~ll + Clvl + 141 + 2c.
Using (66) and (63) we get, W < -k~lle[I 2
-
(do + g2)z 2 +
2kellPbcll['O"T¢[llell
+2k, llPb¢,,lllvlllell + 2k,~,,,l,7,.llPb¢,,ll Ilell
Now recall that
Ivl _< K~K,,~z + K,~km, + k,, + k(Tt),
Iv~l _< K~fKmzz
+ K~vfkraz "1"kvI + k(Tl),
and I ~ I _< ~ + ~ + k(T,), and let
71 -- 211Pb¢ll, 72 = 211Pbc~ll, 73 = 21lPbc, ll • This gives ¢¢ _< - k , ll~ll2 - (do + gz)z a + k,711~r¢lllell + dllCT01 z +'yo.k,(K,,K,n,z + kvkm, + kv + k(Tt))llell +#~ko73(cz + ~ + k(Tt))lldl +ddCIlell + (CKvKmz + KtfKmz)Z +CK.k.,~ + Ck. + Ck(Tt) + K'fk.,, + k~f + k(T~) +2e + K2)z
- -k~llell 2 - (do + a2)z 2 + k~Tx I~r¢l Ilell + dl [¢T01z + [K,,7~k,Km~ + #mTzk,c + dlC]ll~ll z + dl(K,~C -at- K~f)Km, z z +ke[72(Kvkmz + kv + k(Tl)) + p,~Ta(c + k(Tl))]llel[ + dx[CK,~kmz Y 7
+Ck,, + Ck(Tt) + K'vfkmz "1"k,jf + k(Tt) + 2c + ~'-~21z. Ul
(66)
Robust Continuous-Time Adaptive Control
195
Letting
flo = fll = #1 = #2 = Z~ =
K2]
dt[CKvkm. + Ckv + Ck(Tl) + Ktvfkm. + kvr + k(TL) + 2c + "~t dlC (Kv72Km. + I.tm73C) (KvC + K~r)dlKrnz b2(g~k~, + k~ + k(T,)) + ~ 3 ( c + k(T,))],
we have
~ -k~ll~ll 2 - (do -4- g2)~ 2 -4- k,'rx I~TaSlIlell + dx IcT~'I z + ~lllellz -4-mk~llellz + ~2~ 2 + k¢~hllell + ~oz. Note that/~1 and #2 can be made as small as we desire by making Zvmax and Pmax small enough. Next, we will choose ke, K~max and /Jmax (alld hence #1, #2) appropriately and complete squares. Using
k,
I~¢1 I1~11 < Ck-i,TIC~ I~r4,1 w
~'l~,/-w-
~
z "
'
dl ICwffl z2 IcT61w, z W W <- 2dl z and defining f12 = ~
,
we get
W < -k~ll~ll 2 - (do -4- a2)2 -4- Z~v~ 10T~I W "4-2da ICT~'IW z
= -~11~112
(do + g2)z2 + ~v~lffW¢l W + 2dl I¢T61W 2
],e],2 [ k e _
2
+
4
_
z
z (d0 + g2)]
(do + g~) k'2~'~
4
2
4
(do + g~)Zl 8
+ (z
.7.
z - (do+g2)] 4
(do+g2)p,k.l,el,)
I1~11- (do + g2) p° "
+
2
z_
+ (1
(do+g~)fllllell 8
(do+g2)p~) z 2
+ (do + q2)
(67)
196
Naik, Kum~r, and Ydstie
Choose ke = 2 ,/~+
fll + ~
and Kvmax, Pmax such that P2 <
8
g2
Then, adding and subtracting 8W on the right-hand side of (67), we get for
each ~
~ [0,,m~], Kv ~ [0, K ~ ] , _< - S w + 8~v0~7~I ~ 1 w + 2d~ ICr~'lW z
-8 2
~., 0,~el
1 -- ~ 2 [ ( d o + g~) - Z].
Choosing ~ = rain (2-~p-~, do + g, } and letting
83 = ~ v ~ , 8 4 =
2 Cag+ 8 ,20k~),fl5 2 = 2dl, (do + g2)
we have
¢¢ _< - S w + 8s [~r¢l w + 85 ]¢r~'] W + ~4. z
Finally, defining g(t) := W(b)
-8
z
+83IffT(t)O(t)l z(t) + 85 I~(t)¢(t)l z(t)
~
,
we get
b
W(a) <exp[f g(r)dr] (l+(fl4f exp[ff'-g(r)dr]dt')/W(a)) < exp
g(r)dr 1+
8W(a)
This concludes the proof of Lemma 9.2.
, since
-g(r) <_8. []
Robust Continuous-Time Adaptive Control
197
Table D
2Ca
kmz = dl kvm do
Kin# = l + - a 2do M(1J,-M) -
Ku¢, =
Kuv =
bmin
kr kuu -" brain k~f = k~c Kit = K~r + #me
K,,f = K,,e kvfo = kvoc Kym = M K~,m + Kit kum -- Kuv kvr + kuu Km = dl ( Kurn + Kvm ) - do K , = dtKm,(Kum + MK¢,m + K i f ) - (do + g2) t;' = k, + k(T~)
K..~ = K.,K~,.~ + K.~K$t k~ = ku. + K . ~ ( K ' # m , + k a ) km = dl(1 + k~r + k~,m
lc= = dl{[(MK¢,m +
M
brain
Kif)
+g.~,]km, + k.f + ku.~ + 1} g2
do -- K~fKuv Kin. do= T~
dl
1 K.. = (do -4 ~2)2 min{v~1 --K.¢,' 477' }~,,., 2
R2 = g--2 dl - 1 - ~ - k ( ~ )
K a = [2(1 - a + g~.) + 3 ( M 2 + gu2¢,)] Kb
2
2
4K¢
g¢ = m ~ { 3 [ ( g $ f k ~ , + k,f
+k(~)) 2 + (k + k(Tt))~] -
/2 3K;f(l
Kd = Kb + L2 vga,
0} Kd
6(T) = exp[-KaT][Kn= "t- Kg--~da]- ~ a
Kw= -- ½ + 2~max(P)(C -I- 294) 2
Kyx = (Kif + MK¢,rn)Kmz
K u . = (K;fK,,o + K,,,b,K~,,,,)K.,. ku= = ku + Ku,pK¢,mkmz
I km, + k,:t kv, = M K~,mkm, + K~f Kn. 2 - K,~,,.(K~, + -2kin=.2 Z - - ) + ~4v
1
= ~-{~f-~, do + g~} --[i=
ii
fll = dtC
g2
1 log ( 2 ~ 2g~
7
,1
log (i1024Kaw" 2
+e=)]) exP[2(K= 'Y
'
Naik, Kumar,and Ydstie
198
g,
~
7;
~, V
/
~
'
~'og( '56K~'(k + ~°)'~ °~p[2(~"+ ~')]) - l,-~'°g (32~w~k +'°)),
1-Iol g(21256K~ve' xp[4(K+' e)'](k+e°)exp[-/4' ] )
92
7
l+ -l°'g(214096K~"exp[4(K"+e")])}g'2 e=min{'/i
L = max
7
fl ~)'} 8f13v6(T) ' ( f"~5 '
{~__.~ 2K, 2k:' 4K2164K~..exp[2(K~+ez)] (k. + k(Tt)), 2I~ --g2' e. ' g2 3t '
4K2 t8Kw.(k + eo)exp[-f~/4] //~4kexp[2/~T] 4K2. /8Kw.(k + Co) ~
~
'V
~K~o
'~,"V
~
2K2 4K2 1256K~,. exp[4(K. + e.)](k + co)exp[-fl/4] g2 g2 7 4K2 ~ 4K2 164K~,z exp[2(g. + e.)](k + eo)exp[-fl/4] J
J
4K~14o96K~, ~xp[4(z¢.+ ~.)1 ~
~
,
'V~'
'
Robust Continuous-Time Adaptive Control
199
Increase L (if necessary) and choose Tt, K2 large enough and enough so that,
/~2 _> 2(K~fkm~ + kvf +k(TL) ) ,
(~0 - ~ ) _ > 4KvfKmx, '
5(T) > k, for some positive scalar k, k(# . . . . L) < ~/ ' ~ . + g2 t,l < y
sT:
Kvmax, #max small
32~san-m '
do + g2 '
1,2 <
and Pmax < rain [8fl3vs(T),e
where p := K~fKmz + k(g,L) :=
~r(i -[- 02~.~ ).min L/2
s
'
8flav4(T)'e
'8f16~(T)
L/2
102-1m)~
+ (iOn[ + b--~i~
#l := K~72Kmz + I*m73C P2 :-- (KuC + K~f)dtKm, .
'
Stability of the Direct Self-Tuning Regulator* B. Erik Ydstie Department of Chemical Engineering, Goessmann Laboratory University of Massachusetts, Amherst, MA 01003, USA.
A b s t r a c t . The direct self tuning regulator (STR) with parameter projection is finite gain stable when it is applied to a system where small modeling errors and bounded output perturbations are present. The modeled part of the plant is assumed to be stably invertible; the estimated parameters are projected into a compact convex set which contains a stabilizing controller and excludes the possibility of dividing with small numbers. The convexity assumption implies that we fix the sign and the lower bound for the high frequency gain of the estimated model. The extended plant state converges exponentially fast to an attractor which may be chaotic. The rate of convergence and size of ultimate bounds are estimated. The true plant with the unmodeled dynamics may have unstable zeros and there may be errors in the estimate of the delay.
1 Introduction T h e Self-Tuning Regulator (STR) was proposed in [1] as a m e t h o d to adaptively stabilize a linear system with unknown parameters. T h e algorithm converges in the ideal case [4]. However, this result is not robust in a classical sense and unmodeled dynamics and disturbances m a y cause the p a r a m e t e r estimator to become unstable via a mechanism known as p a r a m e t e r drift. In the case t h a t this drift is observable from the p a r a m e t e r estimator it creates a local instability which ma~" cause small amplitude chaotic bursting. An infinite p a r a m e t e r drift, which is not observable, causes a global instability of the estimator and gives numerical problems. P a r a m e t e r projection or similar techniques can be applied to prevent the latter type of instability [2]. T h e aim of the paper is to d e m o n s t r a t e t h a t the first type of instability does not necessarily lead to poor performance. Indeed, we show t h a t when p a r a m e t e r projection is applied to arrest the infinite drift, then the S T R is robust with respect to unmodeled dynamics and bounded perturbations in the sense t h a t we stabilize a compact set which in general contains one or more small amplitude chaotic attractors. It is claimed t h a t the compact set containing these attractors is globally exponentially stable and t h a t the size of the overbounding set scales with design variables, the disturbances and the unmodeled dynamics. In particular, it is shown t h a t the * This research was supported by the National Science Foundation under contract NSF#CTS-8903160.
202
Ydstie
algorithm in closed loop with the plant defines a nonlinear system system which is finite gain stable. This property gives the algorithm a measure of robustness with respect to nonlinearities and explains why adaptive controllers work well in practical applications. The results presented in the paper extend quite readily to continuous time adaptive algorithms; algorithms based on indirect pole assignment and the extended horizon approaches can also be analyzed.
2 S e t t i n g up t h e P r o b l e m We consider control of linear stationary plants
y(t) = P ( q - 1 ) u ( t - 1) + b(t) ,
(1)
where u(t), y(t), b(t) are the input, output and disturbance signals respectively. P is a linear shift invariant operator. The objective is to stabilize (1) so that a bounded setpoint y*(t) is followed using a certainty equivalent direct adapiive controller. To introduce the problem we will first discuss the robust performance problem when we have available a fized reduced order design model
A ( q - 1 ) y ( t ) = B ( q - l ) u ( t - d) + v(t) ,
(2)
where A(q - t ) = 1 + alq -1 + . . . + a n q
-n
B(q -1) = bo + . . . + bmq - m . The model orders n, m, the relative degree d and the parameter vector ~b = ( a t , . . . ,an,bo, ...,bin) are design parameters. The signal
v(t) = A ( q - 1 ) ( A ( q - i ) u ( t -- 1) + b(t)) , with A(q-1) _ p ( q - 1 ) - p m ( q - 1 ) q - d + l ,
pm(q-1) _ B(q -1) A(q-1) '
captures the effect of truncation and parameter errors and bounded noise. The control will be computed based on the nominal model so that u(t) =
1
(y'(t + d) -
(3)
where a(q-1) and f~(q-t) are polynomials of order n - 1 and r e + d - 1 respectively determined so that 1 = F ( q - 1 ) A ( q -1) + q-dc~(q-1)
Stability of the Direct Self-Tuning Regulator
203
and ~(q-1) _= F(q-1)B(q-1). This gives the closed-loop system shown in Fig. 1. Here the parameters n, m, d are fixed by design, while ¢ will be chosen to minimize the sensitivity of the loop. To ensure internal stability we need to perform the minimization subject to the condition that B(q -1) is a Schur-Cohn polynomial, that is: B ( z -1) # 0 for )zI _> tr where 0 < ~r < 1 is a real number. This restriction can be interpreted as a stable invertibility condition for the nominal plant. Note that the plant itself need not be stably invertible. Elements which have an unstable inverse may be "hidden" in the unmodeled term. This is especially important since unmodeled high frequency dynamics invariably are present in practical applications. Also, note that in the adaptive design we will not explicitly enforce the stability property to hold for the estimated parameters. It suffices that there exist parameters such that the nominal plant is stably invertible and that the closed-loop system that results from applying a controller derived from such a model does not have "too high sensitivity". These notions will be made more explicit below.
y
*
U +
1
P
m
(X
Y k
I~dt
Fig. 1. Block diagram of nominal feedback system.
Using the definitions above we may rewrite (2) as
y(t) = a(q-1)y(t - d) + ~(q-x)u(t - d) + F ( q - ' ) v ( t ) , and the closed-loop expression for the tracking error becomes
y(t) - y*(t) = F(q-1)v(t) .
(4)
204
Ydstie
From the above we then have
~(t) =
1
(y*(t + d) - ~(q-i)(y*(t) + F(q-1)v(t))) ,
which we may rewrite as
u(t) =
1
(A(q-a)y*(t + d) - o~(q-1)v(~)) .
(5)
It is now clear that the truncation error satisfies
v(~) = A(q-X)A(q-t) B---~----q_t)q-a(A(q-a)y*(~ + d) - a(q-a)v(t)) + A(q-a)b(t), which we write as
v(t) = H0(¢, q-a)u*(t + d) + Ha(e, q-a)b(t) + ~(t),
(6)
where e(t) is introduced to take care of non-zero initial conditions (Fig. 2), a, A(q-1)q-1 rm(q- ) + A(q-1)a(q-a)q-a
H 0 ( ¢ , q - a ) = A(q-a) n ,
is the transfer function from the reference to the disturbance of the closed-loop system, and
B(q-') Hl(~b, q-a) __ Pm(q -1) + A(q-1)a(q-1)q-1 is the closed-loop transfer function from the bounded disturbance input to the truncation error. Note that e(t) is bounded and decays exponentially provided that the transfer function H0(¢, q-a) is exponentially stable. Prom the expressions above it follows that the tracking error satisfies
y(t) - y*(t) = F(q-1)(Ho(¢, q-a)y*(t + d) + H a ( e , q-1)b(t) + ~(~)) , and it is clear that the tracking and disturbance rejection problems can be formulated as optimization problems. In particular, by defining a set ~ so that B(q-1) -1 is a stable polynomial for all ¢ E fir, we solve the tracking problem by minimizing the transfer function F(q-a)Ho(¢, q-a) over ¢ E ~ under a suitable norm. In the multivariable case it is natural to formulate this as an H ~ - t y p e optimization problem. In this paper we work with a SISO system and we will work with a n / 2 - t y p e objective instead. We are now in a position to discuss the "optimal centering problem" and we will use the solution to this to define the concept of a best nominal model. Define the "compressed" /2-gain of a stable linear operator [10] 2
Stability of the Direct SeN-Tuning Regulator
205
Y
V
H0 b
Fig. 2. Block diagram of nominal feedback system.
and the exponentially weighted (truncated) or-norm
II~(t)l12,~-
~-,x(i)2. i----I
Note t h a t in the case or _- 1 this gives the ( t r u n c a t e d ) / z - n o r m of a function. P r o m (4), (5) and (6) we can define the optimal set of model p a r a m e t e r s with respect to the or-norm as 4"
=
arg rain IlH0(~, q--1)[12,a
For future reference let H~(q -1) = optimal control p a r a m e t e r s becomes
.
Ho(~*, q-l). T h e corresponding vector of
0* = (o,~, . . . , ~7~,,o~,..., ,6*+d_~) • 2 The argument that follows can equally well be formulated using the ~compressed" loo norm
IH(q-~)loo,,,= sup (lH(z-~)l) l,l=v~
defined relative to the norm Iz[oo ----supo
206
Ydstie
Application of this control gives the performance
I l y - y'll2 _<
+ IF(q-1)Hg(q-1)lo lly*ll2 + IF(q- )H;(q- )loollblb.
('0
where e0 is a constant introduced to take care of the effect of initial conditions. The parameters ~b* are called "centered" since they have been tuned to give optimal performance in the sense that they minimize the sensitivity with respect to the truncation errors. These parameters, of course, are not directly available since they depend on P which is not assumed to be known exactly. Note that if Ilzall2,o < <
i then
kPm(q -1)
-,,2,o '
or
IIH Cq-1)ll2,o.
IIq-lA(q-1)ACq-1)Pm(q-1)-llb,o.
The first expression indicates that the optimal centering problem is related closely, but is not equivalent to a model matching problem subject to a star bility constraint. The second expression indicates that the centering problem can be interpreted as that of finding the stable inverse Pm 1 that multiplied with the unmodeled dynamics has small gain. In the certainty equivalent adaptive control strategies, like the one proposed in [1], a linear least-squares-based algorithm "estimates" the parameters of the design model and then the obtained estimates are used to compute the control. In the STR the gains were allowed to go to zero. Here we are interested in studying the dynamics of the feedback system when the gains are finite, but non-zero. To proceed with this design we assume that the disturbance b(t) and the reference y*(t) are 1~. That is, there exist constants kb, k~* so that b(t) 2 < kb and y(t) .2 < k~. for all t. We will also assume that we have available upper and lower bounds for the the control parameters. To be more specific: We know O*, a compact convex set so that 0" E O* and []~01>/~mi. > 0 for all vectors ~ E 8*. The parameter estimates will be projected so that they belong to 8" at all times. The convexity of O* implies that we fix the sign of the high frequency gain of the modeled part of the plant. The lower bound, ~mi,, is introduced to avoid division with small numbers in the control equation and the bounds on 0 are introduced to prevent infinite parameter drift. By noting that we can write the optimized model in the predictive form (2) as
y(t) = ¢(t - d)'O* + 7 ( 0 , where 7(0 = F ( q - 1 ) v ( t ) , the following strategy for direct adaptive control becomes obvious Step O: Initialize 0(0) and ¢(-i), i = O, 1 , . . . , d - 1 and set t = 1 Step 1: O(t) = F o .
O(t - 1) + P ( t - 1)¢(t - d) r + ¢ ( t - d ) ' P ( t - 1)¢(t - d)
"
Stability of the Direct Self-Tuning Regulator
207
Step 2: Solve for
y*(t + d) - ¢(t)'0(t) - 0 Step 3: Set t = t + 1 ~nd go to Step 1.
This strategy is called a "certainty equivalent" strategy since the estimated parameters are used directly in the control law. We notice that the control equation (Step 2) is a re-statement of equation (3) using the estimated parameters. In Step 1, e(0 = Y(0 - ¢(t - d)'0Ct - 1) is the prediction error, O(t) = ( a l ( t ) , . . . ,an(t), ri0(t),...,/3m+a(t))' is a vector of estimated parameters and Fe-{.} is an orthogonal projection which ensures 0(0 E O* for all t. The parameter r _> 0 controls the (local) speed of adaptation. A modification can be introduced to avoid singularity when r = 0 since II ¢(t - d)l I = 0 may ~ v e division with ,.ero. 3 In [1] it was suggested to set r = ~ w h e r e 0 < )~ _< 1 is the forgetting factor, and use recursive least squares to update P(t) so that
1 ( P ( t - 1) - P(t - 1)~b(t_-zd)¢(t_.~ d)'e(_t 2 1 ) ~ P(t)
= -~ _
(A + ¢ ( t - d ) P ( t
-
1 ) ¢ ( t - d))
]
'
The forgetting factor can be updated using the algorithm proposed in [3]. The analysis presented here works for this approach provided P ( t - 1 ) is "regularized" so that 0 < PrainI _< P(t - 1) _< PmaxI, where Pmin ~ Pmax are positive numbers. T h a t is, P(t - 1) should be bounded and have bounded condition number. However, to simplify the algebra we will take P(t - 1) = P, a fixed positive definite matrix. The algorithm described above is known as the Direct Self-tuning Regulator with Parameter Projection. The objective of the paper is to demonstrate that this algorithm is finite gain stable provided that there exist ~ < 1 and a centering so that []H0(¢*, q-1)ll~.,a _< Kv, where Kv is a constant determined in Sect. 4. In other words: The STR with parameter projection solves the optimal centering
problem.
3 Technical R e s u l t s The problem with unmodeled dynamics in adaptive control has attracted considerable research attention. It is an interesting problem, mostly because neglected dynamics give rise to disturbance terms whose magnitude grow at the same rate as the process inputs, their effect persists if they have long memory, and quantitative statements about their magnitude cannot be made until the adaptive system has been shown to be stable. The standard analysis, which relies on decoupling and the use of a smallness in the mean property of the squared prediction error which is independent of stability, then fails. Until recently it was thought 3 In practice a dead-zone is included and we set 0(t) -- 0(t - 1) whenever I[~b(t)l[ _< z~ where A is an arbitrary small number.
208
Ydstie
that the algorithm needed to be modified to recover stability and performance. Several approaches have been suggested including data normalization [9] and relative dead-zones [12]. Both of these methods are aimed towards modifying the adaptive algorithm and in this manner the smallness in the mean property can be restored despite the presence of the unmodeled terms. A Bellman-Gronwall argument can then be applied to demonstrate stability of the closed-loop system. A recent survey of these methods is provided in [7]. In this paper we only use parameter projection, and the decoupling property does not play such an explicit role. Instead, we exploit an exponential stabilizing property of the ideal closed-loop system and the bounded rate of growth that results from the parameter projection and then apply the "switched system" approach presented in [14]. Sets of events where the controller destabilizes the closed loop are isolated and bounded rate of growth arguments are applied to show that on these and a certain number of preceding events the regressor overbounds the unmodeled dynamics. To motivate the development consider an example problem which illustrates, in part, why it is difficult to analyze adaptive control systems. E x a m p l e 3.1. (Ydstie and Golden [15]). Consider adaptive control of the system y(~) Jr- aly($
-
-
1) -- u(t - 1) + v ,
where v is a constant. Furthermore, if the setpoint is a constant different from zero, then the adaptive system can be represented by a mapping in the plane Y* --* -Ol Y + 7 01 --* #l + yl -O' ff + 7 - 1 C-}- y~22
(s)
where 0~ - ~(t)÷al, yJ - y / y * , c - r / y .2, and 7 = l ~ - v / y * . A typical result from a simulation is shown in Fig. 3. What we notice here is the persistent oscillatory phenomenon known as "parameter drift and bursting". While the output appears to converge towards the setpoint, the parameter estimate drifts. Eventually the closed-loop system becomes unstable since a dosed-loop pole migrates outside the unit circle. The output diverges, this causes excitation and the parameter estimate then retunes. This pattern continues indefinitely and the phase plane plot in Fig. 3 indicates that there is a-periodicity and sensitivity with respect to initial conditions. [] The main issues here are the following: - It is not possible to give a complete characterization of the dynamic behavior of even very simple adaptive control systems. - Despite complex dynamics we have a measure of robustness and rapid decay of initial conditions. - The average tracking error is of the order of the external noise.
Stability of the Direct Self-Tuning Regulator
209
2.5 -.. ::...:,..:~. : - , ,~-.:" ,.:..'.'~:.~:a.c,'v ,,.~,." ~-,." :. •
•
•
chaotlc
burst
_
•
.
.,
.
-
1.5 '
~.]"
-
~,.
%..
.'
.
. •
..
...f:~'.~~;~- ~,'~',%, -,~: . .'.. "
:..,~*--:.. ~;.~ .. :." "? " • "~ ".'..
•
:rf'';"'"
.
~" <~'":" "-'" .:~!..:..~. • ,
"
"~..-'N
"
....-
!
.
--.. :.-, ;-~ . . : ~ . ' •." " ::~,~ "" P . . d ~ ' ~ t
-. .:~* •
.
. ~ -
~ t
.'~..~.~ :. " -.•
..
•
,2"'" ~"*"
'"
0.5 •
.
:
'
,,:.
.
I
•
_~.
n
~'.~F..~g~
._ ..~.~:..-.-.-
*Cal. *1•
,
..:. ~..:~-..-.-
% . • ' • .
•
.
";,: :.*:.." .~.~
"'~:;:.
• ...,~...
e•'
.~"
,'_"\'
"0"5-6
-5
.
"4
.
.
.
.
..~,
•
-
.
\
**
."
• .:¢:r'...'."
drift
"." ~-"--.,. ~...-...-. - ........ .. -
','~"
. -
•.
-3
-9-
-II
0
1
2
3
4
Y Fig. 3. Simulation result demonstrating persistent paxameter drift and bursting. The output y (horizontal axis) converges towards zero while the parameter estimate 8 (vertical axis) slowly drifts. Eventually the loop becomes unstable and we observe a burst•
Structural perturbations and noise do not alter the picture described above significantly. T h e critical property to notice, which we will exploit below, is the following exchange of stability: When the estimator is stable the output diverges and vice versa. For the purpose of analysis express the adaptive controller in the perturbation form
"(0 =
/ ~ . ( q -11 )
(Y*($ q- d) -
ot'(q-1)y(l~)
q- ~(~)'/~(~)),
where 0(t) - 8" - 0(t) is the parameter error• This result is a direct consequence of (3) and Step 2 of the algorithm. It follows that the effect of parameter errors enters the loop at the same point as the reference. We can now redefine the expression for v(t) to take into account the effect of adaptation and we get
v(t) - H~(q-X)(y*(t + d) + ~b(~)'0(t)) + H~(q-1)b(t) + e(t). A block diagram for the dosed-loop adaptive system is shown in Fig. 4 and the analysis proceeds using the machinery set up in the previous section. Let
m(t + 1) = ~,m(t) +
(~(t)'o(t) -)'
, m(0)
= ~o,
where rn0 is a constant chosen to overbound the effect of the initial conditions• The existence of this constant is guaranteed by stability of H~(q-1). I[ follows
210
Ydstie
+
Ho b
y
T+
_1 H rl
'
V
1
Fig. 4. Block diagraxn of adaptive feedback system.
that re(t) overbounds the regressor and the perturbation v(~) in the sense that there exist positive constants {K~, K#, k~, k#} so that
v(t) 2 < K~m(t + 1) + k~ [[¢(t)[[ 2 _< K4,m(t
+ 1) +
(9)
k#.
(10)
In particular, we have
K~ - 211HS(q-~)llo and
k~ =
2 1--~1 IIHT(q-1)ll~kb.
We also have 7(t) 2 < with K. t = K~
KTm(t + 1) + kT,
IF(q-1)l~ and k~ = IF(q-1)l~ k~
where
(11)
IF(q-X)l~ = suptql= I
F(q-X). Another critical property, which follows from the projection Ftg. {.}, is that II0(t)ll 2 _< Ms < o o s o that re(t) has bounded rate of growth. T h a t is,
lTt(t -~- ].) __
Vt > 0,
where 91 and kl are positive constants that depend on 8*. Note that kl = kv " - k 7 : 0 when kb = ky° = 0, and that Kv = K-t = 0 when there mismatch present. T h e main idea consists of constructing a comparison signal a(t) which sures all signals in the adaptive loop and in addition captures the drift and
(].2) k# = is no meaburst
Stability of the Direct Self-Tuning Regulator
211
phenomenon described above. The signal we use is "switched" and defined in the following fashion
a(t +
1)
A(t) (~a(t) + (~(t)'ti(t)) 2) + (1 - A(t))(g2a(t) + k2),
(13)
where A(t) = I{~a(0+(~(0,~(0)2_>(~2a(0+k2)} is a zero-one indicator function, a(0) = m(0) and ~ < g2 < 1 and k2 are positive constants to be determined. Note that, unlike previous studies, the comparison signal is not used in the algorithm itself-it is only introduced as an instrument to be used in the analysis. It follows immediately that a(t) overbounds rn(t), so that if we can show boundedness of a(t) then boundedness of the adaptive system itself can be inferred from inequality (10). Moreover, a(t) has the bounded rate of growth property a(t + 1) < gza(t) + kz, Vt > O, (14) inherited from (12). (Note that, in general, kl > k2 and gz > 1 > 92)-
kI
gx +
1-A
v
g2 ~
a(t-1)
q
k
2
a(t)
Fig. 5. Block diagram for the "switched" system. The top part has a loop gain larger than one and the lower part is stable. The idea is that the closed-loop system will be stable if the lower path is used sufficiently often.
Equation (13), it turns out, can be interpreted as a nonlinear dynamic system of the form shown in Fig. 5. That is: We have a dynamic feedback system with one exponentially stable component, one expgnentially unstable component and a switching function which determines which path is used. In this case a switching function has been defined so that A(t) = 1 corresponds (roughly) to the unstable
212
Ydstie
burst phase of the adaptive algorithm and A(t) = 0 corresponds to the phase where the closed loop is stable and the parameter estimate may experience drift. It appears reasonable to expect that if the stable path is used sufficiently often then this system may be stable, if not convergent. Indeed this turns out to be the case. Lernma 3.1 ( T h e Switching L e m m a ) . Let A(t) be a (0, 1) indicator function. Consider the non-negative variable a(t + 1) = A(t)(gla(t) + kl) + (1
-
-
A(t))(g2a(t) + k2),
where kx and k2 are non-negative numbers and 0 < g~ < 1 < gl < c~. Define ul = ln(gl +
k d R ) and
u2 = - I n ( g 2 +
k2/R) for all t. Let R > k 2 / ( 1 - g2). Xt
follows that u~ > O. I f for each t, a(t - i) >_ R for i E [0, N] implies 1 -~ Z
A(i)=U*<--
i=t-N
u2 U 1 dl- U 2
for some integer N, 4 then for every t > 0 sup a(t) < max{Rm, Koa(O) + k0}, t>l
where
NU ° -1
glk
Rm = K o R + ko,Ko = gNV',ko = i----O
and
50
a(t + 1) < max{/~n , e-6t(K~a(O) + k~o),
where Rtm=g~R+ko,
t
s
ffl
K o=(
g2
)
NU*
,k'o =
i=o
\g21
and ~f = u 2 -
U*(nl
- u2) > 0 ;
(it/) liminfa(t) >_ R,
limsupa(t) _< Rm.
Proof. Define two variables GI(O = gl + k~/a(O, a2(t) = g, + k2/a(O. Then a(t +
i) =
A ( t ) G i ( t ) a ( t ) + (1 - A ( t ) ) a 2 ( t ) a ( t ) ;
4 Here and elsewhere N is an integer.
Stability of the Direct Self-Tuning Regulator
213
hence for N = 0,1,2,3,.. and t > N
1-I [A(i)Gt(i) + (1
~(t + i ) =
A(i))G2(i)] a(t
-
-
N) .
i=t-N
Suppose a(t - i) > R for i = 0 , . . . , N and t >_ N. Then In
(
max
\t-N<_i<_t
Gl(i)~ < ul, In ( / --
max
k,t-N<_i<_t
G2(i)~ < --u2, / --
whence t
Ina(t + 1) <
~
(A(i)ul - (1 - A(i))u2) + lna(t - N ) .
i=t-N
However, according to the assumptions, t
E
A(i)ul - (1 - A(i))u2 < - N u ~ + N(ul + us) = - N 6 ,
i=t--N
where 6 = u 2 - -
(U 1 --
u2)U* > 0. This implies In a(t + 1) ~ In a(t - N) - N&
Hence,
a(t + 1) _< exp(-N6)a(t - N ) . On the other hand, if there exists an integer 0 < k < N so that a(t - k) < R, then due to the bounded rate of growth NU'-I
~(t + 1 - i) <_ g~U'R + ~
glk~,i e [o,k].
i=0
Hence, for all t > N
a(t + 1) < max{Rm, e-NSa(t - N ) } , where NU *- 1 Rm
---
gl~u'R
+
E
g~kl.
i=o
Hence, for all t NU°-
I
sup a(t + 1) _ { R ~ , 0 ~ u ' 4 0 ) + t>0
i=0
Having established this, we have
i=0
214
Since
Ydstie
act ) >_gNa(t -- N),
we have by recursive application of the above
a(t + 1 ) < m a x
((gl~NU*a(O) q- i ~0
kl
• []
This establishes the second item and the third follows by taking the limits.
The statement of the problem sets an effective lower bound on R. Specifically, R > k2/(1 - g2). This then implies that for initial conditions a(0) > Rg~ N the interval [0,R] is exponentially attracting with an average exponent 8, i.e., the system decays on the average like a first order system so that zCt + 1) = ~ - ' z ( 0 ) ,
6 = n~ - u ' ( u ~
- ~)
> 0.
The program we follow is to show that, under suitable conditions, fined by (13), satisfies the inequality
A(i),
de-
$
1
U*, i=t-- N
where U* is sufficiently small for the Switching Lemma to apply. The result from applying this type of analysis is that we demonstrate that there exists a compact subset of state space which is globally exponentially attracting. We will not be able to use the this theory to say much about the finer structure of the solutions unless R = 0 is allowed. Then we get exponential convergence to zero. L e m m a 3.2 ( S w i t c h i n g F r e q u e n c y f o r S T R w i t h P a r a m e t e r t i o n ) . Suppose a(i) > R for i E [t - N + 1, t + 4. Then
Projec-
t
1
-~ E
A(i) <_U'(R,N,a, g2),
i=t-N
where X [ClKT+2k3+k7 U*(R,N,a, g2) - g2 - - g2 - a R
+2d+l (
+ k¢+~Pmax) K¢
xdPmax ~ ' + 4 (' N + d + l p m i nN x f~N+d+l kK * ]
'F(q-1)I°°(K~+k~/R)
----~ . . . .
YlM,
--7-
(g2 - or) - 2c~ ~K. +
2~-~)
)1
'
d-1
j=O
Proof.
See Appendix 1.
[]
S t a b i l i t y of t h e D i r e c t S e l f - T u n i n g R e g u l a t o r
215
C o r o l l a r y 3.1 The adaptive control system described in Sect. ~ is stable in sense that there exist finite positive numbers $,Rm, K0, ko, such that
the
].¢(t)], < m a x _ (Rm, K o e - " (maXo<_i
provided there exists a < 1 such that the gain o.f H~(q -1) is sufficiently small. This result, which is a (significant) generalization of the result presented in [14], follows immediately by inspection. Explicit results are developed in the next section. One obvious conclusion is that that the adaptive controller with parameter projection cannot be destabilized by the addition of bounded disturbances. 4 Stability, Transients
and Performance
We focus on establishing four properties of the adaptive system, namely5 Maximum transient (supt>0 [1¢(t)112). Exponential stability of a compact set with radius Rm (ll¢(t)ll 2 _< max{Rm, g0e-' ll¢(0)ll 2 + ko}). - Ultimate bounds (limsupt_~o~ I1¢(t)112). - Performance (limt-~oo(y(t) - y*(t)) = O) or finite gain stability (lly - y*ll -< t,0 + mllbll2 + 211y'll2 ). -
The relevant constants are estimated. In addition to slow local adaptation (r > 0) we discuss '~fast adaptation" (r = 0). In Lemma 3.2, N, R and g2 are free variables to be optimized. In order to obtain feasible, if not optimal bounds, it is convenient to define the following a-dependent parameters g~=arg
min g2 o
1-
In gl ln g2 )
(a good approximation is g~ = Vr~), and B* =
-lng~ In gl - In g~ "
In addition, we define N* to be the smallest integer satisfying the inequality m
N*>~- 7 , where m = 2d+lMe pmaxK÷ ,g~ Pmin
.
g2 -- O"
Note that we have settled for expedience rather than optimality in some of these definitions. s
z(i) 2 is the Euclidean n o r m
ilzl I =
iffil
216
Ydstie
P r o p o s i t i o n 4.1 ( I d e a l ease, f a s t a d a p t a t i o n ) . Consider direct adaptive control of (1) with no model mismatch, b(t) -- y*(t) = 0 f o r all t (ideal case) and r = 0 (fast adaptation). Let or* be the smallest real n u m b e r such that II g;(q-X)ll2,o• = 0 and B(z -x) ¢ 0 for I~I > or*. Y or" < I then the adaptive s y s t e m has bounded transient so that
sup II¢(t)ll ~ < uT max I1¢(-i)11 ~ , t>o
--
ii pmax ,I where rn = •,~Id+1 ~wo rt¢ . g~ Pmin
g2
O_
. Moreover, we have exponential decay in the Or*
sense that
ii¢(t)ll 2 < --
gx
for t >
, - * ' max I1¢(-i)112 O<_i
0
'
where ~ = -0.51ng~ > O. This implies limy(t) = 0 and limu(t) = 0 exponentially fast. Proof. r = k~ = k~. = K 7 = 0 implies k3 = k¢ = 0. From L e m m a 3.2
g2 2d+lK Pmax MS U*(O,N, or, g2) = g~ - ~ ~Pmin ' N " It is convenient for now to set or = or*, g~ = g~ and N = N*. This gives -lng~ U * ( O , N , ~ , g 2 ) < lngl - l n g ~
- B*
and from Lemma 3.1 we conclude boundedness and convergence. In particular, by choosing a(0) = g ~ 1 max0
sup a(t) < g~u" a(0), t>0
where, according to the definitions above, N U * < 2d+lMs pmax r" g~ -- m Pmin ~ g ~ -- or -
-
"
T h e first part of the result then follows from L e m m a 3.1 by application of (10). We also must conclude
a(t+l)<e
-'~
gl
a(0),
with =-lng
. 2-(lug1-
in .~2d+lK Ms g~ Pmax u2 z ÷ -ff g~ - or Pmin
Choosing k = 2N* is convenient in this case and gives the conservative bound
= -lug;(1-1/2), which establishes the result.
El
Stability of the Direct Self-Tuning Regulator
217
The estimate for the rate of convergence has not been optimized and is slower than what we would expect if the parameters of the plant where known exactly. (When the parameters are known we have 5 = - I n ~*. In the adaptive case the bound above ensures/~ < -0.5 In x / ~ ) . We note here that exponential stability of the signals does not necessarily imply convergence of the parameter estimates. It is not even clear whether or not the estimates converge. With the deadzone we get lim0(t) = Ooo where 0oo can be viewed as a random variable and
l
sup II (t)ll _< ,a.
Indeed, this point has caused some confusion in the field of adaptive control. Several earlier attempts to demonstrate exponential stability of the closed-loop system depended on exponential stability of the parameter estimator and convergence of the parameters. Such an approach relies heavily on the use of the concept of persistent ezcitation. In one form, persistent excitation implies the existence of positive constants el, c2, n such that
c~z <_ ~
O(i - d)O(i - cO' <_ c2Z.
|--n
The lower bound is difficult to enforce when there are unmodeled dynamics present. The upper bound can normally be guaranteed through a separate stability analysis. One contribution of this paper is that we have divorced the question of exponential convergence of the signals from the question of exponential convergence of the parameter estimates and in this way we have obtained the finite gain stability result without using the notion of persistent excitation. We do want to stress that what we show is exponential stability of a set in all cases but in the result above and its counterpart, Proposition 4.4. P r o p o s i t i o n 4.2 (Ideal ease, slow a d a p t a t i o n ) . Consider adaptive control of (1) with r > O, b(t) - 0 and sups>0 y(t) 2 < ky.. Let ~r* be the smallest real number such that IIH~(q-1)lb,o. = 0 and B(z -x) # 0 for Izl > ~*.6 1/~" < 1 we have
supll~(t)ll 2 < max_g~n max. l[ff(_i)ll 2 +
t>0
--
~
O
~ g]k~,/~ + k s , i=0
where
/ 2m-1 r +ks + g k0 Pmax i=0 The exponential rate of decay to a compact set is bounded so that _-
}[¢(t)[[~ < max { / ~ , K c e - " ( K o max 11¢(-0112+ k0) } + k s for t > l -
\
0_
--
'
n This condition is stronger than that in [4] where we only need a ° < 1. We recover the stronger result by treating zeros on the unit drcle as unmodeled dynamics, as in Proposition 4.5.
218
Ydstie
where the constants are defined below and
6 =-~lng~.
In
addition,
lim Iv(t) - y*(t)l = 0.
t --*.00
Proof.
In this ca~e, k~ = K7 = 0 which implies k3 = 0. F r o m L e m m a 3.2,
U.(R,N,~r, g2) __ 2d+I g2
(K~ + r/pmax + k¢ ) d.Pmax Mo
92 ~ O"
R
Pmin
N
'
and stability follows b y application of L e m m a 3.1, provided R, N and g2 can be found so t h a t - In gz U*(R,N,a, g2) < In gl - In g~ " T h e r e is a trade-off between R and N . To obtain stability and convergence to n small set, it is necessary to compensate by using large N . This slows down the estimated rate of convergence. In the following, set N = kN* with k >_ 2, er = a* and g2 --- g~ so t h a t -lng2 - B*. In gl - In g2 As in the previous example, the expression for U* simplifies and we get the condition for stability:
U*(R,N,a, g2) <
+
K~Rk
'
and we guarantee convergence of a(t) to a bounded limit set by choosing R=R*
=
1 r/pmax + k~s k- 1
K¢
FromLemma3.1wenowget, bysettinga(O)=max(K;lm<_~
supa(t) < max
Rm,gN1U'a(O) +
glkl
t>O
where
NU* = kin,
,
.=
and hence
=
g r/pm + r+(k-
1) +
glkl, i=O
'
k
=
2
gives a reasonable bound. T h e first part of the result follows. T h e bound for the transient increases with r/pmax large. This is not surprising since a wrong set of initial conditions for the p a r a m e t e r estimates m a y give a large transient if the rate of a d a p t a t i o n is low. It then takes more time to correct modes leading to instability and the transient performance deteriorates.
Stability o f t h e Direct Self-Tuning Regulator
219
We now turn our attention to the exponential decay property. I m m e d i a t e l y
a(t + 1) _< m a x { R ~ , e - " K ~ a ( O ) + k~} , where the constants need to be computed. First we have 6 = - In
g~ - U*(N, R, a', g2)(ln gl
- In g2).
W i t h R = nR*, n > 1
U*(N'R'a'g2)=B*(
l+k-1)n
so t h a t , according to L e m m a 3.1, 6=--lng~
1--~+~
. n
Clearly, to achieve 6 > 0 we need 1
k-1
or n > k. Somewhat arbitrarily, we set k = 2 and n = 3, which gives
6 = - g 1 In g~. To finalize, we need to determine NU*. F r o m the above, 5
.
5
NU* = -~B < -~m. Hence
R ~ = 3K~R* + k~, where K~ = \ g 2 /
'
i=o \ g z /
and the second p a r t of the result follows. Finally, we show convergence to zero error. First,
v ( t ) <_ v ( t - 1)
e(t) ~ .(t) '
where V(t) = O(t)'P-10(t). It follows t h a t l i m V ( t ) - V(t - 1) = 0. Now from the above
e(t) ~ < n(t) (V(t - 1) - V(t)) . Now n(t) is uniformly bounded, the swapping terms ff(t - d)'(O(t - 1) - 0(t - dr) converge to zero and after some simple algebra we have lim [y(t) - y*(t)l = 0
since y(t) - u * ( t ) = e ( t ) + ¢ ( t - d)'(O(t - 1) - 8(1
-
dr).
[]
220
Ydstie
The gist of this is that (at least using this theory) we cannot guarantee exponential convergence to zero error in the absence of persistent excitation when we use slow adaptation and non-zero reference inputs. Instead, we have exponential convergence to an overbounding set - the estimated rate of convergence then slows down as we a t t e m p t to decrease the size of this set. We now move into the part of the paper which concerns the robustness with respect to bounded disturbances. First we have a result on performance when there are Ira-output perturbations. The main idea is, as in the previous example, to establish ultimate boundedness first and then bring the local analysis to bear on the problem to establish the performance result. P r o p o s i t i o n 4.3 ( B o u n d e d d i s t u r b a n c e s ) . Suppose supt> 0 b(t) 2 _< kb and
sup,>0 u*(t) 2 <_ k~.. Let ~* be the smallest real number such that IIHg(q-X)ll2:. = o and B ( : ~) # 0 for I~I-> ~'- Y ~ * < 1, then
sup I[¢(t)l[ z < t>0
max
t~"~' Ko--<m~d I1¢(-i)11' + 'o,~ + k,,
where Ran, Ko and ko are defined below. Ezponential convergence to a compact set is achieved in the sense that I1¢(t)]12 < m a x { t ~ , e - ° t K ~ omia~xaI1¢(-i)11' + k~) + k , ,
where 6, Pdm,K~ and k~o are positive constants defined below. In addition, we have7 [lu - y*ll~ --- #o + #llF(q-X)A(q-t)l~llb[]2, where #i, i = O, 1 are constants independen~ of the initial conditions. Proof. From Lemma 3.2 with K7 = 0, U*(R,N,(Lg2) -- gz g'- a [ 2 ~ + 2d+t ( K ' + k~ T r/pmax)
,
, ~+~+~"
~ - ' ~ 7 ÷ ~ - T 7 , ". . . .
Pmin
_~..+~,
The adaptive system is stable if we can find N, R, u and g2 such that
U*(R,N,a,g~) <
-
In g t
I n g~. -
In g2
"
r The 12 norm of a function z is defined so that
I1~.11~ =
~]/~(~)~ V tET
where T - - [0, 1, 2,...), i.e., Ilxl12 = Ilxll2,a. We also note that [FA[oo can be improved to IFIoo by using algorithms developed for stochastic adaptive control.
Stability of the Direct Self-Tuning Regulator
221
Some insight can be gleaned from here by exploring limits. For example, by setting @ = #*,g2 = g~ and letting R ---*co (large signals) we obtain (16)
g2 . d + l r~" .~Pmax M0 U ( c o , N , o ' , g 2 ) -- g2 - o"~ a~UPmin N '
which coincides with the ideal case using fast adaptation. It follows that bounded disturbances cannot lead to global instability. Moreover, when the signals are large relative to the external disturbances, then the adaptive system behaves roughly like the ideal adaptive controller with no external disturbances or references and fast adaptation, giving similar convergence rate and performance. Setting N = kN ° and R = nR* (R* was defined in the proof of Proposition 4.2) is convenient and gives
U*(R'N'o"g2)=B'(I+k-I)
g 2g'- a n Rk't * [2+2d+3d n+k-ln Pmax(kN* +
[/
d+
1)
\kN'+d+l
,.~ 2k~.+k.v~
By somewhat arbitrarily choosing k = 2,n > 4 and using the fact that + k.y, we get
k¢, > ku.
U*(R,N,#,g2) <
k'r [2-t- 2d+3dpmax(2N* + d + 1) Pmin2N*
g2
g2 - a nR* [
- "
and we can rewrite the condition for stability as
"G l + b--dn <-i B ' where a=2.
K~
g2
b = 2 - d - 2 Pmin
2N*
if2 ~2N*+d+l g l d - 1
Pm~ 2N* + d + 1 e = 2-de2k Pmi......~n
2N*
2 ¢pmax2N*+d+l " After some algebra we get a sufficient condition for stability. In particular, with R = max(4, n)R*, where
n=2k.rae ( I-~B. c + a b + a - 7 ( 1 ,~B e + a b + a ) z -abeB* ) - '
,
222
Ydstie
7 we get U* _< ~B and Lemma 3.1 applies. T h e most important point to notice here is that n, and hence R, scales with kT. We are now in a position to compute the bounds. Substituting into the expressions above gives
N U ° < 2m, so that, with a(0) = K~ "1 max0o
where
2rn--i
Rm = K o R + ko with K0 = g~m,
Eg~kl.
ko =
i=0
R is as defined above and the first result follows. We also have a(t) < m a x { P ~ , e-StK~a(O) + k~o} , where the constants are defined so that
R" = K;R + k~ with K~ =
and
~ \g2 /
,
=
gl
,
i=0
1 , df = - lng2 - U * ( - lng2 + l n g l ) > - ~ I n t o ,
and exponential stability of the compact set follows. T h e expressions above give horrible bounds. This is largely due to the fact that we allow r = 0. Sudden changes in the disturbance then cause large excursions in the estimated parameters when II¢(t)ll is small. Consider for example the case y* = 0 and b(t) = 0 for t < tc and b(t) = 1 for t > to. With r = 0 we then get convergence towards zero. The sudden change in the disturbance at t = te then causes all the parameters to jump to their bounds and we experience a large (but bounded) transient. One obvious way to remedy this problem is to slow down the (local) speed of adaptation by setting r > 0 and average over more samples. This idea is pursued below. By letting V(t) = O(t)'P-tO(t) we have
v(t) < v(t - 1) --
e(t)~
~
+
2.r(t)e(t) n(t)
e(t) ~ "r(t) ~ = W(t - 1) - n(t) + --- V ( ¢ - 1) - (~b(t - d)'O(t - 1)) 2
n(t)
i=1 n(i) <<-~i=l
3,(t) ~
+ n(t)"
Hence,
~ ~(i)~
(17)
,r(~)~ + v(o)
Stability of the Direct Self-Tuning Regulator
223
and
y]t (@(i- d)'O(ii=1 n(i)
1)) 2
~ t 7(0"23
< ~=1 n(i) + V(0).
The latter expression is not needed here, but will be used in the proof of Proposition 4.5. Now the tracking error satisfies u ( t ) - u*(t) = ~
+ ¢(t - d)'(O(t - d) - 0(t)).
The last term is called the swapping term. From the algorithm we have
,O(t_d)_O(t)l
I
~=0
"
and it follows that 2
(¢(t
d)'(O(t - d) - 0(0)) 2 <_ ( E q~(t- d)'P¢(t - d - i)e(t - i) ) ,.=o
Z'((t=T)
d-1 e(t-
i)2 ~(t-- ~) "
< 2a-lll~b(t- d)ll2Pmax ~
i=0
Since
l
e(t) ~ 2 .(t)]
I e(t) 2 -< r - ( 0 '
we get, using the expression for the tracking error 8
Ily(O - Y*(OIIT --< v ~ l l e ( t ) l x / ' ~ ( i ) l l T
+ II¢(t -- d ) ' ( o c t - d) - O(O)IIT
___ ~/ll~(t)ll~. + rv(0) + II¢(t - d)'(O(t
-
d)
-
0(t))llT.
Since suPt>_ d II¢(t)ll ~ < p~, where
P~ = K, max { KoR + ko,g~" ~
I,y(t)-y*(t)[]T<<_ ( 1 + ¢2d-idP,~n~-~ )
8 II. lit denotes the truncated/2-norm
of
a
function.
[[7(t)ll~
224
Ydstie
For r > 0 (slow adaptation), there exists a constant ci so t h a t / ~ and we can rewrite this so that
The result now follows by taking limits since 7(t) definitions of the gains are obvious.
< clr/pm~x
= F(q-1)A(q-1)b(t). The D
Note that it is not critical to use slow adaptation to achieve the finite gain result. W e still get a finite gain result for r = 0. However, the expressions for the gains become even more cumbersome to develop. More explicit expressions are developed in Proposition 4.5. The size of the gains depend on the speed of adaptation and the performance of the adaptive system improves as the (local) rate of adaptation is decreased. More explicit formulas can be developed to highlight this point and a more detailed analysis will show that it is advantagcous to use time varying gains. The gains should be high during transients and then they should decrease to zero during steady state performance. This analysis, when carried out in detail, shows why algorithms based on recursive least squares and fixed dead-zones work well in practical applications. Here the algorithms are usually re-initialized with high gains during a transient, using a reset and/or a variable forgetting factor mechanism. Then the gains are allowed to decrease naturally through the "covariance matrix" update, and eventually, when the algorithm reaches steady state, the estimator is simply turned off. It turns out not to be so important how the details of this general approach are implemented. We now consider an example problem which explains what kind of dynamics can be observed when the gains are non-vanishing.
E x a m p l e 4.1. (Golden and Ydstie [6]). Consider adaptive control of
y(t) = Ku(t - 1) +
(lS)
where K is an unknown gain and v is a constant but unknown perturbation. In this case the adaptive system can be re-parameterized by setting
z(t)=KS(t)(i-~.
)
1)
- (1 - -~7)P, = K2 r (y.
>Ofor
y. i~v.
_
(The projection has been omitted). The two design parameters p and r enter into and fl respectively. Note that c~ represents the noise to signal ratio multiplied with the adaptive gain and that fl is non-negative. The closed-loop adaptive system is now described by the mapping F : R --* R
x-*x+
a(1 - x) +/~x2 ----- T ,
(19)
Stabilityof the Direct Self-Tuning Regulator
225
and the system converges so that limt-~oo x(t) = 1 if and only if 0 < a/(fl + 1) < 2. We note that this imposes a strict bound on the noise to signal ratio and the adaptive gain. If this condition is violated then it can be shown that the map period doubles and the dynamics may become complicated. A fairly elaborate theory applies and can be used to trace the evolution of the attractors to a fair degree of accuracy. In particular, it can be shown that for an infinite family of parameters the Schwartzian derivative
SF(y) -
DSF(y) DF(y)
3 (D2F(y)) ' 2 \ D--'~(y)
of a transformed mapping equivalent to (19) is negative. We conclude that all points (with respect to Lebesque measure) have the same asymptotic behavior; there is exactly one absolutely continuous invariant measure and the adaptive system is stationary and ergodic. Moreover, there exists a continuous sub-family where (19) has no stable attractor of finite period. It follows that the adaptive system acts somewhat like a Markov chain with a countable (infinite) number of states, the attractor is a closed set which contains no interior or isolated points (a Cantor set) and the dynamics are chaotic [5]. A simulation showing the behavior of the process output is given in Fig. 6. The parameter estimate drifts towards the fixed point, "overshoots" and gives oscillatory behavior, reminiscent of the drift and burst phenomenon. The amplitude of the oscillation is small as seen at the process output, and it is clear that aperiodicity and lack of convergence do not lead to poor performance. By averaging over the sample paths we have, due to ergodicity,
,.-.oot -
y(i)-y*),
S
y-y*)U
= lim 1 ~
'=
7
i=1
- y,)2.
The simulation indicates that E ~ ( y - y.)2) < v2 when the projection is ineluded. [2 .s
E x a m p l e 4.2. (Praly and Pomet [11]). Consider the system
y(t) = ay(t - 1) + u(t - 1) + v. This is the same example treated in the beginning of Sect. 3. However, the set point is equal to zero. In this case the parameter drift and bursting is transient and the adaptive system converges either to a stable (elliptic) period-two attractor, which is optimal, or a period-ten attractor which yields slightly higher variance. Further discussions of this problem are given in [6] and [11]. [] Finally, we introduce the model-order mismatch. We consider the case with fast adaptation, no output disturbance and zero setpoint first. Define the ~dependent parameter
K: = l
( l + bc + ae2N" -- ~/( l + bc + ae'N" +d+ l )2 -- 4ace21V"+d+ l ) ,
226
Ydstie
2 0 -2
-4 -6 -8
process output
y
-10 -12 0
I0
20
30
40
50
60
70
80
90
I00
Fig. 6. Simulation showing parameter drift and bursting in a simple adaptive feedback system.
where
2N* ClPmingl(d+l)(g2 -- 0")
a'--
(2N* + d + 1)2d+3PmaxMeK~ b=
Cl e~Pmin (2N* + d + 1)K~prnax2 d+2
e = 2 ~ l B * ( 1 - ~ )/lF(q-X)[~o g2 C.--'~8. We have the following result on robust performance of the adaptive system when there are no external perturbations. 4.4. Suppose b(t) =- O,y*(t) =- 0, r = 0. Let a* be the smallest number such that IlH~(q-1)H2,o. < K~ and B(z -1) ¢ 0 / o r Izl _> ~ ' . Provided that a* < 1, we have Proposition
sup ,>0 II ¢(t)112 <_ a~ m (max o < i < a 11¢(-i)112 + g~lm°)
"
Moreover s the convergence is ezponential, so that ( g l ~ ~m
II¢(t)ll 2 ___ ~ /
e-"(0m_<~'2a11¢(-0112+
K;lmo)
Stability of the Direct Se•Tuning Regulator
for t > O, where 6 =
1
227
,
- ~ lng 2 .
II~(z)ll = o, which impties limy(t) = Proof. Let K~ = IF(q-X)looK~,. From Lemma 3.2 w i t h Consequently, lim
U* ( R , N , g 2 ) -
lim u(t) = 0. k7 =
ky.
= r = 0
(clK7 + 2d+ l K#d pmax
"g~
g2 -- 0" \
Pmin
~r-.
(g~ - ~) - 2c~K~
As before, we require - I n g2 In gl - In g2 " In this case the analysis works with R = 0. Setting g2 = g~ and N = 2N* yields the condition
U*(R,N, g2) <
g2 (elk 7 + g2 - o"
2d+aK¢~dPmax Pmin
K , ( 2 N * + d + 1)
)1.( gB
which, for the purpose of developing explicit bounds, we rewrite as
Kv
( 1 + ae2N.+d.T_11 _ K~b ) <_c,
where a, b, c and e are defined above. It now follows from Lemma 3.1 that
sup a(0 _< g ~ ' a ( o ) , where N ~ * = 2m and a(O) = K~ maxo
~(t) _< gl
e_,,,(0).
After substituting back we find U* _< g5 B * . Hence, 6 = lng2(1 result follows.
-~), and the []
This result is remarkable in the sense that if there are only unmodeled dynamics present, then we can guarantee exponential stability and convergence to zero error when the mismatch is not too serious. Since bounded perturbations cannot cause a global instability, this implies that the adaptive algorithm can handle some plants with zeros on or even outside the unit circle provided there is a time scale separation. (The unmodeled unstable zeros should be sufficiently fast or slow relative to the ones we model). This result does not say anything about the properties of the estimates. This result was anticipated in [8].
228
Ydstie
E x a m p l e 4.3. (Mareels and Bitmead [8]). Consider adaptive control of the system y(t) -- ay(t - 1) -4- u(t) + cy(t - 2). The parameter a is estimated and the term cy(t - 2) is left as unmodeled dynamics. The setpoint equals zero. Marcels and Bitmead show that under certain conditions the parameter estimate undergoes period doubling bifurcations and behaves erratically. Convincing evidence has been presented that the parameter estimate displays at least transient chaos. In particular, it has been shown that the fourth iterate of the mapping satisfies a set of conditions leading up to the application of the Smale-Birkhoff "Homoelinic Theorem". (It is shown that there exists a transversal intersection between the W ' (p) and W ~ (p) at a point q ¢ p, where p is a hyperbolic fixed point. A similar type of analysis is used in [6] to indicate the presence of chaos in the mapping described in Example 3.1). The Homoclinic Theorem then guarantees the existence of a hyperbolic invariant set topologically similar to a horseshoe. This again gives a strong indication, if not a proof, of the existence of a strange attractor. [] The example discussed by Marcels and Bitmead must be treated with some caution since the transformation used to define the mapping in question may be singular in the limit t = oo. In fact, whether or not the adaptive system will display persistent chaos in this example depends on how this particular problem is dealt with. Nevertheless, it is safe to say that there is transient chaos present here. Examples 3.1 and 4.1 should dispel any further doubt whether or not chaos is present asymptotically. In fact, it appears that chaos will almost always be present when adaptive controllers are applied to practical control systems. The overriding and important point to notice is that, if the algorithm is implemented properly, then the erratic behavior of the parameter estimates does not translate into poor performance. This is eminently clear in the case discussed above, since the adaptive system converges, so that limy(t) = 0 exponentially fast [8]. Proposition 4.4 generalizes this to the case where several parameters are estimated. Finally, we tie everything together. P r o p o s i t i o n 4.5. ( U l t i m a t e B o u n d e d n e s s a n d R o b u s t n e s s ) . Suppose y*(t) and b(t) E 1~; that is, there exists constants so that supt>o y*(t) ~ <_ k~. and supt>o b(t) 2 < kb. Let a* be the smallest constant so that IIn~(q-1)ll2,=. < 0.5K~* and B ( z -1) # 0 for Izl _> ,r*. ~rf~r* < 1 then the S T R with parameter projection is stable in the sense that
sup II¢(t)ll ~ < K 4, max{Rm, Ko max II¢(-i)ll 2 + ko + K g t m o } + k s , t>0
--
O
where Rm, Ko and ko are positive constants and mo overbounds the initial conditions. Convergence to a compact set which scales with respect to ku. and kb is exponential in the sense that
II¢(,)ll 2
g o max{Rtm, e-6tK6 ~<_~
Stability of the Direct Self-Tuning Regulator
229
where 6 = - ~ lng;., R/m, K~ and k~ are positive constants, independent of initial conditions. In addition, we have the following finite gain result
+ ,lF(q-X)H[(q-a)loollbll2 + lF(q- )HS(q-X)loollY*ll2,
Ily- y*lb <
where the gains Po and p are defined beiowY Proof. It is going to be convenient to choose g~ = g~. In addition, define a parameter
R,*
f2 ]¢3 -]- k" ]¢¢k"l" r/pmax ]¢*l*-Jt" i~v l
= max l. c~K:__ '
g~
'
gu
J '
where K 7 = 0.5[F(q-1)looK~ ~ O. By choosing R = R* we can (after some algebra and hindsight) rewrite the expression for U ° so that the following inequality is satisfied
U*(R,N,a, g2) <- g2g--'-L-'2a- [ 2clK7 + 2d+~K~dPmaXpmin
•exTg-~--, (g, -- a) - 2clK~ By dividing through with 2 this problem is essentially the same as that leading up to the result in Proposition 4.4. We get the bounds for transient performance by defining the constants as in Proposition 4.3 and replacing R by R* as defined above We then have sup a(t) <_R with [~ -" KoR" + ko t>0
and sup II ~(t) 112< ~
with ~
= K c R + k¢
t>0
Following a procedure similar to the one developed in Proposition 4.3 we get the finite gain result. The difference is that we will simplify the expressions by assuming that the (local) adaptation is slow enough to have
R* -- k(~ -F r/pmax K¢ (this is not critical, however it leads to more economical representations). As before, we start out by defining the tracking error, then we introduce the swapping error and relate the obtained expressions to the comparison function V(t). We get, using the same development as before (see proof of Proposition 4.3) Hy(t) - y*(t),lT <: ( l q - ¢ 2 d - l d R m ~ ' ~ ") H'f(t)HT+ Mfr~(o). 9 Ideally we would like p0 = co, p = 1 to recover the performance bound (7).
230
Ydstie
There is now an added catch since 7(t) contains the effect of the unmodeled terms and is not defined explicitly. First note that
(ll~,*(t + d)llT + 114'(0'(6(0)11T) +lF(q-~)H;(q-~)l~llb(OIIT + ~o/0 - o'). From the definition of V(t), the swapping term, and the fact that the signals are ~l~(t)lm~
[F(q-~)H;(q-~)[~
uniformly bounded we now have
II¢(t)'o(t)llr _< 11
+/~Pm~trII'dt + d)l]T + Ilk(t)'(0(0 -- 0(t + d -
--< i I + RPm'~ 117(t)llTr +d~/K.fR
+
1))lIT
k~,
+11¢(0'(o(0 - o(~ + d - 1))lIT + ~/(" + pm~Rm)V(O) • By using the expression for the swapping error from Proposition 4.3, (¢(t -
d)'(O(t - d) - O(t -
d-1 e(t --/)2 1))) 2 .<_2d-l[l¢(t - d)l[ZPm.x ~
~-(t~'~) "
i=0 Using the ultimate bounds and the slow adaptation,
etan K, (g 0 (,k, "~-.~_//Pmax~ \ K, ] + ko) - -
+ k,.
Hence
I1¢(0'(o(0 - o(t + d - 1))lit _< X/2'~-~dP.~p,.~II"r(OIIT/~ + ~ . We can then rewrite the inequality for 7(t) [[7(t)IIT _~ [F(q-1)H~(q-1)[oo [[[y*(t -t- d)[[T
d_(il.{. ~PmaXr -}-i2d-ld~m~-m-m~) H~'(')HT-{'d~/K~''I-k;
+lF(q-~)H~(q-X)l~llb(OIIT
+ molO - ~r) ,
which gives
(1 - g)i[7(QHT <- co + [F(q-1)H;(q-1)iooHy*(t + d)liT +lF(q-1)H;(q-1)loollb(OlIT ,
Stability of the Direct Self-Tunlng Regulator
231
where
g--']F(q-1)H~3(q-l)]o°(~l"}'[~nPraax"b¢2d-ld[~n'~-) eo=d~/K.yR+k,+
r + P~)V(O) +
d- ld/~mP'~~ - ~ +
-1" " ~
This we then substitute back into the expression for the tracking error developed in the proof of Proposition 4.3. However, to proceed we will need g < 1. This is automatically satisfied by the bound imposed, since Ku* will have to be quite small for this approach to work at all. From the expressions above we then finally get
I~lF(q-1)H~(q-l)looHb(t)HT -I-p[F(q-1)H~(q-1)[ooHy*(t + d)l[T,
I[y(t) - y(t)*[[T _< P0 + where po --
1-g
and 1 + ~d2d-lRmP~ "~ 1-- 9 and the result follows by extending the norms.
El
A few comments are now in order. It is clear from Proposition 4.5 that the design parameters influence the gains and that we can improve performance by tuning. It appears particularly effective to slow down the rate adaptation after a transient. In the limit (Pmax/r -'* 0) we have
l+vq~ 1 -[fHoloo
IFHoloo is quite small and K0 can be made close to one by using a separate backwards in time argument. It follows that p ~ 2 and the performance of the adaptive system is almost as good as that of the optimized non-adaptive system. This performance is achieved at the cost of a very long transient due to the small gains. One thing which is clear is that during a transient the gains should be high and the condition number of the gain matrix should be close to one. The amount of mismatch that can be tolerated by the algorithm discussed in Section 2 is very small. We believe that this is due to an overly conservative analysis and current work is aimed towards tightening up the robustness result as well as the performance bounds. We recover the ideal case (Propositions 4.1 and 4.2) since [g~]oo = 0 and b e 12 D Ioo implies (y - y*) e 121"31oo. If IH~loo # 0 and b e 1oo only, then we recover Proposition 4.3 and the dynamics become complicated. All we can say is that that (y - y*) E Ioo. If y E 12 and b E 12 then we recover Proposition 4.4 and all signals converge exponentially fast. The latter result best describes the transient behavior of the algorithm.
232
Ydstie
ExAmple 4.5. (Ydstie [13], Golden and Ydstie [5]) Consider adaptive control of the system y ( t ) + aYCt - 1) - bu(t - 1) + c y ( t - 2) + v.
The parameter a is estimated, v is here a constant perturbation and c represents an unmodeled pole. There are two sources of mismatch in the adaptive control system and the dynamics become complicated, but constrained to a small subset of the phase space. [3
5 C o n c l u s i o n s and D i s c u s s i o n In this paper it has been shown that the discrete self tuning regulator with nonvanishing gain can be applied to linear systems that are undermodeled, provided that the estimated parameters are projected into a compact convex set which contains the stabilizing parameters and excludes the possibility of dividing with zero. The latter problem implies that we know the sign and a lower bound for the high frequency gain of the modeled part of the system. The performance of the adaptive system is related to the size of the external perturbations and we obtain the "ideal result" as these tend to zero. We have established the following design guidelines: - The condition number of P should be close to one during transients. - Fast adaptation should be used (r small and/or Pmin large) initially to minimize the transient. - Slow adaptation should be used to improve local performance. - Prior knowledge of the parameter values, in particular ~min, should be built into O*. - The data vector should be filtered so that I[¢(t)H = 0 at steady state. This points towards the use of one of the robustified P~cursive Least Squares Algorithms. These usually are initialized so that the gains are high in the transient phase and then the gains decrease rapidly as the system approach steady state. It can be shown that this not only improves the transient performance, but it also gives the algorithm the ability to handle more unmodeled dynamics. The results extend readily to indirect adaptive laws like pole assignment, optimal and predictive controllers. Such control laws are used in practice, since they can be applied even when the stable invertibility condition is violated. Some care will have to be taken to ensure stabilizability of the estimated control model when such laws are applied.
Acknowledgements The author is grateful for numerous discussions with Iven Mareels at the University of Newcastle, Australia, Laurent Praly at Ecole des Mines de Paris, France, and Chris Hollot at the University of Massachusetts. Finally, I would to thank
Stability of the Direct Self-Tunlng Regulator
233
Melinda Golden, who is currently working for Simulation Sciences in Los Angeles, for many inspiring discussions while she was working on her PhD on the Dynamics of Adaptive Control Systems.
References 1. K. J. Astr6m and B. Wittenmark, "On self-tuning regulators," Automatic.a, vol. 9, pp. 195-199, 1973. 2. B. Egard, Stability of adaptive controllers, Spnnger-Vedag, New York, NY, 1979. 3. T. R. Fortescue, L. S. Kershenbaum, and B. E. Ydstie, "Self-tuning regulators with variable forgetting factors," Automatica, vol. 17, pp. 831-835, 1981. 4. G. C. Goodwin, P. J. Ramadge, and P. E. Caines, "Discrete time multivariable adaptive control," IEEE Trans. Aut. Control, vol 25, pp. 449-456, 1990. 5. M. P. Golden and B. E. Ydstie, "Parameter drift in adaptive control systems: Bifurcation analysis of a simple case," Tech. Report, University of Massachusetts at Amherst, 1990. 6. M. P. Golden and B. E. Ydstie, "Small amplitude chaos and ergodicity in adaptive control systems," Tech. Report, University of Massachusetts at Amherst, 1990. To be presented at the IFAC Symposium on Identification and System Parameter Estimation, Budapest, Hungary, 1991. 7. P. Ioannou and J. Sun, "Theory and design of robust direct and indirect adaptive control systems," lnt J. Control, vol. 47, pp. 775-813, 1988. 8. I. M. Y. Marcels and R. R. Bitmead, "Nonlinear dynamics in adaptive control: chaotic and periodic stabilization," Automatic.a, vol. 22, pp. 641-665, 1986. 9. L. Praly, "Robustness of model reference adaptive control," Proc. $nd Yale Workshop on Adaptive Systems, 1983. 10. L. Praly, Commande iineairv adaptative: Solutions bornees et leurs proprietes. PhD thesis, Universite Paris IX, Dauphine, Paris, France, 1988. 11. L. Praly and ].-B. Pomet, "Periodic solutions in adaptive systems: the regular case," Proc. lOth IFAC World Congress, Munich, vol. 10, pp. 40--45, 1987. 12. C. Samson, Problemes en identification et commande de systemes dynamique, PhD thesis, Universite de Rennes, France, 1983. 13. B. E. Ydstie, "Bifurcations and complex dynamics in adaptive control systems," Proc. 25th 1EEE Conf. Dec. Control, Athens, Greece, 1986. 14. B. E. Ydstie, "Stability of discrete model reference control-revisited," Syst. Control Lett., vol. 13, pp. 429-438, 1989. 15. B. E. Ydstie and M. P. Golden, "Chaos and strange attractors in adaptive control systems," Proc. lOth IFAC World Congress, Munich, vol. 10, pp. 127-132, 1987.
Appendix
1: P r o o f o f L e m m a
Multiplying through with
A(t)
3.2
in equation (13) we obtain
A(t)a(t q- 1) _< A(t) (era(t) Jr (¢(t)t0(t)) 2) . And by dividing through with a(t + 1) and using a(t + 1) > g~a(t),
1-
A(t) < .(t + 1------S
(20)
234
Ydstie
To determine the relative occurrence of the events A(t) = 1 we need to relate this equation to a very specific gradient property of the parameter estimator. The swapping error ~(t) = ¢(t - d)'(O(t - 1) - O(t - d)) satisfies d-1
10(t)l _< II¢(t - d)ll ~
IleC(t - d - i)e(t - i ) / n ( t - i)11, i=1 where we used the fact that Fe-{.} projects orthogonally onto 61" so that Ilfo.{O" - O}ll _< lie* - ell for all 8 • O*. Since IIP¢(t - d)ll2/n(t) < pmax w e then have, using (10)
a-1
e(t - i) I
1o(t)l < (Kca(t - d + 1) + L ,1/2_1/2 K " i=1
Here we used the fact that a(t) > re(t). By noting that ¢(t - d)'~(t - d) =
e(t) - 7(t) + r~(t) we have the inequality
I _< le(t +
d)l + IK-~a(t + d + 1) + k.vl 1/2
+ lKi~a(t + 1) + kc] 112,112
d-l[ e ( t + d - i )
[
" m ~ Z.., i=1 I n~" ~ ~-_" i)q'/21 '
so that from (20)
A(t)
1- ~
< a(t + l"------~
n(t + d)ll 2 +
+ (K¢a(t + 1) + '- ,1/2 _1/2
x~)
d-1
+ 2
le(t + i)1
1"maxE n(t i=1
+
i)1/2
Now, a(t) enjoys the same bounded rate of growth property as re(t) (Equation (12)), so that for all t and i > 0 we have i-1
a(t + i) < g~a(0 + k~,
k~ = g~ ~
~kl.
j=0 Since n(t + d) < r + pm~xll¢(t)ll 2 < r + pmax(Kca(t + 1) + ks) this gives, by re-arranging the expression above
A(t) ( 1 - ~a ) < a(t l+ 1------5[IgaKTa(t+l)+k3+kTil/2 a
[ e ( t + i ) [ ]2
+ Ir + pm~(Kca(t + 1 ) + k~){l/2 Ei=l n'~'~ i)'~'2 J "--'--'+--'~[k3 k7
<elK~+2a~t+l)+ d e(t + i) ~× ~i=1 .(t + i) '
2 al (19maxK¢ + r -{- .kC_.Pmax.'~ -a(t + 1) J
(21)
Stability of the Direct Self-Tunlng Regulator
235
where c, = 291a. From this we then have by summing over the interval [t - N, t]
'(
( 1--~2-)' ~
A(i)< Z
im$--N
i=l--N + 2d
ClK'r+ a ( / + l )
(praaxK+ + . +a ( p,,=k+ --'~) ']
e(i + #)') " (22) n-'(i-'+W ./=1
The last t e r m is the sum of squared a posteriori errors. These axe known to satisfy a smallness in the mean condition when the parameter estimates and the unmodeled terms are bounded. Below we show that e(t)2/n(t) is small in the mean on the sets where the adaptive controller is diverging. We use this property to demonstrate that the regressor measures a(t) and from this we get ultimate boundedness of the extended state when there are unmodeled dynamics. To elaborate on this point we introduce another comparison function, namely V(t) = O(t)'P-*O(t). A bit of algebra shows that with the projection we have
V(t) < V(t - 1) - e(t)2 - 2e(t)v(t) -
n(t)
i e(t) 2
Let R be an arbitrary positive number. By assuming 1, t + a~, we obtain from inequality (22)
'
(1-- ~r) Z A(') < N (clK~ldl- 2 g2 ]imt-N ,+d
x d ~
i=t-N+l
(
2n(t) --+
2v(t)2
n-'~"
a(i) >_R for i 6 It - N +
--I-2d+l
( . )R
V(i-1)-
V(i)-v z~(i) ) •
rnaxK~b=l_r-l-k'~raax , .7(i)2~
Hence,
1- ~
~
A(0_
~K~+2
i=t-N
× a [v(t x
a)p;,k
- N ) - V ( t + d) + ~(N + d +
max
K~.(i + I) + ~., ]
,-N+1
2 7 r]~min J
.
(23)
W h a t is required n o w is to show that the last term, under suitable conditions, is small. In other words, w e need to show that the regressor overbounds the unmodeled dynamics. W e use a "backwards in time" argument which goes as follows: First we show that if A(tt) = 1 for one or more events t t 6 [t -- N + 1,t + d], then the regressor overbounds the measuring.signal a(t)
236
Ydstie
on this event. This follows from equation (13) since if A(t~) = 1 then from its definition [¢(ti)'0(tk)] 2 > (g~ - a)a(tk) + k2, so that
II¢(tk)ll 2 > ~
1
lVl e
04)
((g2 - ~)a(tk) + k2).
We then use a bounded rate of growth argument to show that if R is chosen large and the model mismatch is small then I]¢(t - d)ll overbounds a(t + 1) on a limited number of preceding events, due to a bounded rate of growth property of II¢(t)ll 2 and the fact that a(t) decreases when A(t) = O. The expression ¢(t) = A(O(t))$(t - 1) + blv(t) + b2y*(t + d), derived from equation (2) and the control law (Step 2), gives an alternative description of the closed loop of the adaptive system. Here IA(O(t))I '<_ go < oo because oct ) E 0". This gives N-1
II¢(t)ll ___ KNII¢(t -- N)II + ~
KI [Iv*(t + d - i)1 + Iv(t - i)1]
i=0
for all N > 0. This bound is obviously very conservative and presents us with a serious problem when it comes to establishing tight bounds. Dividing through with a(t + d + 1) 1/2 and using a(t) >_giga(t - i) and a(t + d + 1) < g~+Xa(t) + k3 gives
I1¢(011 (g~+la(t) . . \112 <+ ~3)
M
II¢(t - N)II act + d + 1 - N ) 112
+ ~
Ko
ge
g(d-1)/2
i=0
x
1
a(t+l-i)]
+ K~+a(t+l_i)
"
This obviously gives what is needed, since we now have for all N
a(:t -'}- 1 + d -
N) l]2 - k'-~# )
(gld+la(t) + k3) 1/2 ,
i
+ K~+a(t+l_i)
Lh]
k,.
"
Stability of the Direct Self-Tuning Regulator
237
The last term defines a geometric series which converges as N --* oo (quite fast, as it turns out). It follows, by still assuming a(i) >_R,
~ p~"~
,~,_~),
.(t+l+d-N)V~-k'-kT}
,~,),
(
(gf+,a(O+k~)'/"
~_K°+
where
~'+~/" ~ R
_
'
1
At this point it is convenient to set ks = ks(g2 - cr)/g d+l. From the above and equation (24) this implies that if A(tk) = 1 for some tk E It - N , I + d], then
II'/'(',-d-OII
(g~'2"~N /,-(:d+')
(g,,+
R
for all i > 0. The expression on the right hand side is positive and makes sense if Kv is small and R is large. We now have, using expression (23) and the inequality derived above, still assuming that a(t + d - i) > R:
(
1-- cr
A(i) <_etK-r + 2 k3 + k7
T
i
2d+1 +
(
k~+
v(t - g ) - V ( t + d ) x d Pm~
X
pmax
K÷+
N
)
2g+d+
lpm~
N
Pmin
+
max ( K-ra(i+l)+k-r "~ ,-N+x__.i_<,+d \ l i f t ( / - d)ll 2 + r/Pmin]]
-<elK-r+2'k3+k-rR + 2 d + l ( K ~ + k, + ~ P m a x ) [V(t - N) - V(t + d) N t + 2(N + d + 1)(K 7 + k.r/R ) ×
dpm~
PminN
fW,) "+'+'/~-~,,+,> V~;-~ ("'-')
× k~;
We now use the inequality (a - b) ~ >_ 0.5a ~ - b~, which gives 2
)'"
,,K.,,
f
ga
>__o.s\~)
V~-~-.(~,--) -~, + ~ ,<~'+'> (~.- o)- ~,, (Ko + , ' . " R+ ~o~/ ' T ~
and this establishes Lemma 3.2.
[]
Adaptive-Invariant Discrete Control Systems Ya. Z. Tsypkin Institute of Control Sciences, Moscow, USSR.
Abstract. The paper investigates the structure and properties of selective-invariant and adaptive-invaxia~t discrete control systems in which the effects of external regular disturbances axe eliminated and those of external stochastic disturbances are essentially weakened. Possibilities of eliminating the constraints caused by multiple delays and nonminimum phase of a dynamic plant are described.
1 Introduction A great number of works [1] is devoted to the problem of invariance or, in other words, the problem of compensating the influence of external disturbances both in continuous and discrete control systems. Conditions of absolute or complete invariance, providing accurate compensation of arbitrary disturbances are, as a rule, physically unrealizable. The need to relax these invariance conditions, and ensure compensation of a known class of disturbances, has led to the notion of selective invariance in continuous systems which was introduced by V. S. Kulebakin [2,3]. Availability of a priori information from various disturbance models made it possible to design systems capable of resisting broader classes of disturbances. The theory of such control systems has been developed by Johnson [4,5] for both continuous [4] and discrete systems [5]. With an insufficient level of a priori information on disturbances it is expedient to use the adaptive approach which provides an estimate of the class of disturbances and uses it for disturbance compensation. Systems which use adaptation to achieve invariance are called adaptive-invariant systems. This paper presents design principles of ~2-adaptive-invariant discrete-time control systems. After a brief review of discrete-time control systems the character of a priori and current information about regular invariant systems is considered and structures of selective- and adaptive-invariant systems are established. Then it is shown that an insignificant modification of system structures allows us to remove constraints caused by delays and nonminimum-phase plants. Properties of selective- and adaptive-invariant systems are analyzed when the compensation condition is accurately satisfied. Finally, possibilities of constructing systems with stochastic disturbances which are approximately adaptive-invariant, that is #2-invariant, are analyzed and new problems are formulated.
240
Tsypkin
2 Properties
of Discrete
Control
Systems
Discrete-time control of continuous plants leads to difference equations which can be written in the following form
Q(q)y(n) : ql+kPu(q)u(n) -I- f(n).
(1)
At n-th time instant, y(n) is the output, u(n) is the control, f(n) is an external disturbance and q is the unit lag operator: =
- m),
m = i, 2
....
The polynomials Q(q) and Pu(q) with degrees N and NI, respectively, are such that Q(0) = 1, P~(0) ~: 0. A polynomial satisfying the condition Q(0) - 1 is referred to as a monic polynomial. To simplify the notation the lag operator and the polynomial variable are denoted by the same letter. The integer k :> 0 indicates a multiple lag. The minimum possible lag of a plant is equal to 1, and corresponds to k -- 0 [6]. This fact should be taken into account in the analysis and synthesis of discrete-time control systems. Furthermore, we shall assume that the plant is stable and minimum-phase, i.e., that all the zeros of the polynomials Pu(q) and Q(q) lie outside the unit circle Iql -< 1. For brevity such polynomials are said to be "external" [7] or stable. The requirement of plant stability is not an essential constraint, since a known unstable plant can be always stabilized by feedback. A nonminimum-phase plant imposes some definite limitations to controller synthesis [6,7]. The possibility of overcoming these constraints will be discussed in Section 7. Meanwhile, if not stated otherwise, a plant is assumed to be minimum-phase and stable. The block-scheme of a plant is given in Fig. 1.
-¢-)
.I
[ v ~ /
r
Fig. 1. Plant to be controlled.
Without external disturbances, f ( n ) = 0, the plant equation (1) becomes Q(q)u(-) = ql+kP.(q)u(-).
(2)
As an optimal controller for such a plant we define the controller which satisfies the following condition: T°(q)(Y(") - Y0(n)) = O,
n > 1 + k,
(3)
Adaptive~Invariant Discrete Control Systems
241
where T°(q) is an external monic polynomial of degree/72 > 0, with the assigned distribution of zeros, and yo(n) is the output of a reference model,
Go( q )yo( n ) = qt +~Ho( q)r( n ) ,
(4)
where r(n) is a given reference input, and Go(q) and Ho(q) are external polynomials of degrees Na and/74, respectively. It is known from [6,7] that an optimal controller for plant (1) is described by the equation
Pu(q)S(q)u(n) = T°(q)P(n) - P(q)y(n).
(5)
Here S(q) is a monic polynomial of degree k and P(q) is a polynomial of degree max(N - 1, N2 - k - 1) satisfying the polynomial equation
Q(q)S(q) + ql+~p(q) = TO(q)
(6)
~(n) -- Ho(q) r(n). Go(q)
(7)
and e(n) is defined by
The block-scheme of the optimal system described by equations (2) and (5) is shown in Fig. 2a, and its equivalent form is shown in Fig. 2b. With k = 0, which
i ..( ,)s, , q - (~)
"¢")_1
T.(q) l
J,
,
I
"(")
I P(q) I" (b) Fig. 2. Two equivalent forms of the optimal system. corresponds to S(q) = 1, we obtain from (2) and (5)
Q ( q ) y ( . ) = qP~,(q)u(n),
(8)
242
Tsypkin P.(q)u(n) = T°(q)~(n)
P(q)u(n),
-
(9)
From (6) it follows that
(1o)
Pu(q)--q-'(T°(q)-Q(q)) and (3) reduces to
T°(q)(y(n)
-
yo(n)) --
0,
n ~ 1,
(11)
where
Ho(q)r(n)
yo(.) = q G--~
"
(12)
-
In this case it is not necessary to solve (6) for S(q) and P(q), nor to include a multiple lag in the reference model. The block-scheme of the optimal system (8) and (9) is shown in Fig. 3a and its equivalent form is given in Fig. 3b. Note that for a nonminimum-phase plant the system becomes non-robust [8] due to cancellation of the polynomials Pu(q) in the plant and the controller. Thus a change in the structure of the controller is necessary. Nonminimum-phase of a plant and its multiple lag impose additional constraints upon the choice of a reference model [6,8].
r(.)j~ r l H ~ II P(q)
(~)
~1 ~
TO(q) l rT
rl P(.)
I
(b) Fig. 3. Optimal system forms in the ca~e k = 0. For completely deterministic perturbations f(n) ~ 0 the synthesis problem of optimal control for a plant described by (1), with respect to various optimality criteria, leads to the solution of corresponding variational problems [8].
Adaptive-Invariant Discrete Control Systems
243
For example, if the disturbances satisfy a known bound, then the minimax or game approaches can be used to determine the "best" result for the "worst" disturbance [9-11]. Absolutely invaxiant discrete systems under any arbitrary disturbances are, as a rule, unrealizable, because they require infinite feedback gain [12] which is not allowed because of the lag inherent in the discrete-time plants. The problem of synthesizing an adaptive-invariant discrete system consists in determining the structure and parameters of a controller, such that the control system response coincides with the response of the optimal system in the absence of external regular disturbances. Further problems consist in consideration of possibilities to synthesize the adaptive-invariant systems, where the constraints caused by multiple lag and plant nonminimum-phase would be removed. Solution of these problems will be based on a combination of the adaptive approach [7] with the theory of selective invariance [2,3] for discrete control systems. 3 A Priori
and Current
Information
on Disturbance
A priori information on the disturbance plays an important role in synthesis of the adaptive-invariant systems. Assume that disturbance satisfies a difference equation f ( n ) - D(q)f(n - k - 1), (13) where D(q) is a polynomial of degree M. In the sequel (13) is called the equation of extrapolation or prediction, while the polynomial D(q) is a prediction polynomial. The prediction equation is easily reduced to the compensation equation (1 -
q1+~D(q))f(n) = 0.
(14)
Let us call the polynomial 1 - q1+}D(q) a compensation polynomial. Equation (14) indicates the conditions under which the function f(n) is being compensated. Theorems providing the possibilities to represent a wide class of continuous functions f(n) in the form of solutions of homogeneous differential equations have been formulated by C. Shannon in his investigations of differential analyzers [13] and by V. S. Kulebakin for problems of electromechanics and the development of selective-invariant systems [2,3]. Representation of functions f ( n ) in the form of homogeneous difference equations was considered in [14]. In the presence of sufficiently complete information on disturbance f ( n ) a prediction polynomial is quite easily determined. Thus, ff f(n) - d is constant then O(q) -1, if f ( n ) = d i n + do is linear, then D(q) - 2 - q, if .f(n) = a n is an exponential function then D(q) = a, etc. In some cases it is expedient to use polynomial splines [15] instead of a conventional prediction polynomial D(q). With incomplete a priori information a prediction polynomial is known only up to its coefficients which can be reconstructed from disturbance observations. Thus, there is a need in obtaining information about the disturbance and its estimate. As a rule, direct measurement of a disturbance is impossible. Therefore an indirect adjustable model method
244
Tsypkin
will be employed which is well known in identification [6,7,16]. However, the adjustable model will now play a different role. First, its parameters are not to be identified but rather determined from known values of the plant coefficients. Second, this model is a two-input finite impulse response (FIR) model [16]. Third, it is introduced here to identify a deterministic disturbance, rather than a stochastic one. The equation of this model has the following form: #(n) -- (1 - Q, Cq))y(n) + ql+k PuCq)u(n) .
(15)
We define the error = y ( n ) - v0Cn)
(16)
and obtain from (15) that e(n) = Q(q)y(n) - ql+~ p , ( q ) u ( n ) .
(17)
By virtue of the plant equation (1) the error is equal to the disturbance: e(n) -- f ( n ) .
(18)
Thus, the two-input FIR model provides an indirect measurement of the disturbance. It would seem that only one step is left to achieve invariance - to send a signal of the measured disturbance e(n) with the oppostie sign in order to cancel the effect of the disturbance. But this signal can be sent to the plant only by the controller, and the lag in the plant will cause the difference of disturbances f ( n ) - f ( n - 1) to appear. This difference for an arbitrary disturbance will differ from zero even in the absence of a multiple lag, k = 0. This fact most evidently testifies to the unrealizability of absolute invariance.
4 Selective-Invarlant
Systems
Let us use the above indirect method of measuring disturbances for synthesis of a selective-invariant optimal system. Consider a controller equation which differs from equation (5) in having an additional signal, depending on the error e(n): P u ( q ) S ( q ) u ( , ) = TO(q)~(n) - P ( q ) y ( n ) - D ( q ) ¢ ( , ) .
(19)
Here e(n) is found from equation (17) and the polynomial D(q) is a prediction polynomial, so that e(n) = D ( q ) e ( n - k - 1). (20) Now, taking into account that e(n) = f ( n ) , we obtain the compensation condition (1 - q l + k D ( q ) ) f ( n ) = O. (21) Let us show that controller (19) provides selective invariance for the plant Q(q)y(n) = ql+kpu(q)tt(n) + f ( n ) .
(22)
Adaptive-Invaxiant Discrete Control Systems
245
Indeed, substituting u(n) from (19) into (22) and taking into account polynomial equation (6) and equations of reference model (4) and (7), we obtain
T°(q)y(n) = T°(q)yo(n) - (1 - ql+k D(q))S(q)f(q) .
(23)
But, by virtue of the compensation condition (21), the last term in (23) vanishes and the optimality condition (3) follows from (23). The block-scheme of the selective-invaxiant system, described by equation (22), (14), (17) is shown in Fig. 4. There the processes do not depend on the disturbance f(n). The system
'-'(") ,.+~,~..~¢_¢_. ~j~e.,, ~ ) ~(,.,)
Fig. 4. Selective-invaxiaat system.
is equivalent to the optimal system in Fig. 2. Comparing the controller equation (5) of the optimal system in Fig. 2 with equation (19) of the selective-invariant optimal system in Fig. 4, we conclude that it differs only in the additional signal D(q)S(q)e(n) where D(q) is a prediction polynomial and S(q) is the solution of polynomial equation (6). This signal is generated by the FIR model (15) and a predictor described by equation (20). Equation (19) with ~(n) substituted from (17) can be written in the explicit form as (1 - ql+kD(q))P,~(1)u(n) = T°(q)f'(n) - [P(q) + D(q)S(q)Q(q)ly(n).
(24)
This equation reveals the role of the polynomial D(q) which is closely connected with the main ideas of the K(D)-transform used by V. S. Kulebakin for selective invariance [3]. Equation (24) is also close to the one used by C. Johnson [5] to construct control systems which accommodate disturbances. However instead of a two-input FIR model, [5] uses a reduced-order Luenberger observer and does not point to the connection with selective-invariance results.
246
Tsypldn
5 Adaptive-Invariant Systems If the a priori information on disturbances is not sufficient to determine in advance all predictor polynomial coefficients, then (19) will represent a polynomial with unknown coefficients 0, and so instead of (19) we shall have P u ( q ) S ( q ) u ( n ) = T°(q)~(n) - P ( q ) y ( n ) - D(q, O)S(q)e(n).
(25)
An estimator of the coefficients 0 = O(n) from the observations e(n) = f ( n ) can be realized using the adaptive approach [19,7], by means of a recurrent algorithm. To construct an algorithm, we consider a prediction equation similar to (20), namely ~(n) = D(q, 0),(n - k ~ 1), (20) where 0 is a vector of coefficients of the polynomial D(q,O) of dimension M, equal to a polynomial degree. Introducing the vector of observations' CkCn) = ( e ( n - - k -
1 ) , e ( n - k - 2 ) , . . . , e C n - k - M)) T,
(27)
(26) is written in the following form:
~(.) =
0TCk(.).
(28)
Using [7,19] a general recurrent estimation algorithm is OCn) = O(n - 1) + FoCn) [e(n)
-
0T(n
--
1)¢~(n)] Ok(n),
(29)
where Fo(n) is a gain matrix. In the simplest Kaczmarz algorithm [6,20] this matrix is to(n) =
[¢6T (n)¢k(-)] - 1 Z,
(30)
where I is a unit matrix. For the orthogonal projection algorithm [7,16,17] we have Fo(n) = [ ¢ [ ( n ) F o ( n ) ¢ k ( n ) ] - I F ( n ) F ( n ) - F ( n - 1) - F ( n - 1 ) ¢ k ( n ) ¢ T ( n ) F ( n -- 1) ,
CT(n)F(n - 1)¢k(n)
(31) (32)
if ¢ W ( n ) F ( n -- 1)¢k(n) # 0. Whenever C W ( n ) F ( n -- 1)¢k(n) = 0, then instead of (31) and (28) we let O(n) - 0 ( n - 1),
F ( n ) = F ( n - 1).
(33)
In practice the condition CW(n)F(n -- 1)¢k(n) ¢ 0 is usually replaced by the condition C W ( n ) F ( n - 1 ) ¢ k ( n ) > 6o,
where $0 is some positive number [7,20]. The estimation speed of the Kaczmarz algorithm can be increased by using a modification [21,22]. The orthogonal projection algorithm provides the estimate of coefficients in a finite number of steps
Adaptive-Invariant Discrete Control Systems
-\
247
~
N .f
~)
"X.lJ
E n)
F
~ Fig.
5.
4
Adaptive-invariant system.
[7,20]. The block-scheme of the adaptive-invariant system, described by equations (22), (25) and (24) is presented in Fig. 5. In contrast to the selective-invariant system, the system in Fig. 5 has an adaptive loop which accomplishes parameter estimation in the part of the controller which compensates the disturbance. The adaptive loop identifies the disturbance or, to be more precise, implements a prediction of the disturbance. The adaptive-invariant system becomes equivalent to the optimal system by the end of the adaptation process. Robustness of the adaptive-invariant system coincides with the robustness of the optimal system to which it is asymptotically equivalent. As it is well known [23,24], these optimal systems are robust, i.e., their properties change slightly for small changes in the plant and (or) controller. Hence the adaptive-invariant systems also possess the property of robustness.
6 Elimination of Constraints Caused by the Time Lag The presence of a multiple lag in the plant, k > 0, causes the controller to be more complex. Its order, defined by the degree of polynomial Pu(q)S(q), is equal to (NI + 1 + k), and it increases with the growth of k, while for k = 0, we saw that (9) acquires an extremely simple form. In this case there is no necessity in solving the polynomial equation with respect to S(q) and P(q), since S(q) -- 1 and P(q) is explicitly expressed in terms of T°(q) and Q(q). Let us now consider
Tsypkin
248
the problem of eliminating the influence of a multiple lag on the processes in the control loop. We will prove that a minor change in the equation of the controller of selective- and adaptive-invariant systems allows us to solve this problem. Consider the equation of plant (1)
Q(q)y(n) = ql+kP~,(q)u(n) + f ( n ) .
(34)
The block scheme of this plant (Fig. 1) with no disturbance can be represented in the form (Fig. 6) where the multiple lag block is isolated. To eliminate the
"(") l'l
IIv("+k)"['l
]
Fig. 6. Isolating the multiple lag block. influence of a multiple lag in the feedback loop, the input into the controller should be y(n + k) rather than y(n). This input cannot be obtained directly, but instead it is generated indirectly by means of the two-input FIR model used in the selective-invariant system. Denoting the difference of the two signals by z(n) = y(n) - y(n + k),
(35)
and the error of the simplified model by ,°(n) = O(q)y(n) - qPu(q)u(n),
(36)
Q(q)z(n) = e°(n).
(37)
we obtain z(n) from The equation of an optimal controller of the selective-invariant system, in which the influence of a multiple lag is eliminated, is represented in the following form:
P~,(q)u(n) = T°(q)~(n) - P(q)(y(n) - z(n)) - n(q)e(n)
(38)
where, as it has been seen earlier in (17),
e(n) = Q(q)y(n) - qa+~Pu(q)u(n).
(39)
The prediction polynomial D(q) is such that (1
-
ql+kn(q))f(n) = 0,
(40)
and z(n) is found from (36) and (37). Here we have in mind that the polynomial P(q) is determined by (10). Now the feedback signal is equal to y(n) - z(n) = y(n + k) as is seen from (36). Eliminating u(n) from (34) and (38) and taking into account (36), (37), (33) and (40), we obtain the optimality conditions (11), (12) for the optimal system without disturbance and multiple lag.
Adaptive-Invariant Discrete Control Systems
249
[-~],~
y(n + k)
a)
b)
Fig. 7. Selective-invariant system with multiple lag.
The block-scheme of a selective-invariant system without a multiple lag influence is shown in Fig. 7a, where the two previous models are combined. The block-scheme in Fig. 7b is equivalent except for the location of the lag element outside of the feedback loop. The processes in these systems are shifted relatively to another for an interval equal to the multiple lag. The idea of eliminating the lag influence in continuous systems on the basis of introducing dynamic models of a plant with and without lag was proposed by O. Smith [25,26]. It was also used in discrete systems [27-29]. The adaptive-invariant system which eliminates the influence of multiple lag differs from the selective-invariant system (Fig. 7a) by the presence of an adaptation loop similar to that shown in Fig. 5.
Tsyp]dn
250
7 Elimination of C o n s t r a i n t s Caused by N o n m i n i m u m Phase Plant The optimal controller becomes more complex for a nonminimum-phase plant. Modifications of the appropriate equations are presented in [6] and they will not be discussed here. Instead, we will consider the possibifity of eliminating such complexities. We shall prove that even for a nonminimum-phase plant a relatively simple controller can be used, changing slightly the process in the optimal system with a minimum-phase plant. The equation of a nonminimum-phase plant with no multiple lag, k = 0, is found from (34):
Q(q)y(n) = qPu(q)u(n) + f(n).
(41)
But now the polynomial Pu(q) has zeros lying both outside and inside the unit circle [q[ = 1. This polynomial can be represented as a product of the external P+(q) and internal P~(q) polynomials, i.e.,
Pu(q) = P+(q)P:(q).
(42)
Introduce the notation of a normalized inverse polynomial [6]:
[~:(q) : lqtPu(q-1),
(43)
where g is its degree. Then the polynomial
[~ (q) = P+(q)P~ (q)
(44)
will be external. From (44) and (43) there follows the equality
P~,(q) Pu(q-1) p.(q)
~2
(45)
=
The division of the polynomial/5(q) by the polynomial P~,(q) is realized with the help of a single iteration of factorization [30] with no need for explicitly determining polynomials zeros. Making use of the notation/5~-(q) in (43), we represent the block-scheme of a nonminimum-phase plant described by (41) without disturbances in the form shown in Fig. 8. In this ease the nonminimum-phase
~(.)
Fig. 8. Nonminimum-phase plant without disturbances.
y(n)
Adaptive-Invariant Discrete Control Systems
251
element is isolated. Its transfer function is equal to . With q = e -J~ it follows from (45) that the amplitude frequency of this element is constant, that is it does not depend on the frequency w [6]: yp(n) = P~(q)utn~ Q(q) , ,.
(46)
It can be isolated by changing appropriately a two-input model, and then used as in Section 6 to eliminate the lag influence. In this ease the equation of an optimal controller in the selective-invariant system, where the influence of nonminimum phase is eliminated, has the form Pu(q)u(q) -- T°(q)~(n) - P ( q ) ( y ( n ) - z ( n ) ) - D(q)e(q),
(47)
where e(n), z ( n ) and Q ( q ) z ( n ) = e°(n), are defined from the equations e(n) = Q ( q ) y ( n ) -- qPu(q)u(n) q- f ( n ) . z(n)
=
y(n) --
=
P"(q)
y(n) -- Q(q)
, ,
e°(n) -- Q,(q)y(n) - qPu(q)u(n) -I- f ( n ) .
(48)
(49) (50)
The compensation condition (40) is fulfilled and the polynomial Pu(n) is found from equality (10). The block-scheme of a selective-invariant system with the eliminated influence of nonminimum-phase is shown in Fig. 9a, and the block-scheme which is equivalent to it is given in Fig. 9b. It differs from the optimal control system of a minimum-phase plant in that the phase change is outside the control loop. Of course in this case yp(n) will differ from y(n). All we can say is that amplitudefrequency spectra of the processes yp(n) and y(n) will coincide. This is the price we paid for nonminimum-phase influence elimination. Moreover, elimination of the influence of nonminimum phase and plant lag can be combined if necessary changes are entered in the utilized two-input model. By introduction of the adaptive loop into the selective-invariant system we obtain the adaptive-invariaat system to which the optimal system will be equivalent after the completion of the adaptation process. 8 Approximate Systems
Realizations
of Adaptive-Invariant
Assume that a disturbance f ( n ) is generated by a piecewise continuous function f(t) but not by a continuous one. In this case compensation equation (14) acquires the form (I -
q1+~D(q))f(n)=
x(n),
(51)
where x(n) are jumps caused by the change of intervals of the continuous functions f(t).
252
Tsypkin
b)
Fig. 9. Selective-invariant system with a nonminimum-phase plant.
In this case the optimality condition (3) is replaced by
T ° ( q ) ( y ( . ) - y0(.)) = x ( n )
(52)
I f the change of intervals of f(t) occurs very seldom then the influence of x(n) on the processes in the system is insignificant. In those cases where the disturbance f ( n ) is changing so slowly t h a t the first difference A f ( n ) ----f(n) - - f ( n - 1) is sufficiently small, it is sufficient to assume D(q) -- 1. In this case there will be no need for adaptation and we come to the realization of the simplest selective-invaziant system. If the first difference is not very small but the value A f ( n ) -- a A f ( n -- 1) is small, which is with a - 1 equal to the second difference, then we should assume
Adaptive-Invariant Discrete Control Systems
253
a prediction polynomial of the following form
D(q) = (1 + ~) - ~q.
(53)
Tuning of the parameter a is done by the adaptation loop. In this case we realize the simplest adaptive~invariant system. 9 cr2-Adaptive-Invariant
Systems
Up to now it was assumed that external disturbances f ( n ) are determinatistic. Let us now consider that the disturbances ~(n) are stochastic and they satisfy the conditions E{f(n))=0, E{~(n) 2 } = a ~ . (54) For simplicity we consider a minimum-phase, stable plant without multiple lag (k = 0) and write its equation in the form Q ( q ) y ( n ) = qPu(q)u(n) T ~ ( n ) .
(55)
As it follows from [8,20,31] an optimal controller, minimizing the criterion of the generalized mean square error
(56) coincides with the optimal controller (9) for the same plant but with no disturbance. However, in contrast to (11) we now have
(57)
T ° ( q ) ( y ( n ) - yo(n)) = ~(n),
and, consequently, a minimum value of criterion (56) is equal to the variance of the random disturbance: Jmio ---- E
-- y 0 ( n ) ) ] 2 }
---- E
----
(58)
The block scheme of the optimal system with stochastic disturbance is shown in Fig. 10. It should be noted that, as opposed to systems with deterministic disturbances, where the transient processes were of main interest, in the systems with stochastic disturbances the steady-state processes are in the center of attention. With the stochastic disturbances the notion of invariance loses its meaning. However, the problem of synthesizing the system, that is adaptive-invariant up to a s can be formulated, i.e., the system for which the optimality criterion Jmin reaches the smallest possible value. We assume a priori that the disturbance ~(n) is determined by the autoregression (AR) equation [32,16]: ( ( n ) = C ( q ) ( ( n - I) -i-e(n).
(59)
Tsypkin
Fig. 10. Optimal system with stochastic disturbance.
Here ~ ( nis) a sequence of independent identically distributed random values, satisfying the conditions
E { e ( n ) ) = 0,
E { ~ ( n ) &( nm ) } = 0
with m = 0 with m # 0,
(60)
and C ( q ) is an external polynomial of degree M, with unknown coefficients. This polynomial is actually a prediction polynomial. The variance of ( ( n ) is equal to the minimum of the optimality criterion (58). Substituting C(n) from (59) in (58) and taking into account (60) and equalities E { ~ ( n ) & ( nm ) } = 0 with m >_ 0 we obtain
whence it follows that always
AR-equation (59) leads to the form
This is the equation of incomplete compensation while the polynomial 1- qC(q) of degree M 1 is referred to as a polynomial of incomplete compensation. The problem of synthesizing the a2-adaptive-invariant system consists in determining the structure and parameters of a controller which reduces J,;, = u; to smallest possible value Jmi,= u2. The equation of an optimal controller of the a2-adaptive-invariant system for the plant
+
Q(n)y(n)= nPu(n)u(n)+ C(n)
(64)
Adaptlve-Invariant Discrete Control Systems
255
can be represented in the form Pu(q)u(n) -- T°(q)~(n) - P(q)y(n) - C ( q , O)e(n) ,
(65)
where e(n) is the error e(n) = Q(q)y(n) - qPu(q)u(n),
(66)
which, as it becomes obvious from the comparison of (64) and (66), is equal to the disturbance
,(n) = ¢(n).
(67)
The prediction polynomial C(q, 0) in (65) is not fully determined. The vector 0 of its coefficients is unknown. To obtain the estimate O(n) we shall use the estimation algorithms in the presence of noise [16]. Introducing the observation vector ¢(n) : (e(n - 1) . . . . . e ( n - M)) T ,
(68)
prediction value of e(n) denoted by ~(n) is ~(n) = c ( q , o)~(n - 1) =
OT¢(n).
(69)
The best prediction will be realized with minimization of the prediction criterion
=
} -
{
or, which is equivalent, satisfying the following condition: VJpr(0) = - E { [e(n) - owtb(n)] ¢(n)} = 0.
(71)
On the basis of the adaptive approach [19,16] we can select an estimation algorithm, for example, of the least squares type: 0(n) -- O(n - 1) + / ' ( n ) [e(n) - 0We(n)] ¢(n), /'(n) = / ' ( n -
O(0) ---- 00
1) - ['(n - 1 ) ¢ ( n ) C ( n ) I ' ( n - 1) F(0) --- a i r, a >> 1. 1 + C W ( n ) F ( n - - 1)¢(n) '
(72)
This estimation algorithm is optimal on the class of disturbances e(n) with bounded variance [16]. It realizes identification of the disturbance ((n). The block-scheme of the a2-adaptive-invariant system, described by equations (64)(66), (72) is shown in Fig. 11. The adaptation loop corrects the controller in order to decrease the influence of the disturbance. An equivalent block-scheme for the a2-adaptive-invariant system has a similar form shown in Fig. 10 after a disturbance ((n) was replaces by e(n). If ((n) is a sequence with independent increments [32] then (1 -
q)~(n) -- ~ ( n ) ,
which corresponds to C(q) -- 1 in (59). In this case the optimal system is unfit for operation since ~ = c¢. Assuming that Jmin = ¢x~ in (65) we obtain C(q, 0). The
256
Tsypkin
B~ Fig. 11. a2oadaptive-invariant system.
(n)=(1
D"
Fig. 12. Special case: ¢(n) with independent increments.
-
q)-tc(n)
Adaptive-Invariant Discrete Control Systems
257
block-scheme of the selective-invariant up to ~ system is shown in Fig. 12. There is no adaptation loop in it and the prediction equation realization is extremely simplified. The equivalent block-scheme for the selective-invariant system has the form similar to that shown in Fig. 10 after a disturbance ~(n) was replaced by e(n). In this case the selective invariance up to a2 turns the unfit original system into an optimal system with a minimum error variance. If ((n) is a stochastic disturbance, determined by a moving average autoregression equation (ARMA) [16,32]
((n) -- C(q)((n - 1) + s(n) + B(q)¢(n - 1), where B(q) is an external polynomial, then instead of (63) we obtain the equation of incomplete compensation:
( 1-
B(q) ¢(-) =
(73)
q l + qB(q) ]
and the prediction equation is ~(n) -
C(q)
.4- R(~"l
'---''~n - 1). 1+ qB(q) "
Now, instead of the prediction polynomial there appears a prediction transfer function. The block-scheme of such a a2-adaptive-invariant system coincides with the one shown in Fig. 11, if in the adaptation loop we use C(q) + B(q)
1 + qB(q)
instead of C(q, 0). Surely, in this case the algorithms become complex, because it is necessary to estimate the coefficients of both C(q, O) and B(q, 0). The results obtained can be generalized to the case when the plant possesses a multiple lag. However, we shall not spend time on describing this procedure. Due to incomplete compensation of a stochastic disturbance, eliminating the constraints caused by multiple lag and nonminimum phase will lead to a transformation of the disturbance e(n) and the variance of this transformed disturbance will be greater than a 2. This is a price to be paid for constraints elimination. 10 S o m e
Generalizations
Assume that the disturbance is the sum of a deterministic, f i n ) , and a stochastic disturbance, ¢(n). In this case the compensation equation (14) with ~(n) is replaced by the equation of incomplete compensation (1 - D(q))(f(n) + ~(n)) = (1 - D(q))~(n).
(74)
Thus, it follows that in a selective-invariant system a stochastic disturbance is transformed and the variance of the transformed disturbance is increased because
E {[(1-
> E
=
(75)
258
Tsypldn
Let us find another prediction polynomial with which the variance of the transformed stochastic disturbance would be smaller: (1 - qD(q))(1 - qC(q))(f(n) -I- ~(n)) - (1 - qC(q)) [(1 - qD(q))f(n)] = (1 - qD(q))[(1 - qC(q))f(n)].
(76)
Taking into consideration the compensation equation of a deterministic disturbance (14) with k = 0 and the condition of incomplete compensation of the stochastic disturbance (63) we obtain from the equation of incomplete compensation (1 - qD(q))(1 - qC(q))(f(n) + ~(n)) = (1 - D(q))¢(n).
(77)
It is obvious that E { [ ( 1 - qD(q))¢(n)] 2) < E {[(1 - qD(q))~ (n)] z }
(78)
0.2 < E{[(1-qD(q))¢(n)] 2} < 0.~.
(79)
and, hence:
Thus, using the equation of incomplete compensation (76), the variance of the transformed stochastic disturbance is decreased in comparison with ~r~, but it does not reach the smallest possible value 0,2 . Noting that (1 - qD(q))(1 - qC(q)) = (1 - q( D(q) T C(q)) ) + q2D(q)C(q) , it follows that the prediction polynomial in this case is
A(q) -" D(q) + C(q) - qD(q)C(q) .
(80)
In the adaptive-invariant system we must estimate the coefficients of this polynomial. They enter (80) nonlinearly, or ff overparameterized, they become dependent, which makes the algorithm more complex. Essential simplification is attained in the approximate realization of the invariant systems described in Sections 6 and 10. Thus, assuming D(q) -- 1 in (80) we obtain
A(q) = i q- (1 - q)C(q).
(81)
If, in addition, f(n) is a slowly changing deterministic disturbance, then we can assume A(q) --- 1. The processes in this system will be weakly dependent on
f(n).
Adaptive-Invariant Discrete Control Systems
259
11 E x a m p l e s Consider a dynamic plant described by the equation
Q(q)y(n) = qPu(q)u(n) 4- ((n) + f ( n ) , where
Q(q) = 1 + 0.5q + 0.2q2;
Pu(q) = 1.1 -b 0.3q,
(82)
to which a regular disturbance f ( n ) and (or) a stochastic disturbance ((n) are applied. The regular disturbance is a periodic saw-tooth signal 0.4(n - 50rn)
50k - 12.5 ~ n < 12.5 + 50k,
-0.4(n - 25 - 50k)
12.5 + 50k _< n < 37.5 + 50k,
fl(t) =
(83)
where k is an integer; or the sum of a periodical saw-tooth signal and a harmonic disturbance f ( n ) -- fl(n) 4- f2(n), (84) where
27fn f2(n) = 5sin 25
(85)
Alternatively, the disturbance f2(n) can be stochastic, f2(n) = ((n), with ((n) described by the equation of the form (63), or 1
((n) = 1 - qV(q) e(n)'
C(q) = 2 - q,
(86)
where e(n) satisfies conditions (60) with a = 1. The reference model is determined by equation (4) with
H°(q)=G°(q)=l,
k=O,
yo(n)=qr(n)=r(n-1).
The signals ~(n) = r(n) and yo(n) are presented in Fig. 13. For selective-invariant systems (as it follows from (19)) with k = 0, S(q) = 1, the equation of a controller takes the following form
P~,(q)u(n) = T°(q)~(n) - P(q)y(n) - D(q)e(n) e(n) = Q(q)u(n) - qP,,(q)u(n) ,
(87) (88)
where T°(q) = 1 + 0.5q, Q(q) and P~(q) are assigned by equation (82):
P(q) = q-l(T°(q) - Q(q)) = 0.3 - 0.2q (89) D(q)
= 2 -
q.
Under the control law (87) the responses of the plant y(n) and the reference model yo(n) to f ( n ) (83), which is equal to the sum of regular disturbances (83), (85) (Fig. 14) are presented in Fig. 15. With D(q) = 0, the error of a conventional system, shown in Fig. 16, is increased when compared with Fig. 15.
260
Tsypkin
1.2
J
~
t
i
Reference Model
0.4
- ..........
i
-0.4
-0.8
-1,2
40
80
n
Fig. 13. Reference model signals r(n) and
2.4
~.6
'1
'
r
I
'
L
120
160
200
yo(n).
i
Disturbance
0.8
--
0
-08
-16 I
40
Fig. 14. Regular disturbance
80
f(n).
n
120
:
160
r
200
Adaptive-Invariant Discrete Control Systems
1,2
0.8
261
I
(
Output ~y(n) .......... Yo(n)
0.4 :=" 0 ~-0,4 -0 8 I
-1,2~ 0
4O
,
J
'
f
BO
r
n
I
I
120
J
i
~
p
,
I
160
200
i
1.2
0.8
-0.8
-I .2 J
0
40
80
,
120
n
Fig. 15. Selective-invariant system output and control.
~BO
200
Tsypkin
262
2.Ski
i
2.o/~
~
i
[
Ordinary Output
,sf/~
....... ,
& os
II ~
[
I
/.~
~
l
1
I \
~ln)
,.oV....... I >'.
=
.............. ~o,o,t.......t~
"
o
.......
-1.0
.
.
.
.
"....
.
.
.
...
.
• 1.5
n
i
I
i
I
I
I
J
i
I
( ~
°,L
0,4
~0
-0.4
-0.8
-'L21
I
40
80
I
120
n
Fig. 16. Conventional system output and control.
f
q
160
:
200
Adaptive-Invarlant Discrete Control Systems
263
Finally, we assume that an additional stochastic disturbance (86) (Fig. 17) was applied along with regular disturbance (84) to a plant. In this case the error of the selective-invariant system, as shown in Fig. 18, is quite close to that of the nominal system, i.e., of the system in which regular disturbances are absent and instead of the stochastic disturbance ~(n) we apply to the plant the disturbance e(n) whose variance is close to 1. A conventional system (with D ( q ) = 1), as it can be seen in Fig. 19, is not applicable. Control actions in selective-invariant and nominal systems essentially differ from one another (see Fig. 18 and Fig. 19). The reason is that the variance ~(n) in the course of time tends to infinity. We shall limit ourselves to these simple examples as illustrations of the principle of selective invariance.
1.4
1.2
A c
0.~
0,4
1 40
r
I 80
Fig. 17. Stochastic disturbance ¢(n).
J fl
r 120
I
f 160
200
264
Tsypkin
0.8 0.4
-0.4 -0.8 -1.2
I
40
8O
I
120
n
160
/
-0.2
2OO
k
-0.4
-0.6
-0.8
-
-1.0
-1.2
J
F
40
I
I
80
t n
r
I
120
Fig. 18. Selective-invariant system output and control.
I
160
i
200
265
Adaptive-Invaxiant Discrete Control Systems
1.2
i
0.8
~'- 0.4
A
02
i
I
~
1.0
0.6
I
'
I
i
I
Ordinary Output ~ _
/22 \
-0.2
r
t
!
I
40
I
i
I
i
f I
80
/
\ I
n
I
i
I
r
I
120
200
160
.I
I
~
F
i
4.0 3.2 .~ 2.4
0,~ r
40
80
n
t
120
F i g . 19. Conventional system o u t p u t and control.
I
160
200
266 12 C o n c l u d i n g
Tsypkin Remarks
The main idea of designing adaptive-invariant systems consists in utilizing the error e(n) between the outputs of the plant and the two-input FIR model. This error is equal to the external action f(n) at each moment of time n. Because of the plant lag in discrete systems this signal cannot be directly used to compensate a disturbance. Therefore, a predictor is introduced with which a control law a designed to fulfill the task of disturbance compensation. With the deterministic predictor we come to selective-invariant systems. If the a priori information is not complete and the predictor parameters have to be tuned with the help of the adaptation loop, then we come to adaptiveinvariant systems. These selective- and adaptive-invariant systems are equivalent to corresponding optimal discrete systems without deterministic disturbances. The presence of a stochastic disturbance allows realization of selective or adaptive invariance up to a2. These a2-adaptive-invariant systems are equivalent to corresponding discrete systems, minimizing a quadratic criterion of a generalized error. An attempt to attain in discrete systems the absolute invariance with respect to any deterministic and random disturbances is groundless, because it is impossible to compensate white noise, and forcing the corresponding transfer function to tend to zero via high feedback loop gain is precluded due to the time lag in the plant. Thus, without a priori information on external disturbances their elimination or compensation is impossible in the discrete systems, that is: "there is no invariance without prediction". Utilization of a priori information on disturbances allows the synthesis of selective- or adaptive-invariant systems (with deterministic disturbances) and selective or ~2-adaptive-invariant systems (with random disturbances). The described approach can be, obviously, extended to the uncertainty caused by unknown plant parameters. The usual adaptive approach was connected with identification of plant parameters [6,7,31]. As the basis of our adaptive-invariant systems we utilized the principle of disturbance isolation by means of a two:input FIR model and formulated conditions under which we can design, accurately or approximately, adaptive-invariant and a2-adaptive-invariant systems. With a sufficiently high level of a priori information there may be no need for adaptation and we return to classical selectiveinvariant and cr2-selective-invariant systems. Acknowledgement The author expresses his gratitude to P. V. Nadezhdin for fruitful discussions.
Adaptive-Invaxiant Discrete Control Systems
267
References 1. A. I. Kukhtenko, "The main stages of the invariance theory formation, Parts 1 and 2," Automatika, 1984, no. 2, pp. 3-13; 1985, no. 2, pp. 3-14. 2. V. S. Kulebakin, "On the behavior of the continuously disturbed automatic linear systems," Doklady o/the USSR Academy o] Sciences, col. 68, no. 5, pp. 73-79, 1949. 3. V. Kulebakin, "TheK(D) transform and its practical application," Trudy o] the N. Zhukovsky VVIA, col. 695, pp. 59, 1958. 4. C. D. Johnson, "Theory of disturbance accommodating controllers," in Advances in Control and Dynamic Systems, Chap. 12, New York, NY: Academic Press, 1976. 5. C. D. Johnson, "Discrete-time disturbance accommodating control theory with applications to missile digital control," J. Guidance and Control, col. 4, no. 2, pp. 116-125, 1981. 6. Ya. Z. Tsypkin, "Nonminimum phase in discrete control systems," Itogi nauki i tekhniki., pp. 3-40, 1989. Ser. Tekhn. kibernetiki. Moscow: VINITI, T. 18. 7. Ya. Z. Tsypkin and E. D. Aved'yan, "Discrete adaptive control systems for deterministic plants," Itogi nauki i tekhniki., pp. 45-78, 1985. Ser. Tekhn. kibernetiki. Moscow: VINITI, col. 18. 8. L. N. Volgin, Optimal Discrete Control of Dynamic Plants. Moscow: Nauka, 1986. 9. E. D. Yakubovich, "Solution of an optimal control problem by means of discrete linea~ system," Avtomatika i telemekhanika, col. 9, pp. 73-79, 1975. 10. A. B. Kurzhansky, Control and Observation under Uncertainty. Moscow: Nauka, 1977. 11. V. M. Kuntsevich and M. M. Lytchak, Synthesis o] Optimal and Adaptive Control Systems. Game Approach. Kiev: Naukova dumka, 1985. 12. A. I. Kukhtenko, Problem o] Invariance in Automatics. Kiev: Uk. SSR: GITL, 1963. 13. C. Shannon, "Mathematical theory of the differential analyzer," Journal o/Mathematics and Physics, col. 20, no. 4, pp. 337-352, 1941. 14. V. A. Nikol'sky and I. P. Sevastyanov, "q(e)-transformation of lattice functions discrete systems," in Automatics and Electromechanics, pp. 30-36, Moscow: Nauka, 1973. 15. J. H. Ahlberg, E. N. Nilson, and J. L. Walsh, The Theory o] Splines and Their Applications. New York and London: Academic Press, 1967. 16. J. Zypkin, Grundlagen de, In]ormationellen Theorie der Identification. Berlin: VEB Verlag Technik, 1987. 17. A. G. Ivakhnenko, Electrical Automatics. Kiev: Uk. SSR: Gostekhizdat, 1957. 18. B. M. Mensky, lnvariance Principle in Automatic Regulation and Control. Moscow: Mashinostrojenije, 1972. 19. Ya. Z. Tsypkin, Adaptation and Learning in Automatic Systems. Moscow: Nauka, 1968. 20. G. C. Goodwin and K. S. Sin, Adaptive Filtering, Prediction and Control. Englewood Cliffs, N J: Prentice Hall, 1984. 21. E. D. Aved'yan and Ya. Z. Tsypkin, "The generalized Kaczmarz algorithm," Avtomatika i telemekhanika, col. 3, pp. 71-84, 1979. 22. Y. Marld, "Convergence acceleration of the Kaczmarz algorithm in the case of input process correlation," Avtomatika i telemekhanika, col. 8, pp. 70-73, 1980. 23. K. J. AstrSm, Introduction to Stochastic Control Theory. New York, NY: Academic Press, 1970.
268
Tsypkin
24. V. A. Yakubovich, "Optimization and invariance of hnear stationary control systems," Avtomatika i telemekhanika, vol. 8, pp. 5--45, 1984. 25. O. Smith, "Closing control of loops with dead time," Chemical Engineering Progress, vol. 53, no. 3, pp. 217-219, 1957. 26. O. Smith, Feedback Control Systems. New York, NY: McGraw-Hill, 1958. 27. Ya. Z. Tsypkin, "Lag influence compensation in sampled-data systems," Theory and Applications of Discrete Automatic Systems, pp. 157-171, 1960. Moscow: Izd. of the USSR Academy of Sci. 28. H. Gorecki, Analiza i synteza ukladow regulacji z opoznienie. Warszawa: Wydawnictwa Naucowotechniczne, 1971. 29. E. F. Vogel and T. E. Edgar, "A new dead-time compensator for digital control," in Proceedings of the ISA, Annual Conference, pp. 29-46, 1980. 30. Z. Vostry, "New algorithms for polynomial spectral factorization with quadratic convergence," Kibernetika, vol. 11, no. 6, pp. 415--422, 1975. 31. Ya. Z. Tsypkin, Optimality in adaptive control systems. Uncertainty and Control, Lecture Notes in Control and Information Sciences, vol. 70, Berlin: Springer Vetlag, 1985. 32. T. W. Anderson, The Statistical Analysis of Time Series. New York, NY: John Wiley, 1971.
Stochastic Adaptive System Theory: Recent Advances and a Reappraisal * W. Ren and P. R. Kumar Coordinated Science Laboratory and Department of Electrical and Computer Engineering University of Illinois, Urbana, IL 61801, USA.
A b s t r a c t . Progress has been made in the past year towards the solution of several long standing open problems in stochastic adaptive systems for identification, signal processing and control. We provide an account of these recent advances and a fresh reappraisal of the field. This paper divides itself naturally into two parts. Part I considers identification, adaptive prediction and control based on the ARMAX model. Recent results on the self-optimality of adaptive minimum variance prediction and model reference adaptive control for general delay systems are presented. Both direct and indirect approaches based on non-interlaced extended least squares as well as stochastic gradients algorithms are considered. We emphasize the use of a generalized certainty equivalence approach where the estimates of disturbance as well as parameters are utilized. We also show that self-optimality in the mean square sense in general implies self-tuning, by exhibiting the convergence of the parameter estimates to the null space of a certain covariance matrix. Part II considers stochastic parallel model adaptation problems, which include output error identification, adaptive IIR filtering, adaptive noise and echo cancelhng, and adaptive feedforward control with or without input contamination. Recent results on the convergence of these parallel model adaptation schemes in the presence of nonsta. tionary colored noise are presented.
1 Introduction 1.1 W h a t is S t o c h a s t i c A d a p t i v e S y s t e m T h e o r y ? The problems of recursive identification, adaptive signal processing, and adaptive control of linear stochastic systems have been an active area of research for more than two decades. In this paper, we provide a fresh appraisal of these fields, based on m a n y recent advances, some occurring in the last year Mone. [23,26,20,17,18] We focus on analyzing of the performance of various discrete time parameter * The research reported here has been partially supported by the U.S. Army Research Office under Contract No. DAAL-03-88-K0046, by the Joint Services Electronics Program under Contract No. N00014-O0-J-1270, and by an International Paper Company Fellowship for the first author.
270
Ren and Kumar
adaptation algorithms in the presence of stochastic disturbances. 2 This is due to our belief that adaptive systems are used in many applications to improve system performance. In many such applications, the key to improving performance is to exploit any statistical properties of the disturbances and signals, which can realistically and profitably be assumed in many practical situations, to obtain sharp performance results. One of the main contributions of stochastic adaptive system theory is to show how one may do this. By the "performance" of a stochastic adaptive system, we mean the asymptotic properties of relevant signals and parameters such as the following: i) Stability: Is the system stable in a mean square sense? Does it have bounded states? ii) Self-optimality, or more generally the convergence of signals: Do the signals of interest converge to what they would be if the parameters of the system were known? For example, in the case of prediction, does the adaptive prediction converge to the optimal prediction, say in a mean-square or some other sense? iii) Parameter convergence: Does the parameter estimate converge, and if so to what set? Does it converge to the "true" parameter vector, i.e., is it strongly consistent? iv) Self-tuning property: Does the adaptive filter, predictor or controller converge to what we would have designed had we known the parameters? Note that even if the parameter estimates do not converge to the true parameter, the limit set to which they converge may still yield the self-tuning property. The probabilistic properties that we will exploit in this stochastic adaptive system theory are the following; i) the auto-correlation property of a signal; ii) the cross-correlation property between signals; and iii) any persistence of excitation property of a signal. Property i) characterizes the "predictability" of a signal from its own past. By exploiting such an auto-correlation property, we may be able to estimate the innovation process of a signal, and the signal model which generates it. This could then be employed either for adaptive prediction of a signal, or to identify a system model based on say the prediction error identification method. Or, we could use it to attenuate the effect of the predictable portion of an unmeasured but partially predictable disturbance on a system, using adaptive feedback control. By exploiting property ii), we may be able to define "instruments" to obtain asymptotically unbiased estimates of the parameters in the presence of disturbances, by an instrumental variable or o u t p u t error identification scheme. Alternatively, by making use of an auxiliary correlated measurement, we could extract a signal from noise, as in adaptive noise or echo cancelling, or to cancel the effect of a measurable disturbance, as in adaptive feedforward control. Finally concerning property iii), the persistency of disturbance provides a "natural" excitation to the system, thus allowing us to identify the system better and to establish stronger parameter convergence 2 A disturbance or noise in identification or control of dynamical systems may very well be the signal of interest in a signal processing context. In this paper we mostly adopt a control viewpoint.
Stochastic Adaptive System Theory
271
results. Thus, we could, for example, conclude the self-tuning property of an adaptive controller, once we know that it is self-optimal. 1.2 A D e t e r m i n i s t i c R e d u c t i o n In the stochastic adaptive system theory that we develop below, the key concern is to deal with the stochastically modeled disturbances and signals. However, it is possible to fully dispense with all probabilistic assumptions, and adopt a completely deterministic model of such signals and disturbances. Some readers may find that this is a more convenient point of view to adopt. To start, let us consider the key properties satisfied by a "white noise" sequence w(t). A probabilistic model for "white noise" is a stochastic process which satisfies the following three assumptions. i) w(t) is a martingale difference sequence with respect to an increasing sequence
of a-algebras {~'t}, i.e.,
E[w(t) I .~t_l]
= 0 a.s., Vt.
it) E[w2(t)lF,_l] = a2 > 0 a.s., Yr. iii) suPt E[Iw(t)l°lJ:,_l] < + o o a.s. for s o m e tr > 2.
The key consequences of these assumptions are given in the following lemma, see [1-4]. L e m m a 1 Let ft be a YZt-measurable sequence. I f assumptions i) and iii) above are satisfied, then
a) EtN1 ft_11/J(t)"--0 (EtN=If?-l)+ 0(1) a . s . 3 b)
N E,=I
:,_~(t)
~}, [ft--ll)a.s. on the eve,t{sup,]ftl<
convenes a.s. on the event
C)Et/V=IIf1_l]'tO2t"- 0
(EtN1
{E~=~/?-~ <
co}.
If, in addition, w(t) satisfies assumption it) above, then
d) ~ E,~=l(~2(t)- ~ ~) = O ( N - ' ) a.s., ,~here 0 < S <
~ -of~
e) l i m ~ ~-'~tN=lft2x = 0 a.s. on the event N
{lim
X--" A.~ r2 Jr-1 w t2 = 0 and sup [ft-l[ < ~ } . t=l
t
In the above, ft is a signal which depends only on w(k), k < t, and on any other signal supposed to be "uncorrelated" with {w(t)}. Properties a) and b) essentially capture the fact that w(t) is mean zero, and moreover is '~uncorrelated" with ft-1. Since the value of ft_lW(t) is therefore as 'qikely" to be positive as 3 Here and in the sequel, we write at = o(~t) if ctt/~t ---, O, at = O(~t) if s u p t fft/~t < c~, and at ,,, ~t if at = O(/3t) and ~t = O(ctt).
272
Ren and Kumar
N negative, in forming the sum ~"]~=1 ft-lw(~) there are likely to be many cancellations, thus rendering it small in comparison with the energy, ~-~tN=Ift2_l, in f. Property c) essentially states that since w~(t) has a bounded average value, by N assumption iii), the sum ~,~=1 Ift-llw2(t) grows at the same rate as ~-~N=I]f~-ll. Finally properties d) and e) show that w is a natural non-negligible unpredictable excitation. The above sample path properties capture the essence of a white noise. Thus, instead of starting with a probabilistic set of assumptions, one could simply forego all stochastic assumptions, and simply suppose that the noise has the properties (~-e) above, and all the results in this paper would continue to hold. Readers more familiar with deterministic theory may well find this viewpoint to be more enlightening with respect to understanding the goals and results of stochastic adaptive system theory. In this paper, we will mostly model "stochastic" disturbances and signals as a moving average of such a "white noise" sequence. Thus we will consider disturbances or signals v(t) which can be represented as,
v(t) = ~
c,(t - i)w(i) ,
(1.1)
i=-oo
where the weights {c,(i)}~0 are deterministic. The shift structure contained in the above moving average representation (1.1), and Lemma 1 a)-c), will provide us with certain auto-correlation and cross-correlation properties which we can exploit. On the other hand, Lemma 1 d) and e) will provide us with certain "persistency of excitation" properties which we will find useful. The rest of this paper divides itself naturally into two parts. Section 2 treats identification, adaptive prediction and adaptive control based on the ARMAX model, using the so called "equation error" approach, l~ecent results on the selfoptimality of adaptive prediction and control schemes for general delay systems are presented. Both direct and indirect approaches based on non-interlaced extended least squares (ELS) as well as stochastic gradient (SG) algorithms are considered. We motivate the use of a generalized certainty equivalence approach whereby the estimates of disturbance as well as parameters are utilized. We also show that self-optimality implies self-tuning in general. Section 3 considers stochastic parallel model adaptation problems which consist of a unknown linear time invariant system and a partially or wholly tunable system connected in parallel, with a common input to each. The goal of adaptation is to tune the latter system so that its output matches that of the unknown system, despite the presence of an arbitrary disturbance which is stochastically uncorrelated with the common input. Such a general formulation includes the problems of output error identification, adaptive IIR filtering, adaptive noise and echo cancelling, and adaptive feedforward control with or without "input contamination." The convergence of such schemes will be established for fairly general nonstationary colored noise.
Stochastic Adaptive System Theory
273
2 Identification, Adaptive Prediction and Control Based on a n A R M A X M o d e l There has been a great deal of interest in establishing the convergence properties of recursive identification, adaptive prediction and control schemes based on an ARMAX model. Despite the notable contributions, among others, of/~strSm and Wittenmark [5], Ljung [6,7], Solo [8], Goodwin, Ramadge, and Caines [9], Goodwin, Sin and Saluja [10], Fuchs [11], Lal and Wei [2], Becker, Kumar and Wei [3], Chen and Caines [12], Kumar and Praly [13], Lai and Wei [14], Chen and Guo [15,4], and Radenkovid and Stankovid [16], several significant open problems remain. Among them are: i) the self-optimality of adaptive prediction and control based on an ELS algorithm; ii) the self-optimality of adaptive prediction and control schemes for systems with general delay using non-interlaced algorithms; iii) a general self-tuning result. Recently in [21], a fairly complete solution to these problems is obtained for generically all systems under the restrictive condition of Gaussian white noise. Early this year Guo and Chen [23] have succeeded brilliantly in establishing the optimality of adaptive minimum variance (MV) control based on the ELS algorithm, for systems with unit delay. The convergence of non-interlaced algorithms for the case of general delay has been resolved in Ren and Kumar [17]. Very recently, a general self-tuning result has been established by Ren and Kumar [18]. It is shown that for the SG as well as the ELS algorithms, the parameter estimate converges almost surely to the null space of a certain covariance matrix. Moreover, each point in the null space gives rise to, for example, the optimal controller in the case of adaptive MV control. 2.1 I d e n t i f i c a t i o n o f a n A R M A X m o d e l Consider the single input, single output linear system described by the following ARMAX model A ( q - 1 ) y ( t ) -- q-,I B ( q - 1 ) u ( t ) + C ( q - 1 ) w ( t ) ,
(2.1)
where q-1 is the backward shift operator, i.e., q - l y ( t ) = y(t - 1), A(q -1) = 1 + alq -1 + . . . + apq -p B(q -1) = bo + blq -1 -}-... -{- bhq -h C(q -1) = 1 + clq -1 + . . . + c r q - r ,
(2.2)
(y(t)} and ( u ( t ) } are the output and input of the system, and {w(t)}, the disturbance, is a martingale difference sequence with respect to an increasing sequence of a-fields {.~ } which satisfies E [ w 2 ( 0 [ ~ . _ l ] = ,~2,
sup E[[w(t)Ia[~t_l] < oo,
a.s., for all Z >_ 1
a.s., for some (~ > 2.
274
Ren and Kumar T h e s y s t e m (2.1) can be also p a r a m e t r i z e d as 4
y(t) = R ( q - ' ) ~ ( t
- d) + S ( q - 1 ) y ( t - d) + q - ~ D ( q - x ) v ( ~ )
+ v(t),
(2.3)
where v ( t ) = F ( q - 1 ) w ( t ) , and F ( q - 1 ) = ,2_,i=0 - - , a - 1 1iq , _1 , f0 = 1. T h i s a l t e r n a t i v e m o d e l has certain a d v a n t a g e s when used for a d a p t i v e prediction and control, which will b e c o m e clear Inter. In this subsection we consider the p a r a m e t e r e s t i m a t i o n of b o t h models (2.1) and (2.3) . T h e following stochastic regression m o d e l e n c o m p a s s e s b o t h models, y ( t ) = z T ( t -- s)fl + d i v ( t - s) + . . .
+ d t v ( t - s - l + 1) + v ( t ) ,
(2.4)
where v(t) = ~"]i=o , - 1 f i w ( t - i), s > 1, x(t) and fl are column vectors, a n d x(t) is .Tt-measurable and is available at time t. T h e correspondence b e t w e e n (2.3) and (2.4) with s = d is clear, while for the model (2.1), s = 1, v ( t ) = w ( t ) , z ( t - 1 ) = [y(t- 1),...,y(tp),u(t - d),...,u(td - h)] T, ;3 = [ a x , . . . , a p , b 0 , . . . , b h ] T, £ = r and di --- ci, i = 1, . . . ,r. Note t h a t the regression model (2.4) is a generalization of t h a t considered in [14], which corresponds to the case s = 1 here. Let 00 := [fiT, d x , . . . , (it] T. T o recursively e s t i m a t e 00, n n a t u r a l extension of the algorithm considered in [14], i.e., for the case s = 1, is as follows: Let o(t- 1) be the e s t i m a t e of/70 at time t - 1, and c o n s t r u c t an e s t i m a t e ~(t - 1) of the u n o b s e r v a b l e v(t - 1), ~(t
-
1)
:= y ( t
-
1) -
¢T(t
--
s --
1)O('t- 1),
where ~bT(t -- s -- 1) := [zT(~ -- s -- 1), ~(~ -- s -- 1 ) , - - . , ~(t -- s -- g)]. T h e n we u p d a t e the e s t i m a t e to 0(t) using either the recursive least squares algorithm, 0(t) = 0(t - 1) + P ( t - s ) ¢ ( t - s ) ( y ( t ) - ¢ T ( t -- s)O(t -- 1)) P ( t - s) -1 = P ( t - s - 1) -1 + @(t
-
8)@T(t
--
8);
P(--s) = I
(2.5) (2.6)
or its stochastic gradient c o u n t e r p a r t ,
-s~¢(t 0(t) = 0(t - 1) -F -(-y-(~t )s r) ( t r(t - s) = r(t - s - 1) +
- CT(t - s)lT(t - 1))
II¢(t - s)H 2,
r ( - s ) = 1.
(2.7) (2.8)
In c o n t r a s t to the above algorithms, for s > 1, interlaced a l g o r i t h m s of the form 0 ( 0 = o(t - , ) + P ( t - s)¢(~ - s ) ( y ( t ) - CT(t - s)O(t - s)) (2.9) e-l(t
- 8) = P - l ( t
- 2s) +
¢(t -
s)¢T(t
--
s),
(2.10)
have usually been considered in the literature [22]. Due to the interlacing of s s e p a r a t e recursions, t h e a l g o r i t h m (2.9, 2.10) requires t h e additional storage in 4 Let polynomials F azad G satisfy A F + q--riG = C, where d e g F < d - 1, and let polynomials ~ and ~ satisfy C ~ + q-d-~ = 1, with d e g ~ _< d - 1. Then R -BF-F, D = "~, S = G F + G. See [22].
Stochastic Adaptive System Theory
275
memory of (8(t - 2 ) , . . . , 0(t - s)) and (P(t - s - 1 ) , . . . , P(t - 2s)) at each time t. Since each interlaced recursion effectively sees v(t) as a "white noise", it is clear that the results for s = 1 can be easily extended to the case s > 1 by the use of interlaced algorithms. The consideration of such interlaced algorithms have been purely an artifact of the theoretical inability to demonstrate the convergence of non-interlaced algorithms. In Theorems 2.1 and 2.2 below we resolve the convergence properties of the non-interlaced SG and ELS algorithms defined by (2.7, 2.8) and (2.5, 2.6), respectively. T h e following preliminary fact is useful. L e m m a 2.1 Consider the regression model (2.~). Let D(z) = d l + d2z + . . . +
dtz t. Then (1 + q-*D(q-1)) (~(t) - v(t)) = - - ~ T ( t -- 8 ) 0 ( t ) ,
(2.11)
where ~(t) := O(t) - 0o.
Proof. q-'D(q-1))(Y(t) - v(t)) = ~(t) + C'D(C1)~(t) - U(t) + xr(t - s)~ = - - ¢ T ( t -- s)O(t) + q-'D(q-1)~(t) + zT(t -- s)~
(1 +
= --¢T(t
-- s ) ( 0 ( t )
-- 0 0 ) . I"1
For convenience, let us define N
R(N) := I + ~ ¢ ( t ) f T ( t ) 1 ¢0(t)
:----- [ x T ( t ) , V ( t ) , " ' , V ( t
-- e-I- 1)] T
N
Ro(N) := I + ~ ¢ 0 ( t ) ¢ T ( t ) 1
to(N) := trace ( R o ( N ) ) . T h e o r e m 2.1 Consider the SG algorithm defined by (e.7) and (e.8), and applied to the stochastic regression model (e.4). Assume that 1 + q-aD(q-1) is Hurwitz 5 and is strictly positively real (SPII), i.e., Re[1 + zdD(z)] > O,
for z = e/w,Vw.
(2.12)
s Here and in the sequel, we shall say that a polynomial p(q-1) is Hurwitz if its zeroes are inside the open unit disk, and that a rational transfer function is stable if its denominator polynomial is Hurwitz.
276
Ren and Kumar
Then,
(2.13)
IIO(t)ll =onverges a.s. II0(t) - o(4 - ~)11 ~ < ~ o ,
(2.14)
a.~., v k < c o
+co ,=~ ~ 7 ( t ' - - ~ ' - i S
< ~'
~(0 ~ r0(t) oo
(2.15)
a.s.
(2.16)
a.s.
( v ( t ) - ¢ ~ ( t - , ) 0 ( t - k) - v ( t l ) ~
t=l
r(t - s)
< ~'
a.s.,
(2.17)
Vk < c¢.
Moreover, if
sup ()~max(R0(N))~ < oo, a.s., N ~, ~min(Ro(N) ]
and
ro( N ) ---, oo
(2.18)
then O(O ~
Oo a . s . .
P r o o f . Premultiplying (2.7) by cW(t--s) and subtracting from Y(O, we can rewrite (2.7)
- s)
r ( t (t- s - 1) Taking
yields
squares
IIO'(017 -
Let
~(t) = O'(t - 1).
2¢(t - ~)o'(0~(0 r(* - s - 1)
+
I1¢(~ - s ) l l Z ~ Z ( O r 2 ( t - s - 1)
= I10(~ - 1)112.
V(t):=~(t)ll 2, and for convenience define the operator 7-/t(.) on a random
variable z, as follows, 7/t(z) := z - E [ z I Y , _ , ] .
(2.19)
Making use of (2.19) to rewrite (2.7), We obtain, 2~T(t -- s)0"(t)(~(~)- v(t)) _ 2¢T(t --
V(t) -
r ( t - s - 1)
+
II¢(t - ~ ) l l 2 ~ Z ( t ) -~Ct---s-Z ~
s)E[O(t)lY,_,lv(t)
r(t - s -
---- V ( t
1)
-
2¢a'(t
+
-
1)
s)7-lt(O(t))v(t)
(2.20)
r ( t - s - 1)
We will now show that the last term above is absolutely summable, almost surely. Applying the operator 7/t(.) on both sides of (2.7) and noting that y(t) v ( t ) , rk(t - s), r ( t - s) and 0(t - s) are all ~'t_s-measurable, we have, for i = 0 , . . . , s - 2, 7/t(0(t
i))
~(t - ~ - i)I - ¢(t - s - i)¢T(t r ( t - s - i)
¢(t - s + r(t
s
7"lt(v(t - i))
- ~ - i) u , ( o ( t
/
1)) (2.21)
Stochastic Adaptive System Theory
277
and
¢(t - 2s + 1) 7~,(0(t - s + 1)) - r(t 2s + 1) ~ I , ( v ( t - s + 1)).
(2.22)
Now ,,sing (2.21) with i = 0 ann 7~,(v(0) = v(t), ~T(t -- 8)~,(e(t)),~(t) r ( t - s - 1) _ ~T(t -- d ) ~ , ( e ( t - 1))~(t) ll¢(t - ~)II~ ~(~). r ( t - 8) + r(t : ~ r ~ - ~ 1)
(2.23)
II¢(t - 8)112
Since r(~ - s)r(~ - s - 1) is s u m m a b l e , it follows f r o m L e m m a 1 c) t h a t the last t e r m on the R t I S of (2.23) is s u m m a b l e . To show t h a t the first t e r m on the R H S of (2.23) is also s u m m a b l e , note t h a t
[
I*T('-
As earlier, the first s u m m a t i o n on the RHS above is finite a.s.. To show t h a t the second s u m m a t i o n is also finite, note t h a t from (2.21), for i = 1 , . . . , s - 2,
117/,(0(t - i))1[ -< 117/,(0(* - i - 1))11 +
II¢(t - i)l 117_/,(v( t r(t - ss ---i)
_ i))1
(2.25)
and f r o m (2.22),
117~,(a(t - ,
+ 1))11 <
II¢(t - 2s + 1)11 [7/,(v(t - s + 1))l. r(t 2s + 1)
(2.26)
So, ,I--1
1lTt,(o(*
-
1))11 < ~,=x II¢(t i)ll ~(t -- ss - - ~)
17/,(v(t- s-
i))1.
(2.27)
Hence, oo
117~,(o(t - 1))112 < ~ .
(2.28)
t=l F r o m (2.23), (2.24) and (2.28), it then follows t h a t
~ l¢T(t -- s)Tit(O(t))v(t) l ,=i
:Zi
<
a.s. .
278
Rcn a n d K u m a r
We now bound the third term on the LHS of (2.20). Applying Lemma 1 a), we can obtain that N
CT(t -
,=,
s)E[~(t)i&_,]v(O
~=7--~)
=o ( Y~=I (¢'T(t-')E[O(t)lS:'"°]~ ~ +0(1). k
r(,-.-,)
] ]
Using the decomposition 0"(t) = E[0"(t)[~'t_,] + 7/~(0($)), we can further write
~N¢ T ( t
-
Z.., ,=I
8)EEl(0 I ~,-,],(0 r(~ - s - 1)
t=l k r ( i - s - 1 )
]
")
Now, from (2.23), we have
( ~ T ~ :_. _~)_~(~)),~ ~ Z\
r(t-s-l)
t---1
]
--O(,=~(¢T(t--s)Tt'(8(t--1)))2
(2.30)
= o(1), where in the last equality we have used (2.28) and the fact that \r(*-~)r(i-,-1)) ( II¢(~-s)ll~ ~ : is summable. It then follows from (2.29) and (2.30) that
,=,
r-"~--s--D')
=
,=, \ ~(t--'-s---Tj / ] + 0 ( 1 ) .
(2.31)
Finally, from (2.11) and the SPR assumption (2.12), we have N
s(av) =
y~.[-¢'r(t t=l
-
~)~'(t) (v(O
-
v(O)
-
~(~(t)
-
,,(t)) ~] + s(o) _> o, (2.32)
Stochastic Adaptive System Theory
279
for some constant e > O, and random variable S(O) < co a.s.. Summing by parts, we have s(o
- s ( t - 1)
v(N-s--1)
--
+~SCt) ~=1
r(t-
rCt---s)
-1)
r(-s)"
(2.33)
Since r(t) is monotonically nondecreasing, it follows from (2.32), (2.33) that
k C-~T(t - ,)Y(t) (~(0 - v(0) - '(~(0 - ~(0) 2] S(0) ,=~ r(t - , - 1) >- - r(-,----~"
(2.34)
Using (2.34), (2.31) and (2.28) in (2.20), we can immediately conclude (2.15) and (2.14). (2.16) follows from (2.15). (2.13) can be then shown using Lemma 1 b). The following series of inequalities proves (2.17). ~ - ] [u(t) - ~ T ( t - d)O(t - 8) - v(t)]= r ( t -- s)
t=l
~2[y(t)
- ¢(t -
<-- 2..dt=l < ~
8)oct)
v(0] 2
-
r(t -- 8)
~2[~T(t
_ 8)(e(0
"~ 2.4 1=1
l'(t -- 8)
,.=(,-8))
7G(t-7---Y) + 2
-
- e(t - k))]~
II0(t) - o(t - &)ll=
< co.
(2.35)
We now turn to the issue of parameter consistency when (2.18) holds. Multiplying (2.7) byR(t - s), we have
R(t -- 8)8(t) = R(t - s - 1)~'(t - 1) + ¢(t - 8)~bT(t -- s)O(t -- 1) -
+R(t
-
-
cT(t
~) ¢(~ - , ) ( Y ( 0 ~(t - 8)
-
-
8)a(t
-
-
1))
So, 1
" --- ~ ( N -
'
.)
I ¢(t - ,)¢(t - ~)~(t
- 1)
II
l l~tj--~)¢(t--s)(y(t)--ebT(t--8)O(t--1)--v(t)) -~ r(N'-- s) t=l 1
~NR(t--s)
l{
I
280
Ren and Kumar
The terms on the right hand side above converge to zero by the Schwarz inequality, (2.17) and Lemma 1. Hence, 1
r(N
8)
I I R ( N - s)0(N)l I -~ 0
a.s..
(2.36)
From (2.15), 1
rCN - s) ( R ( N - s) - R o ( N - s)) ---* 0
a.s..
Combining (2.36), (2.37), (2.16) and (2.18) completes the proof.
(2.37) []
T h e o r e m 2.2 Consider the ELS algorithm defined by (2.5) and (2.6), and applied to the stochastic regression model (g.4). Assume that 1 + q - ' D ( q -1) is Hurwitz and that > o, for z = J ~ ,
W.
(2.38)
Then,
N ~(~(t) t=l
- v(t)) ~ = O(log ~ ( N - , ) ) a.s.
(2.4o)
r(t) ~ to(t) a.s.
(logrW-,) ]
116(N)II 2 = o kAmin(R( N _ s ) ) J
~
[u(t) - c T ( t -- , ) 0 ( t -- k) -- v(t)] 2
(2.a0)
a.s.
0(logr(N- ~)) a.s.,
(2.41)
(Z42)
V k < c~. Moreover, if t o ( N ) --, oo and
logAm~x(R0(N)) ~ 0 a.s., )~rnin(RO(]V)
(2.43)
then O(t) -.-., Oo a.s..
The proof of Theorem 2.2 below is presented in a manner which follows the steps of the proof of Theorem 2.1, so that the similarities and differences are transparent.
Stochastic Adaptive System Theory
281
Rewrite (2.5) as
Proof.
O(t) -
P(t
1)¢(t -
- s -
s)'6(t)
1),
(2.44)
1)-10"(t - 1).
(2.45)
= O(t -
and also as, P ( t - s - 1)-10"(t) - ¢(t -
s)'O(t)
= P(t
-
s -
Multiplying (2.44) and (2.45), we have V(t)
[~T(t -- s)0"(/)] 2 + ¢(t --
-
s)Tp(t
-- s --
1)¢(t -- s ) ~ ( t )
--2~bT(t -where V(0
V(t-
:=
s)O(t)'O(t)
= V(t
-
1),
~r(t)P(t - s)-lff(t). H e n c e ,
1 ) = V ( t ) - 2~bT(t-
s)O(t)[l~bT(t-
+¢(t - s)Tp(t
( ~ ' ( t ) - v(t))]
s)O(t)+
-- s -- 1 ) ¢ ( t -- s ) ~ 2 ( t )
-
2¢T(t
s)O(t)v(t).
--
(2.46)
We will now show t h a t = o
(2.47)
+O(logr(N-s)).
1=1
1=1
We first decompose it as n
I1
~v(t)¢r(t
- s)~(t) = ~_v(t)¢T(t
t=l
- s)E[O(t)lY~_,]
t=1 n
+ ~(t)¢(t
-
s)u,(o(O).
(2.48)
t=l
We will first bound the second t e r m on the RHS above. From (2.5), we have for i = O , . . . , s - 1, P-X(t
-
s -
Noting t h a t 0,...,s-2 P-l(t
i)O(t y(t)
-
-
s -
- i) = ¢(t
-
v(t),
and P ( t ) are all ~-~- m e a s u r a b l e , we h a v e for i =
¢(t)
i)7-lt(8(t
-
s -
i)y(t
i)) = ¢(t
-
-
i) + P-X(t
s -
i)7-lt(v(t
+P-l(t - s
- i -
- s - i -
-
1)0(t - i - 1).
i))
1)Tt,(o(t
-
i -
1)). (2.49)
For i = s - 1, we have 7~,(O(t
-
s +
1)) =
P(t
-
2s +
1)¢(t - 2s + 1)7/dv(t - s + 1)).
From (2.49) and (2.50), it is easy to see t h a t 1--1
7~,(o(0) = P(t - s) ~ , ¢ ( t - s - i)U,(v(t i=0
0).
(2.50)
Renand Kumar
282
So, N
v(t)t~(t
,)n,(e(t))
-
I=1
N
#-1
= E
s)P-l(t - s) E
¢w(t --
t=l
=
¢(t
- s -- i)v(t)Tt,(v(t - i))
i=O
o
._.
%k/=O
)'
•=I
~bT(t --- O(log
s)
(T(t - s)P(t - s)¢(t -
r(N
s -- i)P(t -- s)¢(t -- s -- i)
+ 0(1)
s)),
-
(2.51)
where in the last equality we have used the fact N
~
¢T(¢ _ i ) P ( 0 ¢ ( t - i) = O 0 o g , . ( g ) ) , Vi _ 0.
(2.52)
t=I
We now bound the first term on the RHS of (2.48). Applying Lemma 1 a), we can obtain that =o I----1
+ oo). t=l
Using the decomposition 8"(t) = S[8"(t)[/',_,] +
7t,(O(t)),
we can further write
N
E(T(t
--
s)E[8(t)
[ :p,_.lv(t)
t=l
= o ( Nt~=(fT(t-- s)O-(t)) u) +o (,=~ [¢T(t -- S)U,(O(t))] u)
+0(1).(2.53)
Now, from (2.50), we have N
DTCt_ slU,(0(0)] 2 t----I
/,-1
= o ~
N
)
~(,~(t
- .)P(~ - . ) , ( , - .
- O)2Eu,(.(~-
0)]'
ki----I$----I
= O0og r(N
-
.1).
(2.a41
Stochastic Adaptive System Theory
283
Combining (2.4S), (2.51), (2.53) and (2.54) then establishes (2.47). Finally, from (2.11) and the SPIt assumption (2.38), we have N
+
o(,)]
't----1 N
N
>__'l~--~[~bT('- - S)0(t)]2 "b E2~(~(t) t---1
--
V('))2 -{-0 ( 1 ) ,
(2.55)
t-~l
for someconstants
Using
ex > 0, e2 > 0. (2.55) and (2.47) in (2.46), we can immediately conclude (2.39) and (2.41). (2.40) follows from (2.39). To show (2.42), we have the following chain of inequalities, t__~l ( y ( t )
-- cT(t
- - 8 ) O ( t - - ~ ) - - 13($))2
1 + CT(t - s ) P ( t - s - k)4~(t - s)
< 2~(~(t)N _
v(t)) 2 + V"N
[4,T(t -- 8 ) ( 0 ( 0 -- O(t -- k))] 2
2~A-~11 + CT(t -----
~----1
s)P(t
-
s
-
k)¢(t - 8)
_< O(log,-(N - s)) k--1 T N [~"]i=o ~b ( t - s ) P ( t -
+2~
8 - i - 1)~b(t- s - i ) ~ ( t -
i))]2
1 + Cw(t -- s ) P ( t -- s -- k)¢(t - s)
t----1
< O(log r ( N - s))
..-./k-1.-.N
+ 0 ~ _ , ~_ c T ( t - - s - - i ) P ( t - - s - - i - -
) 1)¢(t--s-- i)~2(t--i)
= O(log r ( N - 8)). The parameter consistency when (2.43) holds can be shown as in [14].
D
2.2 A d a p t i v e p r e d i c t i o n We now consider the prediction of a stochastic process y(t) which satisfies the ARMAX model (2.1). Such a model is appropriate, for example, when 9(t) is the output of a dynamical system with a random disturbance and an .Twmeasurable input u(t), or when y(t) contains seasonal fluctuations and u(t) is the appropriate known periodic function. For simplicity, we consider only d-step ahead prediction. Let F'(q - x ) := _~_-i be the solution of 1 -t- f~(q-1) + . . . + f~_lq-(d-1), and Gl(q -1) : - x"P z.~i=091~/ the following "long division",
A(q-~)F'(q -~) + q-dG'(q-~) = 1.
(2.56)
284
Ren and Kumar
Also let F(~ -1) = 1-~t-fl(q-1 ) -~-...-{-fd_lq-(d-1), and G(q-1) = E ~ = o g l q - i be the solution of A ( q - 1 ) f ( q -1) + q-riG(q-t) = C(q-1) . (2.57)
Then (2.1) can be reparameterized as y(t + d) - FCq-1)w(t + d) -- BCq-t)f'Cq-t)u(t) + G'(q-1)y(t) + M(q-X)w(t),
(2.58) where
M(q -1) = q~(C(q-1)F'(g -1) - F(q-~)). From (2.56) and (2.57), A ( q - t )M(q -1) = G(q -1) - Gt(q-1)C(q-t) . Hence, it is clear that M(q - t ) = Truncation (F'(q-1)C(q-1), cO,
(2.59)
where Truncation
aiq -i , ki'--0
:= /
oti+dq - i .
i=O
From (2.59), it follows that the R.HS of (2.58) is .~rt-measurable. Hence the conditional mean, and thus also the minimum variance prediction, .~(t + d) := E[y(t + d)13zt], is given by O(t + d) = B(q-1)f'(q-X)u(t) + G'(q -1) + M ( q - t ) w ( t ) .
(2.60)
Note that though w(t) is not an observed variable, we could estimate w by t~ using the following "observer", V(q-1)~(t) = B(q-1)u(t - d) - A(q-1)y(t).
(2.61)
An indirect approach to adaptive prediction is based on the following procedure: Estimate the parameters A, B, C and the innovation sequence w(t); solve the Diophantine equation (2.56) with the estimated parameters to obtain estimates of F ~ and G'; obtain an estimate of M based on (2.59); replace all the unknowns in (2.60), including w(t), by their estimates to generate adaptive prediction ~(t + d), i.e.,
~(t + d) = ( $ ( q - l , 0 ~ ' ( q - 1 , 0 ) , ( 0
+ G'(q-l,0y(0 + M ( q - l , 0 ~ ( 0 ,
(2.62)
where F' and G' are obtained from
7(q-~,OP(q-~,O + q-dO'(q-~,~) = 1,
(2.63)
285
Stochastic Adaptive System Theory
and M is defined as in (2.59) using if' and ~,.e It should be noted that since w(t) is estimated along with parameters, the linear observer (2.61) based on the estimated parameters need not be used. The use of the disturbance estimate as well as parameter estimate results in a more general certainty equivalence approach than commonly advocated. Alternatively, one could use a direct approach to adaptive prediction based on the reparameterized model (2.3), since then the conditional estimate satisfies fl(t + d) = R ( q - ~ ) u ( t - d) + S ( q - ~ ) y ( t - d) + D ( q - ~ ) v ( t ) .
(2.64)
One can therefore simply estimate R, S, D and v(t) in the model (2.3) and use the estimates in (2.64), to give d-step ahead prediction, ~(t + d) = cT(t)0(t).
(2.65)
Theorems 2.3-2.6 below establish the self-optimality of both the direct and indirect approaches to adaptive predictions. T h e o r e m 2.3 Consider the adaptive prediction based on the direct approach (L65) using the SG algorithm (L7-2.8) with s = d. Suppose that l +q-dD(q -1) is Hur'witz and the following SPR condition holds rain Re [ 1 "Jr z dD(z)] > 0, I~1=i
and
1 N sup ~ ~ (y~(t) + ,~(t)) < oo a.s. . N iv'-~"
(2.66)
Then, the adaptive predictor is self-optimizing in the sense N
+ d)-
+ d))
= 0 a.s. .
(2.67)
Proof. When r(t) is finite, (2.67) holds trivially. When r(t) ~ ~ , the optimality (2.67) follows easily from (2.65), (2.66), (2.16-2.17) and Kronecker Lemma. []
T h e o r e m 2.4 Consider adaptive prediction based on the direct approach (~.65) using the ELS algorithm (2.5-~.6) with s = d. Suppose that 1 + q-riD(q-I) is Hurwitz and the following SPR condition holds rain l~e [1 -t- zdD(z)] -1--" > I~1=1
1
and y2(t)+ u2(t) = O ( ~ )
a.s..
(2.68)
e The multiplication of two time varying polynomials of q-1 is to be understood as ordinary polynomial multiplication disregarding the time dependence of their coefficients.
286
Ren and Kumar
Then the adaptive predictor is self-optimizing in the sense of (2.67). Moreover, if y(t), u(t) and w(t) are uniformly bounded, then the ~accumulated regret" satisfies, N
(~(t + d ) - ~(t + d)) 2 = o (log" U )
,.s..
(2.69)
t=l
Proof. As in the proof of Theorem 2.3, it is enough to consider the case r(t) -* oo. Following [23], let (~(t) - 9(t)) 2 ~' := 1 + OT(t -- d ) P ( t - 2d)¢(t - d) 6, := trace ( P ( t - 2d) - e ( ~ - d)).
Then from (2.65), (2.42), (2.40) and (2.39), we have N
N
(~(t) - ~(t)) 2 < ~ t=r
t=l
N
2~, + ~
~,C(t
- d) (e(t
- 2 d ) - P ( t - d ) ) OCt - d)
t=l
= O(logr0(N)) + o(logro(N) • max I
II¢(t - d)ll 2
(
---- O(log roCN)) + o log to(N) I~_~_NI1¢(0 - ¢0(011
)'
+o (logro(N) l<,<_Nmax11¢°(t)[12) -- o(log = ro(N)) +
o(logro(N)) m a x
ll¢o(t)ll 2 .
From (2.68), we have log ro(N) = O(log(N)). The optimality (2.67) and the rate of the regret (2.69) then follow.
[3
R e m a r k : It is worthwhile noting that adaptive prediction based on the least squares algorithm is capable of self-optimizing even for processes with a growing trend as in (2.68). t3
T h e o r e m 2.5 Consider adaptive prediction based on the indirect approach
(2.62) using the SG algorithm (2.7.~.8) with s = 1. Suppose that C(q -1) is Hurwitz and the following SPR condition holds: min Re[C(z)] > 0. i~]=1
Then the adaptive predictor is self-optimizing in the sense of (2.67).
Stochastic Ad~ptive System Theory
287
Proof. From (2.1), we have ~(q-*,t)y(t
+ d -/)
= B(q-l,t)u(t
+ (A(q -1) - ~(q-*, t))y(t + d - j)
- j) +
(B(q-i) -
B(q-l,t))u(t
- j)
+ O ( q - L t)~(~ + d - j) + C(q-~)(~o(t + d - j) - ¢ ( t + d - j)) + ( C ( q - 1 ) _ ~ ( q - 1 , t)~(I + d - j ) , and hence
~(q-1, t ) y ( t
q- d - j ) -- ~ ( q - 1 ,
t ) u ( t - j ) "t"
~(q-1, t ) ~ ( t
"t" d - j )
+ C ( q - 1 ) ( w ( t + d - j) - ~(~ + d - j)) - ~bT(~+ d - j - 1)0"(t).
(2.~o)
So,
(~(q--l,[)fit(q--1
[))y(t
"]- d) ----(j~(q--1, t)~t(q--1, t))ll(~)
d--1
+ ( p ( q - 1 , t ) ~ ( q - 1 , t))~(t + d) + y ~ ~i(t). cT(t + d - i - 1)0"(t)
i=O - (F'(q-l,t)C(q-1))(w(t
+ d) - ~(t + d)).
From (2.62), (2.63) and the above, we have
y(t + d) - P ( q - l , O w ( t + d) - y(t + d) -- ( F ' C q - l , t ) C ( q - 1 ) )
(wit q- d) - ~(t q- d)) + F ( q - l , t ) ( ~ ( t W d) - (w(t q- d))
d--1
--~-~(tl~bT(t + d - i - 110(/1,
(2.71 /
i=0
where P ( q - l , t )
satisfies
A(q-l,t)ff(q-l,t)
+ q-dG(q-l,t)
= C(q-l,t) .
Since IIO(t)ll ~ bounded and the mappings from 0(t) to F ( q - l , t ) and F ' ( q - l , t ) are continuous, the coefficients of f f ( q - l , t ) and f f ' ( q - l , t ) are also bounded. Hence it follows from (2.17) that N
H-.oo -ff y]~ y(t + d) - f f ( q - l , t ) w ( t + d) - "~(t + d)
= 0 a.s..
(2.72)
t----1
So N
2
- - 0 a.s..
(2.73)
Ren a n d K u m a r
288
Noting that both fI(t+d) and ~(t+d) are 9:t-measurable, and that deg ~(q-1, t) = d e g F ( q -1) = d - 1, we have N
E [~(t +
d) - ~(t + d) + (F(q - I ) - F ( q - l , t ) ) w ( t +
d)] 2
t----1 N
N
1=1
I=1 N
+ 2 E ( ~ ( t + d) - ~ t + d)). (F(q-') - F ( q - l , t l l w ( t + d) t=l N
N
+ d) - ff(t 4- d)) 2 "4- E ( ( F ( q -1 ) - F ( q - l , t l ) w ( t -4- d)) 2.
= (1 + o ( l l l Z ( 9 ( t t=l
t=l
The optimality (2.72) then follows from the above equation and (2.73).
[]
T h e o r e m 2.6 Consider adaptive prediction based on the indirect approach (1L6~) using the the ELS algorithm (ES-E6) with s = 1. Let C(q -~) be Hut. witz and the following SPR condition hold:
1
min[ReC-X(z)] > Izl=l
2
and
then the adaptive predictor is self-optimizing in the sense of (2.67). Moreover, if y(t), u(t) and w(t) are uniformly bounded, then N
~,(~(t
+ d) - ~(t - d)) = o(log d+l N ) .
t=l
Proof. It follows from (2.41) and (2.63) that I~(t)l z = O(log a-1 r(t)). The rest of the proof consists of applying the technique of the proof of Theorem 2.4 to (2.71). The details are omitted, r3
Stochastic Adaptive System Theory
289
2.3 A d a p t i v e f e e d b a c k control We consider adaptive feedback control of plants represented by the ARMAX model (2.1). As for adaptive prediction, certainty equivalence adaptive control can also make use of the extra degree of freedom afforded by the availability of the disturbance estimate. Thus, we consider the general control law of the form
R(q-1)u(t) = S(q-1)y(t) + T(q-1)z(t) + M(q-1)w(t),
(2.74)
where R, S, T and M are polynomials, and z(t) is the command signal. In this section, we restrict our attention to a class of certainty equivalence adaptive control which may be regarded as a model reference adaptive control with disturbance attenuation capability. The minimum variance (MV) control may be regarded as a special case with a unity reference model and optimal disturbance attenuation. In order for the closed loop transfer function from the command signal z(t) to the output y(t) to match an arbitrary reference model, we assume that the plant is minimum phase, i.e., B(q -1) is Hurwitz, and we consider the general control law (2.74) with the restriction that R(q -1) presumes B(q -1) as a factor, i.e.,
R(q -1) - R'(q-S)B(q-1), for some polynomial R'(q-1). With this control law, it is easy to calculate that
q-dT(q-1)z(t) R'(q-1)C(q-1) + q-riM(q-I) wft~ y(t) = A(q_l)R,(q_t ) _ q_dS(q_l ) + ~ ) - _ _ ~ ,,, where the first term on the RHS above is the command response, and the second term is the disturbance response. Suppose that the desired trajectory is given by a reference model y*(t) =
q-dBm(q-1) z(t) ' Am(q_i )
then for the command response of y(t) to match that of the reference model, we require that
A(q-1)1~(q -1) _ q-dS(q-i ) _-- S ( q -1) T(q -I) - Ao(q-1)Bm(q-1),
(2.75)
where Ao(q -1) is some "observer" polynomial, and H(q -1) - Am(q-1)Ao(q-1). We now consider the optimization of the disturbance response by making use of the remaining freedom in the control law. Let (R'(q-1), S(q-~)) be a particular solution of (2.75), then (R'(q-1)+q-dA(q-1), S(q-1)+A(q-1)A(q-1)) is the class of all possible solutions, where A(q -1) is a free polynomial. Thus by choosing M(q -~) one exploits the freedom given by A(q-1)in shaping the disturbance response. Hence we need to only consider the appropriate choice of M(q -1) to "optimize" the disturbance response.
290
Ren and Kumar A simple choice is to set U ( q -1) = - Truncation (R'(q-i)C(q-i), d),
(2.76)
which minimizes the variance of H (q- l)(y(t) _ y, (t)), and yields
H(q-i)(y(t) - y*(t)) = Fi(q-1)w(t) , where Fi(q - i ) = R'(q-i)C(q -i) q- U(q-X)q -a . To minimize the variance of (y(t) correlated sequence, the variance of
-
y*(t)), note that since {w(t)}
is a n
un-
R'(q-1)C(q-1) + q-aM(q-i) w(t) H(q - i ) is minimized with respect to M ( q - i ) if for some polynomial F2(q-i ), deg F~(q -i) < d - 1, the following holds,
R'(q-i)C(q -i) + q-aM(q-I) H(q_l)
= F2(q-i).
Hence the optimum M(q - i ) is the minimal degree solution with respect to F2 of the following Diophantine equation,
H(q-1)F2(q -1) - q-aM(q-i) = Rr(q-1)C(q-1) .
(2.77)
It is easy to see that in fact F2(q - i ) = F ( q - i ) , where F(q - i ) is as in (2.57). Hence, with this choice of M(q-1), we have
y(t)
-
y*(t) = F(q-i)w(t).
The difference between the model reference control law described above and a MV control law with y* (t) as the command signal should be noted. The former may have the advantage of resulting in a smaller feedback controller gain, and hence may be more robust in the presence of unmodeled dynamics. The corresponding indirect adaptive control law is to estimate the parameters A, B and C, and the innovation process w(t) of the model (2.1), solve (2.75), (2.76) or (2.77), and use the control law (2.74), with the estimated quantities. Theorems 2.7 and 2.8 below establish the self-optimality of such adaptive control laws based on either the SG or ELS algorithms. T h e o r e m 2.7 Consider the indirect model reference adaptive control based on
the SG algorithm (2.7-2.8) with s = 1. Assume A2.1) H(q - i ) is stable, and z(t) is bounded. Ae.e) B(q - i ) is Hur~oitz, bo • O. Ae.3) minlzl=i Re[C(z)] > O. Ae.4) The distribution of (w(O),..., w(t)) is absolutely continuous with respect to Lebesque measure for Vt > O.
Stochastic Adaptive System Theory
291
Then, with the design (2.76), we have iV
lim N" 1 ~'~ ( H ( q - 1 ) ( Y ( t ) N--oo
y*(t)) - Fl(q-1)w(t)) 2 -- 0 a.s.,
(2.78)
1=0
and with the design (~. 77), we have N
E (,(')-,'(t)-
o
a.,..
(2.79)
t=l
Proof. Substituting the adaptive control law into (2.71), and proceeding as the proof of Theorem 2.5, one can show that N
1 ro(N - 1) ~
t=1
(g(q-1)(Y(t)
-
y*(t))
+ (~(q-l,t)0(q-l,t) + q-~(q-l,t))w(t)
--, 0 a.s.,
Because of the minimum phase assumption (A2.2),
Hence,
~----0
= o(N) H- o
y2(t
a.s..
Because y*(t) is bounded, H(q -1) is stable, and R(q-l,t), C(q-l,t) and (q-t, t) are bounded, we have N
~v~(t)
= O(N)
a.s..
t=0
Hence,
to(N) -- O(N) a.s.. So 1
N
-~ E
[H(q-1)(y(t) - y*(t))
1----0
+(j~(q-1, t)~(q-1, $) + q-d~(q-1, t))w(t)] 2 --'~ 0
a.s..
292
Ren and Kumar
With M designed through (2.76), we have 1 N
53 t=O
2
¢(,))+
It is clear that H(q-1)y(t) - Fl(q-1)w(t) is ~ t - d measurable. The rest of the proof for (2.78) then follows the proof of Theorem 2.5. The proof for (2.79) is similar and is omitted. []
We now consider the ELS algorithm based adaptive control. The following theorem is essentially due to Guo and Chen [23]. T h e o r e m 2.8 Consider the indirect model reference adaptive control for unit delay systems based on the ELS algorithm. Let the assumptions (A2,1), (A2.2) of Theorem 2. 7, and the following conditions hold, 1 A2.5) minrte[C-l(z)] > ~.
A2.6) lim inflb0(t)[ > 0. t --*OO A
Then with M designed through (2.76), we have N
1=0 A
With M designed through (2.77), we have N
y~.(y(t) -- y*(t) - w(¢)) 2 = O(N$), t----O
Proof. First note that with the delay equal to one, F ( q - i ) , F i ( q - i ) , F'(q - i ) and their estimates are all equal to 1. Without loss of generality, we can also take ~1 to be 1. Substituting the control law into (2.70), we have
H(q-1)(y(t + 1) - y*(t + 1)) = M(q-l,t)~(t) + R ' ( q - l , t ) C ( q - l , t ) ~ ( $ -b 1) +C(q-1)(w(t + 1) - ~(t + 1)) + c T ( 0 ~ ( 0 , A
with M designed via (2.76),
H(q-1)(y(t + l ) - y * ( t + l)) = ~(t+ l)+C(q-1)(w(t-k l ) - ~ ( t + l))-FcT(t)O(t). T h e rest of the proof then follows Guo and Chen [23].
n
Stochastic Adaptive System Theory
293
It is also possible to reparameterize the model in the form of the controller, so that the controller parameters can be estimated directly. This is the direct approach. Let us consider the direct approach to adaptive MV control for systems with general delay. With the plant parameterized in the form of (2.3), the MV controller for tracking y*(t) is simply given by
y*(t + d) = R(q-1)u(t) + S(q-Z)y(t) + D(q-1)v(t). The corresponding adaptive control law is to replace R, S, D and v(t) by their estimates. If the algorithms (2.5-2.6) and (2.7-2.8) in Section 2.1 are employed for estimation, the adaptive MV tracker simply is given by
y*(t + d) = ~T(~)O(t).
(~.80)
The self-optimality of the direct MV adaptive control (2.80) can be established as for adaptive prediction. For details, we refer the reader to [18]. 2.4 T h e c o n v e r g e n c e o f a d a p t i v e controllers In the previous section, we have established the self-optimality, i.e., the optimal performance with respect to a certain cost criterion on the output y of the system, of a class of certainty equivalence adaptive control. In this section we consider the issue of parameter convergence and show that self-optimality generally implies self-tuning in some sense. Let us consider the indirect adaptive MV control using the stochastic gradient algorithm. In the previous subsection, we have shown that 1 N lira ~ ~ , ( y ( t ) - y°(t))2 = 0 a.s., N --~ o o
t----0
where y°(t) := y*(t)+F(q-1)w(t). Let u°(t) be the input corresponding to y°(t), i.e., it satisfies
A(q-1)y°(t) = q-dB(q-1)u°(t) + C(q-1)w(t).
(2.81)
Since B(q -1) is Hurwitz, it follows that N
lira 1 N--oo ~ ]~(~(~) -- ~'(t))~ = 0 a.s..
De~ne ¢(~) and R°(~) as ~0(t) and Ro(t), using y°(t) and u°(~) in place of
y(t) and u(t), respectively. Then
±N ~ I1¢(~) - ¢(~)11" - o a.s. and
~IIR°(,)
- R(,)II-.-,. o a.s..
294
Ren and Kumar
It then follows from (2.36) that IIIRO(N
- 1)ff(N)ll --* 0 a.s..
For simplicity, assume that the ensemble correlations of the command signal y*(t) exists. Then ~ converges. Let := lim R ° ( N - 1 ) N--.~ N It can then be shown that 0"(t) converges almost surely to the null space of~. The following theorem summarizes the above development and further characterizes the null space of ~lt. T h e o r e m 2.11 Consider the indirect adaptive M V control using the SG algorflhm. Let the assumptions of Theorem ~.7 hold, and suppose that the ensemble correlations of {y*(t)} exist. Then
g(t)
a.s..
Further VO E .N'(~), using the obvious notation, let A, B, C, F and G be the appropriate polynomials corresponding to the parameter vector O+ 0o. Then F(q-1) = p(q-1),
B(q-
)O(q
(2.82)
(2.83)
=
and
o.
(2.84)
t.7.0
Moreover, i f B ( q -1) and G(q -x) do not have a common factor, and {y*(t)} is persistently ezciting of order lp >_min(deg C(q-1), deg A( q-1) ) + 1, then o(t) - , Oo a.s..
Proof. Since O"E .A/(~), we have N
1 y ~ ( ( A - i)y°(t) + q-~(B - B)u°(t) + (C - C)w(t)) ~ -* 0 a.s.. N t----1 So, 1 ~v -ff Z ( ( A - . 4 ) y
* ( t ) + q - d ( B - B ) u ° ( t ) + ( ( A - A ) F + C - C ) w ( t ) ) 2 ~ O. (2.85)
¢-----1
Since u°(t) is Y't-measurable, it follows as in [3,13] that the first d coefficients of (A - A ) F + C - C have to be zero, which yields (2.82).
Stochastic Adaptive System Theory
295
Multiplying (2.85) by B(q -1) and substituting (2.81) into (2.85), we obtain 1
N
--N ~
[(BA - B.A)y*(O + ( B ( C - .4F) - B ( C - AF))w(t)] 2 --~ 0 a.s..
t=O
Hence
1
N
I2 t----O
2
-.0
and 1
N
So
B ( C - AF) = B ( C -- A F ) ,
(2.87)
which is equivalent to (2.83). If B and G do not have a common factor, then
for some scalar A. Iftp > d e g A + 1, from the above and (2.86), we obtain A A - A = 0. It then follow from (2.87) that A = 1, .4 = A, and C = C. From (2.87), we obtain
( B . A - .BA)F -- B C - B C . Hence if ~p > deg C + 1, we can proceed similarily. This completes the proof. []
The parameter convergence results above can be easily extended to more general adaptive control laws for which signal convergence in the mean square sense can be established, e.g., model reference adaptive control. The extension to the ELS algorithm is also possible. For these extensions, we refer the readers to Ren and Kumar [18]. In the case of direct adaptive MV regulation of systems with general delay, based on the SG algorithm using a-priori estimates, the parameters of C(q -1) need not be estimated, and it can then be shown that if there is no overpararneterization, the limit set # + 00 of 0(0 , is a line passing through the origin and the true parameter. The fact that II0"(t)l] converges can then be used to establish the convergence of 0(0 to a particular point on the line. This result then generalizes to the case of general delay the result of Becker, Kumar and Wei [3], which establishes the convergence of the parameter estimates to a random multiple of the true parameter for the unit delay case.
296
Ren and Kumar
,1 aCq-~)~ + ,~(,)
! Fig. 1. Problem A: Output error identification and adaptive IIR filtering.
z(t) ID
~l H(q-~)! i
i
,'r,(~) [KCq I
I
/
I~(t)
Fig. 2. Problem B: Adaptive feedforwaxd control.
s(t)
,.[ G(q-~)I Y(~)~
(•)
+-...-
I~(z)
Fig. 3. Problem C: Adaptive feedforward control with input contamination.
Stochastic Adaptive System Theory 3 Stochastic
Parallel
Model
297 Adaptation
Problems
In this section, we consider three successively more complicated parallel model adaptation problems which are all inspired by practical problems and are illustrated in Figs. 1, 2 and 3. All the fixed systems in Figs. 1-3 are unknown but stable, linear and time invariant. In all three figures, G(q -1) is the system whose output y(t) is to be matched by ~(t), the output of the (partially) tunable system in the lower channels of the figures. The signal s(t) is the common input to both the upper and lower channels, while v(t) is the disturbance. Problem A, which is the simplest of the three and is depicted in Fig. 1, encompasses both output error identification as well as adaptive filtering. In the identification context, G(q -1) is the unknown dynamical system to be identified, and z(t), s(t) and v(t) are the output, input, and disturbance plus measurement noise of the system, respectively. In the filtering context, v(t) is the signal to be extracted, y(t) is the noise corrupting the measured signal z(t), and s(t) can be regarded as the source of the noise. In both cases, v(t) can be quite arbitrary, except for the reasonable and practical assumption that {v(t)} and {s(t)} are stochastically uncorrelated. The challenge here is to exploit this correlational property to provide unbiased estimation of the dynamical system G(q-1), in the context of identification, or undistorted extraction of the signal v(t), in the context of filtering. Problem B, depicted in Fig. 2, is clearly a generalization of Problem A, and arises in practice as an adaptive feedforward control scheme for disturbance cancelling, which has the following physical interpretation. P(q-1) is an unknown "plant", for which u(t) is the input, and e(t) is the output, which is subject to additive disturbances v(t) and y(t) which are mutually uncorrelated. A signal re(t) related to s(t), the source of the disturbance y(t), is measured. The goal of adapting the feedforward controller K(q -1, t) is to eliminate the part due to y(t) from the output e(t), leaving only the uncancelable disturbance v(t). Problem C, depicted in Fig. 3, represents a further complication that is inspired by practical applications of adaptive feedforward control in the areas of active acoustic noise and vibration control [24]. The feedback F(q -~) in Fig. 3 models the contamination of the measurable signal re(t) by the control input u(t). In active acoustic noise control this effect is called acoustic feedback. More generally, it has been termed as input contamination in [24]. Section 3.1 considers Problem A and develops tools that allow us to exploit the uncorrelatedness between signals. The extension to Problems B and C is considered in Section 3.2. 3.1 T h e c o n v e r g e n c e o f o u t p u t e r r o r r e c u r s i o n s In this subsection, we consider Problem A, the output error identification and adaptive IIR filtering problem. Referring to Fig. 1, let G(q -1) -
B(q -1) A(q-1) '
298
Ren ~nd Kumar
where A(q -1) = 1 + alq -x + . . . + a n q - " B(q -1) = bo + blq -1 -4r... -1- b,~q - m . Hence, A(q-1)y(t) = B(q-1)s(t) ,
(3.1)
and z(t) = u(t) + v(t). As mentioned, the goal is to estimate the parameters of A and B, and the signal v(t). We consider the following algorithm which is a slight variation of Landau's output error identification algorithm [25] in that a projection of the parameter estimate onto a compact convex set A4 known to contain 00 = [ a l , . . . , a n , bo,...,b,n] is employed. We estimate 0o by O(t):= [ ~ x ( t ) , . . . , ~ ( t ) , ~0(t), ... ,~m(t)] T, as follows, ¢(t - 1) O'(t) = O(t - 1) + -~(t- iSe(0,
0(0) e M
(3.2)
0 ( 0 = F[O'(t)l, where F[-] denotes the projection onto A4, r(t - 1) = r(t - 2) + II¢(t - 1)ll 2, r(-1) e(t) :: z(t) - (,bT(t -- 1)0(t -- 1)
= 1
¢(t -- 1) := [--y'(t -- 1 ) , . . . , --y'(t -- n), s ( t ) , . . . , s(t -- re)l,
(3.3)
and u'(t) = ¢ r ( t
-
1)¢(0.
Because the estimate of v(t) is given by e(t), we shall say that the algorithm (3.2-3.3) is self-optimizing if, 1 t¢ lim ~ E ( e ( t )
N---*oo
- v(t)) 2 - 0 a.s..
(3.4)
t=l
Theorem 3.1 below establishes this optimality under the following set of assumptions. Assumptions +oo
A3.1) The disturbance is of the form v(t) = E c i ( t ) w ( t
- i), where {w(t)} is a
i=0
martingale difference sequence with respect to the increasing sequence of ~fields 5~ generated by (w(0),..., w(t), {u(k)}~), and {ci(t)} is deterministic and satisfies Ici(t)l < / ( c a -i, for some a > 1, Ke < co, and Vt >_ 0. A3.2) The sequences {[y(t)[}, {Is(t)[}, {Iv(t)[}, and {[w(t)l}, are uniformly bounded by a finite number F, almost surely.
Stochastic Adaptive System Theory
299
A3.3) A(q -~) is Hurwitz and satisfiesthe following strictly positive real (SPB.) condition, Re[A(e/~)] > 0,Vw.
A3.4) Oo E .A,4. A3.5) limN-.co inf ~ ~"~=1 s2(t) > 0. Note that the set .A,4 can be constructed from minimal a-priori information about the transfer function ~ , for example, from the knowledge of a finite upper bound for its Hco-norm. T h e o r e m 3.1 Let the assumptions (,43.1) to (A3.5) hold. Then for the algo-
rithm described by (3.~ - 8.3}, the self-optimality result (3.4), as well as the following results hold: sup r( N) N -"-~ <00
a.s.)
co
II0(t) 0(t -
-
1)112
<
oo
a.s..
t----1
Proof. Let us denote v ( t ) := z ( t ) - y'(t) za(t) := o(t) - o(t - 1),
z~'(t) : = o ' ( 0 - o(t - 1 ) ,
~(t) = 0(0 - 00, and ~'(t) := o'(t) - 0o. Because of the projection facility, 0(t), O'(t), za(t) and zY(t) are all uniformly bounded. As a consequence, II¢(t)ll has a bounded growth rate, and ,(t)
,.(~ - k ) is uniformly bounded in t, for finite k. Let us consider the "stochastic Lyapunov function" V(N) := IIb'(g)ll 2 . It can be shown to satisfy the recursion V(N) +
N
N
t=i
t=i
~--~]lza'(t)ll2 + ~
2 [ A ( q - 1 ) ( v ' ( O - v(t))][v'(t) -
v(t)]
r(t - 2) ~2¢T(t-
< v(o). ~ •
1)O'(t)v(t)
.~ = ~)
.
(3.5)
t=l From the positive real assumption (A3.3), the third term on the left hand side above satisfies,
, = ;(t • [A(q-1)Cv'(t)
-
vCO)][V(O
- v(O]
___~ ~l[(v(t)- ~(t))] 2 + ~ [ ¢ T ( t - 1)~'(t)] ~ t=l
r ( t -- 2)
-- So,
300
Ren and Kumar
for some finite positive constants 51 and 52, and a random variable So < +c~ a.s..
Turning next to the last term on the right hand side of (3.5), our goal is to show that for Ve > 0,
,=1
~:~)
< O(1)-1-o
~ ~ C T ( t - 1)0"(t)~ ,=1 \ -F(~'---~ ]
+e~llza'(t)ll2",=l (3 6)
To exhibit this let us consider a decomposition of the signal v(t) as
v(t) = ~l(t) + ~ ( t ) where { d(t)-i v1(t) = ~ c,(t)w(t - O, d(t) >_ 1, i=0
o,
d(t) = O,
+oo
~(~) = ~
c,(~)w(~ - i),
i=d(t)
and d(t) := max{d E Z+ledlog ~ r(t -- d) >_ d}, where ed is a sufficiently small positive number. Because of the monotonicity of log~ r(t), d(t) is clearly uniquely defined and depends only on r(0),..., r(t - d(t)), hence is Yrt_d(t)-measurable. Moreover, edlog a
r(t- d(t)-
1)- 1 <
d(t) < edlOga r(t- d(t)) <_ed log r(t).
The basic idea for proving (3.6) is to bound
It=~ CT(t-1)'ff'(t)v2(t)r~--'~) by exploiting the vanishing magnitude of Iv~(t)l, and
I,=~¢('-l)'ff'(')vl(t) r(t - 2)
I I
by using a backward recursion developed in [26], and Lemma 1 a). By backward recursion we approximate the regression vector ¢(t - 1) by y(t - 1), which is the fictitious state of the parallel adaptation model at time t - 1 with the parameter fixed at O(t - d(t)), and hence is ~_d(o-measurable. For the rest of the details, we refer the reader to [19,20]. []
Having established the self-optimality, we turn next to the issue of parameter consistency.
Stochastic Adaptive System Theory
301
T h e o r e m 3.9. In addition to the assumptions of Theorem 3.1, assume that A(q -1) and B(q-1)) do not have a common factor, and that 0o is an interior point of the set .A4. I f {u(t)} satisfies one of the following two conditions, i) {u(t)} is suJ3~eiently rich of order greater than or equal to n-I- m-I- 1, i.e., :1 an e > O, an integer T < so, such that I+T
~[u(t),...,
u(t - n - m)]Eu(t), ... , u(t - n -- m)] T _ eI,+m+,,
i=t
for all t su~fficiently large; or it) u(t) is an A R M A process, i.e., u(t) = C(q-X)~(t), where C(q -1) is a stable rational transfer function, and e(t) is a martingale difference sequence with finite but nonzero variance. Then, O(t) --* Oo a.s..
Proof. As
in Section 2.1, it can be shown that
1
~
~---'1") ¢(t -- l)(e(t) -- v(t))
~b(t- 1)~b(t- l)Tft(t-- I) -I-~
I--~-NR(tr(t-- 1)1)¢(t -
+ - ~1 [ ~7.~
1)v(t) I + ~1l l R:v( t
- 1)[1 [lO'(t) - O(t)[ I + o(1)
I
where R ( N ) := ~"~ ¢(t)¢W(t). The first two terms on the RHS above converge to zero by Schwarz inequality and the self-optimality established earlier. The third term can be shown to converge to zero by the techniques in the proof of Theorem 3.1, i.e., the decomposition of v(t), the backward recursion for ~b(t - 1), and Lemma 1 a). We turn next to the fourth term on the RHS above. Let 7D(Z) denote the indicator function of the complement of a set D. Clearly
lift'(t) - ft(t)ll ___L,,(o'(t))llza'(t)ll. Hence, N l ~ - ~ l l R ( t - 1)ll [[e'(t) - e(t)[I 1 1
<_
2v
~ll¢(t- 1)eCt)llT~CO'(tl) 1
1 N
_< ~ [ [ ¢ ( t
1 N
- 1)(e(t) - v(t))[[ + ~ - ~ [ [ ¢ ( t - 1)v(t)[[ 7 ~ (0'(t))
1
< o(1) + E ~
1
I¢(t - 1)112
o'(t
302
Ren and Kumar
From either condition i) or ii) of the theorem, by using the techniques of [3,13] we can establish that 0(t) converges to 00 in the Cesaro sense, i.e., for any open set D containing 80, N
o 1
Since 0o is an interior point of Ad, and lie'(t) - 0(t)ll --, 0, it follows that N a.s.. 1
Therefore, 1
~ [ [ R ( N - 1)~'(N)II --* 0
a.s..
From either i) or ii), and the self-optimality, it can be shown that l~n..+~f NJlmin(n(N)) > O.
Hence, IIO(t)ll
0 a.s..
O
3.2 A d a p t i v e f e e d f o r w a r d c o n t r o l Let us consider Problems B and C depicted in Figs. 2 and 3. Clearly,
e(t) -- G(q-1)s(t) - P(q-1)u(t) + v(t).
(3.7)
We make the following assumptions. Assumptions A3.6) G(q-1), p(q-1), H(q-1) and F(q -1) are stable rational transfer functions, and p(q-1) is minimum phase. A3.7) The pure delay in G(q -1) is larger than or equal to that in H(q-1)P(q-1), and H(q -1) is minimum phase. The model (3.7) can then be equivalently represented by
A(q-1)(e(t) - v(t)) = q-dB(q-1)u(t) + q-kD(q-1)m(t),
(3.8)
where A(q-1), B(q - t ) and D(q - t ) are polynomials, A(q -1) is Hurwitz, and d < k. With the model parameterized as in (3.8), it is clear that the minimum variance feedforward control is given by
B(q-1)u(t) + q-(k-d)D(q-1)m(t) = O.
Stochastic Adaptive System Theory
303
Let us now consider the adaptation procedure. Let B(q -1) = bo + ... + btq-t,D(q -1) -- do + . . . + dnq-n,Oo - [b0, . . . ,bL, do, ... ,dn], and ~b(t) := [u(t),...,u(t-- t - - 1),m(t -- k + d ) , . . . , r n ( t - k + d - n -
1)] T.
Then
A(q-1)(e(t) - v(t))
-
~T(t
-
-
d)O0.
We propose the following adaptive feedforward control algorithm. The control u(t) is generated so as to satisfy
~T(t)0(t) = o, and 0(t), the estimate of 00, is updated by o(t) = F
a@(t - d) . .'X o(t - I) + ~ = . ~ c(t)),
r(t - d) = r(t - d - 1) +
II¢(t
- d)ll 2,
a> o
r(-d) : 1,
where F(.) denotes the projection onto a convex compact set .A4. The following theorem establishes the optimality of the above adaptive control law. T h e o r e m 3.3 Let the following conditions hold:
i) {v(t)} and {s(t)} satisfy the assumptions (A3.1), (AS.e) and (A3.5). i V Assumptions (,43.6) and (,43. 7) hold. iii) min Re[n(z)] > ( d - 1 )
a
(3.9)
Izl=l
iv) Oo E f14, and "bo ~ O, VO E f14 . Then N
lim ~1 ~ [ ~ ( t ) - ,~(t)y = 0 ~v--.~o
(3.~o)
a.$..
t=1
Proof. Let us denote v(t) := IIO'(t)lr, and
z~(O := II~S(t - d)e(t)ll
~(t - d)
Then V(t) satisfies the recursion
V(N) _< V(0) + )--~A2(t) + 2a t=l
+2a~-"@(t Z.~ ,=i
@T(t
d)e(t)
t=l
- d)('g(t - 1) - ~(t - d))c(t) r(t - d)
(3.11)
304
Ren and Kumar
The last term on the RHS above can be bounded as follows,
N T(t- d ) ( O ( t - 1) - 8"(t -d))e(t) 2a~_~ r(t - d)
,=, N
d-1
tw.1
i=1
N d-1
< ~y:C~2(t) + ,~2(t
i))
-
t=l i=1 N
< 2(d - llY:za2(0 + o(1).
(3.12)
t=l
As in the proof of Theorem 3.1, we can show that
~(t -- ~) d)v(t) I
_
,_-1~,
r(t-d)
for Ve > 0. Substituting (3.12) and N
t=l
-- d)ff(t - d)(e(t) - v(t)) r(t -- d)
~¢T(t t=l
r(t -- d)
)
(3.14)
+ O(1).
Now, N
~za~(t
) = a~
t=l
t=l
<
-
II¢(t - d)ll2(~(t) - , ( t ) ) ~ + v2(t) + 2~(t)~(t) r2(~: -- d)
a 2 ~ -~(e(t) -~v(--t))z + O ( 1 ) .
r(t-d)
,~=,
Substituting the above into (3.14), noting that
A(q-1)(e(t)
-
v(t)) = --¢T(t
-
d)O(t
and using (3.9), we obtain oo
,=1
(3.13)
(3.13) into (3.11), we obtain
V(N) < (2d - 1 + e ) ~ A 2 ( , ) + 2a ,=a ~
+~z~2(t)',=l
(~(t) - , , ( 0 ) ~' ~i=i3 < oo.
a.s..
-
d) ,
Stochastic Adaptive System Theory
305
Finally, since F(q -1) is stable and P(q-1) is minimum phase, we have , ( N - 1) -- 0
=0
(m'(O + u'(O)
a.s..
The optimality (3.10) then follows as in the proof of Theorem 2.7.
R e m a r k s : i) It is worth noting that the presence of input contamination F(q -1) does not require any algorithmic modification, ii) Unlike the output recursions for filtering or identification, it is not possible to use a posteriori estimates in the regression vector for adaptive feedforward control, iii) When s(t) is periodic, the assumption (A3.7) can be removed from Theorem 3.3. D
4 Concluding
Remarks
As we have noted at the outset, stochastic adaptive system theory has as its goal the improvement of the performance of an adaptive system by exploiting any stochastic, typically correlational, properties of disturbances and signals. As a step towards providing such a coherent view of stochastic adaptive system theory, we have provided a somewhat unified treatment of several problems in recursive identification, adaptive signal processing, and adaptive control, based both on the equation error method as well as the parallel model method. There is yet another dimension towards a more encompassing stochastic adaptive system theory which at the current time remains largely an open area. This is to use any prior information about the likely values of the true parameters and treat them also as random variables. Such an approach leads us back essentially to stochastic control, and what is commonly termed "dual" control. While the determination of "optimal" algorithms for such a model seems intractable, there are interesting issues that arise. For example, will an "optimal" or "good" algorithm be stable for a fixed value of the parameters? How "robust" are these algorithms with respect to the assumption on the underlying probability distribution for 00? Such a theory suggests the promise of a better treatment and analysis of transient properties of adaptive algorithms, in contrast to the largely asymptotic theory currently existing. We have also not dealt here with the "robustness" issues in adaptive control. Much progress has been made on this problem, and is surveyed elsewhere in
306
Ren and Kumar
this volume. However, m a n y performance issues relating to the choice of both adaptation as well as control designs remain open. A genuinely comprehensive theory of adaptive systems, it seems, will require a unification of all these considerations, and still remains to be developed.
References 1. Y.S. Chow, "Local convergence of martingales and the law of large numbers," Ann. Math. Statist., vol. 36, pp. 552-558, 1965. 2. T. L. Lai and C. Z. Wei, "Least squares estimate in stochastic regression with applications to identification and control of dynamic systems," Ann. Math. Statist., vol. 19, pp. 154-166, 1982. 3. A. Becker~ P. R. Kumar, and C. Z. Wei, "Adaptive control with the stochastic approximation algorithm: Geometry and convergence," IEEE Trans. Aut. Control, vol. AC-30, pp. 330-338, April 1985. 4. H. F. Chen and L. Guo, "Asymptotica]ly optimal adaptive control with consistent parameter estimates," S l A M J. Control Optimiz., vol. 25, no. 3, pp. 558-575, 1987. 5. K. J. AstrSm and B. Wittenmark, "On self-tuning regulators," Automatic.a, vol. 9, pp. 185-199, 1973. 6. L. Ljung, "Analysis of recursive stochastic algorithms," IEEE Trans. Aut. Control, vol. AC-22, pp. 551-575, 1977. 7. L. Ljung, =On positive ~eal transfer functions and the convergence of some recursive schemes, ~ IEEE Trans. Aut. Control, voL AC_.-22, pp. 539-551, 1977. 8. V. Solo, "The convergence of AML," IEEE Trans. Aut. Control, vol. AC-24, pp. 958-962, 1979. 9. G. C. Goodwin, P. J. Ramadge, and P. E. Caines, "Discrete time stochastic adaptive control," SIAM J. Control Optimiz., vol. 19, pp. 829-853, 1981. 10. G. C. Goodwin, K. S. Sin, and K. K. Saluja, ~Stochastic adaptive control and prediction-the general delay-colored noise case," IEEE Trans. Aut. Control, vol. AC-25, pp. 946-949, 1980. 11. J. J. Fuchs, "Indirect stochastic adaptive control: The general delay-white noise case," IEEE Trans. Aut. Control, vol. AC-27, pp. 219-223, 1982. 12. H. F. Chen and P. g. Caines, "Strong consistency of the stochastic gradient algorithm of adaptive control," IEEE Trans. Aut. Control, vol. AC-30, pp. 159-192, 1985. 13. P . R . Kumar and L. Praly, "Self-tuning trackers," S I A M J. Control Optimiz., vol. 25, pp. 1053-1071, July 1987. 14. T. L. Lai and C. Z. Wei, "Extended least squares and their applications to adaptive control and prediction in linear systems," IEEE Trans. Aut. Control, vol. AC31, pp. 898-906, 1986. 15. H. F. Chen and L. Guo, "Convergence rate of least squares identification and adaptive control for stochastic systems," Int. J. Control, vol. 44, pp. 1459-1476, 1986. 16. M. Radenkovi~ and S. Stankovi6, "Strong consistency of parameter estimates in direct self-tuning control algorithms based on stochastic approximations," Auto. matica, vol. 26, pp. 533-544, May 1990. 17. W. Ren and P. R. Kumar, "Direct stochastic minimum variance control with noninterlaced algorithms." Submitted to 1991 Amer. Control Con/., September 1990.
Stochastic Adaptive System Theory
307
18. W. Ren and P. R. Kumax, "On stochastic adaptive control," to appear as a Technical Report, 1990. 19. W. Ren and P. R. Kumar, "The Convergence of Output Error Recursions in Infinite Order Moving Average Noise," to appeax in New Directions in Time Series Analysis, IMA Volumes in Mathematics and its Applications, Springer-Verlag, 1991. 20. W. Ren and P. R. Kumar, "On stochastic parallel model adaptation problems," to be submitted, 1990. 21. P. R. Kumax, "Convergence of adaptive control schemes using least-squares parameter estimates," IEEE Trans. Ant. Control, vol. AC-35, pp. 416--424, 1990. 22. K. S. Sin, G. C. Goodwin, and R. R. Bitmead, "An adaptive d-step ahead predictor based oil least squares, ~ IEEE Trans. Ant. Control, vol. AC-25, pp. 1161-1164, 1980. 23. L. Guo and H. F. Chen, "Revisit to Astr~m-Wittenmark's self-tuning regulator and ELS-based adaptive trackers," Research Report, Academia Sinica, Institute of Systems Sciences, Beijing, 1990. 24. W. Pen and P. R. Kumar, "Adaptive active noise control: Structures, algorithms and convergence analysis," in Proc. Inter.Noise 89, pp. 435-440, Newport Beach, CA, Dec. 1989. 25. I. D. Landau, "Unbiased recursive identification using model reference adaptive techniques," IEEE Trans. Ant. Control, vol. AC-21, pp. 194-202, 1976. 26. W. Ren and P. R. Kumar, "The convergence of output error identification and adaptive IIR filtering algorithms in the presence of colored noise," Pros. 29th Conf. Dec. Control, pp. 3534-3539, Honolulu, HI, Dec. 1990.
P a r t II Adaptive Nonlinear Control
A d a p t i v e F e e d b a c k L i n e a r i z a t i o n of N o n l i n e a r Systems* P. V. Kokotovi6,1 L Kanellakopoulos, 1 and A. S. Morse ~ 1 Coordinated Science Laboratory University of illinois, Urbana, IL 61801, USA. 2 Department of Electrical Engineering Yale University, New Haven, CT 06520-1968, USA.
A b s t r a c t . After an examination of the restrictive assumptions that limit the applicability of existing adaptive nonlinear control schemes, new adaptive regulation and tracking schemes are developed for a class of feedba~:k linearizable nonlinear systems. The coordinate-free geometric conditions, which characterize this class of systems, neither restrict the location of the unknown parameters, nor constrain the growth of the nonlinearities. Instead, they require that the nonlinear system be transformable into the so-called pure-feedb~k form. When this form is "strict ", the proposed scheme guarantees global regulation and tracking properties, and substantially enlarges the class of nonlinear systems with unknown parameters for which global stabilization can be achieved. The new design procedure is systematic and its stability proofs use simple analytical tools, familiar to most control engineers.
1 Introduction Until a few years ago, adaptive linear [1,2] and geometric nonlinear [3,4] methods belonged to two separate areas of control theory. T h e y were helpful in the
design of controllers for plants containing either unknown parameters or known nonlinearities, but not both. In the last few years the problem of adaptive nonlinear control was formulated to deal with the control of plants containing both unknown p a r a m e t e r s and known nonlinearities. A realistic plan of attack for this challenging new problem led through a series of simpler problems, each formulated under certain restrictive assumptions. T h e two m o s t common assumptions are those of linear parametrization and full-state feedback. L i n e a r p a r ~ m e t r i z a t i o n . This assumption, adopted by all the researchers in the field [5-18], requires t h a t in a nonlinear plant the unknown p a r a m e t e r s either appear, or can be made to appear, linearly. For example, if the plant model contains not only 01 and 02, but also e el°2, it is to be "overparametrized" by * The work of the first two authors was supported in part by the National Science Foundation under Grant ECS-87-15811 and in part by the Air Force Office of Scientific Research under Grant AFOSR 90-0011. The work of the third author was supported by the National Science Foundation under Grants ECS-88-05611 and ECS-90-12551.
312
Kokotovi~, KaJaellakopoulos, nnd Morse
introducing 0a = esis2 as an additional parameter. For linear plants, an analogous linear parametrization is to consider the coefficients of a transfer function as "plant parameters", although they may be nonlinear functions of physical parameters such as resistance, inertia etc. Most adaptive nonlinear control results have been obtained for linearly parametrized systems of the form
~-- fo(~)"F~-~Oifi(~)q" go(¢)'i" Oigi(~) u, i=l
(I.I)
"--
where ~ E IR n is the state, u E IRIS the input, 0 = [01,...,Op]T is the vector of unknown constant parameters, and fi, gi, 0 < i <_p, are smooth vector fields in a neighborhood of the origin ~ = 0 with f~(O) = O, 0 < i < p, g(O) ~ O.
F u l l - s t a t e feedback. With the exception of some very recent papers [17,18], all the adaptive nonlinear control literature [5-16] deals with the full-state feedback problem. The assumption of full-state feedback is motivated by the geometric conditions for input-output or full-state linearization of (1.1). Parameter-dependent forms of the feedback linearization conditions for (1.1) and their "certainty-equivalence" implementation require that further restrictions be imposed either on the location of the unknown parameters or on the type of nonlinearities. According to these additional restrictions, the existing adaptive schemes can be classified into uncertainty-constrained schemes and nonlinearity. constrained schemes. U n c e r t a i n t y - c o n s t r a l n e d s c h e m e s , surveyed in Section 2, impose restrictions (matching conditions) on the location of unknown parameters, but can handle all types of nonlinearities. References dealing with this class of schemes are [5-11,14]. N o n l l n e a r i t y - c o n s t r a i n e d schemes , surveyed in Section 3, do not restrict the location of unknown parameters. Instead, they impose restrictions on the nonlinearities of the original system, as well as on those appearing in the transformed error system. References dealing with this class of schemes are [12-16]. The brief survey of the existing adaptive schemes in Sections 2 and 3 stresses their major limitations. One of the main results of this paper is that these limitations can be removed for the so-called pure-feedback systems. This is now the broadest class of nonlinear systems for which adaptive controllers can be systematically designed without imposing any growth constraints on system nonlinearities. P u r e - f e e d b a c k s y s t e m s are introduced in Section 4. Their geometric characterization identifies the level of uncertainty and nonlinear complezi~y as structural obstacles to adaptive feedback linearization. For an unknown parameter, the level of uncertainty is its "distance", in terms of the number of integrators,
Adaptive Feedback Lineaxizationof Nonlinear Systems
313
from the control input. The larger this distance is, the smaller is the number of state variables on which the nonlinearity multiplying this parameter is allowed to depend (nonlinear complexity).
Systematic design procedure. The new adaptive scheme for pure-feedback systems is designed by a step-by-step procedure which interlaces, at each step, the change of coordinates required for feedback lineaxization,and the construction of parameter update laws required for adaptation. The main idea of this new nonlinear procedure evolved from an earlier linear result of Feuer and Morse [19]. In Section 5, the procedure is developed for single-input systems. Its feasibility and the stability of the resulting adaptive system are established in Section 6. The extension to the multi-input case is given in Section 7. One of the most important stability and robustness properties of every adaptive system is the size of its region of attraction, relative to the size of the region that would have been achieved if all the parameters were known. When with the known parameters the stability and tracking properties are global, but the same properties of the adaptive scheme are only local, then the loss of globality is due to adaptation. To avoid this loss, the schemes of [12,13,15,16] require that the nonlinearities and some of their derivatives satisfy a linear growth condition, which severely limits the applicability of these schemes. The class of systems for which the new adaptive scheme guarantees global regulation and tracking is much wider. Global r e g u l a t i o n a n d tracking. It is shown is Section 8 that the region of attraction for the new adaptive scheme is global if the feedback lineaxization is global. A subclass of pure-feedback systems for which this global property is easy to establish axe strict feedback systems. For these systems the new adaptive scheme achieves both global regulation and, as shown in Section 9, global tracking of smooth bounded reference inputs. In contrast to the eaxfier schemes, these global results axe obtained without any growth constraints on system nonlineaxities. This is illustrated by two examples in Section 10.
2 Uncertainty-Constrained Schemes The uncertainty-constrained schemes developed by Kanellakopoulos, Kokotovi6 and Maxino [6,7,8] and Campion and Bastin [5], use the so-called eztended matching condition (EMC), which was introduced in [6,7] and reformulated in [5] as a "strong lineaxizability" condition. For the nonlinear system (1.1), the EMC has the following geometric expression: fl e sp {g0,ad/og0} g~ E sp {g0}.
(2.1)
This condition extends the strict matching condition (SMC) sp
{go},
(2.2)
314
Kokotovid, Kanellakopoulos, and Morse
which was used by Taylor, Kokotovid, Marino and Kanellakopoulos [10,11], and, in a special case, by Slotine and Coetsee [9I. For full-state feedback linearizable systems, the E M C is necessary and sumcient for the existence of a parameterindependent diffeomorphism z -- ~b(~) which transforms the system (1.1) into ~i = Zi+l,
/ = 1,...,n- 2 t
~ . - I = x, + y]~OiTi(z)
(2.3)
i=I
i=l
To illustrate the properties of the EMC-based schemes of [6,7] we use the system ~:I = Z2
z2 = zz -I- 07(zl)
(2.4)
~:3 ---- u ,
with 7(zl) a smooth nonlinear function of xl. This system is already in the form (2.3). The design procedure of [6,7] employs an estimate 0 of the unknown parameter 0 and replaces x3 with the new state
~
= ~3 + ~ ( ~ i )
•
(2.5)
The system (2.4) is then rewritten as ~:1 -----z2
~ = ~3 + 7(~1)(0 - ~) ~ = ~+
0~1
(2.6)
+ ~7(~1).
It is now possible to design the control as a function of the state and both/~ and :, 8, because the derivative 0 will be explicitly known from the update law. The control ,~07(Xl)
U :- - - O ~ X
2 -- ~ ( X i )
-- ]¢1Xl -- ~2X2 -- ~3[X3 + 0 7 ( X l ) ]
(2.7)
renders the system (2.6) linear in the parameter error O - 0: Xl "- Z2
,~ : ~ + (o - 0)-r(:~l) ~3 = - - k l X l
(2.8)
-- k 2 X 2 -- k3~:3 •
The gains kl, k2, k3 are chosen to place the eigenvalues of the system matrix [0 A --
, 0 ] 0 0 1 - k l -k2 -k3
(2.9)
Adaptive Feedback Linearizationof Nonlinear Systems
315
at some desired stable locations. The generM form of (2.8) is
This form is desirable, because it suggests that a parameter update law can be designed using a result of adaptive linear control [1]. This update law is 8 = WT(~)P$,
(2.11)
where P = pT > 0 is chosen to satisfy P A + A T P -- --I. The stability of the equilibrium ~ = 0, 0 ---- 6 of the closed-loop system (2.10)-(2.11) is then established using the quadratic Lyapunov function V -- ~Tp~ + (8 -- ~)2, whose derivative along the solutions of (2.10)-(2.11) is V = _[[~[[2 _< 0. Since the feedback linearization of (2.4) can he achieved for all z and 8, this stability result is global. The largest invariant set of (2.10)-(2.11) contained in the set where V :- O is M - {(~, 0): ~ = 0, (6 - 8)7(0) -" 0}. (2.12) Furthermore, if 3'(0) ~ 0, then (0, 0) can be shown to be an exponentially stable equilibrium of (2.10)-(2.11). Thus, the stability of this adaptive scheme is robust with respect to both fast stable unmodeled dynamics and small disturbances. In the case when "/(0)- 0, or, in the multi-input case, when the number of unknown parameters is more than twice the number of control inputs, only stability, rather than exponential stability, can be achieved. The above example demonstrates the main advantage of EMC-based schemes: their stability properties can be established independently of the type of nonlinearities. Clearly, the nonlinear function 7(zl) in (2.4) can be any smooth function, and is not restricted by global Lipschitz or sector-type conditions. Even though the EMC is quite restrictive, it is satisfied by many systems of practical importance, such as most types of electric motors [7] and autocatalyzed chemical reactions [5]. The robustness of EMC-based schemes with respect to unmodeled dynamics [6,7] can be exploited to extend their applicability. For systems that do not satisfy the EMC, a partial high-gain feedback control can be used to induce a two-time-scale property, such that the slow subsystem satisfies the EMC, and the fast stable subsystem is treated as unmodeled dynamics. This possibility was exploited by Taylor [10] and is illustrated on the system ~I = x2 + 07(zl) x2 = z3
(2.13)
which does not satisfythe E M C (2.1).Using the control . = -1(v -
P
31,
(2.14)
316
Kokotovid, KaneUakopoulos, and Morse
where v is the new control variable and p is a small positive constant, and replacing z s with the new state ~ - x z - v , we transform (2.13) into the standard singular perturbation form ~1 = z2 +
07(zi)
~ -- v + . /J/} = - I / + / ~ ) .
(2.15)
Neglecting the fast nnmodeled dynamics by setting ~1- 0, we obtain the reducedorder system
~i = z2 + o~(Xl)
(2.18)
Z2=V,
for which an EMC-based scheme is: ~
= ~
+ ~-r(~l)
(2.17)
v = - k l z x - k2~2 -
O°'~(~l) cgzl ~:~ ~(~:~) -
(~.Is)
When applied to (2.15), this scheme results in the closed-loop system
oxl
/~ = -~ + ~
"
""
(2.20)
As shown in [7],this system has Zl = ~ = ~/= 0, 0 -- 0 as a stable equilibrium with a guaranteed region of attraction. The EMC-based schemes of [6,7,8] and [5] m a y result in implicit expressions for the control and the update law. The implicit form will arise if and only if there is at least one unknown parameter 0i which enters the control vector field (so that its derivative 01 is defined in terms of the control u) and whose derivative has to be cancelled by the control (so that, is in turn defined in terms of 0i). For the system (1.1), the ezpHci~ eztended matching condition (EEMC), which . is necessary and su~cient for u and 0 to be explicitly defined, is f~, g~ c sp or
f~ ~
sp
{go)
{go, adsogo},
V i = 1 .... ,~. g~ -
(2.21)
0
To illustrate the implicit form of the scheme of [6,7,8],we use the system e I --~ X 2
~ = z3 + 0(1 + z~) ~a = (1 + Oz,)u,
(2.22)
Adaptive Feedback Lineaxization of Nonlinear Systems
317
for which the new state is
~3 = x3 + 0(1 + x~).
(2.23)
The control and the parameter update law are _. - ( 1
-{- x 2 ) ~ - 2 X l X 2 0 - ~¢i$1 - k2x2 - k 3 ~ 3 1 ÷ Ox 2
0 = [0 1 +
p
•
(2.24)
(2.25)
k
The key point is that (2.24) and (2.25) define u and 0 as implicit functions of x and 0. Eliminating ~ we obtain [1 + 0 ~ =
-(1
+ ( 1 + ~ , ) ~ ( P ~ 3 z , + P23x~ + P ~ s ) ] +
x2)2(P12x I -~- P 2 2 x 2
=
-4- P 2 3 x 3 ) - - 2 X l X 2 0 -
k l X l - ]c2x2 - - k 3 x 3 .
(2.26)
To express u explicitly, the term multiplying it must be nonzero, which is true in a region S around the equilibrium zl = x2 = a:3 = 0, 0 - 0. The boundaries of S are the manifolds on which the term multiplying u in (2.26) is zero. For the scheme of Campion and Bastin [5], the counterpart of (2.26) is equation (20) of [5], assumed to be solvable in a region D~ x De. An overparamdrizatioa approach that always avoids the implicit form is due to Pomet and Praly [14]. To illustrate this approach, the system (2.22) is rewritten as Zl = xg. 5:2 = X3 + 01(1 + Xl2) (2.27)
~a = (I + 02x~)u, where the same unknown parameter 0 is treated as two parameters 01 and 02, which are estimated separately:
01 ~2=
1
0
+OX~zlu
P
z2
,
~n
(2.28)
and the implicit form is avoided because z3 - x3 + #1(1 + x~)
(2.29)
u = - ( 1 + z~)~l - 2xlz20t - kzzl - k2x2 - k3~3
(2.30)
1 + #2x~ A price paid in this overparametrization is the loss of exponential stability. However, the restriction to the region S has been removed and the only requirement is that 1 + 02x~ be bounded away from zero. Hence, the designer is faced with a iradeo.O::if the solvability region S is large enough, the implicit scheme is preferable. If, on the other hand, S is too small, then the overparametnzed exphclt scheme must be used, and additional measures will have to be taken to ensure robustness.
318
Kokotovid, Kanellakopoulos, and Morse
3 Nonllnearity-Constrained
Schemes
If a system, such as (2.13), does not satisfy the EMC, one has to resort to one of the nonlinearity-constrained schemes developed in [12-16]. To illustrate these schemes we again use the system (2.13). The scheme of Sastry and Isidori [15] is developed for input-output linearizable systems with exponentially stable zero dynamics. When applied to (2.13), this scheme employs the change of coordinates ~1 --" Z1 Y2 = ~1 :" X2 "~- OAf(X1) Y3 --'--~2 : X3 "~"
(3.1)
00"r(xl) (x2 OZ I
"~" O~f(Xl))
to transform the system (2.13) into the normal form: ~i=~2
~=y3
~.~ [o0~s(xl)
,
,
- ;0sCxl)~ ~ 1
(32)
-~3 102~'(Xl) 2, ,-/'07('0'~ 2 , ~I which is no longer linear in O. One way to circumvent this obstacle is to introduce two new parameters 02 and 03:
, 03=0 3.
01=0,02=02
(3.3)
This overparametrization makes (3.2) linear in 01, $2 and 03, but it also increases the dimension and complexity of the adaptive controller. A "certaintyequivalence" control is implemented as U = --~1~/1 -- ~2f/2 -- ~3,~/3 -- Olt011(X) -- 02~021(X) -- 03tO3(Z) ,
(3.4)
where
w i l ( x ) = Os(xl) x3 + 02"f(xl) x 2 OX1 ~ 2
(3.5)
,~2~f(Xl) [ t/}21(Z ) = Z ~ " f [ Z l ) ~
(3.6)
x
[~f(Xl )~2 2 "~- ~ OXl / X2
o~s(~*)7~(~,) (07(~1)) ~ "~(~) = 0~ + \ 0~1 ~(~1).
(3.7)
The control (3.4) uses the estimated states
(3.8)
Y2 = Z2 + 01"~(Xl) 03 = x3 + 010"~(Xl) OXl X2 + 02 ~
V(Xl) '
(3.9)
Adaptive Feedback Lineaxization of Nonlinear Systems
319
since the states y~ and Y3, defined by (3.1), depend on the unknown parameters and thus cannot be computed from the measurements of Zl, z2 and z3. Substitution of (3.4) into (3.2) and use of the notation (3.10) (3.11) resultsin the error system
Yl --Y2
~=Y3
~=-~lm-k2y2-k3y3+wl(x)(Ol-P~)
(3.12)
which is linear in 8i -~i, i = 1,2, 3. The gradient-type update law of [15] consists of fifteen dynamic equations (three for the update law and twelve for the four third-order filters),and is not presented here for brevity. The main limitation of this scheme is that it guarantees global stability and tracking only under the assumption, made in Theorem 3.3 of [15],that the change of coordinates (3.1) is globally Lipschitz in z. This forces 7(zi) to be
linear,since if b7(z1___~)were any function of Zl other than a constant, the term o@TtZl}z2"~would not be globally Lipschitz in Zl. A further nonlinearity constraint used in [15],is that the derivatives of the regressor vector with respect to both z and 0 be bounded. In the above example this requirement is satisfied by the regressor [w1(z) w2(x) tO3(m)]T only when 7(z,) is linear. For input-output linearizable systems of relative degree n, the scheme of Nam and Arapostathis [12] replaces globally Lipschitz conditions with less restrictive sector-type bounds. However, its stability proof requires the initial parameter estimates to be close enough to the true parameter values. In the case of the system (2.13), this scheme employs the change of coordinates .~1
-
"
•1
=
+ 0v(x )
(3.13)
and the control u =
k191 -
t2f~
-
k31)3
(3.14)
320
Kokotovi~, Kanellakopoulos, and Morse
to obtain the error system b = Aft + # ( z , 0)(9 - 0) + D(z, 0)~,
(3.15)
ft=
(3.10)
where ,
A=
L93J ~(Z, O) -"
0
,
-kl -k2 -k3
0
"[(XI )
O~ '~ ' X~V(X)l[X2+Or(Z1)] + 02
"[(XX)
(3.17)
0
D(x, 0)-"
~O~t(Xl)x2?2~Xl.~(Xl )
Proposition 4.1 in [12] assumes the following sector-type bounds (cf. (4.9)-(4.10) of [12]) to be valid for all 0 e S(0, p), a ball of radius p around 0:
H~'( x, ~)[I <~ el "~- c2ll'911
(3.18)
liD(z, e)ll < dl q-d2llgll,
(3.19)
where cl, e2,dl, d2 axe positive constants. Furthermore, the initial parameter estimate is required to satisfy I10(o) - Oll < o' = rain
P'N
"
Upon closer inspection of (3.18), one realizes that 7(zx), although not required to be linear as in the scheme of Sastry and Isidori [15], is still severely restricted. For example, if "f(XX)= Xl sin z l , (3.21) then ~-~Xl/ ( 07 ~2 ~(Xl) and ~-~x~7 027 2(Xl) grow like cubic functions of zl, and cannot be bounded by Ilftl[, which can grow at most quadratically with z,. The above two nonlinearity-constrained schemes employ paraxneter update laws analogous to those used in direct adaptive linear control. In contrast, the "semi-indirect" scheme of Teel, Kadiyala, Kokotovid and Sastry [16], developed for the same class of systems as that of Sastry and Isidori, combines parameter estimation elements from both the direct and the indirect approaches. In the case of the system (2.13), this scheme uses the change of coordinates (3.13) and the control (3.14), and results in the same error system (3.15). In addition, following [5,13], the scheme employs the state observer (cf. (75) in [16]) ~t = Aft + D(x, 0)~ + 12(9 - ft) ,
(3.22)
Adaptive Feedback Linearization of Nonlinear Systems
321
where ~ is a Hurwitz matrix. Subtracting (3.22) from (3.15) provides the parameter estimation error system: ~ - f2e-kgr(z,0)(0- 0),
• -- ~ - f/.
(3.23)
Assumption (A9) of Theorem 4.2 in [16] requires that the initial parameter estimate satisfy the bound I 0 ( 0 ) - OI < ,~,
(3.24)
where ~ is a constant which depends on various gains and Lipschitz constants. In addition, Assumption (A5) requires that the change of coordinates (3.13) be globally Lipschitz in 0, uniformly in z. This means that the derivative with respect to 0 of (3.13): 0
v( 1)
O"y(Zl) x2 -{-20c~kXl]7(Xl)
L Ozl
(3.25)
aZl
is required to be uniformly bounded, which is true if and only if 7(zi) -- const. Finally, Assumption (A7) requires that
W h e n 7(zt) = 7 = const, this inequality becomes 171 -< cHzH, and thus 7(zi) 0, which implies that this scheme is not applicable to the system (2.13). Let us note, however, that the proof of Theorem 4.2 in [16J--in particular, equation (92)--is stillvalid under the slightlyless restrictiveassumption
< cI + C2H[[,
(3.27)
under which 7 can be any constant mad not necessarily zero. All the schemes we have examined so far assume that the nonlinear system is eitherfull-stateor input-output feedback linearizable.This enables them to borrow stabilityresults obtained with quadratic Lyapunov functions from adaptive linear control. In contrast, the approach of Pomet and Praly [13,14]assumes the existence of a more general Lyapunov function with prespecified growth properties achievable by an implementable feedback control. While this approach offers some significant advantages, it also has a major drawback: the designer is required to construct such a Lyapunov function, although its existence cannot be ascertained a pr/ort. In spite of this difficulty,this approach has produced [14] the only available scheme that can stabilizethe system (2.13) with 7(zl) = z~.
322
Kokotovi~, Kanellakopoulos, and Morse
4 Pure-Feedback
Systems
In order to remove some of the limitations of the existing adaptive nonlinear schemes, we need to examine whether they are structural, that is, caused by the nature of the problem, or methodological, that is, caused by the design approach. Within the framework of feedback lincarization, a quick qualitative examination is now made using the following '%enchmark" examples with nonlinearities which do not satisfy linear growth constraints:
n o
C o m
i e
x
n
r
e i t
Y
~.2 = zz + ezl2
,~2 = z3
~3=t,
~'3 = t,
~1 = z2 ~2 = z3 + Oz~ ~3 = u
h = z2 + Oz~ ~2 = za ~3 = a
~1 = z2
h = z2 + Oz~
~,2 = z3 +Oz~
~=z3
~.3=u
.~3 = a .
uncertainty level Fig. 1. Systems with different levels of uncertsinty and nonlinear complexity. The two columns in Fig. 1 differ in their uncertainty level. In the left column the unknown parameter 0 is separated from the control u by only one integrator and in the right column by two integrators. Hence, the uncertainty levels of these systems are one and two, respectively. The level of uncertainty and, hence, the design difficulties increase from the left to the right, because the systems on the left satisfy the EMC and those on the right don't. The design difficulties also increase going from top to bottom, even when the parameter 0 is known. The nonlinear complexity is greater when the nonlinearity multiplying the unknown parameter depends on state variables which are '%d forward". An examination of the systems in the right column reveals that with z 2 feedback linearization is global, with z] it is only local, and with z~, which is fed forward over one integrator, feedback linearization is impossible.
Adaptive Feedback Linearization of Nonlinear Systems
323
These benchmark examples motivate our interest in pure-feedback systems, defined as systems of the form
( : fo(()+ EOJi(()+ i=1
go(Q+
Oigi(() u,
(4.1)
which can be transformed via a parameter-independent diffeomorphism z = ¢(() into the pure-feedback form:
Zl = Z2 q" 0T71 (Zl, Z2) i2 = z3 + 0T72(zl, z2, z3)
: ~,._~ = z. + ow7._~(z~,..., z . ) ~. = 70(~) + 0%.(~) + [~o(~) + 0r~(~)] ~,
(4.2)
70(0) = 0,71(0) = . . . = 7n(0) = 0, fl0(0) ¢ 0.
(4.3)
with
Necessary and sufficient conditions for the existence of such a diffeomorphism are given in the following proposition. P r o p o s i t i o n 4.1 A diffeomorphism z = ¢((), with ¢(0) = O, transforming (4.1)
into (~.2), exists in a neighborhood B= C U of the origin if and only if the following conditions are satisfied in U: (i) Feedback linearization condition. The distributions
gl
= span
{go, ad/ogo,., .,axllog0}, i
0
(4.4)
are involutive and of constant rank i + 1. (ii) Pure-feedback condition. g~ E G° , 1
[X, f i ] E ~ j+l, V X E ~ j ,
(4.5)
O<j
Proof. Sufficiency. As proved in [20], condition (i) is sufficient for the existence of a diffeomorphism z = ¢(() that transforms the system
~=fo(C)+go(i)u, fo(o)=o, ao(O)#O
(4.6)
into the system
}i---Zi+l,
l
~. = 70(z) + &(z)u, with
70(0) = o, ~o(0) # o.
(4.8)
324
Kokotovi~, Kanellakopoulos, and Morse
Hence, in the coordinates of (4.7) we have
f0(4-1(z)) = [z2 ... ~. ~o(z)]T g0(4-1(~)) = [0--- 0 Zo(z)] T ~i = s p a n
{°
°}
Oz,~ ' " " O z n _ i
(4.9) (4.10) (4.11)
O
'
Because of (4.11), the pure-feedback condition (4.5), expressed in the z-coordinates, states that
1 < i < p. "" "' O z j _ l
(4.12)
2<_j<_n, ~ '
But (4.12) can be equivalently rewritten as
71(zl)
0 )
g'(4-1(~)) =
0
72(zl,z2) '
fi(4-1(z))=
/
:
,
l
~,-1,i(zl,...,~,)!
r.Azx,---,~.) / (4.13) Furthermore, since 4(0) = 0 and f i ( O ) = 0, 1 < i < p, we conclude from (4.13) that 71(0) = - - - = 7,(0) = O. (4.14) Combining (4.9), (4.10), (4.13) and (4.14), we see that in the z-coordinates the system (4.1) becomes (4.2). N e c e s s i t y . If there exists a diffeomorphism z = 4(() that transforms (4.1) into (4.2), one can directly verify that the coordinate-free conditions (i) and (ii) are satisfied for the system (4.2), and hence for the system (4.1). I] R e m a r k 4.2. The EMC (2.1) is a special case of the "pure-feedback" condition (4.5). This is clear because if the system (4.1) satisfies the feedback linearization condition (4.4) and (2.1), then it is transformable into the pure-feedback form (4.2) with 71 ~ 0 , . . . , 7n--2 ~ 0. []
5
Adaptive Scheme Design
Since the diffeomorphism z = 4(~) does not depend on the unknown parameter vector 0, Proposition 4.1 gives a n a p r i o r i v e r i f i a b l e characterization of the class of nonlinear systems to which the new adaptive scheme is applicable. Assuming
Adaptive Feedback Linearization of Nonlinear Systems
325
that the transformation of (4.1) into (4.2) has been performed, the new adaptive scheme is designed for the pure-feedback system:
~--'Zi+l'~T"fi(Zl,...,Zi+l),
l <_i < _ n - 1
(5.1)
~. = vo(~) + 0Tv.(~) + [/3o(~)+ 0T/3(~)]~, with
~/0(0) : 0,~fl(0) - - . . . =~/n(0) "- 0, ~0(0) ~ 0.
(5.2)
Recall that 70, /30, and the components of/3 and 7~, 1 <: i < n, are smooth nonlinear functions in Bz, a neighborhood of the origin z = 0. The following step-by-step procedure interlaces, at each step, a change of coordinates with the construction of a parameter update law: Step 0: Define xl : zl, and denote by Cl,C2, . . . , c , constant coefficients to be chosen later.
S~ep 1: Starting with
Xl -" Z2 "~"~T~I(Zl, Z2),
(5.3)
let 01 be an estimate of 0 and define the new state x2 as =2 = c l x l + z~ + # w 7 1 ( z l , z 2 ) .
(5.4)
Substitute (5.4) into (5.3) to obtain
2:1 ---- --ClZl + z2 + (0 -- dl)WTl(Zl,Z2) = --ClZ 1 -~ Z 2 "~- (0 -- 01)Twl(Zl, Z2, 01). Then, let the update law for the
(5.5)
parameterestimate 01 be
~1 = xl wl(xl, x2, 01).
(5.6)
Step 2: Using the definitions for zl, z2 and dl, write ~ as
~2 = c~[-~1~1 + ~2 + (0 - 01)Twl(X~, x2, 0~)] + z3 + 0T~2(Z~, Z2, z3) "~-Zltt)l ('1, X2,01)W'~l(Zl, Z2) "~- 0T
=
L'~=~Zl[(~]'I
(Z2 ..~ 0T~I) --~ ~Z2 0"~I (Z3 " + 0T~f2)]
0~/I'~[,~ + 0T*y2(Zl, z2, Z3)] -'~ ~2(z1, z2, 01) + 0T'c2(X1, X2, 01). 1 + .qW ~ ~-;~) (5.7)
Let 02 be a new estimate of 0 and define the new state zs as
0T071 -~'~2(Zl, Z2, 01) -~- 0T¢2(Zl, Z2, 01).
(5.s)
326
Kokotovid, Ka_nellakopoulos, and Morse
Substitute (5.8) into (5.7) to obtain X2 -- --C2X2 J#" X8
= - e 2 x 2 + xa + (0 - 02)Tw2(xl, z2, za, 01, 0 2 ) .
(5.9)
Then, let the update law for the new estimate 0~ be ~2 = x2 w2(Xl, x2, x3, 01,02).
(5.10)
Step i (2 < i < n - 1): Using the definitions for z l , . . . , x ~ and ~ 1 , . . . , ~ i - 1 , express the derivative of zi as T 0"/I
T 19"~i_1
-l-~i(xl,..., zl, 01, . . . , 0i-x) -I- 0 T ¢ i ( z l , . . . , zi, 0i, . . . , 0i-1) •
(5.11)
Let tgl be a new estimate of 0 and define the new state Xi.i.1 as
z,+, = e,xi +
1+ 0
... 1 + ~T_I
[Zi-{-I Jr"~T"fi(Zi,...,Zi4.1) ]
+ ~ ( z , , . . . , zi, 0:, . . . , 0i_:) + 0 T ¢ i ( x : , . . . , zi, 0 1 , . . . , 0i_:) •
(5.12)
Substitute (5.12) into (5.11) to obtain
T071~
0"[i 1 . . ( I + 0 Ti :-"-'~7i] -
= --CiZi "b Xi+I "11-(0 -- oi)Twi(Xl,..., Xi+l, 01, ..., ~9i).
(5.13)
Then, letthe update law for di be O i "- X i W i ( Z l , . . .
, Xiacl , 0 1 , . . .
, Oi).
(5.14)
Step n: Using the definitions for x l , . . . , Zn and d l , . . . , dn-a, express the derivative of zn as
Zn= (1+01~'Z2>]T071~'''(1-t'0Tw n-l~)Oq%~-l~['0(Z)+0T/~(Z)]U "-I-~On(X , 0 1 , . . . ,
0n_l) -I- 0Ten (~, 0 1 , . . . , 0 n _ l ) .
(5.15)
Let On be a new estimate of 0 and define the control u as 1 u = ~(z, d , , . . . ,
On) [-CnZn - ~n - 0 , ~ ¢ n ] ,
(5.16)
where
~(Z,01
O n ) : ( 1 ± 0 TOTI'~ "'" ( l q - o T _ I ~ ) [ f l O ( Z ) q ' o T f l ( Z ) ]
(5.17)
Adaptive Feedback Lineaxlzation of Nonlinear Systems
327
Substitute (5.16) into (5.15) to obtain T 0'Y1'~
• (1 "{-
O"t'n-l'~,(z)u]
-~" --CnZn 4" (0 -- 0n)Twn(x, t91,... , l~n) ,
(5.18)
where (5.16) is used in the definition of w,. Finally, let the update law for the estimate t9n be ~n = zn wn(z, Ox,..., On). (5.19) Feasibility of this design procedure and the stability of the resulting closedloop adaptive system are analyzed in the next section.
6 Feasibility and Stability The above design procedure has introduced a control law defined by (5.16)(5.17) and a set of new coordinates Z l , . . . , z n defined by (5.12). In order to ensure that the procedure is feasible, we construct in Proposition 6.1 an estimate 5 C IRn(l+p) of the feasibility region such that for all (z, t91,... ,tgn) • ~ the denominator in (5.16) is nonzero and the coordinate change (5.12) is one-to-one, onto, continous and has a continuous inverse. P r o p o s i t i o n 6.1 Suppose the pure-feedback form (5.1) of the system (~.1) exists in Bz, and let Bo C IFtp be an open set such that [1+~
I#0(z) +
Ozi+l
>0,
VzEBz,
VtgiEBo, l
> o, vz • B . . vo. • , o
(6.1)
(62)
Then, the set 9: = B, x B~ is a subset of the region in which the design procedure of Section 5 is feasible. Proof. Obvious, since (6.1) and (6.2) guarantee that in B~ x B~' the denominator in (5.16) is nonzero and (5.12) is uniquely solvable for zi. [] R e m a r k 6.2. In general, the feasibility region is not global. However, this is not due to the adaptive scheme, because, even when the parameters 0 are known, the feedback linearization of the system (5.1) can only be guaranteed for 0 E B0 C ll~p, an open set such that
1 + 0 w O"[i(Z)ozi+l > 0 , Vz E B~ , VO E Bo, 1 < i < n - 1 trio(Z) + 0Wfl(Z)[ > 0, Vz • B~, V0 • Be.
(6.3) (6.4) []
328
Kokotovi~, KaneUakopoulos, and Morse
In the feasibility region, the adaptive system resulting from the design procedure can be expressed in the z-coordinates as ~'1 -- --ClZl "JrX2 -~"(0 -- 01)Ttol(Xl, Z2, 81)
~n-1 = --Cn-lZn--1 "Jc Xn "~" (0 -- 0 n _ I ) T t O n _ I ( X l , . . . , Xn -- --CnXn JC (0 -- on)TtOn(X, O1, ....On)
Oi=z~w~(z,,...,xi+l,01,...,O~),
Xn, 01, . . . , 0 n _ l )
(6.5)
l
A nice property of this system is that its stability can be established using the quadratic Lyapunov function n
V(x'O1)""On)
= 21xTxjc1~--~(O--oi)T(o--Oi)" i=I
(6.0)
The derivative of V(z, 0 1 , . . . , 0n) along the solutions of (6.5) is n
=
_E i=I n
----~ClX i=l
+ (0-
))--1 + E',',+I i=1
-
n--I 2 + ZXiXi+I. i=I
(6.7)
At this point we can choose the coefficients c l , . . . , Cn to guarantee that 1? is negative semidefinite. The choice ci > 2, for all i = 1 , . . . , n, yields: _< -ll~ll
(6.8)
2 .
This proves the uniform stability of the equilibrium
x=0,
Ol=O,...,O,=O
(6.9)
of the adaptive system (6.5). To give an e s t i m a t e / 2 of the region of attraction of this equilibrium, we note that /2 must be a subset of our estimate Y of the feasibility region. Let f2(c) be the invariant set of (6.5) defined by V < c, and let c* be the largest constant c such that f2(c) C ~'. Then an e s t i m a t e / 2 of the region of attraction is
= ~2(~') = { ( . , O l , . . . , o . ) :
v(.,,Ol, ...,o,)
< c ' } , c* = arg sup {~}. n(c)c~"
(6.10)
Adaptive Feedback Linearization of Nonlinear Systems
329
R e m a r k 6.3. It can be expected that the above estimate is not tight because the choice of the unity gains in the update laws was made for simplicity. With some a prior/knowledge about the shape of ~r different adaptation gains can be found so that 52 is maximized by a better fit of jr. [] Next, using the invariance theorem of LaSalle, we establish that for all initial conditions ( x , d l , . . . ,tin)t=0 E 1"2, the adaptive system (6.5) has the following regulation properties: lim x ( t ) = O ,
t---*oo
tlim ~ ( t ) = O ,
t
~O~(t)=O,
l
(6.11)
In order to return to the original coordinates ~, we note that (6.1) guarantees, first, that the solution z2 = ... = zn = 0 of the system of equations
Zi+lq-OT'l'i(O,z2,...,Zi+l)-'O,
l
(6.12)
is unique in Bz x Bo, and, second, that z l , . . . ,zn can be expressed as smooth functions of x, 0 1 , . . . , O, using (5.12). Combining these two facts with (6.11), we obtain lim zl(t) = 0
t--*O0
thin ~i(t) = O,
1 < i < n.
(6.13)
Using an induction argument, it is now shown that zi(t) ---* 0 as t --~ co, 1 < i < n:
- For i = 1, we have zl(t) --* 0 as t --* oo. - F o r i = k , 2 < k < n , we assume that zj(t) --* 0 as t --* oo, l < _ j < _ k - 1 . Then, from (6.13) we have
lim ~t_1(t) = lira {zk +OTTk_l(Zl,...,z,-1, zt)} =0
(6.14)
$ ---~O O
and from the uniqueness of solutions of (6.12) we conclude that zk(t) --* 0
Hence, z(t) --* 0 as t --* oo. Finally, since z = ~(~) is a diffeomorphism with ~b(0) = 0, regulation is achieved in the original coordinates ~, namely lim ~(t) = O.
|---*00
(6.15)
The above facts prove the following result: T h e o r e m 6.4 Suppose that the system (4.1) satisfies Proposition 4.1 and that the design procedure of Section 5 is applied to its pure-feedback form (5.1). Then the equilibrium (6.9) of the resulting adaptive system (6.5) is uniformly stable and its region of attraction includes the set F2 defined in (6.10). Furthermore, regulation of the state ~(t) is achieved for all initial conditions in F2. []
Kokotovi~, Kanell~kopoulos, and Morse
330
7
Multi-lnput Systems
The design procedure of Section 5 can be easily extended to multi-input nonlinear systems of the form
= fO(~') -t- Z O/'fi(¢) "l- ~ i=1
g$0(¢)Jl"
j=l
Oi.,q~i'(~')U j ,
(7.1)
--
with
to(O) = fx(O) = . . . = f~(o) = o, rankCo(0) = r,, Co = [gox...g~'],
(7.2)
that can he transformed into =
,...,
kx--ki+2'''''Z~n
m
l<_j<_m
l
~,=~(z)+OT~i(z)+
..
~(,)+
(7.3)
o,~i(z)
~, l < ~ < m ,
with ~ ( 0 ) = 0, 0 < i < kj, 1 < j < m , act B0(0) • 0,
(7.4)
111
w h e r e Bo = ~/~1, . . . , / ~ n ] T ,
and ~kj j=l
"- n.
P r o p o s i t i o n 7.1 A parameter-independent diffeomorphism z = ¢(~), with ~b(O) = O, transforming (Z1) into (Z3), exists in a neighborhood Bz C U of the origin if and only if the following conditions are satisfied in U: (i) Feedback linearization condition. The distributions G'=span{~,adlo~,...,ad~ogg,
l<j_<m},
0
are invoIntive and of constant rank ri, with rn_ ! (//} Pure-feedback condition. g~GG °,
=
(7.5)
n.
l<j<m, 1 < i < p.
IX, fl] ~ 6 k+l , VX e 6 k , 0 '~ k <: n -
(7.6)
2,
Proof. As proved in [24,25], condition (i) is necessary and sufficient for the existence of a diffeomorphism z = ~b(¢) such that in the z-coordinates we have f0(¢-l(z)) = [z~ ... zlx_l ~/l(z)... z~a ... z~n_l ~'~(z)] T
(7.7)
G0 (¢-~ (z)) = [ 0 . . . 0 p0~(z) • • • 0 . . . 0 ~ ' (z)] T
(7.8)
Gi=span
{ Ozki 0
0
Ozk#-i l < j < m
}
0
1
(7.9)
Adaptive Feedback Lineaxization of Nonlinear Systems
331
It is now a tedious but straightforward task to verify that condition (ii) is equivalent to ~(~-I(Z))
= [0'''0~l,i(Z)'''0'''0n~,,i(g)] -
1
T,
X'~i(Cp,
l <<j I T l -__
(7.10 /
k,,,-k,+~)
1
zm
~,,("~, ~ , . . - , ~ ? , . . . ,
l,(~-~(z)) =
1 < i < p.
(7.11)
m 11 ~ ~l,i(Z 1 . . . , Zkx_kra+2 . . . . . Z?, Z~)
7k~,i(~) Pl
The design procedure for the system (7.3) is the following:
Steps 0 through (n - m): Apply steps 0 through (kj - 1) of the single-input procedure to the first (kj - 1) equations of each of the m subsystems of (7.3), to obtain the system: j-1 £=~(k,-1)+i, p----1
l
1)t = ~ ~ i ' ( Z , # l , " ' , Ot), d
: Zrrl /k.,
=
l<j<m
1< £< n - m
Bo(z,O~,...,O,,_,,,)+~"_.B~(z,O~,...,O,,_,,,)O,
(7.12 / u
i=1 -'l-(l~i(X,1~1,..., On_m) all-w T ( x , 0 1 , . . . , On_ni)O ,
where/7o,B1,..., B v axe defined as
D,(z, o~,..., o._.,) = (1 ' " ' 0 7 1 '
-'-"'
=
(
..(I+I)T
1
a-4;-i, ,S
/~IT(z)
:
o77'~ (0T 0"r.~_,~
1 + OTn_m_k,,,+2 (gZ~.}''"
(7.13)
1 + n-m CgZra j ~ r ( z ) k,,. /
Step n -- m + 1: Let O.-m+l be a new estimate of 0 and define the control u as
332
Kokotovi~, Kanellakopoulos, and Morse
u =
~o(z,01,...,,~,,_,,)÷
~dz,,~1,...,O,-,~)O,,-m+~,i i=1
;
' ...-" z" 'T--~(x,O;,..
O,_m)
~ j _> 2, 1 < j < m .
-wW(z,01,...,d,~-m)d,_m+l},
(7.14)
Substitute (7.14) into (7.12) and rewrite the last m equations of (7.12) as d
i } clxl 1 Xlkl :
Xrn L k,,,
=-
kl
:
kl
+ {w +
pu]} (0 - d,_m+l)
/ crn .r~ i L k,~'vkm J cl x l "1 kl. kl ]
+ w T _ m + l ( ~ , d , , . . . , d,_~+~)(0 - ~,-m+~), (7.15) I Cm
2: m
I
L km km J
where (7.14) was used in the definition of for the estimate d,-m+l be
Wa-m+l. Finally,
~,-m+1=w,_,~+~(~,o~,...,o,_~+~)
:
let the update law
.
(7.16)
Xrrt Lk.,
Note that this procedure is feasible only in the region in which the matrix P --" BO °c ~ Sitgn-m+l,i is invertible. The stability properties of the resulting i=1 closed-loop system are analogous to those listed in Theorem 6.4, and can be similarly established using the Lyapunov function 1 T
I
n-m+l
(7.17)
i----1
8 Global
Regulation
There are strong theoretical and practical reasons for investigating whether the stability properties of an adaptive system can be made global in the space of the states and parameter estimates. Systems with a finite region of attraction may not possess a wide enough robustness margin for disturbances and unmodeled dynamics. Furthermore, it is usually hard to find nonconservative estimates of finite regions of attraction. Another aspect of the global stability issue is the comparison of the proposed adaptive controller with its deterministic counterpart, that is, the controller that would be used if the parameter vector 0 were known. Suppose that for all values
Adaptive Feedback Lineaxiz~tion of Nonfiuear Systems
333
of 0 there exists a deterministic controller which achieves global stabilization and regulation of the system. If, with 0 unknown, the proposed adaptive controller does not achieve the same global stability, this loss is clearly due to adaptation. The stability result of Theorem 6.4 is not global. However, as pointed out in Remark 0.2, this is not due to adaptation, because for pure-feedback systems global stability may not be achievable even with 0 known. In Proposition 8.3 we define the class of "strict-feedback" systems, for which a globally stabilizing controller exists when 0 is known. We then prove that for this class of systems our adaptive scheme guarantees global stability when 0 is unknown. In order to characterize the class of "strict-feedback" systems, we use the following assumption about the part of the system (4.1) that does not contain unknown parameters: A s s u m p t i o n 8.1 There exists a global diffeomorphism z = ff(~), with ~b(O)= O, transforming the system = f0(¢) + g0(¢)u, (8.1)
into the system £'i = zi+l , l < i < n - 1
(8.2) z, = 70(z) + ~0(z)u,
with
-to(O) = o, ~o(z) ¢- o vz ~ ~t".
(8.3)
R e m a r k 8.2. The local existence of such a diffeomorphism is equivalent to the feedback linearization condition (4.4). At present there axe no necessary mad sufficient conditions verifying the global validity of this assumption. Sufficient conditions for Assumption 8.1 are given in [21], while necessary and sufficient conditions for the case where ~0(z) - const, can be found in [22,23]. [] P r o p o s i t i o n 8.3 Under Assumption 8.1, the system (g.1) is globally diffeomorphieally equivalent to the "strict-feedback" system l
Zi ~- Z i + l - l - o T ' ~ i ( z l , . . . , z i ) ,
~. = r0(z) + 0T~.(z) +/3o(z)u
(8.4)
if and only if the following condition holds globally: Strict-feedback condition.
gi~--O, l
[X, fi] E~J,
VX E~J,
05j
with ~ i , 0 < j < n - 1, as defined in (4.4).
(8.5)
334
Kokotovid, Kanellakopoulos, and Morse
Proof. The proof is very similar to that of Proposition 4.1. First, because of the assumptions that the diffeomorphism z = ¢(() is global and ~0 (z) ¢ 0 Vz e IRn, the distributions ~J ,0 _< j _< n - 1, are globally defined and can be expressed in the z-coordinates as ~J = span
(0
OZn'
0}
'Ozn-j
0_<j
(8.6)
To prove sufficiency, note that if the pure-feedback condition (4.5) of Proposition 4.1 is replaced by the strict-feedback condition (8.5), then (4.12) is replaced by
gi--O, 1 < i < p.
(8.7)
Thus, the expression for f i ( ¢ - l ( z ) ) in (4.13) becomes
72,1(zI, z2) fi(¢-l(z)) =
: "1,-1,1(Zl, . . . , z , _ l ) 7-,i(Zl,-..,z-) The necessity is again straightforward.
,
1 _< i _< p.
(8.8)
[]
The above proposition gives a geometric characterization of the class of systems for which the following global properties can be achieved. T h e o r e m 8.4 Suppose that the system (~.1) satisfies Proposition 8.3 and ~ha~ the design procedure of Section 5 is applied to its strid.fcedback form (8.4). Then the equilibrium z = 0 , 0 1 = 0 , . . . , O, = 0 of the resulting adaptive system is globally uniformly stable. Furthermore, regulation of the state ~(t) is achieved: l~n ~(t) = 0,
1--*OO
(8.9)
for all initial conditions in I) = IRn(l+p). Proof. When the adaptive design procedure (5.3)-(5.19) is applied to the system (8.4), then for all 0i E IRp, 1 < i < n, the change of coordinates (5.12) is oneto-one, onto, continuous and has a continuous inverse, and the control (5.16) is well-defined, since Ozi+: (z) = 0 ,
fl(z) = O,
rio(z) ¢ O Vz e lR n.
(8.10)
Hence (6.1)-(6.2) are trivially satisfied on J: = Bz x B~ = lI:tnO+p), and from (6.10) we conclude that 1"2= IR."0+p). t2
Adaptive Feedback Lineaxization of Nonlinear Systems
335
9 Global Tracking Every regulation result in Sections 4-6 has its tracking counterpart. For brevity, we restrict our presentation to the tracking version of the global regulation result in Section 8. The counterparts of nonglobal regulation results can be obtained using the same Lyapunov function argument as in this section to determine an invariant set in which asymptotic tracking is guaranteed. Consider the nonlinear system
= fo(¢) + s~mL~od,(¢) + go(¢)u u = h(¢),
(9.1)
where ~ E IR~ is the state, u E IR is the input, y E IR is the output, 0 = [01,..., 0p]T is the vector of unknown constant parameters, h is a smooth function on IKn with h(0) = 0, and the vector fields go, fi, 0 < i < p, are smooth on IRn with g(ff) ¢ 0 ,Vff E IRn, fi(O) = O, 0 < i < p. We first formulate the input-output counterpart of Assumption 8.1: A s s u m p t i o n 9.1 There exist n - p smooth functions ~i(ff), p + l < i < n, such that the change of coordinates
hC¢) Lyoh(~) L~ h(¢)
Z2 Z3
(9.2) Zp "-Z i ~--
~i(~),
p+l
is a global diffeomorphism z = ~(~) transforming the system
= So(¢) + go(C). u = h(¢)
(9.3)
into the special normal form zl - ' z 2
~'p--1 ~-. g p
(9.4)
~p = 7oCz) + ao(z)~, ~:r = ~ o ( y , z ~) y--Zl,
with
vo(O) = L~oh(O) = O, ~o(O,0) = 0
(9.5) (9.6) []
336
Kokotovi~, Kanellakopoulos, and Morse
R e m a r k 9.2. In order for (9.3) to be locally equivalent to (9.4), it is necessary and sufficient that the following conditions hold in a neighborhood of the origin (=0:
LgoL~oh =-- O, O < i < p - 2
(9.7)
L,oLPI lh(O ) ~ 0
(9.8)
Gi is involutive and of constant rank i + 1,
0 < i < p - 1.
(9.9)
The sufficiency of these conditions is a consequence of Proposition 10 in [26]. The necessity can be easily established by verifying that (9.7)-(9.9) hold in the coordinates of (9.4). However, at present there are no necessary and sufficient conditions that can verify the global validity of this assumption. [] We are now ready to formulate the input-output counterpart of Proposition 8.3: P r o p o s i t i o n 9.3 Under Assumption 9.1, the system (9.1) is globally diffeomor. phicaily equivalent ~o the "strict-feedback" normal form
o T T I ( Z l , . . . , Zl,Zr), ~, = ~0(~) + 0T~,(z) + ri0(z)u
l
Zi -" Z i + l +
(9.10)
P
= ~o(~, z r) + ~ o,~,(y, z ~1 i.=l y--Zl, if and only if the following condition holds globally: Strict-feedback condition. [ X , f ~ ] E g 1,
V X E ~ j,
0_<j_
l
(9.11)
with ~J, 0 < j <_p - 1, as defined in (4.g). Proof. The proof follows closely that of Proposition 8.3. First, because of the assumptions that the diffeomorphism z - ¢(~) defined in (9.2) is global and that ri0(z) ¢ 0 Vz E IKn , the distributions Gi, 0 _< j _< p - 1, are globally defined and can be expressed in the z-coordinates as Gj = span
,...,
,
0 _< j _< p - 1.
(9.12)
The sufficiency follows from the fact that, by (9.11) and (9.12), fl
]
Espan
{0
Ozp'
•
- * ,
0}
8zj
,
2<j
- -
l
(9.13)
Adaptive Feedback Lineaxization of Nonlineax Systems
337
Thus, the expression for fi(~-l(z)) is
71,~(zl, z r) 72,i(Zl, z~, z r)
.f,(~-l(~))
=
7 p - i , , ( z l , . . . , zp_l, z r) 7 p , i ( z l , . . . , zp, z r) ~]}i(Zl, Zr)
,
l
(9.14)
[]
The necessity is again straightforward.
R e m a r k 9.4. To obtain the input-output counterpart of Proposition 4.1, one just needs to repine the feedback linearization condition (4.4) with conditions (9.7)-(9.9), and the pure-feedback condition (4.5) with gi E ~0
1 < i < p. [X, f i ] E g j'bl , V X E g j ,
(9.15)
0_<j_
[] As in most tracking problems, we need an assumption about the stability of the zero-dynamics of (9.10): A s s u m p t i o n 9.5 The zr-subsystem of (9.10) has the bounded-input-boundedstate (BIBS) property with respect to y as its input. [] It was shown in [15, Proposition 2.1] that the following conditions are sufficient for Assumption 9.5 to be satisfied: (i) the zero dynamics of (9.1) are globally exponentially stable, and p (it) the vector field • = ~0 + ~-'~0i~i in (9.10) is globally Lipschitz in z. i=l These conditions are more convenient for nonglobal results, where (i) can be used to estimate the region of attraction via a converse Lyapunov theorem. However, they are too restrictive for global results. For example, the system ~r = -(zr)3q-y 2 violates both (i) and (it), but is easily seen to satisfy Assumption 9.5. The control objective is to force the output y of the system (9.1) to asymptotically track a known reference signal yr(t), while keeping all the closed-loop signals bounded. A s s u m p t i o n 9.6 The reference signal yr(¢) and its first p derivatives are known and bounded.
[]
To achieve the asymptotic tracking objective, the design procedure presented in Section 5 is modified as follows:
338
Kokotovid, Kanellakopoulos, and Morse
Step 0: Define Zl "~ Z l
(9.16)
-- Yr,
and denote by el, e2, . . . , cp constant coefficients to be chosen later. Step 1: Starting with
~1 :- z2 -~- 0W'~'l(Zl, Zr) -- Yr,
(9.17)
let 01 be an estimate of 0 and define the new state z2 as x2 = c l x l + z2 + 0 T 7 1 ( z l , z r) - - / #
= ClXl + z~ + ,~T~l(x~, z',
y,) -
u,.
(9.18)
Substitute (9.18) into (9.17) to obtain ~1 ~- - C l Z l Jr z2 Jr (0 - dl)Twl(Xl, z r, Yr).
(9.19)
Then, let the update law for the parameter estimate 01 be 01 = x~ w,(~l, zr, Yr).
(9.20)
Step 2: Using the definitions for Zl, x2 and all, write x2 as
+~,Wl(~, ~, y,)T~(~,., z ~) + 0T [ °~1(z1' ~) (z~ + 0T~(~I, ~')) L 0zz "~
Ozr
(
,
i-~l
)]
= Z~ + ~ ( ~ 1 , X~, Z', 01, y,, ~ , 0r) + 0Tw2(~I, ~ , Z', all, Y,, Y,)"
(9.21)
Let 02 be a n e w estimate of 0 and define the new state z3 as
Substitute (9.22) into (9.21) to obtain ;~2 -- -c2;~2 "~-z3 -~ (0 - t~2)Ww2(Zl, z2, z r, all, Yr, Yr),
(9.23)
Then, let the update law for the new estimate d2 be d2 = x2 w2(xl, x2, z r, 0x, Yr, Yr)Step i (2 < i < p - 1): Using the definitions for x t , . . . , z i express the derivative of zi as
(9.24) and d t , . . . , ~ _ l ,
xi -" Zi+l + ~ i ( Z l , . . . , x i , zr, t g l , . . . , ~ i - l , Y r , . . . , Y ! i ) ) + O W w i ( x i , . . . , xi, z r, 0 1 , . . . , Oi-~, y ~ , . . . , y(i-1)).
(9.25)
Let dl be a n e w estimate of 0 and define the new state xi+i as Xi+l -" cizi "~-Zi+l "4-~ i ( X l , . . . , xi, zr, d l , . . . , di-1, Yr,..., Yr(i))
+O;~w~(~,..., :~,, ~', o i , . . . , o,_~, u,,..., u~'-l)) •
(9.~)
Adaptive Feedback Lineaxization of Nonfinear Systems
339
Substitute (9.26) into (9.25) to obtain ~i = -eizl + zi+~ + (0 - Ol)Twi(zl,..., zl, z ~, 01,. • •, 0~-i, ~,.. •, y(~-1)).
(9.27) Then, let the update law for 0~ be 01 - zi w~(zl,..., zi, z r, 0 1 , . . . , O i - i , Yr,..., y!i-l)).
(9.28)
Step p: Using the definitions for X l , . . . , z~ and 0 1 , . . . , 0p-l, express the derivative of zn as
~. =
~oCz)u + ~ . ( x l . . . . , x . , z r, 0 1 , . . . , 0 . _ 1 , y r , . . . , y!"))
+0Twp(XI,..., Xp, Zr, 0 1 , . . . , 0#,-1, Y r , . . . , y~p-1)).
(9.29)
Let dp be a n e w estimate of 0 and define the control u as . = ~ 1
[-c.~p -~.
- dTwp ]
(9.30)
Substitute (9.30) into (9.29) to obtain ~, = --CpXp "[- (0 --
tgp)Ttop(Zl,...,Zp, Zr,~91,...,t~p_l,Yr,..., y!p-1)). (9.31)
Finally, let the update law for the estimate 0p be Op = XptOp(Xl, . . . , X p , Z r , ~ l ,
. . . . ~p--l,Yr, . . . , Y(rP--1)) •
(9.32)
As was the case in the regulation result of Section 8, the assumptions of Proposition 9.3 guarantee that the design procedure (9.16)-(9.32) is globally feasible. The resulting closed-loop adaptive system is given by Xl = --ClXl + X2 4" CO - t~l)TWl(X.l,Zr, yr)
Xp--1 "-- --Cp--lZp-1 "~ Zp J#- ( 0 -- Op--1) w XU)p-1 ( X l , ' ' - , Xp-1, zr, 0 1 , . . . , bqp-1, Yr,..., y~p-2)) Xp = --CpXp "~- ( 0 -- ~#)TtOp(Xl, . . . , Xp, Zr, ~1, ' ' ' , ~p--1, Yr,-.., y~p-1))
(9.33)
P
= ~o(u, ~) + ~ o,~,(y, ~) i=1 tgi=ziw~(xl,...,z~,zr, dl,...,d~_l,y~,...,y!i-1)),
l
y = z l -{- yr .
The stability and tracking properties of (9.33) will be established using the quadratic function 1 p
V,(xl,...,xp,O,,...,O,)=TZ[x~+(O-Oi)T(O-O,)]. i=l
(9.34)
340
Kokotovi~, KaneUakopoulos, and Morse
The derivative of Vp along the solutions of (9.33), with el > 2, 1 < i < p, is p
p--1
i=l
i=1
p
=-
~
p--1
ciz~ + E zlzi+l
i=1 P
< -~zi
i=l
2 _< 0.
(9.35)
i=1
This proves that Vp is bounded. Hence x l , . . . , zp and ~1,..., tgp are bounded, and Vp is bounded and integrable. The boundedness of zl and Yr implies that y is bounded. Combining this with Assumption 9.5 proves that z r is bounded. Therefore, the state vector of (9.33) is bounded. This fact, combined with Assumption 9.6, implies the boundedness of z, ( and u. Thus, the derivatives x l , . . . , xp are bounded, which implies that ~'p is bounded. Hence, Vp --~ 0 as t --~ c~, which, combined with (9.35), proves that limzi(t)=0,
t---*OO
l
(9.36)
In particular, this means that asymptotic tracking is achieved:
, oo
= a m [y(t)-
= 0.
(9.37)
These results are summarized as: T h e o r e m 9.7 Suppose that the system (9.1) satisfies Proposition 9.3 and As-
sumption 9.5, and that the design procedure (9.16)-(9.3~) is applied to its strictfeedback normal form (9.10}. Furthermore, suppose that the reference signal Yr satisfies Assumption 9.6. Then, all the signals in the resulting closed-loop adaptive system (9.33) are bounded and asymptotic tracking (9.37) is achieved for all 17 initial conditions in ]~vlq-pp
10 D i s c u s s i o n a n d E x a m p l e s With the help of two examples, we now discuss some of the main features of the new adaptive scheme. The first example illustrates the systematic nature of the design procedure, while the second one compares the stability properties of the new scheme with those of the nonlinearity-constrained scheme of [15]. ExAmple 10.1 ( l ~ g u l a t i o n ) . We first consider a "benchmark" example of adaptive nonlinear regulation: h = z~ + az~
k2 = z3
(10.1)
Adaptive Feedback Lineazization of Nonlinear Systems
341
where 0 is an unknown constant parameter. This system violates both the geometric conditions of [5,6,7] and the growth assumptions of [12,13,15,16]. In fact, the only available global result for this example was obtained in [14]. The system (10.1) is already in the form of (8.4) with 80 - 1. Hence, this system satisfies the conditions of Theorem 8.4, which guarantees that the point z = 0, 01 = 02 = 0s = 0 is a globally stable equilibrium of the adaptive system. Moreover, for any initial conditions z(0) e IFts, (01(0), 02(0), 03(0)) e IE 3, the regulation of the state z(t) is achieved:
l ~ z(t) = 0.
t--*Oo
(10.2)
The design procedure of Section 5, applied to (10,1), is as follows:
Step 0: Define Zl = Zl. Step 1: Let 01 be an estimate o f t and define the new state x2 as z2 = 2Zl + z2 + 01z~.
(10.3)
Substitute (10.3) into (10.1) to obtain
~1 = -9~1 + 12 + ~ ( 0 - ol).
(10.4)
Then, let the update law for 01 be
~1 = x~.
(lo.5)
Step 2: Using (10.3) and (10.5), write ~ as 12 -" 2(z2 -I- 0z~) -I- z3 -I- 012ZlCZ2 -t- Oz2) -4- z~.
(10.6)
Let 02 be a new estimate of O, and define the new state
z3 = 2z2 + 2(z2 + d2z~)(1 + dlz~) + z~ + za.
(10.7)
Substitute (10.7) into (10.6) to obtain
$2 = -2z2 + z3 + 2z~(1 + dxzl)((~- 02).
(10.8)
Then, let the update law for 0~ be dz = 2zzz~(1 + 0xzx).
(10.9)
Step 3: Using (10.3), (10.5), (10.7) and (10.8), write ~3 as ~ = 2 [-2~2 + ~ + ~ ( 1 + 01~)(0 - 0~)] + 2 [~ + 2 ~ 0 2 ( ~ + o~[) +~[~(1 + 0~1)] (1 + 0 1 ~ ) + 2(~ + 02~[) [~' + 0~(~ + o~)] +5z14(z2 + 0zx2) + u.
(10.10)
342
Kokotovi~, Kanellakopoulos, and Morse
Let 03 be a new estimate of 0, and define the control u as
+2~2~(1
+ ~ 1 ) ] (1 + ~ 1 )
- 2(z2 + ~2~) [ ~ + ~(~2 + 0~)]
-5~'(~2 + 0 ~ ) .
(10.11)
Substitute (10.11) into (10.10) to obtain (10.12) Finally, let the parameter update law for tgz be ~ = ~3 [ ~ ( ~ + 2~,~1) + 4z~a~ + 2~x(~2 + ~ 1 )2~ 1~ + 5 ~ ]
(10.13)
The resulting adaptive system is ~1 = - 2 z l
+ x2 + z2(O - 01)
x2 = -2z2 + x3 + 2z~(1 + O l Z l ) ( 0 -- 02) x3 = -2x34- [2x~(1 + t91x1) + 4z~d2 -f 2~1(z2 + t92z21)z~ 4- 5x 6] (O - t93)
'01 = ~:~
(10.14)
•52 = 2:~2x~(1 + ,~1~:1)
Using the Lyapunov function 1 V = ~ [z~ -I- x~ + z] + (O - 01) 2 + (0 - ~2) 2 + (0 - ~3)2] ,
(10.15)
it is straightforward to establish the global stability and regulation properties of (10.14). [] E x z m p l e 10.2 (Tracking). Consider now the problem in which the output y of the nonlinear system ~i = z~ + OZ~l
z2 = u + z3 ~3 = - z 3 + y
(10.16)
Y~-~ Z l ,
is required to asymptotically track the reference signal Yr = 0.1 sin t. For the sake of comparison, let us first solve this problem using the scheme of [15]. This scheme employs the control u = - z 3 + k l ( z l - y,) + ~ 2 ( ~ + ~lZ~ - / ~ )
+ ~r - 2~lZiZ~ - 2~:z~,
(10.17)
where 01, 82, the estimates of 0, 02, respectively, are obtained from the update laws: " e1~1 " e1~2 (10.18)
Adaptive Feedback Linearization of Nonlinear Systems
343
Using a relative-degree-two stable filter M(s), the variables e,, ~l, ~2 in (10.18) are defined as
(10.19) (1o.2o) (lO.21) (lO.22)
~1 = M(s) [2ztz2 + k2z~] =
Simulations of this system were performed with 1
M(s)= s~+5s+6,
0=1,
k1=-6,
k2=-5,
(10.23)
and all the initial conditions zero, except for zl(0), which was varied between 0 and 0.45. The results of these simulations, shown in Fig. 2, indicate that the response of the closed-loop system is bounded for zt(0) sufficiently small, that is, for zl(0) < 0.45. However, for zl(0) >_ 0.45, the response is unbounded. This behavior is consistent with the proof of Theorem 3.3 in [15], which guarantees boundedness for all initial conditions only under a global Lipschitz assumption. In the above system, the presence of the term z~ leads to the violation of this assumption, and, as the simulations show, to unbounded response. The unbounded behavior in Fig. 2 is avoided by the new scheme, which results in globally stable tracking. The design procedure of Section 9, applied to the system (10.16), results in the control u = - z 3 - 3x2 - 2(z2 + d2z~)(1 + 01zt) - zlz'~ + 2/It + Yr,
(10.24)
and the update laws
(10.25) where zt = zl - Yr
(10.26)
Theorem 9.7 establishes that uniform stability and asymptotic tracking are achieved for all zl(0), z2(0), za(0), #1(0), #2(0). This is illustrated by simulations in Fig. 3. n The above example illustrates an obvious advantage of the new scheme when applied to strict-feedback systems: it guarantees global stability for all types of smooth nonlinearities. For pure-feedback systems, when the feedback lineariza~ tion is not global, the new scheme provides an estimate of the region of attraction. An advantage of the schemes in [5,12,13,14,15,16] is that they provide local results without assuming the pure-feedback form. However, estimates of the region of attraction are given only in [5,13,14]. A quantitative comparison of the regions of attraction and robustness properties guaranteed by different schemes is a topic of future research.
344
Kokotovi~, Kanellakopoulos, and Morse
1.5
'
0.5
0 0
i
|
1
2
|
|
i
3
4
5
Time t F i g . 2. Locally stable tracking with the adaptive scheme of [15].
Tracking error y - yr
1.5,
0.'~
0
o
i
i
i
4 Time t
F i g . 3. Globally stable tracking with the new adaptive scheme.
Adaptive Feedback Linearization of Nonlinear Systems
345
11 Conclusions The results of this paper have advanced in several directions our ability to control nonlinear systems with unknown constant parameters. The most significant progress has been made in solving the global adaptive regulation and tracking problems. The class of nonlinear systems for which these problems can be solved systematically has been substantially enlarged. The strict-feedback condition precisely characterizes the class of systems for which the global results hold with any type of smooth nonlinearities. For the broader class of systems satisfying the pure-feedback condition, the regulation and tracking results m a y not be global, but are guaranteed in regions for which a priori estimates are given. It is crucial that the loss of globality, when it occurs, is not due to adaptation, but is inherited from the deterministic part of the problem. All these results are obtained using a step-by-step procedure which, at each step, interlaces a change of coordinates with the construction of an update law. Apart from the geometric conditions, this paper uses simple analytical tools, familiar to most control engineers.
Acknowledgements The authors are grateful to Professors Riccardo Marino and Laurent Praly for their insightful comments and helpful suggestions.
References 1. K. S. Narendra and A. M. Annaswamy, Stable Adaptive Systems, Prentice-Hall, Inc., Englewood Cliffs, N J, 1989. 2. S. S. Sastry and M. Bodson, Adaptive Control: Stability, Convergence and RobustheSS, Prentice-Hall, Inc., Englewood Cliffs, N J, 1989. 3. A. Isidori, Nonlinear Control Systems, 2nd ed., Springer-Verlag, Berlin, 1989. 4. H. Nijmeijer and A. J. van der Shaft, Nonlinear Dynamical Control Systems, Springer-Verlag, Berlin, 1990. 5. G. Campion and G. Bastin, "Indirect adaptive state feedback control of linearly parametrized nonlinear systems," Int. ?. Adapt. Control Sig. Proc., vol. 4, pp. 345-358, Sept. 1990. 6. I. Kanellak0poulos, P. V. Kokotovi~, and R. Marina, "Robustness of adaptive nonlinear control under an extended matching condition," Prepr. IFAC Syrup. Non. linear Control Syst. Design, pp. 192-197, Capri, Italy, June 1989. 7. I. Kanellakopoulos, P. V. Kokotovi~, and 1t. Marina, "An extended direct scheme for robust adaptive nonlinear control," Automatiea, to appear, March 1991. 8. R. Maxino, I. Kanellakopoulos, and P. V. Kokotovi~, "Adaptive tracking for feedback linearizable SISO systems," Proc. £Sth IEEE Con]. Dec. Control, pp. 10021007, Tampa, FL, Dec. 1989. 9. J.-J. E. Slotine and J. A. Coetsee, "Adaptive sliding controller synthesis for nonlinear systems," Int. J. Control, vol. 43, pp. 1631-1651, June 1986. 10. D. G. Taylor, Feedback Control o] Uncertain Nonlinear Systems with Applications to Electric Machinery and Robotic Manipulators, Ph.D. Thesis, University of Illinois, Urbana, IL, May 1988.
346
Kokotovi~, Ka~eUakopoulos, and Morse
11. D. G. Taylor, P. V. Kokotovi~, R. Marina, and I. K~nellakopoulos, "Adaptive regulation of nonlineax systems with unmodeled dynamics," IEEE Trans. Ant. Controi~ vol. 34, pp. 405-412, April 1989. 12. K. Nam and A. Arapostathis, "A model-reference adaptive control scheme for pure-feedback nonlinear systems," IEEE Trans. Aut. Control, vol. 33, pp. 803811, Sept. 1988. 13. J.-B. Pomet ~nd L. Praly, Adaptive nonlinear controk an estimation.based algodthm, in New Trends in Nonlinear Control Theory, J. Descusse, M. Fliess, A. Isidori and D. Leborgne Eds., Springer-Verlag, Berlin, 1989. 14. J.-B. Parent and L. Praly, "Adaptive nonlinear regulation: equation error from the Lyapunov equation," Proc. ~Sth IEEE Conj. Dec. Control, pp. 1008-1013, Taxnpa, FL, Dec. 1989. 15. S.S. Sa~try and A. Isidori, "Adaptive control of lineaxizable systems," IEEE Trans. Aut. Control, vol. 34, pp. 1123-1131, Nov. 1989. 16. A. Teel, It. Kadiyeda, P. V. Kokotovi~, and S. S. Sa~try, "Indirect techniques for adaptive input output linearization of nonlinear systems," Int. J. Control, vol. 53, pp. 193-222, Jan. 1991. 17. L Kanellakopoulos, P. V. Kokotovi~, and R. H. Middleton, "Observer-based adaptive control of nonlinear systems under matching conditions," Proc. 1990 Amer. Control Conf., pp. 549-555, San Diego, CA, May 1990. 18. I. Ka~ellakopoulos, P. V. Kokotovi~, and R. H. Middleton, "Indirect adaptive output-feedback control of a class of nonlineax systems," Proc. ~9th IEEE Conf. Dec. Control, pp. 2714-2719, Honolulu, HI, Dec. 1990. 19. A. Feuer and A. S. Morse, "Adaptive control of single-input single-output linear systems," ]EEE Trans. Aut. Control, vol. AC-23, pp. 557-569, Aug. 1978. 20. R. Su, "On the linear equivalents of nonlinear systems," Syst. Control Lett., vol. 2, pp. 48-52, Jttly 1982. 21. L. R. Hunt, R. Su, and G. Meyer, "Global transformations of nonlinear systems," IEEE Trans. Automat. Control, vol. AC-28, pp. 24-31, Jan. 1983. 22. W. Dayawansa, W. M. Boothby and D. L. Elfiott, "GlobaJ state and feedback equivalence of nonlineax systems," Syst. Control Lett., vol. 6, pp. 229-234, 1985. 23. W. Respondek, Global aspects of linearization, equivalence to polynomial forms and decomposition of nonlinear control systems, in Algebraic and Geometric Methods in Nonlinear Control Theory, M. Fliess and M. Hazewinkel, eds., pp. 257-284, D. Reidel Publishing Co., Dordrecht, 1986. 24. B. Jakubczyk and W. Respondek, "On linearization of control systems," Bull. Acad. Pal. Science, Ser. Science Math., vol. 28, no. 9-10, pp. 517-522, 1980. 25. L . R . Hunt, R. Su, and G. Meyer, Design for multi-input nonlinear systems, in Differential Geometric Control Theory, It. W. Brockett, R. S. Millman a~d H. S. Sussmann, eds., pp. 268-297, Birkh~user, Boston, MA, 1983. 26. R. Maxino, W. M. Boothby and D. L. EUiott, "Geometric properties of linearizable control systems," Math. Syst. Theory, vol. 18, pp. 97-123, 1985.
Adaptive
Stabilization
of Nonlinear
Systems
L. Praly, a G Bastin, 2 J.-B. Pomet, 3 and 7,. P. Jiang x a Centre d ' A u t o m a t i q u e et Informatique - Section Automatique, ]~cole des Mines de P a d s , 35 rue St Honord, 77305 Fontainebleau cddex, F R A N C E . 2 Laboratoire d ' A u t o m a t i q u e , de Dynamique et &Analyse des Syst~mcs, Universitd Catholique de Louvain, Place du Levant 3, B-1348 Louvain-L~-Neuve, BELGIUM. D e p a r t m e n t of Mathematics ~nd Statistics, Queens University, Kingston, Ontario, KTL 3N6, CANADA. A b s t r a c t . An overview of the various parametric approaches which can be adopted to solve the problem of adaptive stabilization of nonlinear systems is presented. The Lyapunov design and two estimation designs -equation error filtering and regressor filtering- are revisited. This allows us to unify and generalize most of the available results on the topic and to propose a classification depending on the required e x t r a assumptions - matching conditions or growth conditions.
1 P r o b l e m S t a t e m e n t and A s s u m p t i o n s 1.1 P r o b l e m
Statement
W e consider a dynamic system which admits a finite state-space representation and whose dynamics are described by an equation which involves uncertain constant parameters. W e are concerned with the design of a dynamic state-feedback controller which ensures, in spite of that uncertainty, that solutions of the closedloop system are bounded and their z-components converge to a desired set point. Example: Consider the following one-dimensional system:
(1)
= p* x 2 + u,
(2)
where p* is a constant parameter. Would the value of Iv* be known, we could use the following linearizing control law to globally stabilize the origin of the closed-loop system:
u = -p*x 2 - z.
(3)
W h e n only an approximate value ~ of iv* is known and is used in the control law (3), we obtain the following closed-loop system: = _z + ( f _ ~ ) x 2 . This system (4) has two equilibrium points: i. z : 0, which is exponentially stable, 1 2. z - p . ~ , which is exponentially unstable.
(4)
348
Praly, Bastin, Pomet, and Jiang
Hence, as long as ~ is not exactly equal to p*, the global asymptotic stability of z = 0 is lost. We notice also that the simple linear control u = -,
(~)
gives exactly the same qualitative behavior. Assume now that we apply the following dynamic controller: (6)
= ~ U ~ - - - p X 2 - - ~ff.
This is the linearizing controller where instead of the true value p* of the parameter, we use an on-line updated estimate ~. The corresponding closed-loop system we get is: = ~3
(7)
"----X
"Jr" ( p * - - ~ ' ) 2
2 •
To study the stability, we consider the function:
=
1
+
p.)2).
(8)
Its time derivative along the solutions of (7) is:
¢~ =
(9)
-z ~ .
It follows that any solution of the closed-loop system (7) is bounded and: lira z(t) = 0.
(10)
t--~oo
Therefore, the convergence of z to 0 is restored.
0
In this example, the parameter p* enters linearly in the dynamic equation (2). This assumption is fundamental throughout the paper. It is formalized as follows: A s s u m p t i o n A-LP (A-Linear P a r a m e t e r i z a t l o n ) (11) We can find a set of measured coordinates z in IKn such that, given an integer k and a C 1 function A : lE '~ x IR+ ~ AcIk,(IR), there ezist an integer l, two C 1 functions: a : IR"
x IR"
-~
IR",
A : IR"
x ] R "~ - ~
.M.~(IR),
and an unknown parameter vector p* in IR t such that: 1. the functions A ( z , t ) a ( z , u ) and A ( z , t ) A ( z , u ) are lmown, i.e., can be evaluated, 2. the dynamics of the system to be controlled are described by: = a(z,u) + A(z,u)p*, where u is the input vector in IE m.
(12)
Adaptive Stabilization of Nonlinear Systems
349
To deal with the case where a and A are afline in u, it is useful to introduce the following notation: m
.(., u)
= .0(.)
m
A ( . , u) = A o ( z ) % ~ u ~ A,(x)
+ ~u,.,(.)
(13)
i=1 def
dd
= -o(~) + u e b(~),
= Ao(~) + u o B(~).
(14)
E x a m p l e : I n t r o d u c t i o n t o S y s t e m (17) Clearly, in the case of the system (2), if' we chooee:
A(z,t)- I,
(15)
assumption A-LP is satisfiedwith the known functions:
a(x,u) = u, A(z,u) = x 2.
(16)
Let us now consider the following two-dimensional system: ~x = x ~ ( ~ +Cl + c ~ u )
(17)
~2 = -c2 ~ (x2 + cs x , ) , where ct, c2 and ca are three unknown real numbers. Depending on our choice for the function A, we get different parameterizations. - A first possible choice is, with k = 2:
A(xl,xa,t) = ( ~ ~ ) .
(18)
Then A-LP is met with l = 3, the parameter vector: f
= ( c , , c~, c~ca) T ,
(19)
and the known functions:
.(~,x2)
=
~
~
,
A(~,,~2,u)
=
_~ ~ .
.
- Another possibility is, with k = 1, A(zt, z2,t) = (xt 0).
(21)
Assumption A-LP holds with I = 2, the parameter vector: ]9* -- (Cl
C2) T ,
(~2)
and the functions:
.(.~,.~)=
*
2
,
~(.~,**, u)
=
-.~ (.~ +
ca.,)
Here the function A is unknown since it involves the unknown constant c3. But both A a and A A are known. In particular,the choice (21) for the function A implies that we pay attention to the dynamics of zi only and
350
Praly, Bastin, Pomet, and :liaztg disregard the dynamics of x2.
[]
This example illustrates that in order to find a state-space representation and a parameterization and to choose a function A satisfying assumption A-LP, it may be useful not to work with the a priori given, say physical, coefficients. First, the knowledge of these coefficients may not be relevant to the Stabilization problem. As emphasized by Sastry and Kokotovic [28], this is in particular the case if, by looking at these coefficients as disturbances, they can be rejected. For the system (17) in Example (14), we will see in Example (49) that the coefficient c3 is irrelevant to the stabilization. Second, it may also be useful to change the a priori given coordinates and define the parameters as functions of the coefficients. For instance, in Example (14), we may choose P3 = c2c3. This trick is classically used in robotics (see [14]). It has also been used by Sastry and Isidori [29] for input-output linearization of systems with relative degree larger than one and by Kanellakopoulos, Kokotovic and Middleton [10] for adaptive dynamic output feeeback.
Example: Introduction to System (25)
(24)
Consider a system whose dynamics is described by the following second-order differential equation: ij = u + L ( y ) p * , (25) where y is a measured output, u is the input and L is a known smooth function. A straightforward state-space representation with (y,/t) as the state vector is not appropriate for satisfying assumption A-LP, since y is not measured. Instead, let us introduce the following filtered quantities: ijf + yf + Yt = Y
vf(0) = i f ( 0 ) = 0
i i f + iJr + uf = u
.f(0) =
L t + L t + Lr = L
Lf(0) = Lf(0) = 0 .
f(0) = 0
(26)
Then, (25) can be rewritten with the following non-minimal state-space representation, with - denoting differentiation with respect to time: (01
=
0
0
0
00
1 0 p* 0
00
0
1
00 -1 -1
0
0
0
0
0 0 0 0 0 1 00
0
0
0~
0 -1 -1
%
0 u
0
L(ylj
(27)
Adaptive Stabilization of Nonlinear Systems
351
y = fjf + ~ + ,,f + Lfp', + 6(~), ~ = y~ + , , + ~ + ( Lt + LOp. + ~it) q- 5(t),
(28)
where z is the following state vector: /
•
" -
\ T
L, L 0
,
and 6(t) is the solution of: ~'+$+~
= 0,
with
6(0) = y(0), 8(0) = ~ ( 0 ) - y ( 0 ) .
(30)
Since x can be obtained from the knowledge of y and u only, assumption A-LP is satisfied up to the presence of the exponentially decaying time function/~ by choosing: A(x,t) - (0 1 0 0 0 0). (31) The presence of 6 implies that our forthcoming results will not apply in a straightforward manner. In each case we will have to study the effects of this term. []
According to assumption A-LP, the only knowledge we have about the system to be controlled is that it is a member of a linearly parameterized family of systems whose dynamics satisfy the following equation, denoted by (Sp) in the sequel: = a(~,u) + A ( z , u ) p .
(S~)
In this case, we formalize the Adaptive Stabilization problem as follows: A d a p t i v e S t a b i l i z a t i o n P r o b l e m : Find an integer v and two functions pl : IRa x l R u ~ I f f f ,
p~ : IRn x H t u --* IRm,
such that there exists an open subset ~) of IKn x IR~ with the following property: The solutions (z(t), X(t)) of the system composed of the system (Sp.) to be controlled and the following dynamic state-feedback controller: = #l(Z, X)
(32)
u = ~2(x, x), with (z(0), X(0)) in :P, (AS1) are well-defined, unique and bounded on [0, +oo), (AS2) have the property lira x(t) = ~ where £ is a desired set point for x, which may depend on p*.
(33)
352
Praly, Bastin, Pomet, and Jiang
A typical illustration of this problem has been given in Example (1). It turns out that, in all the solutions to this problem which are proposed in this paper, part of the state X of the controller can be considered as an estimate ~ of the unknown parameter vector p*. This motivates the adjective "adaptive". Several solutions to the Adaptive Stabilization problem have been proposed in the literature under particular assumptions. Extending the work of Pomet [19], we present here a framework allowing us to unify and generalize most of these solutions. 1.2 C o n n e c t i o n w i t h t h e E r r o r F e e d b a c k R e g u l a t o r P r o b l e m When, in equation (Sp), the function A does not depend on u, i.e., A ( z , u) = Ao(x), and a is affine in u (see (13)), the Adaptive Stabilization problem described above has similarities with the Error Feedback Regulator problem as stated by Isidori for systems of the form [6, Sect. 7.2]: = So(Z) + u ® b ( x ) + A o ( x ) p = s(p)
(34)
u = h(~) + q(p), where p in IRl is an unmeasured disturbance while y in IRk is a measured output signal. E r r o r F e e d b a c k R e g u l a t o r P r o b l e m : Find an integer v and two functions Px : IRJ: x IR~ " * IR~,
#2 : IR~ ~ IR"~,
such that: (EFR1) the equilibrium point of
= a0(x) + .~(x) o b(x) + A0(,)p*
(35)
= m(h(~), X) is asymptotically stable in the first approximation, (EFR2) there exists an open subset D of IRn x IR~ such that the solutions of the system (34) controlled by the following dynamic output feedback: = i,1(u,x) ,,
(36)
= us(x),
with initial condition in this set :D, have the property lim y(t) = ,f.
t .,-*OO
where £ is a desired set point for y, independent of p.
(37)
Adaptive Stabilization of Nonlinear Systems
353
It would appear that the family of systems (Sp) dealt with in our Adaptive Stabilization problem is a subclass of the systems (34) of the Error Feedback Regulator problem. This family is obtained by imposing a constant disturbance,
i.e,, = ,(p) = o ,
(38)
and a measured state, i.e., h(z) -- z
q(p) = 0.
and
(39)
Isidori gives a solution to the Error Feedback Regulator problem in [6, Theorem 7.2.10]. It applies to systems (Sp) under the following assumptions, using the notation (13): (A1) There exists a C ~ function uo(p) defined on an open neighborhood of p* such that:
a0(£) + u0(V)O b(£) + A0(C)p = 0,
(40)
(A2) The n x I matrix Ao(£) has rank l, (A3) The pair
(£) ÷ u0(p*) ® ~z(£) q-
(£) p*, b(£) is stabilizable.
Assumption (A1) is quite natur~l if £ is not allowed to depend on p. It states only the existence of a control, smoothly depending on p, making £ a set point of (Sp) for all p close enough to p*. Assumption (A2) is much more restrictive since it imposes that the number l of parameters do not exceed the dimension r~ of the state and that the matrix A0(z) cannot degenerate at x = £. Finally, assumption (A3) excludes systems whose linearization has uncontrollable modes associated with pure imaginary eigenvalues and, in particular,systems which are not feedback linearizable. ExAmple: S y s t e m (17) C o n t i n u e d (41) Assumptions (A2) and (A3) are not satisfied By the following system (i.e., system (17) with c2 = 1 to make A independent of u): =
+ -)
(42)
~2 = - z~ (x2 + p2 zl) •
aa .g u" OA Indeed, A(£), ~-~x( , ) and ~ ( £ )
are all zero.
D
In fact, assumptions (A2) and (A3) follow from the strong requirement EFRI of asymptotic stability.In this paper, it is precisely in order to be able to deal with systems which may not satisfy assumptions (A2) or (A3), such as system (42), that, instead of requirement EFR1, we ask, in the Adaptive Stabilization problem, for the less stringent Lagrange stability requirement ASI.
854
Praly, Bastin, Pomet, and Jiang
1.3 A s s u m p t i o n s Our counterparts of assumptions (A1), (A2) and (A3) are the following assumptions on the particular system (Sp*): Let 17 be an open subset of ]Rt and ~7 be an open neighborhood of ~ in ]Rn. There exist two known functions: V : D×H
~ IR+ of classC 2,
Un : ~2x1-1 --* IK'n of classC 1,
such that:
A s s u m p t i o n BO ( B o u n d e d n e s s Observability)
(43)
There exist an open neighborhood J)o of g in ~ and a strictly positive constant Cto such that for all real numbers ¢~, 0 < a < C~o, all compact subsets IC of 17 and all vectors zo E Y2o, we can find a compact subset F of Y2 such that, for any C 1 time functions ff : ~ --* II and u : IR+ ~ IR"~ and any solution z(t) of." = a(z,u(t)) + A ( z , u ( t ) ) p * ,
z(0) = xo e Do
(44)
defined on [0, T), we have the following implication: V ( z ( t ) , ~ ( t ) ) < ~ and ff(t) E K: Vt E [0,T)
==~
x(t) E F
Vt E [0,T). (45)
A s s u m p t i o n P R S ( P o i n t w i s e R e d u c e d - O r d e r Stabilizability) For all (z,p) in (2 x H, we have: ~-~z(x,p) [a(z, u~(z,p)) + A ( z , u.(~:.p)) p] _< O,
(46)
(47)
where the inequality is strict iff V ( z , p) ~t O.
In the sequel, the case where t2 = S20 = IR"
and
or0 -- Woo
(48)
is called the global case. E x a m p l e : S y s t e m (1Y) C o n t i n u e d (49) Consider the two-dimensional system (17) in Example (14) which, according to the second parameterization we mentioned, is rewritten as: =
+ p, +
.) (50)
where P2 is known to be strictly positive. We choose: =
1
=
x~ "t- pl + x l P2
(51)
355
Adaptive Stabilization of Nonlinear Systems
~o = ~ = n~2 ,
//=
~ × (m+ - { 0 } ) ,
~0 = + ¢ ~ .
(52)
Notice that the constant c3 is not involved. This justifies a posteriori the choice of the parameterization. Assumption PtLS is satisfied. Indeed, we get:
OV
o~ (~' p) [a(x, ~ . ( ~ , p ) )
+ A(~, u.(~,p)) Pl = - ~ .
(53)
Assumption BO holds also if, for any C 1 time function x l ( t ) , we have: Iz,(t)l < a
Vte
[0,T)
=:~
}z2(t)l 5 7
=~ (=2 +
ca =l(t)),
gt e [0,T),
(54)
where z2(t) is a solution of:
~2 = - p l
(55)
and 7 is a positive function of a and x2(0). To prove this implication, we notice that Pa* > 0 implies:
Ix=(t)l > Jc3J Ix~(t)l
~
Z~it) < 0.
(56)
Hence, with (54), we can choose: = max{l~2(0)l,
Ic31~} •
(57)
rl This example illustrates two points of our assumptions: 1. The parameter vector p must in some cases be constrained to lie i n / 7 , an open set strictly contained in n~a. Indeed, here, for P2 = 0, un is not defined. 2. The function V looks like a Lyapunov function in x, but actually it may not be radially unbounded in z. For instance, in Example (49), x2 may go to infinity without V going to infinity. T h e radial unboundedness of a Lyapunov function V ( z ( t ) , t ) guarantees that the magnitude of V(x(t), t) at time t gives a bound on the norm of the full state vector z(t) for the same time t. In contrast, as illustrated by Example (49), the magnitude of V(x(t),p), given by assumption BO, at time t gives a bound on the norm of only a part of the state vector z(t) at the same time t, namely zl(t). W h a t assumption BO actually guarantees is the "observability" of the boundedness of the full state vector z from the "output" function V(z,p), i.e., that if the trajectory {V(z(t),P)}t e [0,T) of the "output" is bounded, then so is the trajectory {z(t)} t e [o,T) of the full state Then, assumption PRS guarantees the existence of a control law which forces the part of the state vector mentioned above to converge to the corresponding part of the equilibrium point £. Later on, we will add more constraints on the function V, e.g. assumption MC. In order to allow for more possibilities to find such a function V meeting all these requirements, it is useful at this stage to have assumption BO instead of the more standard but more restrictive radial unboundedness.
356
Praly, Bastin, Pomet, and Jiang
To guarantee convergence to g of the whole state vector z and not only of its reduced-order part, we will need also: Assumption CO (Convergence Observability)
(58)
For any bounded C 1 time functions ~ : IR+ ~ 1 I and u : ~ ~ lR m with also bounded and for any solution x(t) of (44) defined on [0, @c~), we have the following implication: lira V(x(t),~(t)) exists and is zero, ~ g --~ O 0
x(t) is bounded on [0, +oo)
and
~Ct) • a
==~
lim x(t) exists and is equal to £.
v t • [0,+~)
ExRmple: S y s t e m (17) C o n t i n u e d (59) Assumption CO is satisfied for the system (17) rewritten as (50) with the choice (51). This is a consequence of the uniform asymptotic stability of the zero solution of (see Lakshmikantham and Leela [11, Theorem 3.8.3]):
~2 = -p~ x~.
(60)
Indeed, since xx is a component of a solution of an ordinary differential equation, the convergence of x~(0 to 0, implies the boundedness:
(61)
Izl(t)l < c~ Vt. Then, since assumption BO is satisfied,
Ix2(t)l < ~' = max{lz2(0)l, Icsl~}
Yr.
(62)
Now, the convergence of z~(t) to 0 implies also that for all ~ > 0 there exists a time T such that: Ix,(t)l _< 6 Vt _> T. (63)
And, with (55), we have:
It follows that:
1~2(t)[ > 6 ( 1 + Icsl)
-~
~t)
< - 6 ~.
(65)
Therefore, with (62), defining r(~) by:
r = ~1 (7 2 _ e 2 (1 + icsl)2),
(66)
V6 > 03T, 3 r : Vt_> T+r,[x2(t)[_< 6(l+[csD.
(67)
we have established:
This is CO.
[]
Ad~ptive Stabilization of Nonlinear Systems
357
This Example illustrates a typical fact about assumption CO: it is usually difficult to prove that it holds. Above we used a Total Stability argument, here is another example invoking Barb,~ilat's Lemma [26, p.211]:
(6s)
Example: Consider the following system: ~ = x2
+ p°
(69)
•z 2 = u .
We wish to stabilize the equilibrium point zl = 0, xu = -p*. Assumptions BO and PRS are satisfied when we choose:
y(xl,~2,pl = x ~ + ( ~ 2 + p + x l ) ~ , ~.(~l,~,P) = - 2 ( ~ 2 + p + ~ 1 ) .
(70)
To check that CO is also met, we follow the same steps as the ones proposed by Kanellakopoulos, Kokotovic and Marino for the proof of [9, Theorem 1]. Clearly, if V(Zl(t), z2(t), p~)) tends to 0, the same holds for zl(t) and x 2 ( t ) + z l ( t ) + ~ ( t ) . To obtain our conclusion, it is sufficient to prove that zl(t) tends to 0. Indeed, in such a case, the first equation in (69) implies in this case that z2(t)+p* tends to 0. Since we have: lira g--*-bco
/'z,is)ds z l ( t ) = 0 = xl(O) + t .lira + o o J0
,
(71 /
from Barb~tlat's Lemma, xxi$) tends to 0 if this time function is uniformly continuous. This is indeed the case since its time derivative is:
• ~'~t) = ~ , ~ / =
~(t),
(72)
where, by assumption, the time function u(t) is bounded.
[]
Assumptions BO, PBS and C O a r e w e a k e r than (All, (A2) and (A3). Indeed, it follows from linear systems theory [7] and Total Stability theorems [11] that assumptions (A1 / and (A3) imply the existence of: -
$'2, an open neighborhood of ~, / / , an open neighborhood of p*, P, an n x n positive definite matrix, C, an n x m matrix, and k, a strictly positive constant,
such that, by letting:
V(x,p) = V(z) = ( z - ~ ) W p ( z _ e ) ,
un(z,p) = - C ( z -
£) + u0(p), (73)
we have:
~v(x)[.0(~)+..(x,p)ob(~)+a0(x/p]
< kv(~),
v(~,p) •
a×u.
(74)
358
Praly, Bastin, Pomet, and Jiang
This implies that assumption PItS is satisfied. Also, the function V is positive definite and rndially unbounded. Hence, assumptions BO and CO hold with $2o = 12 and a0 the largest positive real number a such that: <
•
e a.
(78)
Finally,note that (A2) is not needed. If the value of the parameter vector p* were known, assumptions BO, PRS and CO would be sufficient to guarantee the stabilizability of E. This is made precise as follows: Proposition (76) Let assumptions BO and PRS hold, p* be in 17 and the control Un(X,p* ) be applied to the system (Sp.). Under these conditions, all the solutions z(t) with initial condition x(O) in 12o and satisfying V(z(O),p*) < ao are wel{-de~ned on [0, +oo), unique, bounded and." lim V ( x ( t ) , p ' )
$---* Co
(77)
= 0.
If, moreover, assumption CO holds, then x(t) converges to g. Proof. The system we consider is: ]c = a(x, un(x,p*)) + A(x,u,(z,p*))p* .
(78)
With p* fixed in/7, this is an autonomous system with its right-hand side continuously differentiable in the open subset 12 of IIt". Hence, for any initial condition x(0) in 120 C 12, there exists a unique solution x(t) of (78) in 12. It is a continuously differentiable time function (and so is Un) defined on a right maximal interval [0,T), with T a strictly positive (possibly infinite) real number. Let us prove by contradiction that T = -boo if V(z(0),p*) < a0. Assume the contrary. From the theorem on continuation of solutions [5, Theorem 1.2.1], x(t) tends to the boundary of 12 as t tends to T. But since:
• (t) e a ,
vte[0,T)
and
p'en,
(79)
we may use (47) in assumption PRS and conclude:
V(x(t),p*) < 0 Vt E [0,T).
(80)
This yields: V(x(t),p*) < V(x(O),p*) < a0
Vt E [0,T).
(81)
Then, from assumption BO, we know there exists a compact subset F of 12, depending on p*, x(0) and V(x(0),p*), such that:
x(t) e F , Vt e [0,T).
(82)
Adaptive Stabilization of Nonlinear Systems
359
Since the set /" is compact, it is strictly contained in the open set ~. This establishes the contradiction and the fact that the time functions x(t) and u(t) = Un(X(t),p*) are bounded on [ 0 , + ~ ) . Now, from (47), we have: v(~(~),p.) < 0 = o
vt : v ( x ( 0 , p ' ) # 0 vt : v(x(t),v')
(s3)
= o.
Since the function V is nonnegative, this implies (77). Finally, convergence to £ of x(t) is a straightforward consequence of assumption CO. [::1 In the following we will show that even when the value of f is unknown, in which case we cannot implement Un(X,p*), the Adaptive Stabilization problem can be solved ff extra assumptions are added. In Section 2, a solution will be obtained from the Lyapunov design with the assumption that a so-called matching condition is satisfied. In Section 3, other solutions will be obtained from an estimation approach. They will require a stronger version of assumption PRS and that either a matching condition or some growth conditions on the nonlinearities be satisfied. 2 Lyapunov
Design
In a famous paper, Parks [18] suggested a very efficient way of getting a controller for the linearly parameterized family of systems (Sp). The idea is to use the control:
= ~.(x,~)
(s4)
with the time function ~ selected so that a positive definite radially unbounded function of x and ~ be decaying. Even though here the function V is not radially unbounded, let us pursue this idea and compute the time derivative of 1
W(x,~) = V(x,~) + ~ II~-p*ll ~
(85)
along the solutions of (Sp-)-(84). Using (47) in assumption PRS, we get:
¢¢ = ~ zV (x, ~) [a(z, u,(x, ~)) + A(x, un(x, ~)) p*]
OV
= ~-;~ (~,~) [a(~, un(~,~)) + A(~, un(x,~))~] OV OV ~ <_ [ - . - ~ z ( x , p ) A ( x , u ~ ( z , ~ ) ) + ; ]
~-p']+
OV x " ~-p( ,~)~.
(86)
360
Praly, Baatin, Pomet, and Jiang
2.1 Case: V I n d e p e n d e n t o f p, i.e., V ( z , p )
-- V ( z )
It follows that in the particular case where the function V does not depend on p, by choosing:
=
x) A(x, un( , p
,
e a,
(87/
we are guaranteed that W remains hounded and in particular:
1 v c z ( o ) _< V(x(O)) + ~ l i f O ) - f l l ~ .
(88)
However, this boundedness property is not sufficient, since to use assumptions BO and PRS, we must check that for all t: 1. the following inequality is satisfied: V(~(t)) < a0,
(89)
2. the following membership property is satisfied:
~t) ~ a.
(90)
It is straightforward to see that, if z(t) and ~(t) are continuous functions of t and the following function W:
v(x(t))
w(x(t),,~(t)) = ,~-v(z(t))
1
+~ IIp'(t)-p*II2
(91)
is positive and bounded for all t, then, necessarily, V ( x ( t ) ) is strictly smaller than at for all t. Consequently, (89) is satisfied for all t if we can ensure that the modified function W ( z ( t ) , ~ ( t ) ) (91) is positive and bounded for all t and the constant a t is chosen smaller than or equal to a0. In order to meet the membership property (90), we shall constrain ff to remain in a closed convex subset o f / / b y projection of ~. For this, we need the following property of the set H: A s s u m p t i o n I C S ( I m b e d d e d C o n v e x Sets) There exists a known convex C 2 function ~ from IRt to IR such that:
(92)
1. for each real number A in [0, 1], the set:
/'/x = {P IP(P) < ,~ }
(93)
is contained in II, e. there exists a strictly positive constant d such that:
I[O-~-(p)ll >__ d
VpE {plO<'P(p)},
(94)
Adaptive Stabilization of Nonlinear Systems
361
3. the parameter vector p* of the system to be actually controlled satisfies: 7)(p*) < 0
and
D* de=fdist(p*, {p [7)(p) = 0}) > 0.
(95)
Example: S y s t e m (17) Continued (96) For the system (17) rewritten as (50) in Example (49), the set//is defined by P2 > 0. Then, by choosing:
(
7)(pl,/~z) --- 2 1 - -~- , with e > 0, this s e t / / s a t i s f i e s assumption ICS if p~ > e. More generally, consider the case where the s e t / / i s : P:
( P l , . . . , P l ) w G_ II
~
]Pl-Pi[ < °'i , Vi ~_ { 1 , . . . , 1},
(98)
with pi and ~i some given real numbers. To meet assumption ICS we may choose the function 7) as:
with 0 < ~ < 1 and q _> 2 two real numbers. In this case, we get:
IIA : {P]~~-'lPi--PiI' #, <-- 1 - - e ( 1 - - ~ ) }
(100)
and the s e t / / x , for A = 1, a p p r o a c h e s / / w h e n ¢ decreases and q increases.
D
Assumption ICS allows us to define the closed convex s u b s e t / / I o f / / a s :
//i = {p IV(p) <
1},
(101)
and the function Proj as:
Proj(M, p, y) ---y-
y if P(p)_ 0 and oA~p(p)y > O,
(102) where M is a symmetric positive definite I x I matrix. Namely, Proj(M,p, y) is equal to Y if p belongs to the set {7)(p) _< 0}. In the set {0 _< 7)(p) _< 1}, it substracts a vector M-normal to the boundary {7)(p) = A}, so that we get a smooth transformation from the original vector field for A - 0 to an inward or tangent vector fieldfor A -- I. W e have the following technical properties proved in Appendix A:
362
Praly, Bastin, Pomet, and Jiang
Lemma (103) Let All be the open set of symmetric positive definite I x i matrices. If assumption ICS holds, then: 1. The function Proj(M,p,y) : .h4 × 17 × lit t ~ ]Rt is locally Lipschitzcontinuous. 2. P r o j ( M , p , y ) T M - 1 P r o j ( M , p , y ) <_ y T M - l y Vp • II 1 il
~b~p(P)(P- p') > _ D" (p)ll V p : V(p) > 0 with D * defined in (95), 4. ( P - P * ) T M - 1 p r o j ( M , p , y ) <_ ( p - - p * ) T M - l y 5. Let (M, y) : IR+ ~ .h,t x IR1 be a C 1 time function. On their domain of definition, the solutions of: s.
= P r o j ( M ( t ) , ~, y(t))
if(O) • //1
(104)
sati4v ~(t) • n l . We are now ready to propose the following dynamic controller to solve the Adaptive Stabilization problem when the function V does not depend on p: p = Proj
I , ~, ( V ( z ) - al) 2
(x)A(z, Un(Z,~))
(105)
. = . . ( ~ , ~), where ~(0) is selected in /'/1 and the matrix M used in the function Proj is I, the identity matrix. We have: Proposition (106) Let assumptions BO, P R S and ICS hold with a function V not depending on p. Assume also that assumption A-LP is satisfied with: OV A(~,t) = - y ; ( ~ ) .
(lO7)
If, in (105), al is chosen smaller than or equal to no, then all the solutions
(x(t),~(0) o f ( S p . ) - O o o with ~(0) e Oo a . d V0:(0)) < ,~i are wen-defined on [0, +co), unique, bounded and: l~n v(~(t)) = o.
t'-,¢'OO
(lO8)
It follows that the Adaptive Stabilization problem is solved if assumption CO also holds. Proof. The system we consider is:
= .(x, u . ( x , F)) + A(x,,,,,(x, F ) ) f
(109)
Adaptive Stabilizationof Nonlinear Systems
363
From our smoothness assumptions on the functions a, A, u, and V and with Point 1 of Lemma (103), this system has a locally Lipschitz-continuous righthand side in the open set defined by:
(x,~) E 12 x II
and
Y(x) < O¢1 •
(110)
It follows that, for any initial condition in this open set, there exists a unique solution (z(t), ~(t)), defined on a right maximal interval [0, T), with T possibly infinite, and satisfying (110) for all t in [0,T). Applying point 5 of Lemma (103), we also know that ~(t) E / / 1 for all t in [0, T). Then, let us compute the time derivative of W defined in (91) along such a solution. With assumption PRS, we get as in (86):
IV<
(V(z(t))_al) 2
(x(t))A(x, ua(z(t),~(t))
+~ (t) ( ~ t ) - p * ) , (111)
with a strict inequality if V(x(t)) ~: O. But, with the expression of ~ and Point 4 of Lemma (103), we get readily: _<0 ifV(x(O ) = 0
(112)
< 0 if V(x(t)) # O. It follows that for all t in [0,T): v(.(0) < ~i ~lV(~(t)) ~
<
1
~I v(0)
- v(x(0) - ~1 -
v(o)
+
~
I1~(o)
-
p*ll ~
def =
(113)
ll~(0-p'll 2 < 2~.
Hence, we get: ~l ~ d e f - a < ~x _< a0, (114) al + 3 and we know that ~(t) E/C, where/C is the following compact subset of ]7:
V(z(t)) _<
= {, Ill,-
p'II ~ <
1"1
(115)
Then, from assumption BO, we know the existence of a compact subset F of/2 such that: x(0 e r
Vt e [0,T).
(116)
Hence, the solution remains in a compact subset of the open set defined in (110). It follows by contradiction that T = +oo and, in particular, that the time functions x(O, ~(t),u(t) = u,(x(O, ~t)) and ~(t) axe bounded on [0,+oo). Then (108) is a straightforward consequence of (112) and LaSalle's Theorem [5, Theorem X.1.3]. The conclusion follows readily from assumption CO. D
364
Praly, Bastin, Pomet, and Jia.ng
Example:
(117)
Consider the following system:
- p ' z 3 + ( l - z 2) u
(I18)
with p* positive. We wish to stabilize the equilibrium point z = 0. Clearly, even if p* were known, this would not be possible globally, but only for z E ( - 1 , 1 ) . Then let J7 = ( - 1 , 1 ) and choose: V(z) = x 2
and
u n ( x , p ) --
(119)
p1z -3 +z ~z
Assumptions A-LP, BO, PRS and CO are satisfied with a0 - 1. According to Proposition (106), the following dynamic controller guarantees the convergence of z(t) to 0 for all initial conditions ~(0) and z(0), with ]z(0)] < 1: .~.
2z 4
P= (1~.~ ~ U~-
1-
(120)
z2 E]
Compared with Proposition (76), this Proposition (106) states that, for solving the Adaptive Stabilization problem when p* is unknown, the existence of a function V independent of p is a sufficient condition. This shows that we should look for a control law Un for which we can find such a function V to meet assumption BO and PRS (see [19, Chapter 2]). This is where the fact that V need not be radially unbounded proves to be useful, as we now illustrate: E x a m p l e : S y s t e m (17) C o n t i n u e d (121) For the system (17) rewritten as (50) in Example (49), we have established that assumptions A-LP, BO, PRS and ICS hold when A, V and Un are chosen as:
A(Xl, x 2 , t )
1 2
= (Xl 0 ) ,
V ( x l , z ~ , p l , p 2 ) ---- ~ z l ,
(122)
and:
Un(Xl,X2,Pl,p2 ) =
(123) P2 By specializing the dynamic controller (105) to this system (50) and choosing al - +oo and 7~ as in (97), the following controller solves the Adaptive Stabilization problem:
3
:" = P2
= U --
2
if~ > - e or
~_)z__~ 1-2
Z]+~I+Zl
3 2 11(12 +~1 +~rl)
~ -< 0 3 2 (124) ( x ~ + f f l + x l ) ifff2 < e a n d Xl(X2"[-ffl"~-zl) < 0
xl(x2+P ~ l+Xl)
(
12 -~'Pl 3UXl
~2
P2
1:3
Adaptive Stabilization of Nonlinear Systems
365
Proposition (106) generalizes results established by Sastry and Isidori [29] for systems that are input-output linearizable via state feedback, and by Taylor et al. [31] for state-feedback linearizable systems: Corollary [Sastry a n d Isidori [29, Relative D e g r e e One]]
(125)
Let a and A in equation (Sp) be a ~ n e in u, let the system to be controlled have a single input, i.e., m = 1 and, finally, let 11 be an open subset oflR z which satisfies assumption ICS. Assume the existence of two C¢J functions h : ]Rn ---, lit and Io : IR,n --* l i t n - I such that: I.
= O,
e. the functions and u) are known, 3. we have for all (x, p) in IR n × 11 (with notation (13)): ~ z ( x ) (b(x) + B ( z ) p ) # O,
(126)
4. (h(x), ~o(x)) is a diffeomorphism and defines new coordinates with which the system (Sp. ) can be rewritten as: Oh
h = ~(z)
[a(x, u) + A(x, u)p']
~b -- Z(~o, h),
(127) (128)
where Z is a function which is further assumed to be globally £ipschitz. Assume also that ~(£) is an exponentially stable equilibrium point in IR "-1 of:
= Z(~o, 0).
(129)
Under these conditions, we can.find functions V, Un and 7) such that the corresponding dynamic controller (105} solves the Adaptive Stabilization problem.
Assumption (126) means that each system (Sp), with p E 11, is of relative degree one with respect to the output function h (see Isidori [6]). And the exponential stability of (129) implies that (Sp.) is globally exponentially minimum phase. Proof. Assumption A-LP is met with the choice: =
ah
~-~.z(x).
(130)
Then, let: t2 = 120 = IR",
a0 = +oo,
(131)
and choose the function V independent of p as simply: V(z) = h(z) 2.
(132)
According to [29, Proposition 2.1], assumption BO is satisfied. An argument similar to the one used in Example (59) (replace K by ~ in [29, 2.27]) proves that assumption CO holds also.
366
Praly, Ba~tin, Pomet, and Jiazlg
Now from (126), there exists a C I function Un : IRn x ]-/ ---, IR such that: ~ ( x ) [a(z, un(z,p) + A(z,u~(x,p))p] = - c h ( z ) ,
(133)
with c a strictly positive constant. With (127), this implies that assumption PRS holds. In conclusion, the controller (105) may be employed, with the function 79 given by assumption ICS and al = +c~. []
Corollary [Taylor et al. [311]
(134)
Let, in equation (Sp), the functions a and A be known and affine in u, let I I be an open subset of IR t which satisfies assumption ICS and, finally, let Po be a known vector in II. Assume there exist an open neighborhood 12 of £ in IR" and three known functions: : 12 ~ IR" of class C 2 which is a diffeomorphism, wl : 12 ~ IW ~ of class C x , and w2 : 12 "+ f f £ ( m , IR) of class C 1 , such that: 1. by letting: = ~(x)
and
u = w l ( x ) + w2Cx) 0,
(135)
the time derivative of ~o along the solutions of (Spo) satisfies, for all d in IR rn :
io =
+ Od,
(136)
where D is an n × m matrix and C is an n × n matrix satisfying: PC + cTp
= --I,
(137)
with P a symmetric positive definite matrix,
e. for all (x,p,u) in 1"2 × H x IRm, we have, with notation (13): rank {b(z) + B(x)p} = m
(138)
[A0(x) + u (D B(x)] p: • span {b(x) + B ( x ) p o } .
(139)
Under these conditions, we can find functions V, un and 79 and a constant a l such that the corresponding dynamic controller (105) solves the Adaptive Stabilization problem.
Note that (136) simply implies that u = wl + w20 is a feedback linearizing control for the system (Sp0) in the coordinates ~o = ~(x). Proof. Under the above assumptions, it is proved in [31, Proposition S] that:
• -X(O) = E
(140)
Adaptive Stabilization of Nonlinear Systems
367
and there exists a known C 1 function Un (z,p) in £2 x / / , we have:
x//
---, IRra such that, for all
~z (z) (a(z,p) + A(z,p)un(z,p)) - C@(z).
(141)
It follows that assumption PItS is satisfied if we choose: V(x) - d~(z)VP(~(=).
(142)
To check that assumption BO is satisfied, let us define a0 as the largest (possibly infinite) real number such that the set: <
is contained in ~ ( ~ ) . Since ~ ( g ) = 0, • is a diffeomorphism, and P is a positive definite matrix, the so-defined s0 is strictly positive. Then the sets: r~, -- ~ - ' {~ l ~,TP~, < a }
(143)
are compact subsets of ~ for all a < a0. It follows that: V(z) _< s
- '..
z e F,~,
(144)
which implies that BO is satisfied with ~20 - ~2. Assumption CO holds since: V(z(l)) ~ 0 =::# ¢(x(t)) --* 0 ==~ z ( t ) = ~ - I ( ~ ( x ( / ) ) ) - , £ .
(145)
Since assumption A-LP is also satisfied with the function A equal to the n x n identity matrix, the controller (105) may be employed, with the function :P given by assumption ICS and any s l such that 0 < a l < s 0 . Q One of the features of the result established by Taylor et al. [31] is that it gives a sufficient condition which guarantees the existence of a function V not depending on p while satisfying assumptions BO, PRS and CO. This condition is assumption (139), called the strict matching condition in [31, Assumption L]. It turns out that (with notation (13)), when B - 0 and b(g) has full rank m, this condition is also the necessary and sufficient condition for the existence of a regular static state-feedback law:
. = c(x) + a ( x ) w + e(~)p,
(146)
which decouples the state vector z from p seen as a measured disturbance (see [6, Proposition 5.5.1 and Proposition 7.3.1]). As pointed out by Pomet in [19, Sect. 1.2.3], it follows that, by applying the control: (147) to any system (Sp) (with input u), p E / / , we o b t a i n the particular system (Spo) (with input w). In fact the same result holds if B ~ O:
368
Praly, Bastin, Pomet, and Jiang
L e m m a [19, T h d o r ~ m e 2.2] (148) Let, in equation (Sp), the functions a and A be known and affine in u, let 17 be an open subset oflIt t, let J-2 be an open neighborhood of g in IRn and, finally, let Po be a known vector in lI. Assume: rank {b(x) + B(x)p} = m
V(z,p) E J2 × H .
(149)
Under these conditions, assumption (139), i.e.,
[A0(x) + u ( D B ( x ) l p : e s p a n { b ( z ) + B ( x ) p o }
V(z,p,u) e $2x 11× IR",
(15o) is equivalent to the following proposition: There exist two known C x functions: c : 0x11
--* l i t ' ,
d :
0x11
-~
G~.(m,IFt),
such that,/or all ( z , p , w ) in g2 x 11 x IW'~:
a(z, w) + A(z, W)po = a (z , c(z,p) + d(x,p)w) + A (z, c(z,p) + d(z, p)w) p. (151) A straightforward consequence is: Corollary [19, P r o p o s i t i o n 8.13l (152) Let, in equation (Sp), the functions a and A be known and affine in u, let H be an open subset of IRt which satisfies assumption 1GS and, finally, let Po be a known vector in 17. Assume there exists an open neighborhood Y2 o f £ in IPJ~ such that: 1. rank {b(x) + B(z)p} = m V (z,p) e I2 x H, P. [ A o ( z ) + u ( D B ( z ) ] p 6 span{b(x)+ B(x)po} V ( x , p , u ) e I2× H ×lR m, 3. there exist two known functions: V : J2 ~
of classC 2,
us : J2 ~ IWn of classC 1,
such that the functions -bT(z) ov ov a(z, u) and -8-~(z) A(z, u) are known, assumptions BO and CO hold and we have, for all x in I2: OV Oz (z)[a(z, us(z)) + A(x, uo(x))po] < O,
(153)
where the inequality is strict iff V ( z ) ~ O. Under these conditions, we can find functions V, Un and 7~ and a constant 11 such that the corresponding dynamic controller (105) solves the Adaptive Stabilization problem.
Adaptive Stabilization of Nonlinear Systems
369
Proof. Assumption A-LP holds with: A ( x , t ) = ~-~-z OV (z).
(154)
Then, since assumptions BO, CO and ICS are satisfied, it is sufficient to define a function Un meeting assumption PRS. According to Lemma (148), let: u.(z,p) = c(z,p) + d(z,p)u0(z).
(155)
This is a C 1 function, and inequality (47) in assumption P1LS is satisfied since (151) and (153) hold. [] Assumption (139) is very restrictive. Fortunately, (139) is not necessary for the existence of a function V independent of p, as illustrated below.
(156)
Example: Consider the following system:
(157)
X2 = =1 X3 : X2 -~P* (X3 "~"X2) (Xl -t- 2X2 -}- 2 x 3 ) •
We have:
(
a0(=) =
,
A0(x) =
b(z) =
0
0 (=3 + =2) (31 + 2== + 2=3)
and
) (138)
B ( z ) - O.
Hence, even though b is of full rank 1, assumption (139) does not hold. Nevertheless, assumptions BO, PtLS and CO are satisfied if we choose V independent ofp as: •
V(xl,x2,za)
- -~ +
( 3+32) 2 2
+
2
(159)
and un as: un(xt,x~,xa,p)=--3x3--hz=-3xl-p(xa+x~)(6x3+hxhW2Xl).
(160)
Indeed, V is positive definite and radially unbounded and a straightforward computation leads to the following expression for the time derivative of V along the solutions of (157): ~' = - =] - (=3 + =2) = (161) + (31 + 2== + 2=3)[~ + 231 + 3x2 + 33 + p ( = 3 + ==)(6=3 + 5== + 231)]. []
370
Praly, Bastin, Pomet, and Jiang
2.2 C a s e : V D e p e n d e n t
on p with a "Matching
Condition"
When V depends on p, choosing ~ as in (87) is not sufficient to guarantee that is negative when W is given by (85). This follows from the disturbing term ~-~-pv(z,ff)~ present in (86). To overcome this difficulty, a very fruitful idea proposed by Kanellakopoulos, Kokotovic and Marino [8] (see also Middleton and Goodwin [15]) is to compensate this measurable disturbing term by modifying the control to: u = Un(X,~) + v.
(162)
Indeed, in this case, (86) becomes:
OV
l ~ < [ - ~ - z (z,p~A(x, Un(X,ff)--[-v)q- ; ]
OV
•
[13-p*]+ --~p(X,ff)ff
OV + 3-g~(~. ~) [a(x, Un(~. ~) + v) -- ,,(x, u.(~, ~))
(163)
+ (A(~, u.(x,~) + ~) - A(~, u.(~,~))) ~ . Hence, taking:
OV ^ T [-~x (X,p)A(x, Un(X,~)+ v)] ,
p=
if(0) E H ,
(164)
and choosing v as a solution of." ov
•
ov (x '
o = -b--b-p(x,~)~+ ox
~)[a(=,u.(=,~)+~)-a(=,~(~,~))
(165)
+ (A(x, u.(=, ~) + .) - A(=, u.(~, ~))) ~], we guarantee that 14r is negative. However, a new difficulty may arise from the fact that (164) and (!65) is a system of implicit equations to be solved in ~ and v. A solution smoothly depending on z and ~ may not exist. Pomet [19] (see also [21]) has proposed the following way to avoid this difficulty, when a and A are affine in u: With the control u as in (162), i.e., u = un(x,p) + v,
(166)
we may embed the family of systems (Sp) into the following larger family (Sp,q) (with notation (13)):
~:-ao+(un+v)@b+(Ao+un®B)p+v@Bq,
(p,q) e II × I I .
(Sp,q)
371
Adsptive Stsbilizstion of Nonlinear Systems
Then, we modify the function W to: 1 1 W ( z , ~ ) = V ( z , ~ ) + -~ I I ~ - f l l 2 + ~ I I ~ - f l l 2.
(167)
As above, the time derivative of this function W is made negative if we choose:
~. FbV Z,~
p=
1-~;(
L ~
~-
p)CAoC=)+u.(=,~)OBC=))j
-g~(=,~),, =,~,~,
e B(=)
IT
,
,
~(0) e n
~ o ) e 11
(168)
This dynamic controller is well-defined with unique solutions if we assume: Assumption MC (Matching Condition) (169) The functions a and A are affine in u and there exists a known C 1 function v ( z , p , q , O ) from I2 x 1I x 11 x IR I to IR 'n satisf~ling:
Ov (=,p) 0 + Ov ap -~-x (x' p) v ® [b(z) + B ( x ) q] = 0.
(170)
We remark that, if V is independent of p, this assumption is trivially satisfied (by v = 0). Also, the way this assumption is stated is too restrictive for our purpose. In order to make 1~ negative it is sufficient that (170) be satisfied for the particular case 0 = ~ and not for all 0 E fll t. We have stated assumption MC is these terms to allow us independent choices for the function V and for ~. Indeed, relaxing (170) by replacing 0 by ~ implies that V, Un and/~ must be designed all together so that BO, CO, PRS and MC are satisfied at the same time. Example: Consider the system, with n = 2 and l = 1: Zl = u
~, = =~ (=, + v=D .
(171)
(172)
Assumptions BO, PRS and CO are satisfied, with: V ( z x , x2,p) = z~ + ( z l + x2 + p z g ) 2
Un(Zl,z~,p) = - z 2
- xa - (xt+VZ~)(l+xg(l+2px2)).
(173) (174)
Assumption A-LP is also satisfied with:
Ov
A(=I, =~, t) = --~ (=~, =~,~t)).
(175)
Finally, assumption MC holds since in this case equation (170) is: 2z~(zx+z~+pz~)O
+ 2(=,+z2+pz~),,=O,
(170)
Praly, Bastin, Pomet, and Jiang
372
which is satisfied by the choice v = -x]O.
(177)
Finally, to guarantee that p"(t) and ~ t ) remain in / / and that V ( z ( t ) , ~ ( t ) ) remains smaller than So, we have to modify further the controller (168) as in the previous section: p = Proj
!
I , p , ( V ( x , - ~ L 81)~ ~ ( x , ~ )
(A0(x) + Un(X,~) ® B(z))
1T)
= Proj
u=u.(z,~) + v(x,p,~,~), (178) with 8(0) and q"(0) in HI. We have: Proposition (179) Let assumptions BO, PRS, ICS and M C hold. Assume also that assumption A - L P is satisfied with: OV A ( x , t ) = ~-~x ( x , ~ ( t ) ) . (180) If, in (178), s l is chosen smaller than or equal to so, then all the solutions (x(t),~t),~(t)) oI(S~.)-(17S), with x(O) in O0 and V(x(O),~(O)) < 81, are well-defined on [0,+oo), unique, bounded and: lim V ( x ( t ) , ~ t ) ) = 0.
(181)
$--*oO
It follows that the Adaptive Stabilization problem is solved if assumption CO holds also. Proof. The system we consider is:
a
al 2
rOV x,
T }
(182) with ~(0) and ~0) in//1. From our smoothness assumptions on the functions a, A, un and V and with Point 1 of Lemma (103), this system has a locally Lipschitz-continuous right-hand side in the open set defined by: (x,~,~E /2x//xH
and
, V(x,~) < a l .
(183)
A d a p t i v e Stabilization of Nonlineax Systems
373
It follows that, for any initialcondition in this open set, there exists a unique solution (z(t),p~t),~t)), defined on a right maximal interval [0,T), with T possibly infinite,and satisfying(183) for all t in [0,T). Applying Point 5 of L e m m a (I03), we also know that ~(t) e//I and ~t) E//I for all t in [0,T). Then, let us compute the time derivative,along such a solution, of the function W defined by: alV(z,~) + 1 1 W(z,~) = oq-V(x,~) 2 I[P-P'[I2 + 2 Ilq-P*[12 " (184) With assumption PRS, Point 4 of Lemma (103) and equation (170) with 0 = satisfied by v (z, ~', ~',/1~']' we get:
I)v"_< 0 if V(z(t),~(t))=0
(185)
< 0 if V(z(t),~(t)) # 0. From there, we conclude exactly as in the proof of Proposition (106).
n
Example: (186) The assumptions of Proposition (179) are satisfiedby the system (172) of Example (171). For this system, the Adaptive Stabilizationproblem is solved by the following dynamic controller(using (164), (173) and (177)):
~--- 2 [z2 + (xx + x2 + ~x~) (1 + 2~x2)] x~ u : -x2 - x~ - (Xl +px~)(1+x~(1+2px2))
(187)
- 2=~ [x~ + (~I +'~ + ~']) (1 + 2~)] x~. [] Proposition (179) generalizes to the case where V is not radially unbounded a result established by Kanellakopoulos, Kokotovic and Marino for state-feedback linearisable systems [8]: C o r o n a r y [Kanellakopoulos, K o k o t o v i c a n d M a r l n o [811 (188) Let, in equation (Sp), the functions a and A be known and affine in u, let Po be a known vector in IRt. Assume there ezist a bounded open subset ~ of IR", an open neighborhood ~ of po in IRI and three known functions: : ~ ~ IRa wl : ~2 --+ I R "
of class C 2 which is a diffeomorphism, of class C x , and w2 : 1~ ~ GL(m, IR) of class C 1 , such t h a t :
374
Ptaly, Bastin, Pomet, and Jiang
I. by letting: = g'(z)
and
u = w l ( z ) + w2(z) 0,
(189)
the time derivative of ~ along the solutions of (Spo) satisfies, for all 0 in lRm :
@ = C~o + D O ,
(190)
where D is an n x m matriz and C is an n x n matrix satisfying: PC + cTp
(191)
= --I,
with P a symmetric positive definite matriz, 2. for all (z,p, u) in I2 x O x IW", we have, with notation (13):
rank {b(z) + B ( z ) p } = m u o S(x)r e
(192)
+ S(x)p0}
A o ( z ) p : • span {b(x) + S ( z ) p o , [ao + AoPo, span {b + Bp0}]
(x)}, (193)
where [., .] denotes the Lie bracket. Under these conditions and if p* is close enough to po, we can find functions V, un and 7~ and a constant ax such that the corresponding dynamic controller (178) solves the Adaptive Stabilization problem. Proof. From the sections "State diffeomorphism" and "Ideal feedback control" in [8] (see also [9]), there exist an open neighborhood Ha of P0 and three known functions:
: ~ × Ha ---* IRn Un : J2 x Ha ~ IR m v :$2xIIdxIIaxIR:
of class C 2, a diffeomorphism for each p, of class C 1, and --* l R m o f c l a s s C 1,
such that g belongs to g2, ¢(f~,p*) = O,
~ ( z , po) = g/(z)
V x • 12,
(194)
and, for all ( z , p , q , a ) in 1"2 × lid X lid X IR~, we have: - ~ ( z , p ) [a(z,p) A- A(z,p)Un(Z,p)] = C ~ ( z , p ) ~p (x,p) O + ~ ( z , p ) v ® [ b ( z ) +
B(z)q] = 0.
(195) (196)
Then, let us choose the function V as: V ( z , p) = q~(z, p)T p ~ ( z , p) .
(197)
Adaptive Stabilization of Nonlinear Systems In order to choose the s e t / / a n d
375
the scalar or0, let us define a function F by: (198)
F : l'2 x /7 d --* IR.n x /T d .
p)
This function is a diffeomorphism satisfying F(gf-*(0),p0) = (0,p0). Since/2 x /-/d is an open neighborhood of (gf-l(0),p0) and p* is assumed to be close enough from Po, there exist strictly positive real numbers a0 and lr such that: llp"-poll 2 <
~"
(199)
'
with 0 < c < l, and the set:
Iltp-p011' <
and
< -0}
is contained in F (12 x l i d ) and contains £. Then, let us define the function P by: (200) V(p) = ~2 [ -l ~- e , ] p - p 0 . , ' - 1] , and the s e t / / b y : /7 = {p IV(p) < l + e } . (201) Assumption ICS is satisfied. From (195) and (196), it is clear that assumptions PRS and MC are satisfied. To check that assumption BO is satisfied, we remark that for all compact subsets /C of 17 and all a < or0, the sets: {(z,p) ] ~ ( z , p ) W p c ( z , p )
< a
and
p e ~:}
are compact subsets of f2 x / 7 and therefore their projections: r~,~ = {z 13p e ~:: ~ ( z , p ) T p ~ ( x , p )
< a
}
(202)
z 6 Fc,,~:,
(203)
are compact subsets of 1'2. It follows that: V(z,p)
< a
and
p 6 K:
==~
which implies that BO is satisfied with f20 = 1"2. The proof that assumption CO holds follows from the remark: v(zct),~(t))
~
0
= = ¢ , ~ ( z ( t , p " ( t ) ) ) --* 0 = ~(£,p*)
(204)
and the proof of [9, Theorem 1] (see also Example (68)). Since assumption A-LP is also satisfied with the function A equal to the n x n identity matrix, the controller (178) applies. 13 One of the nice results proved by Kanellakopoulos, Kokotovic and Marino is that for feedback linearizable systems, assumption (193), called the extended matching condition in [8, Assumption E] is necessary and sufficient for assumption MC to hold. In fact, as for the case where V does not depend on p, this assumption (193) relies on the fact that we can transform any system (Sp) into a particular one (Spo) but this time by using both feedback and ditfeomorphism. Precisely, we have:
376
Praly, Bastin, Pomet, and Jiang
L e m m a [19, ThdorAme 2.9 a n d L e m m e 8.14] (205) Let the functions a and A in equation (Sp) be afflne in u and let II be an open neighborhood of g in IR n. Assume that, in U (with notation (i3)): 1. The distribution span {b + Bp*} is involutive with constant full rank m,
~. The distribution span {b + Bp*, [ao + Aop*, span {b + Bp* }]} has constant lank.
Under these conditions, assumption (193), i.e.,
u o B(~) v: • span {b(x) + S(~)Vo)
(2o6)
A0(~) V: • sp~n {b(=) + B(=)V0, Is0 + AoPo, span {b + Bp0)] (~)}, is equivalent to the following proposition: There ezist an open neighborhood ~2 of g, an open neighborhood 1-I o f f a vector Po in 1-1 and four smooth functions: : $2 x 1I ~ IRn x 1-I ~ IRm d :/2 x 11 ---* ~£(m, IR) v : 1 2 × l - l x l l x l R i ---} IRm e : 12
of of of of
oflR t,
class C 2, a diffeomorphism for each p, class C 1 , class C 1 , and classC 1,
such that, for all (z,p,q,O) in 12 × 11 x 11 × IR:, we have: O~p(z,p)O + ~--~(x,p)v®[b(z)+ B ( z ) q ] = 0,
(207)
and, fol all (~, p, ~0) in a x 11 x m m, we have:
a(+(x,p), w) + A(+(x,p), ~,) Po = ~ ( z , p ) [ a (x, e(x,p) + d(z,p)w) + A ( x , c(x,p) + d(z,p)w)p].
(208) Comparing (207) and (170) in assumption MC, we understand the importance of this result. It gives us a possible route for finding functions V and U n satisfying all our assumptions BO, PRS, C O and MC. Indeed, if 1. the conditions of Lemma (205) are satisfied, and 2. for the particular system (Spo), with p0 given by Lemma (205), there exist V0 and u0 satisfying assumptions BO, PRS and CO but this time with the family (Sp) reduced to the single element (Spo), i.e., p --- p* = P0,
then, by choosing:
vC~,p) = Vo(~(~,p)) u.(~,p) = c(~,p) + d(~,p) ~,o (~(~,p)),
(209)
Adaptive Stabilizationof Nonlineaz Systems
377
assumptions BO, PRS, CO and MC are necessarily satisfied. However, it is important to notice that finding a solution v in IR"~ to the n equations (207) is more difficult than finding a solution v in n~m to the single equation (170). For single-input two-dimensional affine in u systems with b(F_.) 0 and B(x) e span {b(z)}, the assumptions of Lemma (205) are "generically" satisfied. Indeed, we can expect that, for almost all z close to £, we have:
span {b(z), [a0, b] (x)} = IR2 .
(210)
However, the set of such z's may not be a neighborhood of g, implying that (207) may not hold. Nevertheless, d'Andrda-Novel, Pomet and Praly [I] have shown that, for this two-dimensional case, there are explicit expressions for the functions V, Un and v satisfying assumptions BO, P R S and MC. Compared with Proposition (76), Proposition (179) states that, when iv* is unknown, the solution to the Adaptive Stabilization problem given by the Lyapunov design requires the Matching Condition (MC). And nothing is known ifthis condition does not hold. Another but very particular possibility to handle the case where V depends on p is to choose: W--
cq V(z,p*) 1 a 1 - V ( x , p * ) + 7 IIP--P'II~
and
= Un(Z,~).
(211)
This gives: .T
C~12
OV
X,
*
¢¢ = ~ ~ - P*]-~ (o,1 - v 0 , , p ' ) ) 2 ~ ( p ) [ a ( ~ , . . ( x , ~ ) ) + A(~, u.(x, ~))v*] • (212) Then, using assumption PITS, we get: .T
al 2
OV
.
~P < ~ [~-P'] + (~i- v(~,p.))2~ (~'p) × [-(~, ~n(x,~)) - a(~, ~o(~, p')) + (a(~, ~.(~, ~)) - A(~, u~(~,v*))) v*]. (213) There is no general expression of ~ not depending on p* and making the righthand side of this inequality negative. However, Slotine and Li [30] have shown that in the particular case of rigid robot arms, it is possible to find functions V and Un which satisfy assumption BO, PRS and CO and are such that: There exists a C 1 function Z : 12 × 11 ~ p* but for all (z,p) G g2 x 11, we have: al 2
OV
lR t such that, for the particular value
,
["* - plT Z 0 " P ) = (al -- V(x,p'))2 ~ ( x , p ) x [.(x, u.C~,p)) -- a(x,,,n(~, V')) + (A(~, u,.C~,p)) - ACx, u,.C:~,V')))p*]. (214)
378
Praly, Bastin, Pomet, and Jiang
Indeed, with such a property, the Adaptive Stabilization problem is solved by choosing: = Proj ( I , 3 , Z ( z , ~ ) ) .
(215)
3 Estimation Design Another way to obtain dynamic controllers to solve the Adaptive Stabilization problem is to hope that a separation principle holds, i.e., that we can get an estimate ~ of p* by using a parameter estimator and simultaneously we apply the control Un with the unknown parameter vector p* replaced by its estimate ~, i.e., . = (216) For the estimation, we note that, thanks to assumption A-LP, equation (Sv) is linear in p, i.e., it can be rewritten as: z = Zp,
(217)
with: z = ~: - a ( z , u )
and
Z = A(x,u).
(218)
Also, the vector p* we want to estimate is constant, i.e., it is a solution of:
p=
o.
(210)
Estimating p* is then equivalent to observing, through the observation equation (217), the state vector p which obeys the (trivial) dynamic equation (219). From linear observer theory [7], we know that an observer can be written as: =
(220)
with K the observer gain. Unfortunately, such an observer cannot be implemented, since it makes use of the unknown quantities ~, a(z, u) and A ( x , u). This difficulty is handled as follows: About the unmeasured time derivative ~, it is quite clear that an integration should help us. It turns out that two ways of implementing this integration are fruitful: 1. equation error filtering, and 2. regressor filtering. These techniques will be presented hereafter. To deal with the fact that only the functions A a and A A are known, we select an integer k and a C ~ function h : /2 x H ---* I~ k such that the functions
Adaptive Stabilization of Nonlinear Systems
379
Oh ~(z,p)a(z,u) and -~z(x,p)A(z, u) are known
functions of
(z,u,p),i.e.,as-
sumption A-LP is met with:
A(z,t) = ~(~,~t)). Oh
(221)
This function h is called the observation function. For any C I time functions ~ : lit+ -* H and u : IP~ --. 1~', the time derivative of h(z(t), ~(t)) with z(t) a solution of (Sp.) satisfies:
Oh
Oh
h = ~ ( ~ , ~ ) ( a ( ~ , u ) + A ( z , ulp*) + ~ ( ~ , ~ ) ) .
(222)
This is again an equation linear in p* and equality (217) is again satisfied but with the following definitions of z and Z:
Oh
= h - ~(~,~)~(~,u)
-
Oh • ~(~,~)~
Oh
(223)
z = ~(~,~)A(x,u). E x a m p l e : S y s t e m (17) C o n t i n u e d (224) For the system (17) rewritten as (50) in Example (49), let us choose the observation function h as: 1
2
h(xa, x2,px,p2) = ~ x l .
(225)
h(zl(t), z~(t~,pl(t),p2(t)) = z~(t) (z](t) + P*I + P~ u(t)).
(226)
We get:
By letting:
(227)
Z(t)= (z~(t) z~(t)u(t)), we obtain equation (217), i.e.,
•(t) = z(t)
I.
(22s)
I 13
380
Praly, Bastin, Pomet, and :liang
3.1 E q u a t i o n E r r o r Filtering The estimator is obtained from the observer (220) where the so called equation error Z 3 - z is replaced by a filtered version. Namely, let e in IRk be defined as follows: + r(e,z, P) e = Z 3 - z, (229) or equivalently, using
(223): Oh
Oh
Oh (230)
e = ~ -- h ( x , ~ , with r a positive C 1 function defined in IR~ x D x / / . Definitely, e can be obtained without knowing ]1 and the estimate 3 is then given by: = Proj (I, 3 , - K e),
3(0) e I/1,
(231)
where the matrix M used in the function Proj is the identity matrix and, typically:
K
=
Z T
x,3)A(x,u
=
.
(232)
Unfortunately, as in the Lyapunov design case, if, instead of (216), we implement the control:
u = u.(x,3) + , ,
(233)
where v depends on ~, we have again an implicit definition of ~ when v is explicitly involved in the right-hand side of (231). In such a case, ff the functions a and A are affine in u, we derive the estimator from equation (Sp,q) instead of equation (S~). Using notation (13), this leads to the following observer: Oh
e + ~ ( ~ , 3 ) [.0(~) + A0(~)3+ ~.(~,3) O (b(~) + B(~)3)]
+-~"x(x,3) v = "~ -
q = Pr o j
(b(z) + S(x)'q) + "~pp(X,3) ~
h(x,3)
p~"= P r o j / J,
P,
~"
I,~,
0
~
~
O
h A
^
;...
~
~(~,p),,(~,p,q,p)eB(~)
T
~
)
, ~o) e / / 1 .
(234) In the sequel, such a modified estimator will be implicitly assumed to be used whenever the Matching Condition holds. We have:
Adaptive Stabilization of Nonlinear Systems
381
Lepta (235) Assume ICS is satisfied. For any C 1 time fnnction u : I ~ -.* IRm, all the solutions (z(t),ff(t),h(t)) of (Sp.)-(g30)-(e31)-(e3B) defined on [0,T) with z(t) remaining in [2 satisfy for all t in [0,T):
1. ~t) e u~ e. I I ~ ( t ) - p ' l l ~ + Ile(t)ll 2 + 2
£
r Ilell' ___ I I ~ ( O ) - p ' l l ' + Ile(O)ll 2 •
Similarly for the solutions of (Sp.,p.)-(g3~t), we have, for all t in [0, T):
s. ~(t) e u ~ 4.
and ~ t ) ~ u ~
J II ~(t) - p*
,
+ I1~(0112 + 2
r flail 2 ~
II II2 ~(o) - p*
+ Ik(O)ll 2
II ~o) -v
I~,) v
Proof. Point 1 is a straightforward consequence of Point 5 of Lemma (103). For Point 2, let us denote T/the input of the dynamic system (229), i.e.,
= z~-
z.
(236)
This system with output e is passive, namely, it satisfies for all t in [0,T):
//
e%-
I'
e " (~ + re)
= 71 lie(011 ~ -
(237)
½ Ile(0)ll ~ + f '
r Ilell 2 >__ - ~ 1 Ile(0)ll 2 .
(238)
On the other hand, the dynamic system (231) with input e and output y defined by: (239) y = z(f-~) is also passive. Indeed, we have thanks to Point 4 of L e m m a (103):
/0' /0' y T e -.
(~__
p.)T (_zT)e
_
( ~ _ p.)T Vroj
>__
(~_
(240)
(I, ~,
- Z Te)
p.)T)
1 >_ ~ lib(t) - f l l
2-
1 --- - 7 I l l 0 ) - p'll 2 •
(241)
(242) I1~O)
-
p"ll ~
(243) (244)
382
Praly, Bastin, Pomet, and Jiang
Noting that p* satisfies: = z f,
(245)
the two passive systems are interconnected with: Y= -7.
(246)
Then, Point 2 follows directly from standard passivity theorems (see Landau [12]) or more directly by comparing (238) and (243). The proof of Points 3 and 4 is similar. O In fact, as emphasized by Landau [12] and expected from this proof, a similar Lemma would be obtained if, instead of the filter (229), we would have used any passive operator. In particular, as already noticed for adaptive control of linear systems by Narendra and Valavani [17], Pomet [19] has shown that, by choosing the identity matrix for the observation function h and a copy of the controlled system itself as the filter (229), we can rederive the adaptation law (178) of the Lyapunov design. Indeed, assume that BO, PRS and MC hold and define e as the output of, instead of (229), the following system with input t / a n d state X, with notation (13): = a0(x) + A 0 ( x ) ~ + u . ( x , ~ o
(b(x) + B(X)~) (247)
aV
e = ~-(x,~
T.
To see that this system is passive, we look at the time derivative of V(X(t), ~(t)). We get:
OV OV . OV 8V f/ = ~-~[ao+ A o F + U , ® ( b + B~)]+~-~ ~o(b + B~) + -~V~+ -5-£,1. (248) Hence, the definition of e, (47) in assumption PtLS and (170) in assumption MC give readily: II _< eWy. (249) Therefore, for any solution of (247) defined on [0, T), we have:
~o
*eTy >_ V(X(t),~(t)) - V(X(0),~(0))
Vt e [0,T).
(250)
In the proof of Lemma (235), we have seen that if equation (Sp,q) is satisfied
with p -- p* and q = p*, the following system with input e, state ( ~ ) and ouput y: Proj ( I , ~ , - [ ( A 0 ( X ) + un(X,~) q) B(X))IT e) , ~(0) E HI ~ = Proj ( I , ~ , - [ v ( X , ~ , ~ , I ~ ) ® B ( x ) ] T e ) ,
y = (Ao(X)+Un(X,~-)®B(X))(~-p*)
q"(0) E / / 1
+ v (X,~,~',~') @ B ( X ) ( ~ - p * )
(251)
Adaptive Stabilization of Nonlinear Systems
383
is passive, namely: fo'
1 II - p. eTy >-- 2 II q~0 - f
1 ~(0) - p* - ~ q~0) - p*
Vt E [o, T ) .
(252)
It follows that the observer defined by equations (247) and (251) is such that V(X(t),~(t)), ~(t) and q~t) are bounded if 7/is chosen such that: 7= -y
= (Ao(X) + un(X,p-') ® B(X))(p* - ~ )
+v(x,~,~,)) eB(x)(p'-~).
(253/
It remains to check that the observer (247)-(251) can be implemented while satisfying this equality, i.e., with (247) rewritten as: "-- aoCX) + AoCX)p* de unCX, ff) (~) (b(X)
+ B(X)p*)
+ v (X,'~,'~,~) ® (b(X)+ B(X)p*) + rI. (254) The right-hand side of this equation is nothing but a copy of the system (Sp.) with the control: .
:
+ v
(255)
It follows that if, in (247), we choose the initial condition X(0) = x(0),
056)
x(0 = ~(0.
(257)
then, necessarily, for all t:
This implies that, in fact, in the observer, the X equation does not need to be implemented, but that we simply have to replace X by x in the definition of e, and ~. This gives exactly the adaptation law in (178) provided by the Lyapunov design. 3.2 R e g r e s s o r F i l t e r i n g To overcome the difficulty of having d: or more precisely ], unmeasured, another way to implement an integration follows from the following remark: Let zf and Zf be the following filtered quantities: z~ + p(~, ~, u) ~ = ~ ,
zf + p(~,~, u) z~ = z ,
z~(0) = 0
zf(0) = 0,
(258)
384
Praly, Bastin, Pomet, and Jiang
or equivalently, using (223):
Oh
~f = p ( ~ , ~ , - ) ~ f + ~ ( ~ , ~ ) . ( ~ , . )
Oh
+ ~(~,~)~,
•
~(0) = h(~(0),~(0))
~f = h(x,~) - ~f Zf--p(z,~,u)Zf
Oh + ~-~x(z,~A(z,u)
, Zf(O) = O,
(259) where p is a C 1 positive function. Clearly, the knowledge of/~ is not needed and equation (Sp.) gives: zf = z f p * .
(260)
This equation is again linear in p*. This yields the following linear observer for p*:
e= Zf~-
zf
(261) ~0) e//1,
= P r o j ( M , ~, - K e ) ,
where, typically, the observer gain K and the matrix M, used in the function Proj, are given by:
K = MZWr(x,~,u,e) ~I>_-(2-el)MZWZfMr(x,~,u,e),
(262) e2M
I_<¢3M(0),
where 0 < ¢1 < 2, 0 < ca, 0 < ¢2, r is a strictly positive C 1 function and, if the observer gain K is allowed to decay to zero with time:
t~f <_ - ¢ 4 M Z T Z f M r ( z , ~ , u , e ) ,
2--¢I > ¢4 > 0.
(263)
Note that, in (261), l~ depends on u via the dependence of the observer gain K on r. Unfortunately, in this case, it is useless to extend the parameterization by embedding (Sp) into (Sp,q), since r will remain a factor in the extended observer gain. It follows that to make sure that ~ is well-defined, we shall impose that r does not depend on u whenever u is allowed to depend on i~. We have: Lemma:
(264)
Assume ICS is satisfied. There ezists a positive continuous function kl such that, for any c I lime f ~ n c t i o . . : m~ --. ~t m, . u (x(t),~(t), ~'f(t), zf(t), M(~)), solutions of (Sp.)-(259)-(26I)-(262), defined on [0, T) with (z(t), M(t)) remaining
Adaptive Stabilization of Nonlinea~Systems
in f2 x
Jr4
385
satisfy for all t in [0, T): 1. ~(t) • Izl
e. ~ I P ( 0 - p ' l l ~ + gl
I'
r Ilell ~ ___ e311~(0)-p'll 2 < k~(p', ~(0)).
s. g (eas) bores, the.
Moreover, for all constant k2, the property: IIZ(t)ll < k2 Vt • [0, T) p(=(0,~(0, u(t))
(265)
-
implies the property: Ilze(t)ll <_ kz and
Ile(t)ll ___ &2
c~/-~llff(0)-P'll
Vt
[0,T).
(266)
Note that a trivial way of meeting the condition (265) is to choose: p(~,p,u) = 1 + IlZll.
(267)
Proof. Point 1 is a straightforward consequence of Point 5 of Lemma (103). For Point 2, first note that (260) and (261) imply: e = zf(~-p*).
(268)
Then, let us consider the time derivative of: WI(t)
1
- - ~ (if(t) - p . ) T
M_l(t) (if(t) - p*)
(269)
along the solutions of (261). This is a well-defined time function since by assumption M(t) E .l~4. First note that, from (262), we get: M-1 < (2-- ¢I)ZTzfr,
(270)
then, with Point 4 of Lemma (103) and (268), we get successively:
IfVI = (ff- p*)T [1M-I (ff- p*) + M-I ~] < ( l - 2)r]lZf(ff-P*)]l" + ( i - p.)W M_I Proj (M, ~, - r M ZWe) < ( 1 - 2 ) r I[Z' ( P - p*)II2 - r ( f f - p.)T ZTe
(271)
+ (if_ p.)T M_I Proj (M, if, - r M ZTe) -- (~-- p.)T (_rZWe)
_<
W gl
< - - 5 - r ildl 2
°
Praly, Bastin, P o m e t , and Jiang
386
On the other hand, from (262), we have: ¢2
-~-IIF(t)-p*ll 2 _< wa(o _<
Amax { M-l(t) }
2
liP(t) -P*llz '
(272)
where Xmax {'} denotes the maximum eigenvalue. Point 2 follows readily and also with (271):
for (~-
p.)T
f
(-r ZWe) - - / ' (if-- p,)W
M - 1 Proj
(M, if,
< ~3 -
~
~ rM
zIe) (273)
IIp(O)-P*II ~
To prove Point 3, let us define X as follows:
X= 0
if :P(ff) < 0 or
--.d-~(v)MZTe > 0
(274)
o~ "^" = 7~(ff) if T'(ff) > 0 and "~pt, p) M ZT e < 0 With the definition (102) of Proj and Point 3 in Lemma (103), we get: (if_ p,)W (_ZWe) _ (~_ p,)W M _ l P r o j (M, ~, - M ZWe) 19"P ^ 07~ ~ T -~v (P)M-K'Fv (P) o~,
^
p,)] /
-
op ,_~M#9
(276)
t'~--'~T
Then, the expression of ~ in (261) and the inequality on /~r in (263) give with (262):
}~I'--IIProj(M,~,--"MZTe) I[ <- ~IIMzT, II + ~x
~o~, (v) ^
(278) MO~ -~(v)^ T
(279)
Adaptive Stabilization of Nonlinear Systems
387
And, with the Cauchy Schwarz inequality and (262), we get:
/' ]]~]] <
~/f0¢ r ]]e][a
~fo`tr{rMZ~rZfM} + Jo - -
+
Jo
o~, ^ MaP ^ T
e2
-~p (p) M-~-pP(p) T
,
(2s0)
where tr{.} denotes the trace of a matrix. The proof of Point 3 is now straightforward. We use the inequalities given by Point 2, (273) and (277) and get: ~'[]]
~ eV/-~ ][p'(O) --P'i]~ t r { M~:4( 0 ) }
%
ca i]~(O)_ p.]12 2 D" e----~
(281)
Finally, we prove the implication stated in the Lemma. To simplify the notations, let: P(~) = P (x(0, P(0, u(0)(282) The solution Zf(t) of (258) is for all t in [0, T): Z](/) = ~0t exp
(-~' p(v)dr) Z(s)ds.
(2s3)
Therefore, if for all t:
IIZ(t)ll < k~ p(t),
(284)
IlZd~)l[ _< ~' exp I-Z ~p(~) d~)IIZ(s)ll ds
(285)
~ k2 ~oteXpI-~o'P(r)dT) P(s)ds <<.
(286) (287)
we have:
k2 .
Finally, the inequality on ][e[] is a straightforward consequence of this inequality, Point 2 and (268). ra An important drawback of this result is that it requires a particular initial condition for the initial condition ~f(0). Choosing another initial condition may create difficulties if at the same time r is not guaranteed to be bounded.
388
Praly, Bastin, Pomet, and Jiang
3.3 Estimation Design The estimation design of a dynamic controller for solving the Adaptive Stabilization problem consists in choosing the observation function h and either the equation error filtering or the regressor filtering technique. This provides an estimate ~ which we use in the nominal control u . to obtain the control:
. = . . ( ~ , ~ ) + ~,
(288)
where, as in the Lyapunov design case, v may be introduced to counteract the effects of updating ~, so that v may depend on l~. In both the equation error filtering case and the regressor filtering case, the estimator is trying to find a function ~ fitting as well as possible the it equation (222). On the other hand, to show that we are solving the Adaptive Stabilization problem, we see, from the proof of Proposition (76), that both the estimation and the control should be such that II is negative. With assumption PRS, such an objective is met by the control Un as long as the 1~" equation is well-fitted by ~. Consequently, the fact that the Adaptive Stabilization problem is solved depends crucially on the fact that, when the/~ equation is well-fitted, the same holds for the V equation. More precisely, this fact relies on the properties of the following set value maps: F(p,~)={(p,u)[u F?(p,v)=
4 Estimation h=
= V(z,p)
{ ( p , g ) [7 = h(z,p)
and h(x,p) = y} and
V(x,p)
Design with the Observation
-
(289)
v}.
Function
~V
When we choose the observation function h as:
-1 v(~, p)
h(~:,v) = ~1 - v ( x , p )
'
(290)
the above set value maps F and F t are in fact standard applications defined by:
F(p,,7) =
(
p , ,7+c~1]
and
F'(p,.) = (P'
(291)
As we will show this is a very favorable situation compared with other possible choices for the observation function h.
Adaptive Stabilization of Nonlinear Systems
389
4.1 E q u a t i o n E r r o r Filtering
Let ~ be obtained from the equation error filtering technique (230)-(232). If we implement the control: u = ..(~,~), (292) then, with (47) in assumption PRS, equation (230) gives: A.
~I 2
h ___ - r ( e , ~ , ~ ) e + ( V ( ~ , ~ -
OV
4,)2 ~ ( ~ ' ~ ) ~ '
•
(293)
where:
= ~'----Zv + ~. (294) al - V If, instead, assumption MC holds, let ~ and ~ be obtained from (234) and use the control: with v given by (170) in MC with 0 = ~. Then, inequality (293) reduces to: h _< - r ( e , x,~) e.
(296)
Since V is positive and e is bounded (see Lemma (235)), ~ is lower bounded. To conclude that h is also upper bounded and therefore V is strictly smaller than al, we would need to know that re is in L t but we only know, from Lemma (235), that yrre is in L 2. In fact to prove boundedness of h, we need to strengthen assumption PRS: A s s u m p t i o n U P S ( U n i f o r m R e d u c e d - o r d e r Stabilizability) (297) There exists a positive constant c such that, for all (z,p) in Y2 x II, we have: OV "Ox (x,p)[a(Z,Un(z,p)) + A ( x , u . ( z , p ) ) p ] <_ - c V ( x , p ) .
(298)
E x a m p l e : S y s t e m (17) C o n t i n u e d (299) Consider the system (17) rewritten as (50). Assumptions BO, CO and URS are satisfied if we choose: Un(~l,X2,Pl,P2) =
X~"~-Pl + X l , P2
(300)
and: V ( X l , z 2 , p t , p 2 ) = V(xt) =
Xl+
= 19exp(3(l_~))if
if xl ~ - 1 -1<
1 _~XlIndeed:
xl < I
(301)
390
Praly, Bastin, Pomet, and Jiang
1. The function V is of class C 2 with:
d2V dxl~(zl) = 2
if
I=11 > 1
if
[xll < 1.
(30~) =2
x--'-~exp 3 1 -
2. If V(xl) is bounded, so is sl, and if V(zl) tends to 0, so does xl. Hence, as in Example (49), BO and CO are satisfied. 3. Finally, assumption URS is met, since we get: (gV "z, ~-(p)ia(x,=.(~,p))
+A(x,u.(~,p))p]
=-2
( z t + ~2) zx3
if zl < _ - 1
-~exp(3(a-~.l-~l))if --2 (Zl----2) z~3
- 1 < zl < 1 if 1<__ X 1
_ -2 v(~,p).
(303) D
ExRmple: S y s t e m (25) C o n t i n u e d (304) For the system (25), assume the function L satisfies the following growth condition: there exist a positive constant 7 and an integer j such that:
IL(Y)] ~_ 7 ([Yl + lYlJ) •
(305)
Under this condition, we can find functions V and Un such that assumptions BO, CO and URS axe satisfied by the non-minimal state-space representation (27) of (25). Indeed, let ~(p) be the following invertible matrix:
~(p)=
[i
~0I°°°°I 10pO[ 0 0 1 0p| 00010| 00001/
"
(306)
It allows us to define new coordinates:
x=e(p)
x=
x2
'
and to rewrite (2"/) in the following block form:
:~=
0 F2
x+
[u+L(y)]+
as
Adaptive Stabilization of Nonllnear Systems
391
- (H1 0)x + 6(t),
(309)
where the pair (F1, G1) is controllable and the matrix F2 is Hurwitz. It follows that there exist matrices CI, P1 and P2 and strictlypositive constants "1 and a~ such that: P~ (F~ - GiC~) + (FI - G~CI) T PI = - I
<_ - 2 a l P1 (310)
el F2 + FTp2 -- -I <_ --2~2 P2. Let us now define a function U by: U'(X) "- -(XTPIX1)J
2j
+
(XTPIXl) (xTp2x2) 2 + ~ 2 '
(311)
where/~ is a strictly positive constant. By letting:
V(x,V) = U (~(p) z),
(312)
assumptions B O and C O are clearly satisfied.To check that assumption U R S holds also, we choose Un as:
Un(Z,p) = -CI XI - L(y),
(X1) X2
-" ~(p)x.
(313)
Then along the solutions of:
the time derivative of U satisfies:
0
--
+ ,] + fl [xTp, F2X2 + xTp2G2L (HIX 1 -{- 5(t))]
_<--1
[(xTPlx,/ + ( TPlXl)] + fl (xTp2x2) ½ ILl ~/G~TP2G2.
(315)
But the growth condition (305), satisfied by the function L, and Young's inequality imply the existence of four positive constants 71 to 74 such that, for all X1, X2 and 5, we have:
(xT p~x2) ½ [L(HIX1 -t- 5)[ ~/G2TP2G2 < - ~"~ (x~p2x~) + vl (xTPlxl) + V2 157 + V~ (XTplXl) # + V, 151~i(316) We have established:
rY < -cU + ~ [~ 15la +v, 1512#],
(317)
where: c = rain {2(0:1 - fl'}'l),2J(0:1 - fits), fl0:2} • (318) It follows that assumption U R S holds, up to the presence of the exponentially decaying terms in 5, if/~ is chosen sufficiently small. []
392
Praly, Bastin, Pomet, and Jiang
Example: Introduction to System (320) Consider the system = 3 2 p* 2:
+
iZ = 2 3 t3=u. Following the Lyapunov design proposed in [27], we choose:
and:
where k and j are integers larger or equal to 1 and X is given by:
Note that, if k = j = 1, u, is a linearizing feedback. Since V is positive definite and radially unbounded, assumptions BO and CO are satisfied. Assumption URS holds also, since a straightforward computation shows that the time derivative of V along the solutions of (320) with p instead of p' satisfies:
Proposition Let assumptions BO, URS and ICS hold and choose: %P)
=
a1
-v(x'p) V(z, P)
and
If assumption A-LP i8 satisfied with:
and:
r(e,x,p) = 1
+
Adaptive Stabilization of Nonlinear Systems
393
either assumption MC holds, aa is chosen smaller than or equal to no, x(O) belongs to ~o and V(~(O),i(O)) < ~ , or we are in the global case, i.e., J2o = 1-2 = 1Rn, ax = as = Jroo and there exists a C o function dz : II ---, fit+ such that, for all (x,p) in 11%n x 11:
1~-(p)A(~,,~.(~,p))
__. dz(p) m~{1, V(~,p)2}, (32a)
then all the corresponding solutions (x(t), i(t), h(t)) of
(sp.).(eso).(esl).(ese)
are well-defined on [0, +oo), unique, bounded and: lim V(x(t),if(t)) = O.
(329)
$---* 0 0
It follows that the Adaptive Stabilization problem is solved if assumption CO holds also. Proposition (325) is an extension to the case when V is not radially unbounded of a result established by Pomet and Praly [23,21].
Proof. Case: Assumption MC holds: The system we consider is, with notation (13):
Oh Oh + ~(x,i)v
Oh • (~,~,~,~) ® ( b ( , ) + BCx)'~) + ~ ( ~ , i ) ~
e =~ - n(,,~)
p--eroj Un(X,p~® B(x)) ] T e ) ~" I I , ~ , - [-~xx(x,i)(Ao(z)Jr Oh
, p~O) e II,
(330) with:
h(~,p) =
Y(x,p) ' ct~1 1 --g(x,p)
r(e,z,p) = 1 Jr h(z,p) 2 ,
(331)
and, with notation (13):
OV O-oVp(z,~))Jr -ff~x(x,~)v(x,i,~,~)®[b(x)JrB(x)~]
= O.
(332)
From our smoothness assumptions on the various functions and with Point 1 of Lemma (103), this system has a locally Lipschitz-continuous right-hand side in the open set defined by:
(x,i, ~,'h) e 0 x 11 x 11 x IR
and
V(z, i) < al .
(333)
394
Praly, Bastin, Pomet, a~d Jiang
It follows that, for any initial condition in this open set, there exists a unique solution (x(t), p~t), q~t),h(t)), defined on a right maximal interval [0, T), with T possibly infinite, and satisfying (333) for all t in [0, T) and in particular:
V(z(t),~(t))
< al
Vt E [0, T).
(334)
Applying Points 3 and 4 of Lemma (235), we also know that, for all t in [0,T): p(~) e //1
and
q"(L) E I/1 ~o) -p*
~ ( 0 - p*
~,) p"
~
+ lie(Oil 2 + 2 fo' ," Ilell 2 _< q~0) - p "
II2
+ ile(O)ll ~ def
,i~2.
(335) Then, with assumption URS, (332), (334) and (331), we get successively from A.
the h equation in (330):
al~V h < -," e -
(336)
c (41 - V) 2
< 4 ; (4;'lel) - e
Otl
--
{1'1-- Y
_< (1 + h) ( v q l e l )
- ch
(337)
h
(338)
_< - (c- (,,/;lel)) h + 4";lel
(339)
_< - (e- (v~lel)) h + (1 + lel) V';'I¢I + clel.
(340)
Now, inequality (335) implies that the assumption of Lemma (583) in Appendix B is satisfied with: X = h,
(341)
and, by using the fact that r > 1: ~2 ~1 = v / T l e l ,
ax = 2, S i x =
~-
~2 ~! = (1 +~).v"71e] + clel,(1 = 2, 321 = 2 ( ( 1 - ~ ) 2 Jr-C2) T •
(342)
It follows that there exists a constant T, depending only on the initial condition, such that, for all t in [0, T), we have: 0 < -
,~ v(x(t), ~(t)) 41 - v ( x ( 0 , ~ ( 0 )
= h(x(t),~(t))
= h(t) -
e ( t ) < r + ft.
(343)
Adaptive Stabilization of Nonlinear Systems
395
Hence, we have established, for all t in [0, T):
V(x(t),i(t)) < a1(T+ fl) < at a,+T+9 -8 <- "h(O <- r Ili(t)-p'll
_< ~
and
if(t) e //1
II~t)-P*II < ~
and
q-'(t) e f I 1 ,
(344)
Then, from assumption BO, we know the existence of a compact subset F o f / 2 such that: z(t) E F Yt E [O,T). (345) Hence, the solution remains in a compact subset of the open set defined in (333). It follows by conUadiction that T -- + c o and that z(t) and i(t), u(t) and ~(t) are bounded on [0, +co). Then, using the second conclusion of Lemma (583), we have:
l
sup
< 0.
t--*+oo
(346)
Also, from (229) and the fact that the solution is bounded, we deduce that ~ is bounded. Since, from (335), e is in L2([0, +co)), we have established:
lim e(t) = 0.
(347)
t--*-I-oo
This yields: 0 < limsup h(z(t),~t)) = limsup h(t) --
~--*+oo
lim e(t) < 0.
~-*+oo
t-*Too
--
(348)
With the definition of h, (334), and assumption CO, this implies: lim xCt) = ~ .
(349)
t .--*-.I-oo
Case: Inequality (328) holds: The system we consider is: - - a, ( X , ~II(Xtp'~)) + A ( x , U n ( X , i ) )
p*
(350)
=Vroj
e
,
e rS ,
396
Praly, Bastin, Pomet, and Jiang
with: r(e, ~,F) = 1 + Y(~, F)~.
(351)
From our smoothness assumptions on the various functions and with Point 1 of Lemma (103), this system has a locally Lipschitz-eontinuous right-hand side for all (x,ff, h) in IRn x / ' / x IR. It follows that, for any initial condition in this open set, there exists a unique solution (x(t), ~ ( t ) , h ( t ) ) , defined on a right maximal interval [0,T), with T possibly infinite, and remaining in this set. Applying Points 1 and 2 of Lemma (235), we also know that, for all f in [0,T):
~(t) e n l II~(t)-p'll ~ + II~(t)ll 2 + 2
I'
r I1~112 ___ II~(O)-p'll' + I1~(0)11~ .ej 82. (352)
Then, as in the previous case, with assumption UP,S, we get from the h equation in (350): ;.
h <-re-
cV+
O V :..
~p.
(353)
But, by using inequality (328), Point 2 of Lemma (103), (351) and the expression of # in (350), we get successively:
I111I1,o
O V ;.
)11
<
< dl(~(t)) max{I, V 2} I~1
~d,(~(t))O+V 2) lel dl (~(g)) r lel.
(354)
Since dx depends continuously on if(t) which satisfies (352), there exists a constant k depending only on p-'(0) and e(0) such that (353), (354) and the expression of e in (350) give:
h < (1+ k)~
(¢;H)
- cV
( 1 + k ) ( 1 d - h - e) ( ~ H )
- c ( h - e)
_<- ( c - (1+ k)v,Tlel) ~ + (1+ k ) ( l + I~l)(v/;l~l) + cl~l.
(355)
Adaptive Stabilization of Nonlinear Systems
397
Hence, with Point 2 of Lemma (235), the assumption of Lemma (583) in Appendix B is satisfied with (see (352)):
x = g
(350)
and Ox = (1 + k)x/~lel,
~1 - 2, S n = (1 + k)= 2
=,~ = (1 + k)(1 +/~) (q'~lel) + del, ¢~ = 2, &x = 2 ((1 + k)2(1 + ~)2 + ~=) -T"
(357) From here, we conclude the proof exactly as in the previous case.
D
Compared with the Lyapunov design, we see that, when the Matching Condition (MC) holds, the equation error filtering technique, with V as the observation function, requires the more restrictive Uniform Reduced-order Stabilizability (URS) assumption. However, if assumption MC does not hold, nothing is known for the Lyapunov design, whereas here the Adaptive Stabilization problem is solved in the global case by the equation error filtering technique if the quadratic growth condition (328) is satisfied. E x a m p l e : S y s t e m (320) C o n t i n u e d (358) For the system (320), we have shown that assumption BO, CO and URS are satisfied. For assumption MC, we get (see (322)):
(359) and: Ov b = Xa. 0z
(360)
0V
0V
Hence, (170) c~not be satisfied, since, when ~ - b is ~ero, ~
is not necessarily
zero. Then, let us see if the growth condition (328) holds. We have to compare the product of the norrm of (see (322)): OV
OU OX
OV
OU OX
to a power of V = U. We have, writing everything in the X coordinates in which U has a simpler expression:
0u Ox =
x~* ~ j-' x~k_, ' + "2T]
+ 33-
x=,
xa
(361)
398
I
1
ox
Praly, Bastin, Pomet, and Jiang
)
-~ + 2pX1
'~Z 1' "-
(
+ (2~ - l) X~lk-~
0
/
(362) (363)
To obtain our inequalities, we note that:
Ix~l
S ('ru)~,
Ix21 ~
(-ru)~,
Ix~l
(364)
S (~,u)~,
with:
= sup {j(2~)~, j ~ , 2 } ,
(365)
l a + bU'~l S ( l a l + Ibl)sup{l, u s} .
(366)
and, for any positive a:
We get:
•"~zA <_dl(p) s u p { l ,
go,}
and
<
d2(p) s u p { l ,
g a2} , (367)
with:
dx(p) = [(cl + 2 IPl) + (cxc2 + 2c2 IPl + 4P 2) + 2 IPl + (2k - 1)] -r~' { 1 1 31 2 1 1 11 ~j} ~ = s u p 1+%7, i - ~ + 2k--7,i + ~7, ~+ ~ + ~ , ~+ d2(p) = [2 -I- ICl - c21-t- 2 IPl]7°~ 1 1 1 3 1 k+l~ a2 = sup 1,1 - ~ + ~-~, ~ + 2k"'~ ' 2 + ~ J "
(368) (369)
It follows that:
OV H~---~x(p,z)A(z,un(p,x))]l II-~p(p,x)l[
<
dl(p)d2(p) (l + V(p,x)a),
(370)
where a depending on j and k is given in Table 1. Hence, for this example, (328) is satisfied if we choose k > 2 and j > 1. []
Adaptive Stabilization of Nonlinear Systems
399
Table 1. a(k,i)
iI l~i ~MI M lili ~ I ~ ~ I ~ ~ ~ ~ ~ I ~ ~ ~ ~ ~
4.2 Regressor Filtering
When the regressor filtering (258)-(261) is used, using (259) and (261) we get: h =-p(~,E
+ (~
-)~ + zf~
~,~" ( ov - v(.,~))~ ~ - . ( . , ~ ) ( . ( . , ~ ) + A ( . , ~ ) ~ )
ov
+ -g-~p(.,~
~) , (3711
where: = h(z,p-') + e = Z f ~ + £f.
(372)
Hence, if: u ---- un(z,~),
(373)
and (47) in assumption PRS holds, we have:
< -p(z,~,u)e +
(
al ~ OV ) (a~_V(m,~))2 ~pp(x,~)+Zf ~.
(374)
Compared with equation (293) for the equation error filtering case, we have the extra term Zf~. But, thanks to Lemma (264), we know that we may choose p and 2hr in order to guarantee that Zf is bounded and ~ is absolutely integrable. It follows that the following counterpart to Proposition (325) can be established: Proposition
(3~5)
Let assumptions BO, URS and ICS hold. Choose: h(~,v) -
~i v(x,p) ~1 - V(x,p)
p(~,p,u) = 1 +
(~1 _ v ( x , p ) ) 2 o~, ( z , p ) A ( x , u )
(376)
,.(x,p, u, e) = p(x,p, u) 2 , and:
2f4 = G(M, Zf, el,
¢3M(0) > I ,
(377)
400
Praly, Bastin, Pomet, and Jiang
where G is a negative symmetric matrix, depending locally-Lipschitz-coniinuously on its arguments and satisfying (see (~6~) and (1~63)): -e4MZfTZtMr
> a > -(2-eOMZTZfMr.
(378)
If assumption A-LP is satisfied with:
~(~,~(t)),
a ( z , t ) = Oh
(379)
and: either assumption MC hot& and, with notation (13), B(x) = O, m is chosen smaller than or equal to Vto, z(O) belongs to J2o and V(z(O),p"(O)) < al, or we are in the global case, i.e., Do = D = IRn, c~l = ~o = +oo and there
exists a C O function d2 : H ~ IR+ such that, for all (x,p) in IR" x 1I: [['~p OV (x,P) H < d2(p) max{1, V(z,P)},
(3s0)
then all the corresponding solutions (z(t),ff(t), ~f(t), Zf(t)), M(t) of (Sp. ), (259), (261), (262) and (377) are well.defined on [0, +oo), unique, bounded and: l~
v(,(t),~(t))
(38~)
= o.
If follows that the Adaptive Stabilization problem is solved if assumption CO holds also. Proof. Case: Assumption MC holds: In this case, we assume also that A(z, u) does not depend on u, i.e., with notation (13), we have: A(z, u) = Ao(z).
(382)
The system we consider is: ~ = a ( x , u) + Ao(z)p* Oh ~=
Oh
p(~,~)zf + ~(~,~)o(~,u) + N ( ~ , ~ ) ~ ,
zf-- A ( ~ , F ) Zf = e-----
~f
Oh
-p(~,~)zf + ~(~,~)Ao(~),
zf(o) = 0
Z f ~ - zt
~= Proj ( M ,
~(o) = n(x(o),p--(o))
~, - M ZWr(x,~)e) ,
1~I = G ( M , Zf, r(z,ff)),
~ 0 ) 6 1-/1
m(O) > o,
(383)
Adaptive Stabilization of Nonlinear Systems
401
with:
h(x,p) =
m v(x,p) ,~I - v(x,r,) '
~(~,p) = p(z,p)2,
(384)
,
(385)
and p(z,p) = 1 +
where:
cgV
(hi - V ( x , p ) ) 2
•
(z,p)Ao(x))
OV
-~p (z,~)~ + ~-z (x,~)v(x,~,l~) ® b(z) = 0.
(386)
Since r does not depend on u, ~ is explicitly defined. Then, from our smoothness assumptions on the various functions and with Lemma (103), this system (383) has a locally Lipschitz-continuous right-hand side in the open set defined by: (z, Ff, Zf,~,M) E f 2 x I R x ] R t x H x A d
and
V(z,~) < a l .
(387)
It follows that, for any initial condition in this open set, there exists a unique solution (z(t), ~(t), Zf(t), ~(t), M(t)), defined on a right maximal interval [0, T), with T possibly infinite, and satisfying (387). Applying Points 1 and 2 of Lemma (264), we also know that, for all t in [0, T):
p(t) E / / i 6211~(t)-p*ll 2 + el
I'
r Ilelf <__ ~311p-'(0)-p'll 2 do=f 82.
(388)
Moreover, our choice for p implies with the last statement of Lemma (264) that, for all t in [0, T):
IIZrll _< 1
and
Ilell _< V/-~ll~0)-p'll
•
(389)
Then, since we have (see (260)): zf-
Zfp*,
(390)
we conclude also that for all t in [0, T): Ilzdl-< Ilp']l •
(391)
Finally, since M ( t ) E . ~ for all t in [0,T), (377) and (378) imply that /~/ is negative and:
M-a <_ (2 - el) Z ~ Z r , .
(392)
With (389) and (377), this implies, for all t in [0,T): I 83 + (2 - el) f f , <- M(t) <_ M(O).
(393)
Praly, Bastin, Pomet, and 3iang
402
Then, for h defined in (372), we get by (371) and (298) in assumption URS: -:-
o¢12 V
(394)
h _< - p ~ - c (t~, - V) 2 + Z ~ <-
(V'71el) -
eh + Ze~
_< -oK + v'71el + ~1~1 +
(395)
zt~.
(396)
With (388) and Point 3 of Lemma (264), the assumption of Lemma (583) in Appendix B is satisfied with: X = h (397) and: ~1 = v/'~lel + i
clel,6
= 2 , & 1 -- 2 ( 1 + c 2) /3A
(398)
I
It follows that there exists a constant T depending only on the initial conditions such that, for all t in [0,T), we have: 0<
~,1 v C . ( t ) , v"(t)) ~1 - v ( ~ ( t ) , ~ ( t ) )
= h(x(t), ~(t))
(399)
= ~(t)
(400)
~(t)
-
< r + ~/e_--s I1~(o)-p*ll,
re2
(401)
where we have used (389). Hence, we have established, for all t in [0, T):
v(x(t),~(t))
<_
< -1
I I ~ O - p * l l < v ~¢ ;
and
~(t) E //1
(402)
IIz~(t)ll_ 1 II~(t)ll __ IIp*ll • Then, from assumption BO, we know the existence of a compact subset F of 12 such that: z(t) e F Vt E [0,T). (403) We have also:
ea+(2-el)TR
<_ M(t) < M(O),
(404)
Adaptive Stabilization of Nonlinear Systems
403
where R is an upper bound for f [ r whose existence follows from the continuity of the function r and the boundedness of x and ~. Finally, with the continuity of the function h and the fact that: ~f = h(z,~) - zf,
(405)
we have proved that the solution remains in a compact subset of the open set defined in (387). It follows by contradiction that T = +oo and in particular that z(t), ~(t), u(t) and ~(t) are bounded on [0, +oo). From here, we conclude the proof exactly as in the proof of Proposition (325).
Case: Inequality (380) holds: The system we consider is the system (383) with: U -- Un(X,p),
h ( x , p ) -- V ( g , p ) ,
(406)
(407)
and
From our smoothness assumptions on the various functions and with Lemma (103), this system has a locally Lipschitz-continuous right-hand side in the open set defined by: (x, ~, Zf, ~, M) e IR" x IP,.x IP,} x / / x
M.
(408)
It follows that, for any initial condition in this open set, there exists a unique solution (z(t), Ff(t), Zt(t), ~(t), M(t)), defined on a right maximal interval [0, T), with T possibly infinite and satisfying (408). Then, as in the previous case, we have, for all t in [0, T):
~(t)
E
//~
~2 I I ~ ( O - f l l 2 + ex IIZfll _< 1
I'
r Ilell= _< e3 IIv-'(O)--p*II= de__f/~2
Ilell ~< %/-~ II~(O)-fll
and
(409)
II=dl _< IIfll I
<_ M(t) <_ M(O).
¢3 + ( 2 - c1) f f r " By taking care of the fact that we are not using v to cancel the term ~O V ~, we get also:
h_<-pe-
cV+
(0¼) +Zz
~.
(410)
And, by using (409), inequality (380), and the continuity of the function d2, we have:
II°vl[ N
<- d2(Nt)) max{l, V}
(411)
< k (I + v)
(412)
Praly, Bastin, Pomet, and Jiang
404
for some constant k depending only on ~(0). Hence, with our choice for p and r, we have established:
<-pe Itence, with (409), the assumption of Lemma (583) in Appendix B is satisfied with: X -" h (415) and
tr1 = 1, $11 = kkl(p*,p~O)) ¢1 = 2,
= 2 (1 +
From here, we conclude the proof exactly as in the previous case.
(416) []
In practice, to apply this Proposition, we must be allowed to choose the particular value for the initial condition £f(0) and a vanishing observer gain K ((263) is assumed). Compare with Proposition (325) where we have no constraint on the initial conditions and the observer gain is not forced to decay to zero. However, Proposition (375) shows that the regressor filtering technique with V as observation function has also the property of providing a solution to the Adaptive Stabilization problem in the case where the Matching Condition (MC) does not hold. Then, besides this question of initial conditions and observer gains, the choice between the two estimation techniques depends on which of the linear growth condition (380) or quadratic growth condition (328) holds. E x a m p l e : S y s t e m (320) C o n t i n u e d (417) For the system (320), let us see if the growth condition (380) holds. From (367), we have: " " " < d 2 ( p ) s u p { l , Ua~}, IlO--~pll (418) mE
~
i j
where d2 is given by (368) and a2 satisfying (369) depends on j and k and is given in Table 2. Hence, for this exbxnple (380) is satisfied if we choose k > 1 and j > 1. This is different from what we obtained in Example (358) where the equation error filtering is to be used. []
5 E s t i m a t i o n Design with an Observation F u n c t i o n h not D i r e c t l y Related to V When the observation function h is not directly related to V, the set value maps F and F ~ defined in (289) do not have any particular properties. Consequently
Adaptive Stabilization of Nonl~neaz Systems
405
Table 2. a2(k,j) 1 1 2 1 3 1 4 1 1
lla~mamiamHoBBEHBmim mi~IH~Hi~4~m~n~7~
ll l)gm~u~7~IU~r~a~Tm H~7~ assumption PRS or URS does not give any information on the dynamics of h. Similarly, the estimation does not necessarily provide a better fit of the I~" equation. To overcome this difficulty and guarantee some properties for F and F ?, we have to refine our assumptions. Namely, we have to relate V to h. In the global case, i.e., for g2 = IK", this yields: Assumption RBO (Refined Boundedness Observabillty) (419) For all positive real numbers (~, all compact subsets If. of 1-1 and all vectors mo E IRn, we can find a compact subset F of IRn such that for any C 1 time functions ff : ~ ---, 1"I and u : ~ --* IRm and any solution z(t) of (44) defined on [0, T), we have the following implication: IIh(x(t),~(t))ll<_ a andS(t) ~ )C Vt e [0,T)
--.
x(t) e /" Vt e [0,7').
(420) Namely, the boundedness of the full state vector z is "observable" from the boundedness of the "output" function h. Ass-mption RUttS (Refined Uniform Reduced-Order Stab.) There ezists a positive constant c and two functions: f : IR ~ x 1-I ---. IRk of class C x
and
U : IRh x 1-I ~
(421)
EL~ of class C 2 ,
with U known and:
~. / ( o , p ) = o vp ~ iz 2. U(h,p) = 0 ~
h=O
3. V~ >_ 0, V/C compact subset of H, the set {h [U(h,p) <_ ~ and p E 1C}
is a compact subset of lR ~,
such that, for all ( z , p , h ) in IRa × 1I x IR t, we have:
Oh 1. f ( h ( z , p), p) = ~ ( z , p ) Ou
[a(z, un(z,p)) + A ( z , un(z,p)) i0]
e. -b-yCh,p) fCh,p) < - c UCh,p).
(42~)
406
Praly, Bastin, Pomet, and Jiang
The meaning of assumption RURS is that, for any fixed vector p i n / / , the time derivative of h along the solutions z of (Sp) in closed loop with the control U ~- . n ( X , p ) ,
(423)
is simply: = f(h,p), (424) i.e., this control makes the reduced-order system, consisting of the components of h, autonomous and decoupled from the other components of the full state vector x. Moreover, U is a Lyapunov function for this reduced-order system implying that it admits 0 E IRk as a globally asymptotically stable equilibrium point.
A s s u m p t i o n R C O (Refined C o n v e r g e n c e Observability) (425) For any bounded C 1 lime functions ~ : HL~ -* II and u : IR+ ---* IRm with also bounded and for any solution z(t) of (g4) defined on [0, +c~), we have the following implication: lim h(x(t),~(t)) exists and is z e r o ] t-~oo ~ '.. lim z(t) exists and is equal to g. and z(t) is bounded o n [0,q-c~) t-.~ Assumption RMC (Refined Matching Condition) (426) The functions a and A are affine in u, the function U does not depend on p and there exists a known C 1 function v(x,p,q,O) from IRn × H × II × IR a to IRm satisfying: Oh ~p (X,p) O + ~x (x,p)v ® [b(x) + B(x)q] -- O. (427) Note the strong restriction on U in assumption RMC.
Example: System (17) Continued
(428)
For the system (17) rewritten as (50) in Example (49), the refined assumptions RBO, RCO, RURS and RMC are satisfied if we choose: h(xl,x2,Pl,p2) = Zl
,
un(xl,x2,pl,pa) =
z~ + pl + xl , P2
(429)
and: U(h,pl,p2)=
if h_< - 1
h+
=9exp(3(1-~-~2))if =
h-
-1 < h<
1
(430)
if l < h ,
with: f(h, p l , ~ ) = - h 3 ,
n = IR × (IR+ - {0})
(431)
and r = {x e Ia 2 I Ilxll < IIx(o)ll +
+ Ic3l) } •
(432) []
Adaptive Stabilization of Nonlinear Systems
407
5.1 E q u a t i o n E r r o r F i l t e r i n g Let ff be obtained from the equation error filtering technique (230)-(232). If the control is given by: u = un(z,~), (433) equation (230) gives with the help of (422) in assumption RBO: = f(h,~)
+
-r(e,x,~)e+(f(h-e,~)--f6,p))+~pp(Z,ff)
.
(434)
If, instead, assumption RMC holds, let ~ and ~ be obtained from (234) and use the control: u -- un(x,~) + v ( x , ~ , ~ , ~ ) , (435) with v given by (427) in RMC with 0 = ~. Then (434) reduces to: h
=
+
[
(
))]
.
(436)
In view of assumption RURS, the term enclosed in brackets in both (434) and (436) should be considered as perturbations. To be able to apply techniques a la Total Stability (see Lakshmikantharn and Leela [11]), we need to compare the components of this perturbation with the "Lyapunov function" U. This motivates the following assumption: A s s u m p t i o n G C 1 ( G r o w t h C o n d i t i o n s 1) (437) There exist two positive continuous functions d3, defined on H x IK ~ and d4, defined on H and a known positive real number )% with A <_ 1, such that, for all (h,p) in I~ k × 11:
I°U
t
1. -~-ff(h,p) ( f ( h - e, p) - f ( h , p)) < d3(p, e) max {1, U ( h , p ) 2-x } 1141 OU h
_
max
(43s) Moreover, if assumption R M C does not hold, then A < 1 and there exist three positive continuous functions di, i = 5,7 defined on H and two positive real numbersw and ~, with w + t¢ <_ 2 - A such that, for all (x,p) in IR" x H : <_ ds(p) max{ 1, uChCz, p), p ) 2 0 - x ) } OU Oh
Vh 6 11%~ ,
(439)
408
Praly, Bastin, Pomet, and Jiang We have:
Proposition (440) Let assumptions A-LP, RBO, RURS, ICS and GC1 hold in the global case with A(z,t) = ~(~,~(t)), Oh
(441)
r(e,x,~) = (1 + g(h,~)(x-x)) 2.
(442)
and choose:
All the corresponding solutions (x(t), ~(t),h(t)) of (S v.)-(230)-(231) are unique, bounded and well-defined on [0, +(x)) and:
lira h(z(t),~(t)) = O.
(443)
1~-.-*O0
It follows that the Adaptive Stabilization problem is solved if assumption RCO holds also. Proof. Case: Assumption R M C holds: The system we consider is, with notation (13):
Oh
~) e + ~ ( x , D [~o(~.) + Ao(~.)~+ u,C~,D 0 Cb(~) + BCx)~)] Oh
^
Oh( x , ~ ) ) (x,~, ~,~) e (b(~) + BCx)~) + N
e = ~ - h(x,~
~=Proj X,~,-
~(z,p)(Ao(~)+,~.(~,v)eS(~))
0
•
~'= Proj [ 1 , 7 , - ~ ( ~ ' , D ~ ( ~ , E ~ , D ® B ( x )
e
e
, ~0) en~
, ~O) e I ~ ,
(444) with r(e, x, 1~) defined by (442) and, with notation (13): Oh Oh -~p(X,~)~ + ~.~x(z,~)v(x,~,'~,~)®[b(x)+ B(x)~] -- O.
(445)
From our smoothness assumptions on the various functions and with Point 1 of Lemma (103), this system has a locally Lipschitz-continuous right-hand side in the open set defined by: (z,~,~',h) E IR" x / / x / / x I R
k.
(446)
It follows that, for any initial condition in this open set, there exists a unique solution (x(t), ~(t), ~t), h(t)), defined on a right maximal interval [0, T), with T
Adaptive Stabilization of Nonlinear Systems
409
possibly infinite and satisfying (446). Applying Points 3 and 4 of Lemma (935), we also know that, for all t in [0,T): $(t) •/1", ~(t)-p*
and
I'
+ 1140112 + 2
~t) - f
IE2
~t) •//i
z
- P"
r Ilell2 _< ~0)
+ Ile(O)ll 2
d.J ~2
p"
(447) Now, from assumption RMC, the function U given by assumption RUBS depends on h only. Then, letting: U(t) = U(h(t)), (448) we look at the time derivative of U along the solutions of (444) (see also (436)). From (422) in assumption RUBS, (438) in assumption GC1 and (442), we get successively:
~___ - c ~ + - ~ ( h ) <-eU
~,~)~+
-e,
-
+da(~)max{l,UX}rllel] + d3(~', e) max { 1, ~2-x } ilelI
_<-c0 + ~,,)
(1 ÷ 01-~),,, (1 + 0)
+ d4(~) max {1, UX} (1 + UI-X) 2 II,ll < - c U + [(dz(ff, e)+ 2 d,(ff)) (1 + ~l-X)ilell] (1 + U')
_<- [o -~ (1 ÷ v,-~) ,o,1 o ÷~ (, ÷ v,-~) ,,,,
(449)
where the constant k depends only on the initial conditions and satisfies:
d3(~(~),40) +
2 d,(~(t)) _< k
W • [0,T).
(450)
Such an inequality holds since we have (447) and the functions da and d4 are continuous. With (442), inequality (449) implies that the assumption of Lemma (583) in Appendix B is satisfied with: X = U and:
(451) 2~
ol - k (1 ÷ O x-x) I1~11, ~, = 2, s n = k z "--
2 "
(452)
Praly, Bastin, Pomet, and Jiang
410
With the property 3 of U in assumption RUtLS, it follows that there exists a constant T depending only on e(O), ~(0) and ~0) such that, for all t in [0, T), we have:
< r.
w
(453)
i
Then, with (447), this implies the existence of a constant ~ depending only on e(0), p"(0) and q"(0) such that, for all t in [0, T):
II (,)ll ÷
(454)
II~(t)ll
< ~,
(455)
and we also know from (447) that p-'(t) e K: and q-'(Q E/C, where IC is the following compact subset of H: = {P IllP-P*II < ;3} N H i .
(456)
Then, from assumption RBO, there exists of a compact subset P of IR~ such that: x(t) • F Vt • [0,T). (457) With (447) and (453), we have established that the solution remains in a compact subset of the open set defined by (446). This implies by contradiction that T = +oo and that x(t), ~(t), u(Q and ~(t) are bounded on [0, +oo). Then, using the second conclusion of Lelnma (583) and the properties of the function U, we have: l ~ h(t) = 0. (458) t--*+co
Also, from (229) and the fact that the solution is bounded, we deduce that ~ is bounded. Since, from (447), e is in L~([0,+oo)), we have also: lim e(t) = 0.
(459)
t--*+oo
This yields: lim h ( t ) lim h(x(Q,~(t)) = t--.+oo t--*+o0
lim e ( t ) = 0. t--.+oo
(460)
With assumption RCO this implies finally: lim z(t) = g.
(461)
t-*+oo
Case: Assumpfion tlMC does not hold: The system we consider is: = a ( x , u n ( x , ~ ) ) + A ( z , Un(Z,~)) p* 0V
= -r(e,x,~)e + ~;(x,~)A(~,uo(~,~))~
0V
+ ~;(~,~)a(x, uo(x,~))
OV
+N(~,~ e = "~ -
(462)
V(~,F)
~=Proj x , ~ , - ~(~,~)A(x, Un(~,~)) e ,
~(0) e IZ~,
Adaptive Stabilization of Nonlinear Systems
411
with r(e, z, ~) defined by (442). From our smoothness assumptions on the various functions and from Point 1 of Lemma (103), this system has a locally Lipschitzcontinuous right-hand side in IW* x / / x IR~. It follows that, for any initial condition in this open set, there exists a unique solution (z(t), ~(t), h(t)), defined on a right maximal interval [0, T), with T possibly infinite. Applying Points 1 and 2 of Lemma (235), we also know that, for all t in [0, T)
I1~(¢)-p*l[' + lie(Oil ~ + 2
£
," [lel#' _< I1~(o)-p*ll' + ##e(O)ll~ d~d ~2.
(463) Then, as in the previous case, we study the evolution of the time derivative of U alongthe solutions of (462) (see also (434)), with: U(t) -- U(hC,),~(t)).
(464)
From (422) in assumption RURS, Point 2 of Lemma (103), (438) and (439) in assumption GC1, (442), and the expression of ~ in (462), we get successively: +
ou ^ [
Ohm]
OU~ • + -57(h,~)~ <_ - ~ f f OV
^
~
Oh
~
^
Oh
T
~ou ~ Ou
^ ..
Oh
_< - c O
+ d4(~) max{I,Ux} [r Helm+ ds(~) max{l, U(h(x,~),~)2`I-~)} [Jell] + d6(~) max{l, 6 w} dT(~) max{I, V(h(z,~),~) ~ } I1~11•
(465)
A difficulty appearing in this inequality and which we have not encountered yet is the distinction between ~r = U(h, ~) and U(h, ~) = U('h - e, ~). As proved in Appendix C, thanks to point 2 in (438) of assumption GC1, these two quantities are related by:
max{l, U(h(x,~),~)~} <_6 [max{1, U'} +max{l, ~,~*r}0,e[ry.{_][eHV-~)]'
(466)
412
praly, Bastin, Pomet, and Jiang
where 7 is any positive real number and ~ _> I depends only on d4(p). Since the functions di, i = 3,7 are continuous and e and ff are bounded from (463), there exists a constant k depending only on the initial conditions such that inequality (465) yields:
u_< -cO +k maxfI , +k maxtl
X [max I, 0 2(1-)0} -~-max {1,0 2A(I-~)} ([[e[[2(l-X) + [[e[[2)] [[e[[ + k max 11,
f
+ k max ~I,
X [max 1, U '~} -t-max {1, U ~ } (lie[['~ -t-Hellr~r)] [[e[[. (467) Since assumption GC1 gives w + ~¢_< 2 - A ,
(468)
we have: max{1 , UW}ma.x{1 , 0 ~} < _ (1-l-U) (1-t-
1 U) "
.
(469)
Hence, (467) can be simplified in:
u ___-c0
(1+ (i+
"t-k max{l, ~,x}
max{i, ~2x(,-x)}(llel12(i-~,)+, +
Ilell3)] doj (a)
+ k max{l, U~} "max(l, # ~ } (IHI ~+1 + INI~+.---~)]
d~=,(b). (470) Let us now show that the terms (a) and (b) defined in this inequality can be bounded from above by terms of the form:
with 0 < 7_< 2. (a) If 2A - 1 > 0, and since
+ 2 ~ (I - ~) = i + (2~ - l) (1 - ~),
(471)
we get:
(472)
Adaptive Stabilization of Nonlinear Systems
413
where, knowing from (463) that I[e]l is bounded, the same holds for the last term between parentheses, since we have positive powers. If 2~ - 1 _< 0, and since
A+2A(,-A)_< 2(I-A),
(473)
[(1 + 0 l-A) lie[I]2 (ll~ll1-2'~ -4-I[ell)
(474)
we get:
(a) ~k
_
(b)
1-~)
][ei]]2 ([Je[]l--2~ "~ lie[I) •
(475)
If a < 1, and since
+)~; < 1+ ( 1 - a ) ( 1 - & ) ,
(476)
we get:
(b) _<
).
(,77)
If t¢ _> 1, and since w + ~
< 1 + 2(l-A),
(478)
we get: (b) < k ( l + U ) [ ( I + U
l-A) [[e[I]~ ([[e[["-1 + [[e[[V~'~-l) .
(479)
With these inequalities, and the expression (442) for r, (470) yields an inequality we write formally as:
_<
- c o + 5, (1+ ~) ¢;Hell + (, + ~) [r. (v~llell) ~"+ rb (J;H)~b],
(480)
where the last term represents the sum (a)-l-(b), with ~a and ~b two constants depending only on the initial conditions, and 0 <
7a _< 2,
O < 7b _< 2.
(481)
This inequality can be rewritten as:
_<- ( c - 5 k V~llcll- [L (v711~ll)~" + ~b (vqll~ll)~b]) +SkVTIlell + [L (v~llell)~" +~b (VTIlell)~] •
(48~)
We may now apply Lemma (583) in Appendix B with X = U
(483)
414
Praly, Bastin, Pomet, and Jiang
and
01=5k~llell,
~2 S n = 25k 2 2
o"1=2, 2
SI~ = ~2 ~2__2 #2
3% 2
'~3 = l;b (VTIlell)' '
S13'= ][b~ ~2
7b
=,1 = 5 ~ : ~ l l e l l ,
Cx = 2,
~ 2 = ~'.,(~llell)'~', ¢2 = ~ 3 = ~'b (v~llell) ~"
2
'Ta 2
(484)
S2x = 2 5 k 2 2
, S22 = ~ 2 ~ S23 = ~b 2 ~-
7b From here, we conclude the proof exactly as in the previous case.
[3
This proposition shows that, in fact, the observation h cannot be chosen independently of V. The refined assumptions RCO and RURS specify the link between these two functions with U playing the role of V. However, not taking h = V creates some difficulties. Even when the Refined Matching Condition (RMC) is met, some extra conditions are required - the increment condition in point 1 and the growth condition in point 2 of assumption GC1. Nevertheless, as with the other estimation designs, we may have a solution to the Adaptive Stabilization problem when assumption RMC does not hold.
C o r o l l a r y [ C a m p i o n a n d Bastin[2]] (485) Let, in equation (Sp), the functions a and A be known and a~ine in u and let I I be an open subset of IR t which satisfies assumption ICS. Assume there exist three known functions: h : I~ n x 17 ---, IFt" of class C 2, a diffeomorphism for each p, Un : ]Rn x II ---, I~ m of class C 1 , and v : I R , n x / - / ' x / / x l R . t ---* I R m o f c l a s s C 1, such that:
1. h satisfies assumption RCO, 2. for all (x,p,q,O) in IR" x 11 x 11 x IR ~, we have: Oh ~ p ( X , p ) O + ..~-(z,p)v @ [b(z) + B(z)q] = O, ox
(488)
~o = h ( x , p ) ,
(487)
3. and by letting the time derivative of ~o along the solutions of (Sp) with u = un satisfies:
¢, = c ~ ,
(488)
Adaptive Stabilization of Nonlinear Systems
where C is
an
n X n
415
matrix satisfying: = -I,
PC + cTp
(489)
with P a symmetric positive definite matrix. Under these conditions, the Adaptive Stabilization problem is solved by the dynamic conlroller consisting of (~3~) and: (490)
= .o(x,~) + v(x,~,¢,~).
Proof. Let us first notice that, the functions a, A and h being known, assumption A-LP holds with: Oh A(x,t) = ~-~z(x,~t)). (491) Then, we define two functions U and f as follows:
g(h) ~- h T p h
(492)
f ( h , p) = Ch.
(493)
and From the assumptions, conditions A-LP, RURS and KMC are satisfied. Now, define the function F by: F : I K '~ x H - - +
IRn x H .
(x,p)
(h(x,p), p)
(494)
This function is a diffeomorphism. Hence, for all compact subsets K: of H and all positive real numbers a, the set: F - I { ( ~ , P ) IPE/C
and
II~]] < a }
is a compact subset of/2 x H, and therefore its projection: Fa,~: = {z 13p • K: : ]]h(x,p)][ _< a }
(495)
is a compact subset of IR2. It follows that assumption RBO holds. Finally, let us check that (438) in assumption GC1 is satisfied. We have: (496) ~,~max
{P} V ( h ) ~
,
(497)
and also: I~rr
I
I
I
]~(~,p) (s(h- e, p)- s(h, p))] = Ih+p c~l < U(h)~ ~/eTCTp C e
(498) (499)
<_~/.~,.~,{Crpc} U(h)~ I1~11- (500)
PraIy, Bastin, Pomet, and Jiang
416
Hence, (438) holds with any A satisfying: 1
-
< A <
(501)
1
So we choose A = 1 in order to simplify the expression of r in (442).
[]
Note that in Corollary (485) the Matching Condition (486) can be replaced by Points 3 and 5 of assumption GC1, namely: There exist two positive continuous functions d5 and d7 defined on H such thai, for all (x,p) in IR n x II:
Oh (502)
H~z(z,p)A(z,un(m,p))[<_d-:(p)max{i, [Ih(x,p)ll3}. 5.2 R e g r e s s o r F i l t e r i n g When the regressor filtering technique (258)-(261) is applied, by letting: = h(~,~) + e = Zf~ + ~ ,
(503)
and using (259) and (261), we get: -- f ( h , ~ ) +
[
(
-pe-F
f(i-e,~)-f(i,~)
+
~pp+Zf
)]
~ ,
(504)
if the control u is: u = ~,,(x,~).
(5os)
And we get:
~ = f(~, ~) + if assumption RMC holds, A does not depend on u and we use: = ~.(x,~) + ~(x,~,~),
(507)
where, with notation (13): Oh
•
Oh.
N(~,~)~+ ~(~,~)v(~,~,~) e b(~) = 0.
(~08)
Compared with the equation error filtering case, we have the extra term Zf~ in both (504) and (506). However, we know that, by choosing p appropriately, Zf is bounded. And, by using a vanishing observation gain, ~ is absolutely integrable. This latter property being different from what we have in the error filtering case, different growth conditions are needed:
Adaptive Stabilization of Nonlinear Systems
417
A s s u m p t i o n GC2 ( G r o w t h Conditions 2) (509) There ezist three positive continuous functions di , i = 8, 10, wi~h ds defined on 17 × IR k and d9 and dl0 defined on 17, two positive real numbers )~ and to, with )~ < 1 and ~ known, such that, for all (~o,z,p) in IRk x IRn x 17:
I I e. -g~(~,,p) <_ d9(p) max {1, U(~,,p) ~} II II p)) IIOh
1. --~(~o p) (f(~o -- e, p) -- f(~o, p)) < ds(p, e) max { 1, U(~o,p)"+'~ }
Ilell
<_ dlo(p) m a x { l , U(h(z,p),p) ~} .
3. ~'~x(x,p)A(x, Un(X,
(510) Moreover, if assumption R M C does not hold, there ezist two positive continuous functions dlx and dx2, defined on 11 such that, for all (~, x,p) in IR~ × I~ n × l-I:
II
OU
II
(511)
Note that there is no constraint on x besides its existence. We have: Proposition (512) Let assumptions A-LP, RBO, R URS, ICS and GC2 hold in the global case with: A(x,t) = O-#--h_ (z,p"(t)).
(513)
Choose:
(514) Oh z
(515)
and, in (262):
= G(M, Z~, r),
~3 M(O) > I,
(516)
where G is a negative symmetric matrix, depending locally-Lipschitz-continuously on its arguments and satisfying (see (262) and (263)): -e4MZTZfMr
> G >_ - ( 2 - e l ) M Z T Z f M r .
(517)
Assume that A does not depend on u when u is chosen to depend on ~. Under these conditions, all the solutions (z(t), if(t), ~(t), Zr(t)), M(t) of (Sp.), (259), (e61), (e62), (516) are well-defined on [0, q-oo), unique, bounded and:
lim h(z(t),ff(t)) = 0.
1~---*OO
(518)
It follows that the Adaptive Stabilization problem is solved if assumption RCO holds also.
418
Praly, Bastin, Pomet, aztd Jiang
Proposition (512) generalizes a result established by Pomet and Praly [22] who choose the state vector x as the observation function h.
Proof. Case: Assumption MC holds: In this case, we assume also that A(z, u) does not depend on u, i.e., with notation (13), we have:
A(z, u) - Ao(x) .
(519)
The system we consider is:
= a ( z , u)+ Ao(z)p*
Oh Oh . ~ = p(~,,E,,)~f + ~(~,~1~(~,,,~)+ N ( ~ , ~ ) ~ , zf = h(z,~)
-
Zf=-p(z,~)Zf
e= ZfF-
~K0) = h(x(O),~(O))
~
Oh + -~x(z,~)Ao(z),
Zf(O) = 0
zf
~ = Proj (M, ~, - M ZWr(z,~)e) ,
fl=G(M,Zf,
r(x,~)),
~(0) e 111
M(O) > O,
(520) with r given by (514), p given by (515) and:
u = uo(x,~)+v(x,g~),
(521)
where v satisfies (515). Since r does not depend on u, ~is explicitly defined. Then, from our smoothness assumptions on the various functions and with Lemma (103), this system has a locally Lipschitz-continuous right-hand side in the open set defined by:
(x,£f, Zf,~,M) E IR? x IRk x ]vim(IR) x H x ~v/.
(522)
It follows that, for any initial condition in this open set, there exists a unique solution (z(t), £f(t), Zf(t), ~(t), M(t)), defined on a right maximal interval [0, T), with T possibly infinite, and satisfying (522). Then, as in the proof of Proposition
419
Adaptive Stabilization of Nonlinear Systems (375), we have, for all t in [0,T):
e~ll~(t)-p'll ~ + el IlZdl ___ 1
I' r Ilell' < e~IIP~O)-v*II ~ dd f Ilell _< ~/-~ IIP"(O)--V'II
and
(523)
Ilzfll _< IIp*ll
I
<_ M(t) < M(O).
~ + (2 - ~,) f [ ,. Now, as in the proof of Proposition (440), we look at the time derivative along the solutions of (520) (see also (506)) of 0 defined in (448). From (510) in assumption GC2, (422) in assumption RUBS, (514), (515), and (523), we get successively:
~< < -cU "31-d9(p)dlo(P)
max
{ 1, Ux } (1 +
max
{ 1, U (h(x, ~))'~}) Ilell
+as(~,e) max {1,0 "+~'} Ilell
+~o<,>rn~(1,01 I;11"
(524)
And, with Appendix C, we have the following inequality:
max{l, U(h(x,~),~)'} < 6 max{l, U'}
+,
[m~x{1,0~'} (,~," +,,,~)]
(525)
for some constant 6 > 1 depending only on dg(p). Since the functions di, i = 8, 10 are continuous and e and ff are bounded from (523), there exists a constant k depending only on the initial conditions such that: ..k
u _< -cO
"~' (1"~-0')[(I"~-0") Hell]'~(l]ell l'l'n'-' Jrllell I-~'F,-')
+k (, + o,) I1:11.
(526)
420
Praly, Bastin, Pomet, and Jiang
Finally, since r satisfies (514), e is bounded, and from GC2 we have A < 1, we get more simply:
We may now apply Lemma (583) in Appendix B with: X = 0
(528)
and:
0x = kllell + 2kv~llell, ~x - 2, Sxl = 10k 2~-~-2 2 ~2 ~ 2~I -~2 - ~(v~llell) ~ , ~ - ],Si~ = da
i, S,a = k2/cl(p*,~0)) (529)
~
= 2kvTIlell,
~2 = ~ (vTIlell) x ,
G =
2, s~l = 4 k ~ - ~
(2 =
2 E ' $22 =
II -_ II
~x ~i
k2 kt(p*,p~O) ) .
With property 3 of U in assumption RURS, it follows that there exists a constant T depending only on the initial conditions such that, for all t in [0, T), we have:
Then, with (523), this implies the existence of a constant c~, depending only on the initial conditions, such that, for all t in [0, T):
[Ih(x(t),~t))l[<
Ib)
+ Ile(t)ll
< a.
(531) (532)
We also know from (447) that ~(t) E ~ , where/C is the following compact subset of//: ~:=
p IIp-p'll _ _ _ ~
N/z1.
(533)
With assumption l%BO, this proves the existence of a compact subset F of I1%'~ such that: z ( t ) • r V t • [0, T ) . (534) Finally, with the continuity of the functions r and h and the fact that:
= h(~,~) - ~f,
(535)
Adaptive Stabilization of Nonlinear Systems
421
we have established that the solution remains in a compact subset of the open set defined in (522). It follows by contradiction that T = +oo and that the time functions ~t(t), p~(t), u(t) and ~(t) are bounded on [0, +oo). From here, we conclude the proof exactly as in the proof of Proposition (440).
Case: Assumption RMC does not hold: The only difference with the previous case, is that we use
u = un(z,~)
(536)
instead of (521), and the fact that U may depend on p. Hence, everything remains the same up to, but not including, equation (524). To get the equivalent of (524), we have to evaluate the time derivative of:
0(0 = u(h(~(0,~(t)))
(537)
along the solutions of (504). We get successively: OU A
+ ~o (uh^, ~)
[-p(~,~,~)e + (I(~ - e, ~) - I(~,p)) + ( ~ + zf) ~]
u
+ dg(~)dl0(~) m~{~, 0 ~) (~ + m~x {~, U(h(~,~))" })li~ll
-]-dg(p)d,l(l~) max{1, U~}max{1,
U(h(x,~)) 1-A } I1~1 (538)
With Appendix C, we get similarly to (525): max{l, U(h(x,~),~) I-~} < 6 max{l, ~1-x}
-t- 6 [max{l, ~),(1-A)}
(11~11~-~+ Ilell)] (539)
And, since 0 _< A < 1 implies:
(2- a) < 1,
(540)
we have also:
(541)
422
Praly, Bastin, Pomet, a~ad Jiang
As in the previous case, this implies the existence of a constant k depending only on the initial conditions such that:
We can now apply Lemma (583) in Appendix B with: X
=
6
(543)
and:
~2 o , = k (llell + v q l l e l l ) ,
~
= 2,
02 = k (vTIlell) ~,
~
= ~, s~
Im
93 -
2
k2 ~ 2t --
k2 kl(p*,p(O))
11 11, II
=
IA
II
~,a = k (llell + vqllell), 6 ~
s ~ , = 2k ~ -
= k (vTllell) x ,
= 2,
S~
= 2k 2 - -
2 G = X' s~
~gl = k 2 --
(544)
r'l
From here, we conclude as in the previous case.
Again, in practice, to apply Proposition (512), we must be allowed to choose a particular value for the initial condition £f(0), and a vanishing observer gain K ((263) is assumed). However, the main feature of this result together with Proposition (440) is that, when the observation function h is not related to V and assumption RMC is not satisfied, it provides solutions to the Adaptive Stabilization problem under different growth conditions. Corollary (545) Let, in equation ( Sp), the functions a and A be known and affine in u and let 1-I be an open subset of IR t which satisfies assumption IGS. Assume there exist two known functions: h : IRn × 1-I ~ IR" of class C 2 which is a diffeomorphism for each p, and Un : IR'~ x 17 --* IRm of class C 1 , a positive real number t¢ and a positive continuous function dlo, defined on II, such that: 1. h satisfies assumption RCO, 2. for all (x,p) in IR" x II, we have:
I I ~ ( z ' P ) A ( z , Un(Z,p))ll < dto(p) max{1,'[h(z,p)[,2~} ,
(546)
Adaptive Stabilization of Noalineax Systems
423
3. by letting:
= h(z,p),
(547)
the time derivative of ~ along the solutions of (Sp) with u
=
Un
= where C is an n
x
satisfies:
(548)
n matrix satisfying: PC
(549)
= --I,
+ cTP
with P a symmetric positive definite matrix. Under these conditions, if
either the function A does not depend on u and there exias a known function: v : IR n x H x l ~ ; ---* IRrn
of c l a s s C 1,
(550)
such that, for all (z,p,O) in IRa x 1"I x IK ~, we have: Oh (x,p)o +
or
v o b(x) = o,
(551)
there exists a positive continuous function dll, defined on II such that, for all ( z , p ) in IR" x II:
1Oh pp( x , p)l] < d a l ( p ) m a x { 1 , IIh(z,P)ll2},
(552)
then the Adaptive Stabilization problem is solved by the dynamic controller consisting of (g59), (261), (~6~, (516), and: k
]
(553)
if v exists, or:
u = Un(X,~)
(554)
if not. Proof. This proof follows exactly the same lines as the proof of Corollary (485). In p~ticular, the above growth conditions are nothing but GC2 with: X=
~,1
U(h) = h W p h
and
f ( h , p) = C h .
(555) D
This Corollary (545) should be compared with the result of Nam and Arapostathis in [16]. In the same context of adaptive feedback linearization, they propose the same dynamic controller except that: 1. the Matching Condition (551) is not assumed, 2. they do not restrict the choice of Fr(O),
424
Praly, Bastin, Pomet, and Jirmg
3. the observer gain is not allowed to go to zero, namely, (263) does not hold, 4. p is kept constant, 5. finally r is given by: r=
1+
x,p)A(x, ~ . ( z , p
,
(556)
i.e., they impose ~ - 1. As a consequence, Nam and Arapostathis get a solution to the Adaptive Stabilization problem under more restrictive Growth Conditions. Namely, instead of (546), they assume: [~x(
p)A(z, Un(Z,p)) <_ dlo(p) max{1,Hh(z,p)[]},
(557)
and, instead of (552), they have: Oh z ]]~pp(,p)ll<_dll(P) max{l,
Hh(x,p)ll}.
(558)
6 Conclusion In this paper, we have given a unified and generalizing overview of most (not all!) of the presently proposed approaches to stabilize an equilibrium point of a nonlinear system described by a differential equation containing unknown parameters. Table 3 summarizes our results. A key assumption is the fact that the right-hand side of the differential equation describing the system depends linearly on these unknown parameters or at least on those actually needed for the control (see Example (49)). To meet this assumption, called A-Linear Parameterization A-LP, we have mentioned the fact that it may be useful not to work with the a priori given coordinates and parameterization: a parameter-dependent change of coordinates and a reparameterization, i.e., a transformation of the parameters, are allowed (see [14] and [29]). Also, Middleton and Goodwin [15], Pomet and Praly [22] and Slotine and Li [30], for example, have shown that the proposed results extend in some cases to a more general case called Implicit Linear Parameterization in [19]. Finally, Mareels, Penfold and Evans [13] have shown that it is also possible to follow a non parametric approach to solve the problem of stabilizing an equilibrium point of a system whose dynamics are not completely known. Another important assumption is assumption PRS and its more restrictive versions URS and RURS. It guarantees not only that the system is stabilizable, but also that a parameter-dependent control law for this stabilization is known. It follows that the only problem addressed here is: how can we use this control law when the parameters are unknown, i.e., how can we make this control law adaptive?
Adaptive Stabilization of Nonlinear Systems
425
Table 3 shows the ten routes we have studied for designing and analyzing adaptive controllers of nonlinear systems. It shows the interplay between structural assumptions, control design techniques and estimation algorithms. In particular, it emphasizes the fact that any stabilizing control law cannot necessarily be made adaptive. It has to give the closed-loop system properties which differ depending on which control design and estimation technique is used. However, a general very desirable property is the fact that the so-called Matching Condition (MC) can be satisfied. Implicit in all what has been presented was the assumption that the state is completely measured. It follows that the problem we are addressing is a very particular case of the Error Feedback Regulator problem stated by Isidori [6, Sect. 7.2]. But thanks to this particularity, we have solved the problem under less restrictive assumptions. It is also worth mentioning that the results established by Kanellakopoulos, Kokotovic and Middleton [10] lead us to expect that relaxation of the assumptions is also possible in some cases when the state is not completely measured. Appendices A: P r o o f of L e m m a (103) In this proof, we denote by S the following open subset of H x IRl:
s =
(p,y)
Proj(M,p,y) differs from
P(v) > 0 and -g~p(p)y > 0
.
(559)
y if and only if (p,y) belongs to S.
Poin¢ 1: We make the following preliminary remark: Since 0 _< P(p) and M E .M imply: ~---~(p)M ~ _(p)Tp ~> 0,
(560)
and P is a twice continuously differentiable function, 1. the function Proj(M,p, y) is continuously differentiable in the set ./M x S, 0P 2. Proj(M,p, y) tends to ~ as ~(V) or -b-~p(V) y tends to 0, 3. for any compact subset C of
z4 x
{ I (v,y)
~'(p) __ 0 and -b-~p(p)y _> 0
}
,
there exists a constant kc bounding the Jacobian matrix:
[[VProj(M,p,y)[[ ~_ k¢
V(M,p,y) E C.
(561)
426
Praly, Bastin, Pomet, and Jiang
Table 3. Summary of results Fundamental assumptions
]
o.n
I Bic Istimtion
method
assumptions
Additional assumptions
algorithm
V ind. of p MC MC
G c (3~s) ICS ]
~1 - V
MC
CO
GC (380)
RMC, GC (438) Estimation h n o t rel. V
] ]
RBO -~ R U R S RCO
GC (438)-(439) RMC, GC (510) GC (510)-(511)
BO EEF GC MC R
: Boundedness Observability : Equation Error Filtering : Growth Condition : Matching Condition : Refined
CO A-LP ICS PRS RF URS
: Convergence Observability : A-Linear Parameterization : Imbedded Convex Sets : Pointwise Reduced-order Stabilizability : Regressor Filtering : Uniform Reduced-order Stabilizabihty
Now, let (Ml,Pl,Yl) and (Mo,Po,Yo) be two points such that, for any cr in [0, 1], the point (Ma,pa,y~,) is in the set .&4 x / / x IRI, with: M~ : a M 1 + ( 1 - ~ ) M 0 p~ = ~ p , + (1 - ~ ) p o
(562)
y~ = tr yl + (1 - a) y0Four cases must be distinguished: Case h (Pl, Yl) and (P0, Y0) are not in S. Then, we have trivially: {[Proj(Ml,pl, Yl) - Proj(M0,Po, Y0)H = {{Yl- Y0{I •
(563)
Adaptive Stabilization of Nonlinear Systems
427
Case 2: For all a in [0, 1], (p~, y~) lies in S. Then, from the above preliminary remark and the Mean Value Theorem, we get: [IProj(M1, pa, yl) - Proj(Mo, Po, Yo)[[ < k [lIMa - M0[[ + liP1 -Po[[ + [[Yl - Yo[[] (564) with the constant k given by (561). Case 3: Say (Po, Y0) belongs to S but (Pl, Ya) does not. Then, we define a* by: ex* =
min a. O
(565)
Since S is open, all the points of the segment [(Mo,Po, Yo), ( M s . , p c , o, Y a . ) ) have their ( p , y ) component lying in S. But (p~.,v..) is not in S. With (561) and (562), this implies: [[Proj(M~., Pa-, Y~.) - Proj(Mo, Po, Yo)[[ < k [liMa - Moll + lips - poll + Ilyx - yoll] •
(566)
We also have: [IProj(Ma,px,ya)
- Proj(Mo,.,p,~.,yo.)ll
- IlYl - Y-'II -< IlYa - Yo[I • (567)
This yields: IlProj(Mx, Pl, Ya) - Proj(Mo, Po, Yo)ll < (1 + k) [liMa - Moll + liP1 - poll + Ilyl - yoll].
(568)
Case 4: Finally, when both (Pl, 91) and (P0, Y0) belong to S, but there are some a in (0, 1) for which (p~, ya) is not in S, we define c~* as above and let: /~* =
max
/~.
(569)
0~
(p:,y:) ~ s Then all the points of the segments [(Mo, Po, yo), (M~., p~., y~. )) and ((M#., p#., y#.), (M1, Pl, Yl)] have their (p, y) component in S. But the points (p~., y~.) and ( P / ~ ' , V Z ' ) are not in S. With (561), we get: IlProj ( M 1 , Pl , Yl ) - Proj(M#., pp., y~.)[] + [[Proj(M~. ,p~., Ya-) - Proj(Mo, po, Yo)[[
___ 2k [liMa - M2ll + IIp~ - p2ll + Ilyl - y~ll]. (570) The conclusion follows, since we have trivially: Proj(Ma*,plj.,y#.)
- Proj(Ma.,pa.,ya.)
= y#* - Ya" .
(571)
428
Praly, Bastin, Pomet, and Jiang
Point 2: For (p, y) not in S the inequality of point 2 is trivial. If (p, y) is in S, a direct computation gives: / o7~
\2
p(p) ( 2 - v(v)) ~,~-(p)u) Pr°j(M,p,y)T M - 1 p r ° J ( M , P , y ) = yT M - * Y --
o7:' o~ T "g~p(P)M-5-'~p (P)
(572)
The conclusion follows since, by definition, for all (p, y) in S with p in II1, we have:
p(p) ( ~ - p(p)) > 0.
(573)
Point 3: For any p satisfying P(p) >__ 0, let q be the orthogonal projection of p* on the hyperplane through p and orthogonal to -t-b-p(P), i.e.,
q = p. _ ~ (p)(p - p') 09 p T o,p
2
Since P is a convex function we have: OP 7~(q) > P ( p ) + "~v (P) (q - p)
> p(p) > o.
(575/
(576)
It follows that q is not an interior point of the set //0 and, therefore, IIq - P'II is larger or equal to D*, the distance from p* to the boundary of the closed set /7o = {p EP(p) = 0 }, i.e., we have:
> D "2 .
~(Pll i1o,
(577)
_
The conclusion follows from the fact that, p* being an interior point of the set 170 (see (95) and (93)/, the Basic Separation IIahn-Banach Theorem [4, Theorem V.1.12] implies: OP
0~(p)(p-p*) > 0 v p : p(p) > 0.
(578)
Point ~: Again, point 4 is clear for all (p, y) not in S. And, for (p, y) in S, we get, using point 3: (p - p*)W M - 1 P r o j ( M , p, Y)
~(p) = (P -- P * ) T M - l y --
(p--p')TM-~y.
(~---~(p)v)(~-~p(p)(p-p')) O~ OP T -~p (p)M-~p (p)
(579)
Adaptive Stabilization of Nonlineaz Systems
429
Point 5: Let us compute the time derivative of~(p(t)) along a solution of (104).
We get:
P(~t)) = ~(~(t))y(t)
if ~(p~(t)) < 0 or ~ ( ~ ( t ) ) y ( t )
< 0
-- -~-~p~(ff(t))y(t)(1-T'(ff(t))) ff P(ff(t)) > 0 and ~ ( p ~ t ) ) y ( t ) > O.
(580) Therefore, we have: P(~(t)) _< 0
if
P(~(t)) >__ 1.
(581)
Since the initial condition satisfies: 7~(~(0)) ___ 1,
(582)
a continuity argument proves that the same holds for all t where if(t) is defined. f]
B: A Useful Technical L e m m a L e m m a : (see also [3, T h e o r e m 137.1.91) (583) Let X be a C 1 time function defined on [0, T) (0 < T <__+oo), satisfying:
JC < - c X
+ EOi(t)X(t) i
where c is a strictly positive constant, E
+ Ewj(t), j
and E
(584)
are finite sums and Oi, and
i
wj are positive time functions satisfying:
~
Toio' < Sli
and
~0T wj ej
<_ S2j,
(585)
where ~i >_ 1 and ~j >_ 1. Under this assumption, X(t) is bounded from above on [O,T) and, precisely: X(t) <_ KIX(O)+K2
Vt E [0,T),
(586)
with K1 and K2 depending only on ~i, ~j, Sll and S2i. Moreover, if T is infinite then: limsup X(t) < 0.
|---*e~
(587)
430
Praly, Bastin, Pomet, a~d Jiang
Proof. This is a straightforward consequence of a known result on differential inequalities. From (584), one derives (see [5, Theorem 1.6.1]):
"
Let
Oi >
+
L
"
wj(r) dr. J
(588)
1 and r/j > 1 be defined by: 1 --+
1
1
and
-1
1
~ + --r/j -- 1.
(589)
Inequalities (585) and the HSlder Inequality yield, for any positive t and r, t > r: ..t_
< (t - r ) ~ S i i ~ .
(591)
Similarly, we get:
(fo' ~- +o-,)~,~(,-)<;d,-)~ .d.l
_<(~).,
~_(~)~
(
e-~&j +
1
J:
Then, let us note that the function: c
f(~) = -gx + ~ su
~,~ (,-)~id,-
~
(592)
(593)
i
is well-defined and continuous on [0, + ~ ) , with:
$(0) =
0
and
f(+~)
=
-oo.
(594)
This implies the existence of a constant K t depending only on $11 and a / s u c h that, for all 0 < r < t: exp
( c--(2t - r ) + E ( t -ir ) ~ S t i ' ~ )
<_ K1.
(595)
With this bound, (588) and (592) we have established:
x(t) _< KI
.~0~+~/~/'~ .-~.,+~ ~/~ d~ (596)
Adaptive Stabilization of Nonlinear Systems
431
The proof is completed by noticing that (585) implies that, ifT = +co, we have: lira
~ j ( r ) ¢ i d r = 0.
t..-*+~
(597) rl
C: A n I n e q u a l i t y Lemma: Let U : IRk ~ IR+ be a C 1 function such that:
(598)
(59O) with 0 <_ A < 1 and d a positive constant. For any positive real number 7, there exists a constant 6 such that, for all (h,c) in IR k × IRk, we have:
max {1, U ( h - e) "Y} < 6 max {1, U(h) "r}
Proof. Let us define a function W as follows: W ( h ) = m a x { l , U(h) l - x } .
(601 /
This function is of class C t in the open set {h I U(h) > 1} with: [I-~(h)[
_
(602)
Then, for all (h, e) in IR~ x IK~ , we have:
W(h
-
e) - W(h) <_ d (1 - a) Ilell •
(603)
This is proved by breaking the segment [h - e, h] into pieces depending on whether U is larger than 1 or not and by noting that W ( h - e) - W ( h ) is equal to the sum of the variations of W on the two extreme pieces only (see the proof of Point 1 in Appendix A). This yields: max {1, U(h - e) 7 } - max {1, U(h) 7 } < ( W ( h ) + d (1 - A) Ilell) r ~ - W(h)~-Z~.
(604) Let us now note that, for all ~" >__0, the function: f(x) = (l+x) "r-x T
(6051
z ~ + z x~ + 1
is positive, well-defined and continuous on [0, +oo), with: f(0) = 1
and
f ( + c o ) = 0.
(606)
This implies the existence of a constant K depending only on 3' and A such that: max {I, U ( h - e) 7} - m a x { l , U ( h ) ' }
< K [W(h) r'3x + W ( h ) ~7~ ( d ( 1 - A) Ilell)" + ( d ( 1 The conclusion follows readily.
A)Ilell)r~r]. (607) []
432
Praly, Bastin, Pomet, and Jiang
References 1. B. d'Andr6a-Novel, J.-B. Pomet, and L. Praly, "Adaptive stabilization for nonlinear systems in the plane," Proc. l l t h IFAC World Congress, Talinn, 1990. 2. G. Campion ~nd G. Bastin, ~Indirect adaptive state feedback control of linearly parametrized nonlinear systems," Int. J. Adapt. Control Sig. Proc., vol. 4, pp. 345358, Sept. 1990. 3. C. A. Desoer and M. Vidyasagax, Feedback Systems: Input-Output Properties, Academic Press, 1975. 4. N. Dunford and J. T. Schwartz, Linear Operators, Part I: General Theory, Interscience Pubhshers, 1957. 5. J. K. Hale, Ordinary Differential Equations, Wiley-Interscience, 1969. 6. A. Isidori, Nonlinear control systems, 2nd ed., Springer-Verlag, 1989. 7. T. KaJlath, Linear Systems, Prentice Hall, 1980. 8. I. Kanellakopoulos, P. V. Kokotovic, and R. Matino, "Robustness of adaptive nonlinear control under an extended matching condition," Prepr. IFAC Syrup. Nonlinear Control Syst. Design, pp. 192-197, Capri, Italy, 1989. 9. I. Kanellakopoulos, P. V. Kokotovic, and R. Marino, "An extended direct scheme for robust adaptive nonlinear control," Automatiea, to appear, March 1991. 10. I. Kanellakopoulos, P. V. Kokotovic, and R. H. Middleton, "Observer-based adaptive control of nonlineax systems under matching conditions," Proc. I990 Amer. Control Conf., pp. 549-555, San Diego, CA, May 1990. 11. V. Lakshmikantham and S. Leela, Differential and Integral Inequalities: Theory and Applications, Volume 1: Ordinary Differential Equations, Academic Press, 1969. 12. I. D. Landau, Adaptive Control: The Model Reference Approach, Control and Systems Theory, vol. 8, Dekker, 1979. 13. [. M. Y Mareels, H. B. Penfold, and R. J. Evans, "Controlling nonlinear timevarying systems via Euler approximations," Technical Report EE 8939, Dept. of Electrical Engineering and Computer Science, University of Newcastle, Australia, Oct. 1989. 14. H. Mayeda, K. Osuka, and A. Kangawa, "A new identification method for serial manipulators arms, ~ Proc. 9th IFAC World Congress, Sect. 08.2/B-3, July 1984. 15. R. H. Middleton and G. C. Goodwin, "Adaptive computed torque control for rigid link manipulators," Syst. Control Lett., vol. 10, no. 1, pp. 9-16, 1988. 16. K. Nam and A. Arapostathis, "A model-reference adaptive control scheme for pure-feedback nonlinear systems, ~ IEEE Trans. Aut. Control, vol. 33, pp. 803811, Sept. 1988. 17. K. S. Narendra, L. S. Valavani, "A comparison of Lyapunov and hyperstability approaches to adaptive control of continuous systems," IEEE Trans. Aut. Control, April 1980. 18. P. C. Parks, "Lyapunov redesign of model reference ~daptive control systems," IEEE Trans. Aut. Control, vol. AC-11, pp. 362-367, 1966. 19. J.-B. Pomet, Sur la commande adaptative des syst~mes non lin~aires, Th~se de I'll,cole des Mines de Paris en Math~matiques et Automatique, 1989. 20. J.-B. Pomet and L. Praly, "Indirect adaptive nonlineax control," Proc. £7th IEEE Conf. Dec. Control, pp. 2414--2415, Austin, TX, Dec. 1988. 21. J.-B. Pomet and L. Praly, "Adaptive nonlinear regulation: equation error from the Lyapunov equation, ~ Proe. £8th IEEE Conf. Dec. Control, pp. 1008-1013, Tampa, FL, Dec. 1989.
Adaptive Stabilization of Nonlineax Systems
433
22. 5.-B. Pomet and L. Praly, "Adaptive nonlinear control: an estimation-based al_ gorithm," in New Trends in Nonlinear Control Theory, J. Descusse, M. Fliess, A. Isidori and D. Leborgne Eds., Springer-Verlag, Berlin, 1989. 23. J.-B. Pomet and L. Praly, "Adaptive non-linear stabilization: estimation from the Lyapunov equation," CAI Report 232, submitted for publication, Feb. 1990. 24. J.-B. Pomet and L. Praly, "Adaptive nonlinear control of feedback equivalent systems," Proc. 9th Int. Con?. Analysis and Optimization of Systems, Antibes, Prance, June 1990. 25. J.-B. Pomet and I. Kupka, "Feedback equivalence of a p~ametrized family of control systems," in preparation, Department of Mathematics, University of Toronto, Canada, 1990. 26. V. M. Popov, Hyperstability of Control Systems, Springer-Verlag, 1973. 27. L. Praly, B. d'Andr~a-Novel, and J.-M. Coron, ~Lyapunov design of stabilizing controllers for cascaded systems," IEEE Trans. Aut. Control, to appear, see also Proc. ~8th IEEE Conf. Dec. Control, Tampa, FL, Dec. 1989. 28. S. S. Sastry and P. V. Kokotovic, "Feedback linearization in the presence of uncertainties," Int. J. Adapt. Control Sig. Proc., vol. 2, pp. 327-346, 1988. 29. S. S. Sastry and A. Isidori, "Adaptive control of linearizable systems," IEEE Trans. Aut. Control, vol. 34, pp. 1123-1131, Nov. 1989. 30. J.-J. E. Slotine and W. Li, "Adaptive manipulator control: a case study," IEEE Trans. Aut. Control, vol. 33, pp. 995-1003, Nov. 1988. 31. D. G. Taylor, P. V. Kokotovic, R. Marina, and I. Kanellakopoulos, "Adaptive regulation of nonlinear systems with unmodeled dynamics," IEEE Trans. Aut. Control, vol. 34, pp. 405-412, April 1989.
A d a p t i v e N o n l i n e a r C o n t r o l of I n d u c t i o n M o t o r s via E x t e n d e d M a t c h i n g * Riccardo Marino, 1 Sergei Peresada, 2 and Paolo Valigi 1 1 Seconds Universitk di Roma, Dipartimento di Ingegneria Elettronics Vis O. Raimondo, 00173 Roma, ITALY. 2 Kiev Polytechnical Institute, Department of Electrical Engineering Prospect Pobedy 37, Kiev 252056, USSR.
A b s t r a c t , A nonlinear adaptive state-feedback input-output lineaxizing control is designed for a fifth order model of an induction motor which includes both electrical and mechanical dynamics under the assumptions of linear magnetic circuits. The control algorithm contains a noulineax identificstion scheme which asymptoticany tracks the true values of the load torque and rotor resistaJace, which axe assumed to be constant but unknown. Once those parameters axe identified, the two control goals of regulating rotor speed and rotor flux amplitude axe decoupled. Full state measurements are required. 1 Introduction In the last decade, significant advances have been made in the theory of nonlinear state-feedback control (see [1] and [21 for a comprehensive introduction to nonlinear geometric control): in particular, feedback linearization and inputoutput decoupling techniques have proved useful in applications. More recently the problems of feedback linearization and input-output linearization have been generalized allowing for some parameters not to be known [3,4,5,6]. In this paper we address the problem of adaptive speed regulation for induction motors with load torque and rotor resistance being unknown but constant parameters. Nonadaptive input-output decoupling controls were presented in [7,8,9] using geometric techniques. We develop an adaptive version of the controller presented in [9], assuming that load torque and rotor resistance are unknown parameters. A preliminary version of this work can be found in [10]. The paper is organized as follows. In Section 2 a fifth-order state-space model of an induction motor, which includes both electrical and mechanical dynamics, is given. In Section 3 previous control schemes are reviewed and it is shown that field oriented control can be viewed as a feedback transformation which achieves asymptotic inputoutput decoupling and linearization. In Section 4, following the results presented in [8,6], we develop an adaptive version of the exact decoupling and linearizing control given in [9] which covers the more realistic situation in which the load * We would like to thaJak Prof. A. Bellini for providing us the data of the motor and for useful discussions. This work was supported in part by Ministero della Universit~ e dells Ricerca Scientifica e Tecnologica (fondi 40%).
436
M a r i n o , P e r e s a d a , a n d Valigi
torque and the rotor resistance are not known. W e present a second-order nonlinear identificationscheme which asymptotically tracks the correct value of load torque and, when electric torque is different than zero, the correct value of rotor resistance as well. The adaptive state-feedback linearizing control achieves full decoupling in speed and rotor flux magnitude regulation as soon as the identification scheme has converged to the true parameter values. This is shown in Section 5, where the effects of parameter uncertainties are analyzed and some simulation results illustrate the performance of the proposed control algorithm, which has some advantages over the classical scheme of field oriented control: with a comparable complexity, two critical parameters are identified and exact decoupling is achieved.
2 Induction
Motor Model
The reader is referred to [11] and [12] for the general theory of electric machines and induction motors and to [13] for related control problems. The symbols used and their meaning are collected in the Appendix. A is made up of three stator windings and three rotor windings. Krause [14] introduced a two-phase equivalent machine representation of an induction motor with two rotor windings and two stator windings. The dynamics of an induction motor under the assumptions of equal mutual inductances and linear magnetic circuit are given by the fifth-order model: d~ = n p M (~braisb -- ~brbisa)
dt
J Lr
d~bra d~rb
dt
R~ Rr
Rr
TL J Rr . z
npW~ra -- ~ r ~brb + ~ r M ' s b
(1)
di~a M Rr npM 1 - - --~ra, "J#-m C d ~ r b - - 7"i,~ + - u ~ dt -- uL2r ~Lr di~b npM . M Rr . 1 "-- -- __ h / ~ r a "b ~ 3 r b -- "f*isb "~ - - l t s b , tit O"l J r or l J r Or .
.
where i, ~b, us denote current, flux linkage and stator voltage input to the machine; the subscripts s and r stand for stator and rotor; (a, b) denote the compoM2 nents of a vector with respect to a fixed stator reference frame and tr = Ls--L'Tr '
= k ozg 1" From now on we will drop the subscripts r and 8 since we will only use rotor fluxes (~bra, ~brb) and stator currents (isa, isb). Let x = (w, ~ba,~ , ia, ib) T
(9)
P = ( P l , P2) T = ( T L -- TLN, R r -- R, rN) T
(3)
be the state vector and let
Induction Motors: Adaptive Nonlinear Control
437
be the unknown parameter deviations from the nominal values TLN and RcN of load torque TL and rotor resistance Rr. TL is typically unknown whereas R¢ may have a range of variations of 4-50% around its nominal value (see [13, p. 224]) due to rotor heating. Let u = (ua, Ub)T be the control vector. Let a = ~ r ' /~__ q--~r' M 7 = Ma_~L r + Ra~t,/j = ~n rM~ , be s reparametrization of the induction motor model, where a,/3, % p are known parameters depending on the nominal value P.N. The system (1) can be rewritten in compact form as -- f ( z ) + Uaga + Ubgb + Pl.fl + P2f2(Z),
(4)
where the vector fields f, g=, gb, fl, f~ are
f(z) =
P ( ¢ a l b - - Cbla)-- ~ '~ --t~ba - nptO~bb"1- otMQ | npWCa OtOb-I- otMib | , - -
(5)
--np/~WCa "1"a 3 ¢ b -- 7ib ]
li)[il [!/ g~=
,
gb=
,
(6)
/ -~¢~+~, °
yl(~) =
, f2(x) =
1
M.
1
M
| -~¢b + ~i
/M /
~
M
(7)
M~ -
3 Induction
Motor
Control
3.1 F i e l d O r i e n t e d C o n t r o l A classical control technique for induction motors is the field oriented control. First introduced by Blaschke [15,16] in 1971, it involves the transformation of the vectors (Q, ib), (Ca, Cb) in the fixed stator frame (~,b) into vectors in a frame (d, q) which rotates along with the flux vector (¢~, ¢~); if one defines p--arctan~,
(8)
438
Maxino, Peresada, and Valigi
the transformations are ( i d ) __ [ e o s p s i n p ] ( i , , ) iq [-- sin p cos pj ib (:~)=
[ cospsinP]sinpcospj( : ~ ) L-
(9) '
"
(10)
We now reinterpret field oriented control as a state-feedback transformation (involving a state-space change of coordinates and a nonfinear state feedback) to a control system of simpler structure. If we define the state-space change of coordinates ~O "" ¢a3
p = arctan Cb ¢---~ id "-
(11)
C a i a q- C b i b
I¢[ Caib -- C b i a
iq -and the state feedback
()
[
I¢l
'
]_1
Ub
.
td~q
,
(12)
npflWCd + npWZd+ a M - - ~d- + vq
the closed-loop system (1), (12) becomes in the new coordinates dw ~LN = #¢diq j dt diq ---- - - 7 i q -I- Vq dt
dCd ---- --C~¢d + c~Mi d dt did dt dt
(13)
= --Tid q- ~ f l C d "4- Vd
= npw + otM~. d .
In other words, the system (1) is transformed into (13) by the feedback transformation (11)-(12). The system (13) has a simpler structure: its flux amplitude dynamics are linear
dee
dt = --aCd + otMid
did d"T = --7id + c~3¢a + Vd,
(14)
Induction Motors: Adaptive Nonlinear Control
439
and can be independently controlled by ttd, for instance via a PI controller, as proposed in [13] ~,d = - k d l (¢d - Cdr~f) - ~d2
(¢d(r) - Cd~f) dr.
(15)
When the flux amplitude Cd is regulated to the constant reference value Cd ref, rotor speed dynamics are also linear
dw d-'-( = #¢d ~f iq diq
TL j
_
d--T - - 7 i q q- V q ,
(16)
and can be independently controlled by Vq, for instance by two nested loops of PI controllers, as proposed in [13] vq = - k q l ( T
- T~o~) - k~2
Trcf = -kqa(W - Wref) - kq4
(T(r) - rrof(r)) dr
(w(r) - Wrer)dr
(17)
T = p~bd/q . If w and Cd are defined as outputs, field oriented control achieves asymptotic input-output linearization and decoupling via the nonlinear state feedback (12), (15), (17): PI controllers are used to counteract parameter variations. During the flux transient the nonlinearity Cdiq in (13) makes the first four equations in (13) still nonlinear and coupled. Flux transients occur when the motor has to be operated above the nominal speed. In this case flux weakening, k (for instance Cref = ~--~7,r) i s required in order to keep applied voltage within inverter ceiling limits [13, p.217] and the speed transients of the closed-loop system (13), (15), (17) are difficult to evaluate and may be unsatisfactory. It should also be mentioned that flux measurements, which are required in (12), are difficult to obtain (see [17,18]), even though flux observers from stator currents and rotor speed measurements have been determined [19]. 3.2 I n p u t - O u t p u t
Deeoupling
As shown in [9], one can improve field oriented control by achieving exact inputoutput decoupling and linearization via a nonlinear state-feedback control which is not more complex than (12). We will use the following notation for the directional (or Lie) derivative of state function ¢ ( x ) : ]Rn -+ IP~ along a vector field f ( z ) = ( f l ( x ) , . . . , fn(x)) (18) i----1
440
Marino, Peresada, and Valigi
Iteratively we define L~¢ = L t(L~i-i)¢). The outputs to be controlled are w and Ca2 + ¢2. Let us define the change of coordinates u~ = Cx(x) =
y2 = Lf¢~(~) = g(¢.ib -- Cbi.)
TLN J
(19) Y4 = L/¢2(x) = -2c~(¢a2 + Cb~) + 2c~M(¢~ia + ~bbib) Y5 = arctan ( ~ ) = ¢ 3 , which is one to one in ~2 = {z E lR.s : Ca2 + ¢~ # 0} but it is onto only for Y3 > 0, -90 < Ys < 90. The inverse transformation is defined in as W=yl
¢~ = v ~ c o s y~ Cb = x/-~sin Ys ia=~ ib=~
1
sinys\
(20) 1
~-
]
~ sin ys
~
)+--#cosys
Tj____N_)N) Y2+~
•
The dynamics of the induction motor with nominal parameters are given in the new coordinates by ~/1 = Y 2
Y2 = L~1¢1 + LgaLf¢lua + Lgb LI¢lUb
(21)
~t3 = Y 4
il4 = L~¢2 + Lg,,LI¢2u~ + Lgb Lt¢2Ub i15 = L!¢a. The first four equations in (21) can be rewritten as
('.~)
L2¢1
u~
(22)
where D(z) is the decoupling matrix given by
[Lg~,LlCt LgbL/¢I] D(x) = [Lg~Lf¢2 LgbLJ¢2j Since
[ -#--¢b 2o~¢7 #-'M¢a 2a~ffa ¢~
aM# 2 det [D] = - 2 - - ~ - ( ¢ . + ¢~),
~ Cb
"
(23)
(24)
Induction Motors: Adaptive Nonlinear Control
441
D(z) is nonsingular everywhere in/2. The terms LffCt and Lff~b2 are given by Lff¢l = -/~flnpw(¢~ + Cb2) - / ~ ( a + 7)(¢aib
-
Cbia)
-- I~npW(¢aia +
Cbib),
L~¢2 = (4a 2 + 2a2/~M)(¢a2 + ¢~) + 2aMnpw(¢aib -- Cbia)
(25)
--( 6a2M + 2(~TM)(¢aia + Cbib) + 2°t2M2(i2a + i~). The dynamics of the flux angle Y5 = $3(z) are d¢3 = dy5 =
dt
dt
~M now + - -2" ~ ( ¢ a i b c a -- Cbia) = npyl + npRrNyz(JY2 + TLN). (26)
The difference between flux angular speed ~3 and rotor speed npw is usually called slip speed, ws, which can be expressed, recalling the expression of a, as
RrNM Caib -- Cbia -- RrN T ¢3 - np~ = ~, -
L-----/- ¢ ~ + ¢?,
(27)
n o I¢12,
where
T - - npM
Lr (¢rai,b -- ~/'rbisa)
(28)
represents the electric torque. The input-output linearizing feedback for the system (21) is given by
( Uo a- l) ( x:) ~ b
[(--L~I~_L~2)--~ (13a)]vb ,
(29)
where v = (v~, Vb)T is the new input vector. Substituting the state feedback (29) in (21) the closed-loop dynamics become, in the y-coordinates,
Yl ~ Y2 y2=-Va = u4
(30)
Y4 =Vb ~]5 : npYl +
RrN 1 , j
--(Y2+TLN).
np Y3
Equation (26) or (27) represents the dynamics which have been made unobservable from the outputs by the state feedback control (29). In order to track desired reference signals Wvef(t) and [,~[ref(t) 2 for the speed Yl = w and the square of the flux modulus Y3 = ¢~ + Cb2, the input signals va and Vb in (29) are designed as va = - k a l ( ~ X
- ~ror(t)) - ka:(U2 - ~ f ( t ) )
•2 Vb = --kbx(Y3 --IC[~f) -- kb2(Y4 -- [¢Ivef) +
+ ~ref(t)
.-2
loiter,
(31)
442
Marino, Peresada, and Valigi
where (kal, ka2) and ( k b l , ks2) are constant design parameters to be assigned in order to shape the response of the decoupled, linear second-order systems
d~ dt 2
d -
ref)
=
--
re,)
--
d2
--
d
ro,)
~-~(1¢12 - ICier) = --kb~(l¢l 2 --I¢lLe) -- k b = ~ ( l ¢ l
2
-- I¢lLt).
(32)
Remarks 1) The system (30) is input-output decoupled; the input-output mapping is a pair of second-order linear systems. This allows for an independent regulation (or tracking) of the outputs according to (32). Transient responses are now decoupled also when flux weakening is performed. This is an improvement over the field oriented control (see also [9]). 2) State-space changes of coordinates both in the field oriented control and in the decoupling control (i.e., (11) and (20)) axe valid in the open set /2 = {x e IR 5 : ¢~ + ¢~ ¢ 0}; notice that ¢~ + ¢~ = 0 is a physical singularity of the motor in starting conditions. 3) While measurements of (w, ia, ib) axe available, measurements of (Ca, Cb) pose some problems (see [17]). As far as parameters are concerned, variation in load torque TL and rotor resistance R, cause a loss of input-output decoupling and steady-state regulation errors. This calls for an adaptive version of the control (29),(31), which is given in the next Section. 4) Easy computations show that the induction motor model (1) is not feedback linearizable. The necessary and sufficient conditions given in [20] fail; in fact the distribution G1 = span {ga, gb, adlga, adfgb} is not involutive since the vector field [adlga, adlgb] does not belong to G1 (adxY or IX, Y] denotes the Lie bracket of two vector fields; one defines recursively adlx Y = adx(aclixlY)). Following the results in [21], since Go = span {g~,gb} is involutive and rank G1 = 4, it turns out that the largest feedback linearizable subsystem has dimension 4. This shows that the control (29),(31) provides the largest lineaxizable subsystem in the closed loop. 5) The state-feedback control (29),(31) is essentially the one proposed in [9]. It is made clear that the decoupling control makes the angle ¢a unobservable from the outputs, and that (1) is not feedback linearizable. Exact input-output decoupling controls for induction motors are proposed also in [7,8] with reference to a simplified model: the mechanical dynamics in (1) are not considered and w is viewed as a parameter in the last four equations of (1).
4 Adaptive Input-Output Linearization In this section we develop an adaptive version of the decoupling control (29) under the assumptions that TL and Rr are unknown constant parameters. Let us rewrite the system (4) in the V-coordinates defined by (20); since the Lie
Induction Motors: Adaptive Nonlinear Control
443
derivatives L / 2 ¢ t , LI~Lf¢I, LI~¢2 , L I ~ L I ~ , LI~¢3 , LItLI2¢~ , LgaC3, Lgb¢ 3 all vanish, we have
~]1 = Y2 + PtLl~¢t Y2 = L~¢t + P2LI2LICI + LgaLtCtua + LgbLl¢lUb YS : Y4 + P2LI2¢2 Y4 = L ) ¢ 2 + p2LI2LI¢~ + LgaLl¢2ua + Lgb Ll¢2Ub = LI¢3 +p2Ll~d~a.
(33)
Let 15(t) = (151(t), i ~ ( t ) ) w be a t i m e varying e s t i m a t e of the p a r a m e t e r s a n d let
\ep~/
p2
~(t)]
be the p a r a m e t e r error. Following [3,6] we now introduce a t i m e - v a r y i n g statespace change of coordinates depending on the p a r a m e t e r s e s t i m a t e ~(t) Zl -- Yl
Z2 =--Y2 q- plL]t q~l za = ya
(35)
z4 = y4 + 1~L~,2¢2
zs=ys. In the z-coordinates t h e s y s t e m (4) b e c o m e s Zl = Z2 + e p , L l , ¢ l
.~ + L~aLI¢lUa + LgbLl¢lUb z'2 = L~qJl + p2Lf~L/¢t + ~dPl L lt~t /~S = z4 + ep~Ll~¢2
(36) L
+u. (L,~L~,2 + ~L~L~.¢2) + ~'b (L~L~¢~ + ~L~bL~¢~ ) L1¢3 + p2Ly~¢a • Let
where D(x,~) =
LgaLI¢ t LgbLl~t ] LgaL]~b2 + #2LgaLl2q~2 Lgb Ll¢2 + p'~LgbL]2qJ2 J
-L~,I - ~LI~LI~,1 - ~ Lj, ¢1 a~ ---L~b2
~LI2LI~b2- --~-Lf2¢2
-k~,(z, - z,,~) - k.2z2 Vb --- - k b l ( Z S
- zsref) - k b 2 z 4 ,
(38) :2L2/~v2 ~LIL/~¢2 _ p2
444
Marino, Peresada, and Vallgi
(kal, ks2), (kbl, kb2) are control parameters to be designed and zl ref and z3 ref are the desired values for the rotor speed and the square of the rotor flux amplitude respectively. Since det D(x,/~2) = - 2 - ~ ( R r N + ~ ) ( ¢ ~ + ¢~),
(39)
the decoupling matrix is singular not only when (¢~ + ¢~) = 0 as in the nonadaptive case but also when ~(~) = --RrN; this additional singularity has to be taken into account in the design of the adaptive algorithm. Defining the regulation error t~ = (Zl -- Zl ref, Z2, 2:3 -- 2:3 ref, 2:4) T ,
(40)
the closed-loop system becomes e l = e2 + epl
L/I ¢1
e2 - - - - k a l e l
-- k a 2 e 2 +
%2L/2L.f¢l
~s = e4 + %~L12¢2 e4 = --kble3 -- kb~e4 + %2 (LI~Lf¢2 + 152L~¢2) i~5 = L I e s + p2Ly~¢3.
(41)
While the dynamics of z5 are (42)
z5 = n p z x + R.r J z 2 + T L -- ep, , Bp Z3
the dynamics of the vector e can be rearranged as
i3 e4
= K
= g
c3 e4
e3 e4
+
L12¢2 / Lf~LI¢2 + ~ L ~ ¢ 2 J
+ W(z,p
)
~ eP2
ep~
(43)
~
where K = block diag (Ks, Kb), Ka =
_kal
L], ¢I = - L]2Lf¢I--~
_ka2
,
_kbl
_kb2
,
1
d' ~r
1+
(¢aib -- ¢bia) -- --
1+
(44)
445
Induction Motors: Adaptive Nonlinear Control 2 LI2¢2 = ~ (M(¢aia + Cbib) -- (Ca2 + ¢~)) , Lj~Lj¢2 = RrN~, ¢2 = '1, '1 =
4~rLr + 2M 2 . . ~ 2M2- .2 .~aL~ ('P~ + ebb) + " ~ ' 2 ('a + 'b) 6aMLr + 2M 3 trL3r (¢a/a 4" Ibbib),
W(z,~2) is called the regressor matrix and is a function of the x-variables (and therefore of the z-variables). Let P = block diag (Pa, Pb) be the positive definite symmetric solution to the Lyapunov equation KTp + PK = -Q, (45) with Q = block diag (Q,, Qb), Q~ and Qb positive definite symmetric matrices. Consider the quadratic function
(46)
V = eTpe + eWF-tev,
where T' is a positive definite symmetric matrix. The time derivative of V is d_~V = ,T (KTP + PK)e + dt
r ey/W ee + / - - l d % ] L
dt J "
(47)
If we now define dev = _ F W T p e dt
(48)
or equivalently dt
= FWTpe
(49)
which defines the dynamics of the parameter estimate 15(t), and use (45), equation (47) becomes dV dt
_eWQ e
(50)
This guarantees that e(t) and 15(t) are bounded and that e(t) is an L 2 signal; it follows from (43) that the first four state variables ( z l , . . . , z4) are bounded. We are guaranteed to avoid the singularities z3 = 0 and ~ = --RrN for the decoupling matrix, and therefore for the control (37j as well, if the initial conditions (e(0),%(0)) are in S = {(e,%) e IR6 : e ' P e + e T F - l e v < c}, the largest set entirely contained in {(e,ev) e IR6 : ep2 < R¢ + ~ 1 , ea > a 2 - Z3ref}, where a l > 0 and as > 0 are arbitrary. Since W(z,152) is continuous, contains only bounded functions of z5 (sine and cosine), and (zl,z~,.z3, z 4 , ~ ) are bounded, it follows that W(z,~2) is bounded and therefore ~ and 15are bounded as well; since, according to (42), k5 is bounded for (e, %) E S, it follows that
446
Marino, Peresada, and Valigi
= ~ [ K e + W(z,f~)ep] is bounded as well. Now, since e is a bounded L 2 signal with bounded derivative ~, by Barbalat lemma ([22], p. 211) it follows that lim Ile(t)ll = 0,
(51)
i.e. zero steady-state regulation error is achieved. Since g is bounded as well, is uniformly continuous and (51) implies by Barbalat lemma again that lh-n Ile(t)ll = o.
(52)
t--tOo
Therefore, it must he that
\ ~,(t) ]
,-.~
= 0.
(hZ)
Equation (53) implies, from (44), that
lim L1,¢l%,(t) = hm ( - j e , , ( t ) )
=O
,-.~limLj, L1*~e,,(t) - , n ~
1+
t----* OO
-TZ'~,
(54)
T(Oep,(O
= 0,
i.e., lira ep, (t) = 0,
(55)
and, since limt--.oo T(t) = TL, whenever TI. ~ 0, i.e. in any physical situation,
,Era ep~(t) = o,
(56)
that is, parameter convergence is achieved. The difficulty in identifying rotor resistance under no-load condition is a common problem ([23]) and it is related to physical reasons. If the motor is unloaded, when speed and rotor flux regulation is achieved, the slip frequency in (27) is zero so that the flux vec.tor rotates at speed npw and we have Rrird = 0, Rri, a = 0; it follows that rotor currents are zero and therefore rotor resistance is not identifiable in steady-state. It is proposed in [23] to track a sinusoidal reference signal for ¢2 + ¢~ so that rotor currents are different than zero and rotor resistance can be identified. In summary, we have shown that the adaptive feedback control (37), (49) gives the closed-loop system = Ke + W%
~p = - F W T p e .
(57)
If the initial conditions (e(O), %(0)) e S we have
lira lie(Oil = o,
~.--*oo
lim lep~(t)l = o.
t--*O0
(58)
Induction Motors: Adaptive Nonlinear Control
447
Moreover, if TL ~ 0 we also have
lim lew(t)l = O.
(59)
t---*oo
Prom (51) and (52) it follows that in any case we have
lira I'~(01 = {gref,
t--*OO
l~
~t.--* O0
I~1 = o,
lim I¢1 = ICier,
(60)
mn I¢1- O.
lt--~ O0
5 Simulations The proposed control algorithm has been extensively simulated for a 15kW motor, with nominal torque of 70 Nm and nominal speed of 220 rad/sec, whose parameter values are reported in the Appendix. The simulation test reported involves the following operating sequences: the unloaded motor is required to reach the nominal speed and the nominal value of 1.3Wb for the rotor flux amplitude ]¢], with the initial estimate of rotor resistance R~ in error of +50%. At t = 2sec, a 40Nm load torque, which is unknown to the controller, is applied. At t = 5sec the speed is required to reach 300rad/sec, well above the nominal value, and rotor flux amplitude reference is weakened accordingly to the rule I¢] = ~ The reference signals for flux Wre f" amplitude and speed, reported in Fig. 1, consist of step functions. A small time delay at the beginning of the speed reference trajectory is introduced in order to avoid time overlapping of flux and speed transients.
4OO
FluxAmplitudeRefem,nce
.Speed Reference. 1.4
300
I
200
1.2
100 0
~
0
Figure l.a
~ Time(sex)
8
080
~ Figure l.b
~ Time(sec)
Fig. 1. Reference trajectories.
Both the non-adaptive (29), (31) and adaptive (37), (49) control laws have been simulated.
448
Maxino, Peresada, and VMigi
In the non adaptive case, we observe from simulations that parameter errors cause a steady-state error both in speed and flux regulation and a coupling between speed and rotor flux which is noticeable both during speed transient and at load insertion (see Fig. 2, where dashed lines stand for reference trajectories, solid lines correspond to simulated behavior and the dotted line represents the electric torque). To clarify the effects of unknown parameters, consider the closed-loop system, obtained applying feedback control (29), (31) to the motor (4) (recall (33)):
e2 e3 : K
it1)rL l l 0 (;:) e2
e3
e4
+
[ i
e4
=K
e2 + W*(y) p
e3 e4
Ly2L!cbl L1~¢2 L!2L! ¢2
(61)
9s ----L f ¢ 3 , w h e r e e = (el, e2, e 3 , e 4 ) T ~--- (Yl - Ylref, Y2, Y3 - Y3ref, Y4) T is t h e regulation error vector and Ylref and Y3rd are the desired values for the speed and the square of the flux amplitude respectively, K has the structure given in (44) and represents the linear part corresponding to (32), while W*p takes into account the effects of parameter uncertainties (see (3)). Matrix W* entries are given in (44), from which we see that LI~ ¢1 is constant and LI~L! ¢1 is proportional to the electric torque T. The entry L12¢2 is proportional, via a non-zero constant, to the derivative of the squared flux amplitude and therefore, once flux steady-state is achieved, L!2¢2 - - 0. The entry L12L! ¢2 can be rewritten as L!~L! ¢2 - - 6~1 d¢_[~. dt "1" c2T2, with el and c2 nonzero constants, that is, as the sum of a term proportional to the derivative of the squared flux amplitude plus a term proportional to the square of the electric torque. When electric torque is zero and flux amplitude steady-state is achieved, Ly2Lyd?2= 0. Up to 2sec there is no load torque so that pl = 0, the electric torque T is zero (except for a short transient after the first step in desired speed, when a coupling is noticed) and the rotor flux dynamics reach a steady state (Fig. 2.b): this implies zero steady-state error according to the above analysis. This is confirmed by simulation; we see (Fig. 2, 0 < t < 2) that speed and flux steady-state error is zero even if rotor resistance is in error of +50 %. From load insertion (from t = 2 sec) on, the electric torque and Pl are different than zero, which causes, according to (61), a coupling and steady-state errors, as confirmed by Figs. 2.a and 2.b. Notice that even if the load torque were known (and therefore Pl = 0), rotor resistance error (p2 # 0) would still cause a speed steady-state error due to the entry L!~L! ¢1 which is proportional to the electric torque (see Figs. 3.a and 3.b). The dynamic responses when both parameters are known are reported in Fig. 4 for comparison. The adaptive case simulations are reported in Figs. 5: speed and flux amplitude behavior are shown in Figs. 5.a and 5.b respectively, where solid lines
Induction Motors: A d a p t i v e Nonlinear Control
449
, Flux Amplitude.
Speed • Torque ,
,_, 4OO u 300
1
0.5 0
................
i ............
i ...............
2 4 6 Figure 2.a Tim-"(sec)
0
8
8
Figure 2.b
Time (scc)
F i g . 2. Nonadaptive case (with both load torque and rotor resistance error).
A
.Speed &.Torque,
40O
~'V-
I...........
30O
• Flux Amplim&,
1.5
/
I
V"
I.I o
i
0
0.5
i
a
,
2
4
6
Figure3.a
0
8
0
Figure 3.b
Time (sec)
I
I
4
6
Time (sec)
F i g . 3. Nonadaptive case (with rotor resistance error).
Speed ~, Torque,
~. 400 300
Y
1.5
Flux,~plitu&
1 0.5
.".........................
0
I
0
2 Figure 4.a
~
..'~........................
6
0 0
Time (see)
F i g . 4. Nonadaptive case (without pa.raxneter error).
Figure 4.b
Time (sec)
450
Marino, Peresada, and Valigi
represent actual variables, dashed lines the corresponding reference values and dotted line the electric torque; Figs. 5.e and 5.d report load torque and rotor resistance respectively, where solid lines represent true parameter values and dashed lines the corresponding estimates. In Fig. 5.c the load estimator was disabled until t = 2 sec. T h e Q matrix in (45) has been chosen equal to the identity matrix, the gain matrices K , and Kb have been chosen as (k~l, ka2) = (400, 40), (kbl, kb2) = (900, 60) and the parameter update gain matrix F has been chosen as F = diag (71,72) ----diag (0.1, 7 x 10-s).
,.., 4OO
,Speed& Torque
80
3oo
6O
2O0
o
4O
wl°
0~
•
i
loo
0
True & Estimated Load i T
100 ~
20 0
i
0
Figure 5.a 1.5
2 4 6 Figure 5.c Time(sec)
0
Time(sec)
Flux Amplitude,
v~ O'lS
1
0.1 05
True & EstimatedResistance
0.2 i
i
0.05
0 0
i
00 Figure 5.b
Time (sec)
i
i
2 4 6 Figure 5.d Time(s¢c)
8
Fig. 5. Adaptive case.
T h e dynamic performance of the adaptive control law is satisfactory: no steady-state errors occur and transient responses are decoupled, excepting for an initial short time interval. During the first speed transient, due to a wrong initial resistance estimate, a small flux error occurs. At the same time, due to the torque required to increase speed, the rotor resistance estimate quickly converges to the true value and complete decoupling is achieved: no flux error occurs during the subsequent load insertion. Notice that the estimation algorithm (49) is driven by the regulation error e defined by (40); more precisely, in view of the structure of the regressor matrix W and the structure of the P matrix solution of (45), the
Induction Motors: Adaptive Nonlinear Control
451
load estimate dynamics are driven by the regulation error on speed and speed time derivative, i.e. by el and e2, while the rotor resistance estimate dynamics are driven by the whole error vector e. It is clearly impossible for the adaptation law to distinguish between regulation error due to change in reference value (e.g., step function references) and regulation error due to parameter variations. As a result, the rotor resistance estimate slightly varies both after the load insertion (at t = 2 sec) and after a step change in speed reference (at t = 5 sec). Similarly, the step change in speed reference at t = 5 sec causes a peak in load torque estimate, even though actual load does not change. Figures 6 show the control input signal us. Control action consists in varying amplitude and frequency of the applied voltage. The nonlinear state-feedback control law results in voltage supply signals which are well within the capabilities of actual inverters and therefore can be easily implemented by current power electronic technology.
Appliedvoltageua
4OO
400
Appliedvoltageua
~. 200
~=, 200
0
-200
o
0:5
i
1'.5
"4002
2
200
:> -200
400
Appliedvoltageua
Nil
-4004
45 5 5.5 Figure 6.c Time(sec)
~
' 3
, 3.5
4
Figure 6.b Time (sec)
Figure 6.a Time (see)
400
2
E
Appliedvoltageua
200 0
-200 6
"4006
615
7
715
Figure 6.d Time (sec)
Fig. 6. Applied input voltage ua.
Figures 7 show the i~ current waveform. The step functions used as reference trajectories in the simulation clearly are the "worst case" signals. In a real implementation smoother reference signals should be used, improving transient responses and reducing stator current and stator voltage peaks.
_50
452
Marino, Peresada, and Valigi
StatorCurrentia
100
l{lO
50 -~.... '
0
6
6
-50 -I00
0
2
,
2.5
,
3
,
3.5
4
Figure 7.b Time(see)
StatorCurrentia
I00
5 -1017_
|
0.5 I 15 Figure 7.a Time (sec)
, StatorC-M,rrent ia,
I00
oi Ww I
, StatorC-Mrrentia,
50
i ii .....
-50
-I004
415
5
5:5 Figure 7.c Time (sec)
6
6
6.5 7 7.5 Figure 7.d Time (sex:)
Fig. 7. Stator current ia.
6 Conclusions In this paper it is shown how the theory of input-output decoupling and its adaptive versions lead to the design of a satisfactory controller for a detailed nonlinear model of an induction motor. T h e control is adaptive with respect to two parameters which cannot be measured and is based on a converging identification algorithm. The main drawback of the proposed control is the need of flux measurements. However, nonlinear flux observers from stator currents and rotor speed measurements have been determined [19]. Preliminary simulations show that a good performance is maintained when flux signals are provided by the observers to the adaptive control algorithm. This is a direction of further investigation. Another direction of research is the real implementation of the control in order to verify the influence of sampling rate, truncation errors in digital implementation, measurement noise, simplifying modeling assumptions, unmodeled dynamics and saturations.
Induction Motors: Adaptive Nonlinear Control
453
Appendix
Induction Motor P a r a m e t e r s (two-phase data) P~ Rr is ~bs ir ~br u w np Ls Lr M J TL T
stator resistance rotor resistance stator current stator flux linkage rotor current rotor flux linkage voltage input angular speed number of pole pairs angle of rotation stator inductance rotor inductance m u t u a l inductance rotor inertia load torque electric motor torque P a r a m e t e r Values /~ Rr Ls Lr M J
= = = = = =
0.18 ~2 0.15 f2 0.0699 H 0.0699 H 0.068 H 0.0586 K g m 2
References 1. A. ]sidori, Nonlinear Control Systems, 2nd ed., Berlin, Springer-Veflag, 1989. 2. H. Nijmeijer and A. van der Schaft, Nonlinear Dynamical Control Systems, Berlin, Springer-Verlag, 1990. 3. R. Marino, I. Kanellakopoulos, and P. Kokotovic, "Adaptive tracking for feedback llnearizable SISO systems," in Proe. ~Sth IEEE Con]. Dec. Control, Tampa, FL, pp. 1002-1007, Dec. 1989. 4. S. Sastry and A. Isidori, "Adaptive control of linearizable Systems," IEEE Trans. Aut. Control, vol. 34, pp. 1123-1131, Nov. 1989. 5. D. Taylor, P. Kokotovic, R. Marino, and I. Kanellakopoulos, "Adaptive regulation of nonlinear systems with unmodeled dynamics," IEEE Trans. Aut. Control, vol. 34, pp. 405-412, April 1989. 6. I. Kanellakopoulos, P. Kokotovic, and R. Marino, "An extended direct scheme for robust adaptive nonlinear control," Automatica, to appear, March 1991.
454
Marino, Peresada, and Valigi
7. A. De Luca and G. Ulivi, "Dynamic decoupling of voltage frequency controlled induction motors," in 8th Int. Conf. on Analysis and Optimization of Systems, INRIA, Antibes, pp. 127-137, 1988. 8. A. De Luca and G. Ulivi, "Design of exact nonlinear controller for induction motors," IEEE Trans. Aut. Control, vol. AC-34, pp. 1304-1307, Dec. 1989. 9. Z. Krzeminski, "Nonlinear control of induction motors," in lOth IFAC World Congress, Mfinich, pp. 349-354, 1987. 10. R. Maxino, S. Peresada, ~nd P. V~llgi, "Adaptive partiM feedback linearization of induction motors," in Proe. P9th IEEE Conf. Dec. Control, Honolulu, HI, Dec. 1990. 11. A. Fitzgerald, C. Kingsley Jr, and S. Urea, s, Electric Machinery, McGraw-Hill, 1983. 12. P. Kranse, Analysis of Electric Machinery, McGraw-Hill, 1986. 13. W. Leonhaxd, Control of Electrical Drives, Berlin, Springer-Veflag, 1985. 14. P. Kranse and C. Thomas, "Simulation of symmetrical induction machinery," IEEE Trans. Power Apparatus and Systems, vol. PAS-84, no. 11, pp. 1038-1053, 1965. 15. F. Blaschke, ~Das Prinzip der Feldorientierung, die Grundlage ffir die Transvector Regelung von Asynchronmaschienen," Siemens-Zeitschri]t, vol. 45, pp. 757-760, 1971. 16. F. Blaschke, "The principle of field orientation applied to the new transvector closed-loop control system for rotating field machines," Siemens-Rev., vol. 39, pp. 217-220, 1972. 17. R. Gabriel and W. Leonhaxd, "Microprocessor control of induction motors," in Proc. I E E E / I A S Int. Semiconductor Power Converter Conf., Orlando, FL, pp. 385-396, 1982. 18. W. Leonhard, "Microcomputer control of high dynamic performance ac-drives: a survey," Automatica, vol. 22, no. 1, pp. 1-19, Jan. 1986. 19. G. Verghese and S. Sanders, "Observers for flux estimation in induction machines," IEEE Trans, Industr. Electronics, vol. 35, pp. 85-94, Feb. 1988. 20. B. Jakubczyk and W. Respondek, "On linearization of control systems," Bull. Acad. Pol. Sci., Ser. Sci. Math., vol. 28, no. 9-10, pp. 517-522, 1980. 21. R. Marino, "On the largest feedback linearizable subsystem," Syst. Control Lett., vol. 6, pp. 345-351, Jan. 1986. 22. V. Popov, Hyperstability of Control Systems, Berlin, Springer-Verlag, 1973. 23. H. Sugimoto and S. Tamai, "Secondary resistance identification of an inductionmotor applied model reference adaptive system and its characteristics," IEEE Trans. Industr. Application, vol. IA-23, pp. 296-303, March 1987.
G l o b a l A d a p t i v e O b s e r v e r s and O u t p u t F e e d b a c k S t a b i l i z a t i o n for a Class of N o n l i n e a r Systems* Riccardo Marino and Patrizio Tomei Seconda UniversitY, di Roma, "Tor Vergata" Dipartimento di Ingegneria Elettronica Via O. Raimondo, 00173 Roma, ITALY.
A b s t r a c t . We address the problem of determining global adaptive observers for a class of single-output nonlinear systems which are linear with respect to an unknown constant parameter vector. Sufficient conditions are given for the construction of a global adaptive observer of an equivalent state, without persistency of excitation. Under additional geometric conditions the original (physical) state can be asymptotically observed as well. The results obtained axe based on nonlinear changes of coordinates driven by auxiliary filters (filtered transformations). When only a single input is allowed and it is assumed to enter linearly in the state equations, we determine via geometric conditions a more restricted class of nonlinear single-input, single-output systems which can be globally stabilized by a dynamic (adaptive) observer-based output feedback control. Linear minimum-phase systems with unknown poles and zeroes, known sign of the high-frequency gain and known relative degree belong to such a class of systems. Systems which axe not feedback lineaxizable may belong to such a class as well. 1 Introduction In the last few years several papers have developed adaptive versions of nonlinear control algorithms designed via the so-called geometric approach to nonlinear systems (see [1] and [2] for an introduction to the field). Those nonlinear algorithms very often rely on the cancellation of nonlinear terms by feedback or o u t p u t injection: when those terms contain unknown parameters the development of adaptive versions is very much needed as simple examples can show. Under the assumption that unknown parameters enter linearly in the nonlinearities to be cancelled, adaptive versions of state feedback linearizing controls have been developed in [3-7], of input-output linearizing controls were reported in [8], and of nonlinear observers in [9,10]. Nonlinear adaptive stabilization using the "control Lyapunov function" technique was studied in [11,12]. More recently, an adaptive version of a static output-feedback linearizing control given in [26] has been proposed in [13] and [14], under the restriction of sector-type nonlinearities and of so-called output matching conditions. * This work was supported in part by Ministero della Universitg e della Ricerca Scientifica e Tecnologica and in part by the U.S. Air Force Office of Scientific Research under Grant AFOSR 90-0011.
456
Maxino and Tomei
In this paper, we address the problem of global adaptive output feedback stabilization of a class of nonlinear systems, that is, the design of a dynamic output feedback control (compensator) such that for any initial condition of the closed-loop system and for any unknown value of the parameter vector, the closed-loop system has the property that the state of the system tends to zero and the state of the compensator is bounded.
The main result presented determines via geometric conditions a class of nonlinear single-input single-output systems, linear with respect to parameters, which are globally stabilizable by an (adaptive) dynamic observer-based output feedback control for every unknown constant value of the parameter vector. Those systems are characterized by a global, asymptotically stable, linear zero dynamics and are such that global observers with linear error dynamics [19] can be designed. The only restriction imposed on the nonlinearities is their dependence, in suitable state coordinates, on the output only. No sector-type restrictions or output matching conditions [13,14] are required. Systems which are not feedback linearizable may be included in such a class. Linear minimumphase systems with known relative degree, known sign of the high-frequency gain, known upper bound on the number of poles but unknown poles and zeroes are included in the class of nonlinear systems considered.
We make use of nonlinear adaptive observers developed in [10,15] and of filtered transformations into adaptive observer forms. In the case of relative degree greater than one, we use techniques recently developed in [16-18] for the construction of Lyapunov functions of cascade systems and, following [5], we employ several "parameter estimators" in the design of nonlinear adaptive controls.
The paper is organized as follows. In Section 2 some preliminary results on adaptive observers and on the adaptive observer form are recalled. In Section 3 we recall from [15] the concept of filtered transformation and show that a class of nonlinear systems can always be globally transformed into an adaptive observer form by a filtered transformation so that global adaptive observers can be designed. This generalizes a well-known result on adaptive observers for linear systems [21]. Section 4 addresses the problem of observing the original "physical" state. In Section 5 we further restrict the class of nonlinear systems considered. In Section 6 we show how to design an adaptive output-feedback stabilizing control, in the relative-degree-one case, by using filtered transformations. In Section 7 we show how to design adaptive output-feedback stabilizing controls for the class of nonlinear systems determined in Section 5, when the relative degree is greater than one. The results presented in Sections 6 and 7 generalize well-established results for hnear minimum-phase systems [21].
A d a p t i v e Observers and O u t p u t Feedback for Nonlinear Systems
2 Adaptive
Observers:
Basic
457
Results
We consider nonlinear single-output systems P
-- f ( x ) "4"qo(x,u) "4"Z O i q i ( z , u ) i=l
~ f(z)+qo(x,u)+O(x,u)O,
v=h(x),
xEIRn,u~IRm,OEIR
p
(27)
veIR,
with qi : IRn x IR'~ --+ IR",O < i < p, f : IR a ---* IRn, h : IRn ~ IR smooth functions, h(xo) = O, qo(z,O) = 0 for every x E IR"; x is the state, u(t) : IR+ IRm is the control, 0 is the parameter vector which is assumed to be constant, and y is the scalar output. We assume that the system (,U) with u = 0 and O = O: =
I(~)
v = h(~),
satisfies the condition
rank{d(niyh(x)) : O <_ j < n - 1 }
= n,
Vx•IRn,
which implies local observability for every x [2, p. 95]. The assumption of linear dependence on the parameter vector 0 is certainly a restrictive one. We further require that each system (27) be transformable by a parameter independent, global state-space change of coordinates in IR n,
¢ = T(x),
T(xo) = 0,
into the system
(2.1)
p
= Ace + ¢0(v, u) +
~oi¢,(v, ~) i=l
= A¢( + Co(Y, u) + re(y, u)O
(s)
y = C¢C, with
" 0 1 0 . . . O, 001...0 Ac=
:::'-
ooo
:
:i
LOOO...o co= [100...0
,
and ¢i : R x i R '~ --" IRn smooth functions for i - 0 , . . . ,p. There are two reasons to restrict ourselves to such a class of systems. The first one is that for each
458
Marino and Tomei
known parameter vector O, asymptotic observers are available for system (S) and therefore for (27) [19]. The second reason is that necessary and sufficient conditions which characterize those systems (22) which are locally transformable into (S) by (2.1) are known (see [10]): they are an immediate consequence of the main result in [19]. Following [20] and [25] the global version of such a result can be stated as follows. T h e o r e m 2.1 There exists a parameter-independent global state-space diffeomorphism ~ = T(z), with T(xo) = O, transforming the system (~) into (S) if
and only if (i) [ad~ r, ad~ r] = 0, 0 g i, j g n - 1, at) [qi(u), ad~r] = 0, 0 < i < p, 0 < j _< n - 2, (iii) the vector fields f and r are complete,
¥ u E I ~ ra ,
where r is the vector field satisfying
,r>--
d(L -lh) Proof. Conditions (i) are shown in [19] to be necessary and sufficient for (Z) with u = 0 and 0 = 0 to be transformable via a local diffeomorphism (2.10) in U~o into the system = A¢~ + ¢0(Y)
y - C¢¢. The needed local coordinates as those in which
¢ = T(x) are defined, by virtue of assumption (i),
0 a ~ r -- ( - 1 ) / 04n-i '
O
that is, the vector fields ad~r, 0 <_ i <_ n - 1, are simultaneously rectified. Consequently, conditions (ii) guarantee that the vector fields qi(x,u) depend, in the ¢- coordinates, on the output y only. Condition (iii) is necessary and sufficient, according to [20], for the above change of coordinates to be a global one. When condition (iii) fails we only have a local change of coordinates. [] For every known parameter vector 0, the system p
"~= A¢~ + Co(Y, u) + EOi¢i(y, u) + K ( y - C¢~) i=l
(2.2)
Adaptive Observers and Output Feedback for Nonlinear Systems
459
is, with a suitable choice of K, an asymptotic observer (provided that the state x(t) is bounded) with linear error dynamics (e = ( - ¢) (see [19]): -kl -k2 = (Ae - KCc)e =
•
10.--0] 01...0{ :
:
'.
:
e.
-L_, 00..: i / -k.
00...0J
The eigenvalues of A¢ - KCe can be arbitrarily placed by choosing the vector K , since they coincide with the zeroes of the polynomial s n + kxs n-1 + . . . + kn. In order to obtain a linear error dynamics the parameter vector 0 has to be exactly known. If its estimate 0 does not coincide with 0, the following error dynamics results (ee = 0 - 0) ~i = (A¢ - KCe)e + ~P(y, u)es, that is, we are not guaranteed that limt--,oo ][e(t)[I = o. This situation calls for an adaptive version of the observer (2.2). In the literature there are available sufficient conditions of different nature which guarantee the existence of adaptive asymptotic observers for system (22). In [9] an adaptive observer, driven by a p(n - 1) dimensional auxiliary filter, is shown to converge under persistency of excitation for any system (22) satisfying conditions (i)-(iii). This is a general result: the only drawback is that persistency of excitation cannot be a priori checked. In [10] further structural geometric conditions, which are a priori checkable, are imposed on the class of systems (27) and guarantee the construction of an adaptive observer without the need of auxiliary filters and without any persistency of excitation. However, only for a very special subset of systems (2:7), which is identified by necessary and sufficient geometric conditions, can such a simple adaptive observer be used. In [15] it is shown that conditions (i)-(iii) are sufficient for the existence of an adaptive asymptotic observer for an equivalent state of system (27). We will now state some basic definitions and results which are needed in order to recall the main results on adaptive observers given in [15]. D e f i n i t i o n 2.1 A global adaptive observer for system (27) is a finite dimensional system
= ~/(~, u(t), u(t)),
~: ~ ~ " ,
driven by the inputs u(t), y(t), such that
(i) IIx(t) - e(t)ll is bounded, Vt >_ O, Vwo E IR"
(//) limt-..oo Ilx(t)- e(t)ll = o, Vwo e m ' , w(o) e m " .
460
Marino and Tomei
D e f i n i t i o n 2.2 The constant integer v such that Lq~Li.( f +qo ). h = O
'
Vi, 1 -<- i < _ p Vj, O < _ j < _ v - 2 Vx E IRn,
Vu E ]R~
L q ~ L([ ~ q) o h ¢ 0, for some i, 1 -< i _< p for
some
x E ]pn ,
is called the relative degree between the output y and the parameter 0 for the system (,~).
D e f i n i t i o n 2.3 A vector b = [bl,... ,bn] T is said to be Hurwitz of degree p if the associated polynomial bl sn-1
q- b z s " - 2
-1- • . . q-
bn
is of degree p (bl = -.- = bp_a = O, bp ~ O) and Hurwitz, i.e., all its zeroes have real part less than zero.
Lemma (Meyer-Kalman-Yakubovich). Given an n × n Hurwitz matrix A, two n × 1 vectors b and c and an n × n symmetric positive definite matrix Q, if R e [ c W ( j w I - A)-Xb] > O,
Vw E IR,
(2.3)
then there exist a positive scalar e, a vector I and a symmetric positive definite matrix P such that A T p -k P A = - I t w - eQ Pb : e. (2.4) Proof. See for instance [21, p.66].
We now recall a basic result from the theory of linear adaptive observers [21, Chap. 4] which shows how to construct adaptive observers for a class of systems said to be "in adaptive observer form". T h e o r e m 2.2 Consider the following system in adaptive observer form = Acz + ¢0(Y, u) + b[3W(t)O y = Ccz,
(2.5)
with z E ]Rn, y E IR, [3 : IR+ ~ IRp a known and piecewise continuous vector and b = [bl,...,bnl T. f f fl(t) has bounded components and if b is tturwilz with degree one, then the following system is an adaptive observer for (2.5),
= (A¢ - KCc)£" T Co(Y, u) T b/3T(t)0 T K y
(2.6)
Adaptive Observers and Output Feedback for Nonlinear Systems 0 = F/3(t)(y - Cc~) sgn(bl),
461
(2.7)
where ~ E IR n, 0 E IR v, F is a s y m m e t r i c positive definite matrix and K =
[ k l , . . . , kn] T is given by q
K = ~ ( A c b + Ab),
(2.8)
with A an arbitrary positive scalar. Proof. The error system is "- A e + b s g n ( b l ) ( ~ T ( t ) s g n ( b l ) ) e s es = - F([3( t ) sgn( bl ) )Cce ,
where A = Ac - K C c , e = z - ~, and es = 0 - 0. Since by assumption bl ¢ 0, by virtue of (2.8) we can write
[bl[(s" + kl sn-1 + ''" "~ kn) = (8 "Jr A)(bxs n-1 + . . . + bn)sgn(bx) , which implies that the transfer function Cc(sI-
a ) - l b s g n ( b ~ ) = s +1 ,~
(2.9)
satisfies condition (2.3). There are n - 1 pole-zero cancellations in (2.9). T h e matrix A is Hurwitz, since ,~ > 0 and by assumption the vector b is Hurwitz. The thesis is proved once the following claim is proved. C l a i m : Consider the system = Ae + b/3W(t)ea, ~o =
e E ]R n, ee E IRv
--r~(t)cTe,
with A a Hurwitz matrix and F a p x p symmetric positive definite matrix. If the transfer function c T ( s I - A ) - l b satisfies condition (2.3) and the components of/3(t) are bounded, piecewise continuous functions, then e -- 0, es -- 0 is a uniformly stable equilibrium point, the components of e(t) and e0(t) are bounded, and lim lle(t)ll = o. (2.1o) "t -.'+ O 0
P r o o f of the Claim (it is a standard result, see for instance [21, p. 142]). Since cW(sI -- A ) - l b satisfies the conditions of the lemma, there exist two symmetric positive definite matrices P and Q such that AT p + PA = -Q Pb=c.
Choose as candidate Lyapunov function V ( e , e o , t ) = eWpc + e T F - l e o .
(2.11)
462
Marino and Tomei
The time derivative of V is ~r(e, eo,~) = --eTQe.
(2.12)
It follows that e = 0, ee = 0 is uniformly stable [22, p. 199], and e(t), ee(t) are bounded. Since, by hypothesis, fl(t) is bounded, ~ is also bounded and, as a consequence, e(t) is uniformly continuous. Equations (2.11) mad (2.12) imply that e(t) E L 2. By Barbalat's Lemma [23, p. 210], (2.10) follows. [] In the next section we shall discuss the transformation of (zU) into adaptive observer form. 3 Filtered
transformations
into
adaptive
observer
form
The system (S) is not in general in adaptive observer form: (S) coincides with (2.5) only when el(y, u) - bfli(y, u), 1 <_ i < p where/31(y, u): R x m " --. R a n d b is a constant vector. The goal of this section is to transform (,U) into an adaptive observer form. We first recall the main result in [10] which gives necessary and sufficient conditions for system (£7) to be transformed into an adaptive observer form by a local state-space change of coordinates which is independent of 9. T h e o r e m 3.1 [10] There exists a global change of coordinates z = T(z), with T(zo) -- O, transforming system (22) into the adaptive observer form
= Aoz + ¢0(y, u) + b T(y, u)0
(3.1)
y = Ccz, if and only if conditions (i), (it) and (iii) of Theorem P.1 are satisfied and, in addition, rt--1
(iv) qi(x,u) = fli(h(x),u)~bn_j(-X)JadJlr,
Vz e IR",Vu e lR "~, 1 < i <_ p.
j=0
R e m a r k 3.1. Condition (iv) is a simplification of condition (iv) stated in [10, Theorem 3.1]; a slightly different notation is used here. The above result is the global version of the one presented in [10]. [] R e m a r k 3.2. If Theorem 3.1 applies and the assumptions of Theorem 2.2 are met, namely the vector b is Hurwitz of degree one and the vector/?(y(t), u(t)) has bounded components (for instance, it suffices that the system (22) be boundedinput-bounded-state (BIBS) stable for every 0 E II~p and that u(t) be bounded), then the system (2.6)-(2.7) with output ~ = T -1 (~) is a global adaptive observer for (5~). [] Suppose now that condition (iv) of Theorem 3.1 either fails or it holds for a constant vector b not satisfying the assumptions of Theorem 2.2; since the conditions of Theorem 3.1 are necessary and sufficient, in order to apply Theorem 2.2
Adaptive Observers and Output Feedback for Nonlinear Systems
463
we need to determine a class of transformations taking (2Y) into an adaptive observer form (with a Hurwitz vector b of degree one) which is more general than a parameter independent state-space change of coordinates. We now recall from [15] the definition of filtered transformations which will be shown to be able to transform system (/7) satisfying assumptions (i), (ii) and (iii) into an adaptive observer form with a desired b. D e f i n i t i o n 3.1 A filtered transformation is defined as a time-varying global state diffeomarphism, which may depend on the unknown 0:
~=s(~,~(t),o),
z e m " , x e m",~ e Ia',0 e ~ p ,
such that there ezists a C °° function S-X(z,~(t), 0) satisfying
z = sCs-'Cz, ~(~), o), ~(t), o),
v~ e m', vo e rap,
where ~(t) is a signal generated by the auxiliary linear asymptotically stable filter (A is Hurwitz)
= A~ + 6(vCt), u(t)),
~(0) = O,
d~i~en by the inputs u(t), u(t). L e m m a 3.2 [151 If conditions (i)-(iii) of Theorem e.1 are satisfied for the system (22), then ( S ) is globally transformable by a filtered transformation into an adaptive observer form (2.5), with b = [1, b2,..., bn]T the vector of coefficients of the Hurwitz polynomial S " - 1 -~- b2 s n - 2
-~- • • • -~- b . - - ( s "~- , ~ 1 ) " " " ($ -l- ~ n - - 1 ) ,
where hi, 1 < i < n - 1, are arbitrary positive scalars. Proof. If conditions (i)-(iii) of Theorem 2.1 hold, then Theorem 2.1 guarantees that (Z:) is transformable by a global change of coordinates ( = T(z) into (S). Consider the n x n matrix D = [dl d2 ... d,], where dn =[0 0 -.. 1]T (3.2)
dj = Aedj+i + )~jdj+l,
l<j
By construction the vectors dl, . . . , dn are linearly independent: in fact the matrix
D=
d12 1 0 [d13 d23 1
Ldlnd~.d3.
i]
(3.3)
464
Marino and Tomei
is nonsingular, and its entries dis satisfy the relations
Sn--1 .~ dl=sn-2 + . . . + din = (s + ) q ) . . . (s + A~_ 0 8n-2 "t" d23 sn-3 + . . . + d2n - (s + ~2) " "" (s q- An-l)
s + d,,-1,n
= s +
An-1 • I'i
Each vector field ¢i in (S), 1 < i < p, is uniquely expressed as ¢i = E ¢ i i dj, j=l which, substituted in (S), gives n
p
j:l
i=1
(3.4)
= Ao¢ + ¢0(v,-) + ~ e ~ ¢ T ( y , - ) e j=l where cw, 1 _< j < n, are the rows of the n x p matrix
=
• (v,-)
=
[¢,A
"
.
(3.5)
l¢~J
Now consider the following linear, time-invariant, asymptotically stable system in IRp(n-1)
~i = - ~ d ~ + 5i+1 + ¢~+1(y, u),
~(0) = O,
~i E I~p,
l
(3.6) Define the time-varying affine change of coordinates that depends on the unknown parameter vector 0 n
=
~- E d~'_,e,
(3.r)
j=2 which admits the global inverse fi
j=~ Differentiating (3.7) with respect to time and using (3.6) and (3.4), we obtain
j=l
j=2
.i=2
Adaptive Observers and Output Feedback for Nonlinear Systems 1"1
465
n
= Aoz + ¢0(y, u) + ~ dj*T(y, u)0 + Ao ~ ~eT_, 0 j----1
j=2 n-1
n
q- E dj(,'~j-l~?_l - ~T(y, u))O -- E di 6T° .~=2
j--2 rl
rt
= Acz + ¢o(Y, u)+ ~-~(Acdj + aj_ldj)~?_l j=2
0-
E j=3
dj-l~? -I0 "~"dighT°
Taking (3.2) into account, we have
= Aez + ¢0(y,u) + d~(¢T(y, u) + ~T)o.
(3.s)
In addition, since, according to (3.3), Ccdj - O, 2 < j < n, from (3.7) we have I%
u = c¢¢ = Ccz + co ~ dj~f_~O =
Coz.
(3.9)
j=2
The system (3.8)-(3.9) is in the adaptive observer form (2.5) with b = d I and j3(t) = ¢l(y(t), u(t))+~z(t). The filtered transformation is therefore given by the composition o f ¢ = T(z) with (3.7), where the dynamics of ~j_~(t), 2 < j < n, are given by (3.6). [] R e m a r k 3.3. Assume there exists a choice for ~i, 1 < i < n - 1, such that in (3.4) el(y, u) = 0, i* < i < n (a necessary condition for such a choice to exist is that rank{~P(y, u)} = i* - 1, Vy, u). Then, the system (3.6) becomes
~i = -~i~i + ~i+1 + ¢i+~(y, u), ~i--1 = 0 .
1
Indeed, from (3.6) we obtain ~i(t) = 0, i* - 1 < i < n - 1, and therefore the auxiliary filter is of reduced dimension (i* - 2)p. Consequently, the transformation (3.7) becomes i*-1
(3.10) j=2
[] We are now ready to state the main result of this section. T h e o r e m 3.3 [15] If the system (E) is bounded-input-bounded-output (BIBO) stable and conditions (i)-(iii) of Theorem ~.I are satisfied, then for every bounded input u(t) there exists a global adaptive observer for a system whose slate is equivalent to the physical state x of (~,).
466
Marino and Tomei
Proof. Lemma 3.2 applies since conditions (i)-(iii) of Theorem 2.1 are satisfied and, consequently, there exists a filtered transformation (3.6)-(3.7) which takes (27) into (3.8)-(3.9) which is in adaptive observer form. Since (27) is BIBO stable, for every bounded u(t) the functions ¢~(y, u), 1 < i < n, are bounded, ¢~ being continuous, and ~1 is bounded since (3.6) is exponentially stable. It follows that Ct(y, u) + ~x is bounded and therefore Theorem 2.2 applies to the system (3.8)-(3.9), guaranteeing the construction of an adaptive observer for the state z of (3.8)-(3.9) which is, according to (2.1) and (3.7), an equivalent state for (27): in fact, z is related to z by a time-varying nonlinear change of coordinates depending on the unknown parameter vector O. [] R e m a r k 3.4. Let us examine how Theorem 3.3 specializes for single-output linear systems with m inputs
010...0] 00 1-.. 01 :::'.
: ] ~--
/ooo..:i / L000...0J v=
[ al as •
Fbll Ul
a._,
(L)
t a,
[1 0 0 . . . 0 ] i ,
which are the realizations in observer form of the transfer matrix W ( s ) = [Wl(s) W~(s) . . . W ~ ( s ) ] ,
with b ~ " - P + . . . + b~ W~(s) = s'* + diS "-1 + . . . + a,,
If we are given the transfer matrix W ( s ) with unknown poles and zeroes, then 0 = [al, . . . , a , , bl, . . . , ¢ , , . . . , bT, . . . , b,~] T •
is the (m + 1)n-dimensional unknown constant parameter vector and system (L) is of type (S). Under the assumption of BIBO stability Theorem 3.3 applies and provides a globM adaptive observer for the system described by the transfer matrix W ( s ) with unknown poles and zeroes. We then reobtain, as a corollary of Theorem 3.3, a well-known result of adaptive control theory for linear systems (see [21, Chap. 4] for an overview). Filtered transformations can be viewed as a tool to produce a nonminimal realization of the transfer matrix W(s). As an interpretation we can observe that nonminimal realizations of W ( s ) are needed in order to achieve an adaptive observer form which, as discussed, may be nonminimal. Theorem 3.1 gives sufficient conditions under which filtered transformations are not needed and therefore there is no need for nonminimal realizations. []
Adaptive Observers and Output Feedback for Nonlinear Systems
467
R e m a r k 3.5. Theorem 3.3 does not guarantee an estimate for the physical state z, since z is related to z through the unknown parameter 0. If, in addition to the hypotheses of Theorem 3.3, we assume the following persistency of excitation condition t+T
oo>c1I~_
(q~l+~l)(~bl+~l)Tdr>c2I>O,
(3.11)
,It
where el, c~. and T are positive constants, and ¢1, E1 have bounded timederivative for all t >_ 0, except possibly for a countable number of points of fixed minimum separation, then the parameter estimates given by the adaptive observer of Theorem 2.2 converge to the true parameter values. This result can be proved by a direct application of Theorem 2.3 of [24, p. 44]. [] R e m a r k 3.6. It remains open the problem of determining sufficient conditions under which the system ~: = f ( z , u, t~), which is more general than the system (2Y), since the parameters may enter in a nonlinear way, can be transformed by filtered transformations (or more general ones) into adaptive observer form. Any result in this direction would enlarge the applicability of the methods discussed in this section. 13
4 Adaptive
observers
for the original
coordinates
In this section we provide sufficient conditions for the existence of adaptive observers which asymptotically track the physical coordinates of system (~U), without requiring persistency of excitation. T h e o r e m 4.1 [15] Assume that the system (~) is BIBS stable and satisfies
the conditions of Theorem 3.1, with b = [bl,...,bn] Hurwitz. Then, for every bounded input u(t) there exists a global adaptive observer for (~). Proof. Since the conditions of Theorem 3.1 (and of Theorem 2.1) hold, the system (2~) is globally transformable by a parameter-independent change of coordinates ( - T(~) into the adaptive observer form (3.1). Let v be the relative degree of (,U). The system (3.1) can be rewritten in the form (3.4) = A¢~ + ¢0(y, u) + d,¢wo y = C¢(,
(4.1)
with 1,b~+l, ...,b,,l T,
de = l b = [0,. bb'
=
=).
" "'
Define the constant vectors = Atdj +
i ,
2 _< j <
(4.2)
468
Maxino and Tomei
where Aj,1 _< j < u - 1, are arbitrary positive scalars. As in the proof of Lemma 3.2, we introduce the auxiliary linear filter
~ = -a~& + ~+1, ~ E Irtp, 1 < i < v - 2 , ~ ( 0 ) = 0 ~-~ = -~-i~-1 + ¢~(y, ~), ~_~ e n v , ~ _ ~ ( 0 ) = 0
(4.3)
(note that no filter is needed when v = 1) and define the time-varying change of coordinates z ----~ -
~ dj~T_10
(4.4)
j=2
which transforms the system (4.1) into the adaptive observer form, = Acz + ¢0(y, u) + dl (¢T(Y, u) + ~ T ) 0
(4.5)
y = Cez, with ¢ l ( y , u ) = 0 if t, > 1 (¢l(y,u) - fl(y,u) f l y = 1) and dl satisfying [ s " - ' 8 "-~ . . . 1]dl = ( 8 + ~ 1 ) . . .
(s+~,-1)(s"-'+b~+~"-~-l+
.. . + b , ) .
(4.6)
Since A~, 1 < / < v - 1 are positive scalars and by hypothesis the vector b is Hurwitz, the left-hand side of (4.6) is a tturwitz polynomial of degree n - 1. Since, in addition, (22) is assumed to be BIBS stable and therefore BIBO stable for every 0 e ll~p, ¢1 + ~a in (4.5) is bounded for every u(t) bounded and Theorem 2.2 applies to (4.5). We consider the adaptive observer
z= (Ac-gc¢)b.+¢o(y,u)+dl(¢T(y,u)+~w)o+
gy
(4.7)
k
0 = r~(y
- cc~),
(4.s)
with K = (Ac + AI)dl, A being an arbitrary positive scalar. The error system is given by = (Ac - KCc)e + dl(¢w(y, u) + ~W)e0 (4.9)
~ = -r(¢T(y, ~) + ~T)coe.
(4.10)
For every bounded input u(t), Theorem 2.2 guarantees that e(t) and eo(t) are bounded, and lim lie(011 = 0. (4.11) t-*CO Now, using the transformation (4.4), define the estimate of the state x of system
e = T -1
~+
~ ~ J=~
,
(4.1~)
/
so that (4.12) is the output of the adaptive observer (4.3), (4.7), (4.8). The estimation error is given by
0
Adaptive Observers and Output Feedback for Nonlinear Systems
469
If v = 1, then no filter is needed and the transformation (4.4) coincides with the identity transformation. Consequently, (4.11) and the hypothesis of bounded state z(t) imply lira IIx(t) - ~(t)ll = 0.
(4.14)
t--*OO
I f v > 1, differentiating (4.9) with respect to time and recalling that ¢x(Y, u) = 0, we obtain By (4.3), (4.9) and (4.10), the signals 41, ~ and 60 are bounded. It follows that is also bounded which, in turn, implies that ~ is uniformly continuous. Besides, lim
£
~(r)dr
= lim el(t) - ei(O) = - e l ( O ) < ~ ,
1 < i < n .
Applying Barbalat's Lemma [23, p. 210], we have lim II~(t)l[ = 0 .
t---*OO
(4.15)
From (4.9), (4.11) and (4.15), we obtain
aim ~T(t)~e(t) = o.
(4.16)
The time derivatives of ~Wee are
= ( - ~ 1 6 + ~2)T~, + ~T(-G6C¢~) d~ W "" ~-~(~ co) = 6Teo + 2 ~ o
(4.17)
+ ~o
d2 Since ~1, ~2, ~ and ~0 are bounded, also ~-~([Tee) is bounded and, as a consed T quence, ~-~(~x ee) is uniformly continuous. By (4.16) and using Barbalat's Lemma it follows that lim a ( ~ T e e ) = 0, (4.18) t ---*OO
which, in view of (4.11), (4.16), (4.17) and (4.18), implies lim 5T(t)ee(t) = 0. l--*OO
Iterating the above argument it is easy to show that lim ~ T ( t ) e o ( t ) = 0,
t ---* oO
1 < i < v -- 1.
(4.19)
From (4.13) and (4.19) we can see that (4.14) holds, for every value of v. Therefore, the dynamic equations (4.3), (4.7), (4.8) and (4.12) represent a global adaptive observer for system (,U), for every bounded input u ( t ) . 13
470
Marino and Tomei
R e m a r k 4.1. Consider the system (E) and suppose that Theorem 4.1 does not apply. However, if Theorem 3.3 applies we can construct an adaptive observer that asymptotically estimates an equivalent state. Let v be the relative degree of system (57). It is easy to see, by using the same arguments in the proof of Theorem 4.1, that the estimates obtained by the inverse transformation 5: = 11
T-1($ + Edj~W_10) are such that j=2
lim [zi(t) - ~i(t)l = O,
~-.-* O0
1 < i < u,
(4.20)
where mi is the i-th component of m. 5 Adaptive results
output-feedback
[] stabilization:
preliminary
In this section we restrict the class of systems (57) introduced in Section 2 by assuming that only a single input, which appearslinearly in the state equations, is available:
~ = ~ ( = ) + 90(=)+
0,g,(=) ~+
0,q,(~), ~ e n ~ " , ~ e m , 0 e ~ c ~ p i=l
y=h(z),
yeIR,
( 57n) where z is the state vector, u is the control, 0 = [01,..., Op]w is the (constant) parameter vector belonging to 1"2,a closed subset o f l R p, h : C ~ ( I W *, R), h(0) = 0, is the output function, f , go, gi, qi, 1 < i < p, are smooth vector fields on IR", with f(0) = 0, qi(O) = O, 1 < i < p, and p
go(~) + ~o,gi(~) ~=g(~; 0) # 0, w e m " , v0 e ~. i=l
From now on we shall assume without loss of generality that the origin is an equilibrium point. We now further restrict the class of systems ( ~ R ) determined by Theorem 2.1. T h e o r e m 5.1 [18] The system ( ~ R ) is transformable by a global state-space diffeornorphisrn (independent of O) (=T(z),
T(0) = 0,
(5.1)
into (let Oo = 1 for convenience of notation) P
P
= A~¢ + (b0 + ~ 0~b~)~(u)~+ ¢0(y) + ~ O,¢,(u) i=1
y=C=¢,
i=1
i=O
(sn)
A d a p t i v e O b s e r v e r s a n d O u t p u t F e e d b a c k for N o n l i n e a r S y s t e m s
471
with (A¢, b, Ce) in observer canonical form
A¢=
b,(O) ] b2(:0) /
010...0] 001'"0 / :: :'. : ,
b(O) =
/000..:i /
b._](O)| b.(O) J
LO o o... oj
co= 11oo...o1, if and only if
(i)
[ad~ r, ad~ r] = O,
(ii)[ql,ad)r]=O, (iii)[gj,ad~r]=O,
0 < i, j < n - 1
1
n
Or) a~ = ~(u) ~')j~(-1)"-' adT-%
0 <_ j < p
4=1
(v) the vector fields f and r are complete, where r is the vector field satisfying
<
9r>'-
.
d(L - l h Proof. Conditions (i) are shown in [19] to be necessary and sufficient for the existence of a local state-space change of coordinates ¢~ = ~(~),
1 < i < n,
(5.2)
in which ad~r = (-1)i 0 ~ _ / , 0 < i < n - 1, and the system
= f(~)
(5.3)
= A e ¢ + ¢0(Y)
(5.4)
y = h(z) takes the form
y=C¢¢.
Conditions (i), (ii) and (iii) are shown in [10] to be necessary and sufficient for the existence of a local, parameter independent, state-space diffeomorphism (5.2) transforming (,UR) into
4 = Ao~ + ~o(y) +
oj~(y) u + Co(y) + y~o~¢~(y) "---
y=C¢¢.
i----1
(5.51
472
Marino and Tomei
Condition (iv) implies that the vector fields ~/i(Y) in (5.5) become
]
bil ] bi2
: , ~'(Y) = ~(Y) b,/~i: 1 l
0 < i < p.
(5.6)
The proof of the above statement is an easy consequence of the main result in [19]: in the coordinates ~ = T ( z ) in which the system (5.3) takes the form (5.4), we have
ad~r = ( - 1 ) ~ ~¢._~ , and (iv) becomes in the ~-coordinates (iv) ~ =
~(y)~bj,~, -i~--1 -
o < ~ < v,
L,L" ~t
i.e., (5.6). On the other hand, the necessity of conditions (i), (ii), (iii) and (iv), which are coordinate free, is readily verified on (SR). Condition (v) is necessary and sufficient, according to [20], for the above change of coordinates to be global. When (v) is violated the change of coordinates is a local one. [] Let us recall [1,2] the definition of relative degree. D e f i n i t i o n 5.1 Let p, the (strong) relative degree of the output y with respect to the control u .for the system (•R) with 0 = 0 (nominal system), be defined as LgLiyh(x) = O, L g L ~ - l h ( x ) # O,
O < i < p - 2, Vx E IK" for some z C I~n. []
The vector field g(z; O) becomes in the ~-coordinates g=
~(Y)EE i=1
Ojbj'
= a y
j=0
i
,
i=1
At this point we introduce a different pararnetrization. We consider the n p + 1 components of b(O) = [b1(0),... ,bn(O)] r as unknown parameters: P
i=0
Ao¢ + b~(y)~ + ~(u,
~)~
y = C¢(, where 0 = [00, ~ 1 , . . . , 0v]T with 0o = 1 for convenience of notation. The above system contains p + n - p + 1 unknown parameters. In what follows we assume (recall Definition 2.3) that
Adaptive Observers and Output Feedback for Nonlinear Systems
473
P
(vi) the vector b(O) = bo + Ebi(O), which appears in condition (iv), is Hurwitz i=1
with constant degree p for every 0 E 12, and the sign of the scalar bop + p ~b~pO~ is known and constant in t2. i=1
The class of systems (EK) satisfying conditions (i)-(v) of Theorem 5.1 and, in addition, assumption (vi), have the remarkable property that the zero dynamics (see [1, Chap. 4], for a definition) can be expressed, in suitable global coordinates, by a linear asymptotically stable system. T h e o r e m 5.2 The zero dynamics of the system (~R) with relative degree p satisfying conditions (i)-(vi), can be expressed in suitable global coordinates as
bp+l - _-~-p i 0 . • . 0 " bp+2 ---~--p 0 1 ~=
0
: - : : • . ". : z. b,-1 ---~Z 0 0 "" 1 b. -b'7
O0
(5.7)
0
Proof. Conditions (i)-(v) guarantee by Theorem 5.1 the existence of a global change of coordinates (5.1) in which the system (2~K) is expressed as (SR). Since the relative degree is p, we have bl = ..- = bp_l = 0, bp ¢ 0 and, by assumption (vi), the polynomial bpsP + . . . + b,_ls + b,
(5.8)
is Hurwitz. By choosing the control law
u
1 -
bpa(y)
(-¢p+i),
(5.9)
the manifold {~ E IRn : ~1 = "'" = ~p = 0} is made invariant and the dynamics on it (the zero dynamics according to [1, Chap• 4]) is given by (5.7) and is asymptotically stable since (5.8) is Hurwitz. D D e f i n i t i o n 5.2 A global (adaptive) output-feedback stabilizing controller for system ( ~ R ) is a finite dimensional system = ~ ( ~ , u(t)),
~, = Z(~,, ~(t)),
~ ( o ) = too, ~ e
IR",
~, • IR,
such that for every x(O) • IRn and wo • IR e, IIx(t)ll and II~o(t)ll are bounded Vt >_ O, and lim IIx(t)ll = o, |....-* O 0 for the closed-loop system.
[]
474
Marino and Tomei
We now first establish a technical result which generalizes L e m m a 3.2 and will be one of the basic tools in the next two sections• L e m m a 5.3 Consider the system P
= Acx + fo(Y, u, O) + Z f i ( y ,
u)0i
i=l
(5.10)
y=Ccz. Let fij, 1 < j < n, be the components of the vector fi. I f
Were, r u e m,
f, Au, u) = 0,
l
(5.11)
then there ezists a filtered transformation
z = s(~, ~(t), o) (5.12) = A~ + ~(v, u),
~(o) = o,
taking (5.10) into p
~" = Aez + fo(Y, u, O) + b Z T i ( y , u,~)O, /=1
Y=Cc
(5.13)
z ,
in which b = [ 0 , . . . , bp2,..., bn]T is a Ilurwitz vector of degree P2 < Pl, such that
b~s--p~ + . • • + b, = b~(s + ~1)•.• (s + ~._~) with Ai, 1 < i < n - p~, arbitrary positive scalars. Proof. Consider the n x (n - p2 + 1) matrix
D =
0
0
•
•
-..0" •
°
1
0
.-.0
dp2,p2+l
1
--- 0
•
dp2,n
.
~= [ a ~ ... 6, ],
(5.14)
.
dp2+l,n ... 1.
where the columns di, P2 < i < n, which are hnearly independent, are related by the recursive formula
d.= [0...011T (5.15)
n~j~2.
Adaptive Observers and Output Feedback for Nonlinear Systems
475
From (5.14) and (5.15) it follows that
sn-P2+dp~,a2+ls"-P2-x+...+dpu,n-'(s+Xn_l)...(s+X1) s,*-.~ -1 + dp~+l,p~+~s'*-p.-~ + . . . + d.,+t,. = (s + ;~.-x)... (s + .~2)
s+
d._Ln =
s+
A.-1 •
Note t h a t dp2 = ~ 1 b . Condition (5.11) guarantees that the first Pl - 1 components of the vectors fl, 1 < i < p, are zero: each vector field fi(Y, u) can be uniquely expressed as
1 < i
(5.16)
k=p=
Substituting the expression (5.16) in (5.10), we obtain P
= Acx + ~ o , ~ a ~ ( v , ~) + So(y, ~,o) i=1
k=p2
(5.17)
ft
= Acz + ~ dkCT(y, u)O + fo(Y, u, 01, k=p.,
where the p x 1 vectors ~ , are given by ¢k = [ ¢ x k , . . . , ~pk] T, P2 _____k __~ n. Now consider the filtered transformation consisting of the linear time-invariant asymptotically stable filters
~j--'--~j~j"[-~j+l'~t-¢j+l(y,U),
p2~j~n-1 (5.18)
~,=0, where ~i E ]Rv, 1 < i < n, and the time-varying affine change of coordinates (5.19) j=p2+l
which has the global inverse n
j=p~+l
476
Mazino and Toraei
Differentiating (5.19) with respect to time, and using (5.17) and (5.18), we obtain n j=p2-1-1 n
=Acz+
Z
n
trl
Acdj~?-10+ Z dJ*T(Y'U)O- Z dJ~T-l°+f°(Y,U,°)
jfp2+l
j=h+l
jf#~
11
J=P2
j=p2+l n
+ E
n-1
dJ[AJ-1/5"T-x--dj~kT(y'u)] O - Z
j=p2+l
dj['To
j=p2+l
=A.z+fo(y,u,O)+ ~('.dj+Jj_xdj)~:_,O~ dj_~:_:O j=p2+2
j=p2+l
which, taking (5.15) into account, becomes = Adz + fo(Y, u,O) + do, [¢T(y, U) + ~T] 0. In addition, since according to (5.14), Ccdj obtain
---
(5.20)
O, ,02 -]- 1 < j < n, from (5.19) We
n
c¢dj T_lo = Ccz.
v = cox = vex +
(5.211
j=pa+l
Recalling that dp~ = ~ b , we can finally express (5.20)-(5.21) in the form (5.13), defining 7i(Y, u,~) as 1 7i(y,u,()=-~p [¢,p.(y,u)+(..i],
l
with ~p,i being the i-th component of the vector ~p,. 6 Adaptive output-feedback relative-degree-one case
[]
stabilization:
T h e o r e m 6.1 Consider the system (•R) with relative degree p = 1. If assumptions (i)-(vi) are satisfied, then there exists a global adaptive output-feedback stabilizing controller for every unknown value of the parameter vector 0 E 12. Proof. Since assumptions (i)-(v) are satisfied, Theorem 5.1 applies and guarantees the existence of a global change of coordinates ¢ -- T(x) transforming (ER)
Adaptive Observers and Output Feedback for Nonlinear Systems
477
into P
i=0
(6.1)
A_ Ao¢ + b~(y)u + ~(y)~ CX,
y=
which can be rewritten as I-
P
= A¢~ + -~bsgn(bx)~(y)u + ~O,~bi(y)
(6.2)
i=0
y = C¢¢, where/c = ~
1
and b =
D=
where the vectors di, the recursive formula
[lO}
[ 1, ~'t b, . . . . ' ~bll J ~ " Consider
d12 :
1 < i < n,
1.
the m a t r i x
A [dl..-d,],
(6.3)
which are linearly independent, are related by
d.= [0...01] T (6.4)
n~j~2, with Aj, 1 < j <_ n - I, being positive scalars. The system (6.2) can be rewritten as
(
4 - Ac( + ldl +
)
-~dj~j-1 sgn(bl)a(y)u + ~O,¢i(y) j=2
(6.5)
/=0
where ~ = [ 1 , ~ , . . . ,A~-~F is such that b= Dfl. Defining flj = - ~ , 1 < j < n - 1, we can rewrite (6.5) as n-1
p
= Ac( + kdl sgn(bl)a(y)u + E dj+l~j sgn(bl)a(y)u + E Oi~bi(y) j=l
y = Co(.
i=0
(6.6)
478
Marino and Tomei
Now, consider the following filtered transformation consisting of the filters ]~n--l,l -- --')tlPn--l,l "~ ~.--1,2
~-l,n-1
--
--)tn--lbtn-l,n-1 de" sgu(bl)a(y)u
~n--2,1 -" --~X/-Irl--2,1 q-~n-2,2
(6.7)
i~-z,,.-z = - a . - 2 ~ . - z , . - z
+ sgn(b,)a(y)u
1~1,1 "- --~1/~1,1 -I- sgn(bl)O'(Y) ~ and of the transformation n-I
~=¢-
i+1
~ / 3 , ~ dj.,,i-1 • i=l
(6.8)
j=2
In these new coordinates, we have n--1
"- Ac'~ "[" ~ dl sgrJ'(bl)°'(Y)" "~ ~ Oi~)i(Y)1Ldl 2 fli.i,1 i=0
(6.9)
i=1
y = Ccz.
We now apply Lemma 5.3 which guarantees that there exists a filtered transformation z
= S(~, ~(t), 0)
(6.1o) = A~ + 6(y),
~(0) = 0,
such that (6.9) is transformed into n--1
p
k = A¢z + ld, sgn(bl)a(y)u + dl ~_~,~,,1 + dl ~'ri(y,~)Oi 1
i=x
i=o
A¢z + -~dxsgn(bl)~(y)u + dl#Tfl + dx[7o(y, ¢) +
(6.11) 7T(Y,~)0)]
y= Ccz, where /3 = [/31,... ,/3"_x] T, PI = [Pl,1,...,/~,-1,z] T, 7 = [71,...,7p] w and 70(0,0) = 0, 7(0,0) = 0. Let the control u be defined as u:-/e
a~y)sgn(bl ) [Kc$+/~Tj~+ 70(Y,~)+ 7T(Y,~)ol ,
(6.12)
Adaptive Observers and Output Feedback for Nonlinear Systems
479
where a(y) • 0, Vy E IR, by virtue of the hypothesis that g(z; O) ~ 0, Vz E IRn, V0 E J~, and sgn(bx) is known by assumption (vi). The matrix Kc is chosen so that (Xc is a positive scalar) det[sI - (Ac - d l K c ) ] -- (s + .~c)(s " - t + d12s n-2 + - - - + dzn),
(6.13)
i.e., the eigenvalues of the matrix (Ac - d z K c ) are the zeroes of the polynomial s"-Z+ds2s"-2+ ". .+dx., which is Hurwitz by (6.4), in addition to -A¢. Such a K¢ exists since the pair (A¢, dx) is controllable (since dl is Hurwitz by construction and, therefore, dzn ¢ 0). The state estimate ~ in (6.12) is provided by the adaptive observer z=
-{- bl dl ~r(y)u -~ dl [.T~ -'J"")'O(y, ~¢) "[- "yT(y,~¢)0] @ K o ( y - C c ~ ' ) ,
(6.14)
in which Ks is chosen so that (As is a positive scalar) det[sI - (A¢ - KoC¢)] = (s + ~o)(S "-x + d12 sn-2 -~- • • • -~ din ) .
(6.15)
Such a Ks exists since the pair (C¢, A¢) is observable. The parameter update laws are chosen to be
0 = Gs[y + r/(y - Ce~)IT(Y,~)
(6.16)
/3 = Go[y + rl(y- Cc})]UI
(6.17)
- g~ [K¢~ +/~T~ + 7o(Y,~) + 7(Y,~) T~] Y
bl
--
gb(Y --
C¢~.)o'(y)u,
(6.18) (6.19)
with q, g~ and gb being positive scalars, Gs and G o being symmetric positive definite matrices. From (6.11), (6.12), (6.14) and (6.16)-(6.19), we obtain the extended dynamics (we use the notations: ~ - z - ~, 0 = 0 - 0, /3 = /3 - 8,
-
-
-
dlKc)z "I- dlKcZ, "Jr dl
(Ac-
i:
+d~-
+
+
[~T~-~,,,},T(y, ~)0]J
L 0 +
z = (Ac - KoCc)~" + dlbla(y)u 4- dl~'~fl "1"7T(y,~)0] L
o= -
Cc(z +
(6.20)
(6.21)
o )'r(u,O
L (6.22)
bl=--gbCj~(y)u.
480
Marino and Tomei
Consider the function 1 -1--2 V(z, ~', ~, ~, k, hi) = zTP=z + rl~,TPe~, + 9TGe'10 +/~TG~I/~ + ~gk k + rlgblbl2 , (6.23)
where k = ~
1
is a positive unknown constant and Pz and Re are, respectively,
the symmetric positive definite solutions of (Ac
-
dlKe)TPz + Pz(Ac
dike) = -Izl T
-
-
e,I
(6.24) Pzdl : C T
and (Ac - KoCc)Tpe + Pe(A¢ - KoCe) = - I J w - ¢~I
(6.25) Pedl = C T , with ez and ee positive scalars. Such solutions exist since, according to (6.13) and (6.15), the triples (A¢- dIKc,di,Ce) and (Ac- KoCc,di,Ce) satisfy the Meyer-Kalman-Yakubovich L e m m a [21,p. 66].The time derivativeof (6.23) with respect to the dynamics (6.20)-(6.22)satisfiesthe inequality
-
lllZ'll
IIC~"~Kell
(6.26)
.'~,~ J LII.~II/'
and, therefore, if 1/is chosen such that
. >
II~KelI2,
(6.27)
~z ~Se
then there exists a positive constant kL such that
v <_kL ][llzlll --
II~'llJ
2
(6.28) "
Then, I/is negative semidefinite. From (6.23) and (6.28), we obtain that IIz(t)ll, [[~(t)l[, i[~(t)H, [i/~(~)ll, [[/c(t)H and Hbl(t)l[ are bounded, which imply that also y(t) = Ccz(t) is bounded, Vt > 0. From (6.10) it then follows that [l~(t)][ is bounded. From (6.1), the input-output map is given by the differential equation
b,dt~_~[~(v)u] = ~
~
¢~,(y)0,
,
(6.29)
in which ¢~i denotes the i-th element of the vector ej. From the first set of filters in (6.7), we obtain the differential equation relating a(y)u to/*,-1,1 d--1
a~-~
dtn_l P,~-l,1 + dl2 dt---ff'~_2p,-l,1 + ' " + dlnPn-l,l = sgn(bl)a(y)u.
(6.3o)
Adaptive Observers and Output Feedback for Nonlinear Systems
481
Substituting (6.30) in (6.29), we have
aldt2n_l_i/tn_l,, = ~
dt--ff__~ i=l
i=l
eji(y)9i
,
(6.31)
"=
where al, 1 < i < 2n - 1, are the coefficients of the Hurwitz polynomial defined by sgn(bO(bls n-1 + . . . + b , ) ( s n-1 + dl2s n-~ + . . . + din) = d i s 2n-2 + . . . + a2n-1 •
The system (6.31) can be realized as _a__?.2al 1 0 . . . 0 ] ~ =
a2,-2 0 0 ... 1 al
/
~o + y E , , - 2 -
,=1
E,_~+i ~
j=0
¢~i(U)01
(6.32)
a 2 . - 1 0 0 . . . oJ~ al
~,_,,~ = [ 1 0 0 . . . 0 1 ~ , where w is the (2n - 2) x 1 state vector and Ei represents the i-th column of the (2n - 2) x (2n - 2) identity matrix. Since (6.32) is an asymptotically stable linear system whose inputs are bounded (y has been already been shown to be bounded), it follows that the output /Jn-l,x(t) is also bounded. A similar reasoning may be used to prove that also the other filter variables ~ty,1, 1 < j < n - 2, are bounded. According to (6.12) the control u(t) is bounded as well. Consequently, in (6.7) the state vectors pi(t), 1 < i < n - 1, are bounded. From (6.20) and (6.21) we can see that I]~(t)]l and ]]z(t)]] are bounded, which imply that the quadratic form on the right-hand side of (6.32) is uniformly continuous. Moreover,
kL,
£11I'11:0")11]11 mJ:o' LIl~(r)ll
d~ _< t
-Vdr
= V(0) - t--.oolimV(t) < cx3.
Applying Barbalat's Lemma [23, p. 210] it follows that lim IIz(t)ll = o t--oo
(6.33)
lim II~(t)ll = 0, t'--* CO and, since y = Ccz, lim y(t) = O.
1-,-,*oo
(6.34)
Now consider the systems (6.10) and (6.32). Since y(t) is bounded and (6.34) holds, we can apply a result in [22, p. 59] to establish that lim II¢(t)ll = 0 ]ira ~ . - 1 , , ( ~ ) = o , f--*OO
(6.35)
482
Maxino and Tomei
and, similarly, lim pi,1(t) = 0,
l-=*OO
1 < i < n - 2.
(6.36)
Taking (6.33)-(6.36) into account, we obtain from (6.12) that lira u(t) = 0
~.--* OO
(6.37)
which allows us to apply again the result in [22, p. 59] to the filters (6.7), yielding
,ILm~ II~,(t)ll = 0,
1 < i < n - 1.
(6.38)
Since ¢(t) is related to z(0 by the transformations (6.8) and (6.10), and (6.33)(6.38) hold, we finally have that lim II¢(t)ll = 0,
~.--*¢o
which implies nm IIz(t)ll - 0.
~---*oo
13 E x a m p l e 6.1. Consider the system Xl :
3 2 Jr" O(e y - -
1)
~ 2 "-- 2~3
~3=u y -- ClXl
-I- C 2 X 2 -I- C 3 X 3
with ca > 0 and [el, e2, e3] an unknown Hurwitz vector. For every value of the constant parameter 0 ¢ 0, this system is not state-feedback linearizable; hence, the adaptive algorithms proposed in [3,4,6,8] do not apply. On the other hand, the global Lipsehitz conditions required in [7] are not satisfied either. However, this example is a relative-degree-one system which satisfies the assumptions of Theorem 6.1 and can therefore be globally adaptively stabilized by output feedback. []
7 Adaptive output-feedback stabilization: general case
(p _> In this section we generalize Theorem 6.1 to the case of any relative degree p, by using an adaptive scheme introduced in [5] and a recursive procedure for the construction of Lyapunov functions for cascade nonlinear systems given in [16-18]. T h e o r e m 7.1 Consider the system (•R) with relative degree p, 2 <_ p < n. If assumptions (i)-(vi) are satisfied, then there exists a global adaptive outputfeedback stabilizing controller for every unknown value of the parameter vector OED.
Adaptive Observers and Output Feedback for Nonlinear Systems
483
Proof• Since assumptions (i)-(v) are satisfied, Theorem 5.1 applies and guarantees the existence of a global change of coordinates ~ = T(z) transforming (~TR) into (SR) with b = [0, . . . , 0, bp,..., bn]T. Now, consider the filtered transformation consisting of the filters
(7•1) iop-1 = -Xp-l~op-1 + a(v)u, with ~oi E IR, 1 < i _< p - 1, and of the transformation
~ = ¢- ~(,7,~,_,),
(7.2)
i----2
where ~i, 1 < i < p - 1, are arbitrary positive scalars and all, 1 < i < p, are unknown vectors given by the recursive formula ~=b
p>_j>_2.
(7.3)
Note that, since the vector b is Hurwitz by assumption (vi), dl is also Hurwitz and its first element is dxl = bp. The dynamics in the ~-coordinates is given by p
i=0
A¢~ + drool + #(y)8
(7.4)
y = C¢~, where dl is an unknown Hurwitz vector, of degree one by construction, whose first element bp is of known sign. We now apply Theorem 6.1 to (7.4) and, following [16-18], consider ~01 as the control variable. According to the proof of Theorem 6•1, we rewrite (7.4) as
- A¢~ + ~dllOl sgn(bp) + ~(y)0 y = C¢~.,
(7.5)
484
Marino and Tomei
where k = ~,~,',1 and dl =
[1,-~--~.od12' .-., dlnl w bJ e . We introduce the filtered transfor-
mation ]~-i,1 = -ALP,-1,1 +/J,-1,z
/-~n-l,n-1 --" --An--lJ/n--l,n-1 "I- sgn(bp)~Ol ]~tn_2, I -- --~1].1n_2,1 + Pin-2,2
(7.6)
~/.-2,n-2 -- --An-2]gn-2,n-2 + sgn(bp)tOl
~1,1 ---- --AI~I,1 n--I
-~- sgn(ba)w1 i+1
z ' = ~ - ~--~/?i ~ i=1
dj/~,d-i ,
(7.7)
j=2
where f/i, 1 < i < n - 1, and di, 1 < i < n, are defined as in the proof of Theorem 6.1. In the ~-coordinates, we have
1
~t = Ac~" + ~dl sgn(bp)ial + dlpWf/+ #(y)O
(7.8)
y = C¢~', where f / = ~ 1 , . . . , f / n - l ] T, $~1 ---- [ ~ 1 , 1 , . . . , ~n-l,1] T. We now apply Lemma 5.3, which guarantees that there exists a filtered transformation
,=s(~',~(t),~) ~=A,+~(v),
(7.9)
~(0) = 0,
such that (7.8) is transformed into
= Acz + l d l sgn(bp)to, + dlpWfl +dl [7o(Y, {) + 7T(y, {)O] y = Cez,
(7.10)
where 7 -- [71, . . . , ,),p]T with 7o(0, 0) ---- 0 and 7(0, 0) = 0. Following the proof of Theorem 6.1, to1 should be designed as follows
~; = - sgn(b~)k [go~ + ~l~l + ~0(v,~) + ~T(V,~)~I],
(7.11)
Adaptive Observers and Output Feedback for Nonlinear Systems
485
where the estimates of the states and of the parameters are given by
= A¢~:+ bp,dl~; + dl [/.IT/~I "~-"f0(Y,~) "~-7T(Y,~)01] "~-Ko(y -- CcZ) (7.12) O, -- G,, [y -i- ~(y - Ce~)] "y(y, ~¢) al = Ga, [u + o(y - c~e)] m "-
gkY
(7.13) (7.14)
[Kc..~+ "T)I "JC"f0(Y,~)+ ~(Y,~)T~I] ~, =
gb, (~ -- Co~)V~
(7.15)
.
For the closed-loop system (7.10)-(7.16), with ~Pl - "
(7.16) ~P~,
the function
1 1--2 --1-2 Vo --" z T e z z "l- r]zTPez + ~1 G;,101 + ~ T G ; : ~ I + .~gk k "Jr ~gb, bpt , with k = ~
1
(7.17)
a positive unknown constant and Pz and Pe solutions of (6.24) and
(6.25), is such that its time derivative, denoted by ~'0*, satisfies the inequality (recall (6.28))
III" - "III
where kL is a positive scalar. Consider the extended system consisting of (7.10), (7.13)-(7.15) and
~ = AcT,"]-bp,dl~Olq-d1
[1~T~1q-')O ' (Y,~)-~'~fT(Y,~)01]q-Ko(y- C¢~')
bp, = gb, (Y-- Cc~)~I ~1 = --A1~ol + ~2. Perform the change of coordinates
(7.19) (7.20) (7.21)
~1 - ~1 - ~ .
(7.22)
In the new coordinates, we have 1 . ]: = Acz + sgn(bp)~dl(~l + ~1) dr dl [/tT/~ q- 7o(Y,~) + 7T(Y,~)0)] = A¢fr + bp~d~(~ + ~1) + d, [#T/~1 + 7o(y,~) + 7T(y,~)t~t] + K o ( y L
J
CcO
o~ = o e , [ y + . ( y - G e ) ] v ( y , ~ )
--" 9 k Y
[KcZ q" ~T~1 .+ 'yo(y, ~) -}- 'y(y, ~)T01]
i
(7.23)
486
Marino ~nd Tomei
We now consider ~2 as a control input which is to be designed using the function 171 = V0 + ~3~ + gb= -I-2 bp= ,
(7.24)
where bp= is an auxiliary estimate of bp (bp= --'- bp - bp= ) and gb= is a positive constant. The time derivative of 111 with respect to (7.23) is •
T
-
1
~
(7.25)
I
=
+ 2V~sgn(b~)~ +
-
2~i~1
+ 2g;Jbp~bp~,
which can be rewritten as ~1 = V; + 2b~zY~l + 2b~zy~ + 2 ~
+ 2g;Jb~,~.
(7.26)
Factoring out ~1 in (7.26), using the last of (7.23), and choosing bp= as
be,, = gb=Y~l,
(7.27)
the function ~2 should be designed as ~o~a = ~ol "* - ybp~ + A1~1, - *
(7.28)
so that substituting ~o~ = ~o~d in (7.23), the time derivative of 171, denoted by ~ d , would become
¢,~d= Vo*- 2 X ~ < -kL / I1~11 [
•
(7.29)
LIl~lllJ Let us examine ~b~, which is given by
o~,~ i~ o_~
o~,~ ~,
o~__i. ~ + o~,__j.~: + o ~ ~
o~ z
(7.30)
. =
~
'
where p~ - [P%2,-.. ,Pn-L~] T. Since/) is not available, it follows that ~b~ and, consequently, ~ a are sot known. Noting that
[1
~] "- C c z = C c A c z + Ccdl ~ ( ~ 1 + ~o*l)s~l(bp) + 7o(y,~) + 7 T ( Y , ~ ) 0 at- f t T ~
]
1
= CeAez + -~((Ol + ~o~)sgn(ba) + 70(Y,~) + 7T(y,~)0 + PTB, (7.31) we will replace y by (the use of several "parameter estimators" was introduced in [5])
~ = CoA°~,+ $ ~ ( ~ + ~ ) + v0(v, ~) + vT(v,~)~Z+ ,~Z -- ~ ( t ) ,
(7.32)
487
Adaptive Observers and Output Feedback for Nonlinear Systems
in which the dynamics of b~,, /~2 and 0~ and the function O~l(t ) are yet to be defined; replacing y by ~t in (7.28), we obtain .
-
0~7:
Oy
(7.3~)
Consider the function
b~ 1/1 = 171 + ~ -~~2
+~TG~IA + 0"~G210~,
(7.34)
where g~a is a positive constant and Gp~, G#~ are symmetric positive definite matrices. The time derivative of (7.34) with respect to the dynamics (7.23),(7.27), with ~2 = ~ , is
F = v0" - ~-~--~-
I
--
We choose b.~ =
-g~(~
"-- - - ( J ~ 2 D I
"
with
ka~ a positive scalar,
+
/
(7.3~)
_ .out _ ~ l ) ~ - y ~1
_-"~--¢P1
Oy~ • ° ~ 1 ~1
(7.36)
so that (7.35) becomes
\ Oy )
oy
'
(7.37)
and, in view of (7.18), a positive finite value of k~, exists such that
v; <-k,
•
/ I1~11 LII@lll
ir z,]l
(7.38)
From (7.11), (7.30), (7.32), (7.33) and (7.36), we have
x CcAo~ + ~,,~(~1 + ~i) + "ro(~,~) + ~ T A + ~T(y,~)~2 _ ~ l - - ~ - y ~.,] (7.39)
488
Marino and Tomei
If p = 2, ~o~ represents the final control ~ = ~(y)u, so that the input u is given by
u ~r-1(y)~ ~- u(Y,~,z,k,bp=,b1,~l,bp1,#l,b2,0~,bh,P2,~l). -
-
If p > 2, we make a further change of coordinates ~2 = ~2 - ~ , the dynamics of ~2 + 9 3 - l- ~ * -
~2 = -A~
(7.40)
consider also
~"*,
(7.41)
and consider ~o3 as a control variable which is designed according to the function
~'2 = Vl + ~],
(7.42)
whose time derivative satisfies the inequality
~2 __<- ~
/ IlZll
+ 2~1932 q- ~32~2.
(7.43)
From (7.41) and (7.43) we can obtain the ideal expression of ~oa
~ d = i 2~2* + ~2 * - ~i,
(7.44)
and proceed as we did in the design of ~2. At the end of this step, we obtain an expression of the desired ~a in the following form
(7.45) having introduced a new estimate ~/2 of the time derivative of the output along with new parameter estimators bp3,/~3, 0a and a new function a2(t). Proceeding the same way, we can obtain the final control u, which will be of the form
tl ~- . l ~ ( y , ~ , z , ~ , b p = , ~ l , . . . , / ~ p , ~ l , . . . , ~ p ,
bpl,...,bpp,f tl,..-,~p,~l,.--,~p-1), (7.46)
where pp = [pp,p, . . . , pn_l,~] T, with the associated function p--1
p
p
p
i=1
i=2
i=2
i=2
Vp-1 = Voq-gb= - 1 - 2bp=-I-Z - 2 q- ~_~gb, - 1 bp, ~ ~ T Gp, - I ~i-k ~
0 T G ; I ~ i , (7.47)
whose time derivative satisfies the inequality
LII~,IIJ where ~ -- [ ~ I , . . . , ~ , - , ] T .
From (7.47)-(7.48) it follows that 1141,11ZlI, I1~11,
]¢, bp~, bp~, /31, ~i, 1 < i < p, and, consequently, y --
Ccz
are bounded. The
Adaptive Observers and Output Feedback for Nonlinear Systems
489
boundedness of II¢(z)ll fonows from (7.9). From (7.4), the differential equation relating ~x to y is
dlid~---'~'~'-/~1 = dth
- -
~=0 ¢ii(Y)Oi "
i=l
(7.49)
The first set of filters in (7.6) can be written as d--1 d,-2 dtn_ 1 I~n-x,1 + d12 dr----if=T_2/~n-1,1+ ' ' " + dxnIJn-l,1 = ~1 sgn(bp),
(7.50)
which substituted in (7.49), gives
sgn(bp)~i--1ai dt2n-l-i
ftn-l'l
=
~
i=1 ~
=0
¢~,(y)o~
(7.51)
where ai, 1 < i < 2n - 1, are the coefficients of the Hurwitz polynomial defined by ( d l l s n - 1 + .. .+dx,)(s"-l +d12s"-2 + .. . + d l , ) = axs2"-9+ • "+a2n-1. (7.52) As shown in the proof of Theorem 6.1, (7.51) implies that Pn-I,I is bounded. Similarly, we can prove that Pj,1, 1 < j < n - 2, are bounded. From (7.11), we can see that ~ is bounded, and therefore H1 is also bounded, since ~1 has been already proved to be bounded. Hence, the input to the filters (7.6) is bounded, which implies that all the state vectors pj, 1 < j < n - 1 and, from (7.46), the input u, are bounded. Thus, from (7.1), (7.9), (7.10), (7.13)-(7.15), (7.19) and (7.21), Ilall, I1~11and II&ll are bounded. Recalling (7.48) and applying Barbalat's Lemma, we obtain lira IIz(t)[I = 0 t-.* DO
lim ll~(t)ll = 0
t---* oO
(7.53)
lim II~(t)ll = 0,
t--*CO
and, consequently,
lim y(t) = 0.
1---*OO
(7.54)
Now consider the systems (7.9) and (7.51). By virtue of (7.54), we can apply [22, p. 59], which yields lim II~(t)ll = 0 I---*CO
,~m ~,,,_l,l(t) = O,
(7.50)
and, similarly, lim /~,,l(t) = 0,
t---*OO
i < i < n - 2,
which, along with (7.46) and (7.53)-(7.55), imply lim u(t) = O.
~---*vo
(7.56)
490
Marino and Tomei
Hence, we can apply again [22, p. 59] to (7.1) and (7.6), to obtain lim II
$---00
,(t)ll = 0,
a < i <.
J i m ~,(t) = O,
- 1
1 < i < p - 1.
Since <(t) is related to z(t) by the transformations (7.2), (7.7) and (7.9), in view of (7.53)-(7.55) and (7.56), we finally have that lim I1¢(011 = o ,
t---*O0
which, since ~ = T(z) with T(O) = O, implies lim II (t)ll = o .
t -,~ (X)
D E x a m p l e 7.1. Consider the system
x2 = z3 ~3 = bu, y -'~ Z l
b> O .
This system is both globally state-feedback linearizable and input-output linearizable by state feedback for any known value of the parameters O and b. Nevertheless, the adaptive algorithms proposed in [3,4,8] do not apply. A global adaptive stabilizing state-feedback control for a simpler example (b=l) was first obtained in [11]; this system belongs to the class of systems in the so-called "pure feedback form" which can be globally adaptively stabilized by state feedback via the systematic algorithm proposed in [5]. According to Theorem 7.1 this example can be adaptively globally stabilized by output feedback: in fact, the example is of relative degree p = 3 and belongs to the class of systems (SR), with b - [0, 0, 1]T a Hurwitz vector of degree p = 3. [] R e m a r k T.1. Let us examine how Theorem 7.1 applies to a single-input, singleoutput linear system characterized by the transfer function of known relative degree p b , s " - ' + . . . + b, ,(s) W(~) = ~" + , , * " - 1 + . . . + a, = d(~)'
(7.57)
where the numerator is a Hurwitz polynomial with unknown coefficients but with bp of known sign, while the denominator is a polynomial with unknown coefficients ai, 1 < i < n. The above assumptions are standard ones in adaptive
Adaptive Observers and Output Feedback for Nonlinear Systems
491
control of linear systems [21, p. 183]. A realization in observer form of (7.57) can be written as
f01001 [i] [al} 00 1 . . . 0 [
~=
:::'.
"
:[~+
bp u -
/ o o o..: {/ LOOO...oJ
a2
a3 y
(7.s8) b
y= [100...0]<. If we set 0 = [ a l , . . . , a,~, b , , . . . , b,~]T as the (2n - p + 1)-dimensional vector of unknown parameters, the system (7.58) is of type (SR) and Theorem 7.1 (or Theorem 6.1 when p = 1) applies, i.e., there exists an adaptive output-feedback control which drives the state ~(t) to zero. []
8 Conclusions We have determined in Theorem 2.1 a class of single-output nonlinear systems, linear with respect to an unknown parameter vector, for which adaptive observers for equivalent states can be designed without the assumption of persistency of excitation (Theorem 3.3). Theorem 3.3 generalizes to a class of nonlinear systems the available theory for linear systems. Theorem 4.1 gives sufficient conditions under which the adaptive observer tracks the original physical coordinates. The observers obtained in Theorem 3.3 and filtered transformations are then used to design adaptive output-feedback stabilizing controls for a more restricted class of nonlinear systems characterized by the geometric conditions given in Theorem 5.1. Theorem 5.2 shows that such a class of systems is characterized by a global linear asymptotically stable zero dynamics. Theorem 6.1 (for systems with relative degree equal to one) and Theorem 7.1 (for systems with relative degree greater than one) provide the construction of globally stabilizing adaptive output-feedback controls. Linear minimum-phase systems with unknown poles and zeroes, known sign of the high-frequency gain and known relative degree belong to the class of systems determined. As shown by Example 6.1, systems which are not feedback linearizable for every value of the unknown parameter vector may belong to such a class and therefore be adaptively stabilized by output feedback. The theory presented in this paper for adaptive stabilization also leads to the design of output-tracking adaptive controls by output feedback by introducing a suitable linear reference model.
Acknowledgements The authors would like to thank P. V. Kokotovic and I. Kanellakopoulos for helpful discussions on the proof of Theorem 7.1 and for making them available a preprint of [5].
492
Marino and Tomei
References 1. A. Isidori, Nonlinear Control Systems, 2nd ed., Berlin, Springer-Verlag, 1989. 2. H. Nijmeijer and A. van der Schaft, Nonlinear Dynamical Control Systems, Berlin, Springer-Verlag, 1990. 3. D. Taylor, P. Kokotovic, R. Marino, and I. Kanella3~opoulos, "Adaptive regulation of nonlinear systems with unmodeled dynamics," IEEE Trans. Aut. Control, vol. 34, pp. 405-412, April 1989. 4. I. Kanellakopoulos, P. Kokotovic, and R. Marino, "An extended direct scheme for robust adaptive nonlinear control," Automatico, to appear, March 1991. 5. I. Kanellakopoulos, P.V. Kokotovic, and A.S. Morse, "Systematic design of adaptive controllers for feedback linearizable systems," Report UILU-ENG-902249, DC-124, University of Illinois at Urbana Champaign, October 1990. 6. K. Narn and A. Arapostathis, "A model-reference adaptive control scheme for pure-feedback nonlinear systems, ~ IEEE Trans. Aut. Control, vol. 33, pp. 803811, Sept. 1088. 7. G. Campion and G. Ba.stin, "Indirect adaptive state feedback control of linearly parametrized nonlinear systems," Int. J. Adapt. Control Sig. Proc., vol. 4, pp. 345-358, 1990. 8. S. S. Sastry and A. Isidori, "Adaptive control of linearizable systems," IEEE Trans. Aut. Control, vol. 34, pp. 1123-1131, Nov. 1989. 9. G. Bastin and M. R. Gevers, "Stable adaptive observers for nonlinear time-varying systems," IEEE Trans. Aut. Control, vol. 33, pp. 650-658, July 1988. 10. R. Marino, "Adaptive observers for single output nonlinear systems," 1EEE Trans. Aut. Control, vol. 35, pp. 1054-1058, Sept. 1990. 11. J. B. Pomet and L. Praly, "Adaptive nonlinear regulation: equation Proc. 8th 1EEE Con]. Dec. Control, pp. 1008-1013, Tampa, FL, 1989. 12. L. Praly, G. Bastin and J. B. Pomet, "Adaptive stabilization of nonlinear systems," Proc. Conf. Analysis of controlled dynamical systems, Lyon, France, 1990. 13. I. Kandlakopoulos, P. V. Kokotovic and R. H. Middleton, "Observer-based adaptive control of nonlinear systems under matching conditions," Proc. 1990 Amer. Control Conf., San Diego, CA, pp. 549-555. 14. I. Kanellakopoulos, P.V. Kokotovie, and R. H. Middleton, "Indirect adaptive output-feedback control of a class of nonlinear systems," Proc. 29th IEEE Conf. Dec. Control, pp. 2714-2719, Honolulu, HI, 1990. 15. R. Marino and P. Tomei, "Global adaptive observers for nonlinear systems via filtered transformations," Report R-90.06, University of Rome "Tot Vergata", September 1990. 16. C. I. Byrnes and A. Isidori, "New results and examples in nonlinear feedback stabilization," Syst. Control Lett., vol. 12, pp. 437-442, 1989. 17. P. V. Kokotovic and H. J. Sussmann, "A positive real condition for global stabilization of nonlinear systems," Syst. Control Left., vol. 12, pp. 125-134, 1989. 18. J. Tsinias, "Sufficient Liapunov-like conditions for stabilization," Math. Control Signals Syst., vol. 2, pp. 343-357, 1989. 19. A. J. Krener and A. Isidori, "Linearization by output injection and nonlinear observers," Syst. Control Lett., vol. 3, pp. 47-52, 1983. 20. W. Respondek, "Global aspects of linea~ization, equivalence to polynomial forms and decomposition of nonlinear control systems," in Algebraic and Geometric Methods in Nonlinear Control Theory, M. Fliess and M. Hazewinkel, eds., pp. 257-284, D. Reidel Publishing Co., Dordrecht, 1986.
Adaptive Observers and Output Feedback for Nonlinear Systems
493
21. K. S. Narendra and A. M. Annaswamy, Stable Adaptive Systems, Prentice-Hall, Inc., Englewood Cliffs, N J, 1989. 22. C. A. Desoer and M. Vidyasagar, Feedback Systems: Input.Output Properties, Academic Press, Orlando, 1975. 23. V. Popov, Hyperstability of Control Systems, Berlin, Springer-Verlag, 1973. 24. B. D. O. Anderson, R . R . Bitmeaxl, C . R . Johnson, Jr., P.V. Kokotovic, R. L. Kosut, I. M. Y. Mareels, L. PrMy, and B. D. Riedle, Stability of Adaptive Systems: Passivity and Averaging Analysis, MIT Press, Cambridge, MA, 1986. 25. W. P. Dayawansa, W. M. Boothby, and D. L. Elliott, "Global state and feedback equivalence of nonlinear systems, ~ Syst. Control Lett., vol. 6, pp. 229-234, 1985. 26. W. Respondek, "Linearization, feedback and Lie brackets, ~ Proc. Int. Conf. Geometric Theory of Nonlinear Control Systems, Wroclaw Technical University Press, Wrodaw, 1985, pp. 131-166.
Adaptive Output-Feedback Control of Systems with Output Nonlinearities* L Kanellakopoulos? P. If. Kokotovi~, 1 and A. S. Morse 2 1 Coordinated Science Laboratory University of Illinois, Urbana, IL 61801, USA. Department of Electrical Engineering Yale University, New Haven, CT 06520-1968, USA.
A b s t r a c t . For a class of single-input single-output nonlinear systems with unknown constant parameters, we present a direct model-reference adaptive control scheme, which requires only output, rather than full-state, measurement. The nonlinearities are not required to satisfy any growth conditions. The assumptions on the linear part of the nonlinear system are the same as in the standard adaptive control problem for linear systems, which now appears as a special case of the nonlinear problem solved in this paper.
1 Introduction The two most common assumptions employed in adaptive nonlinear control are those of linear parametrization [1-13] and full-state feedback [1-11]. The purpose of this paper is to avoid the full-state feedback assumption and to remove the specific restrictions of previous output-feedback results [12,13]. In the linear case, the adaptive output-feedback designs follow either a direct model-reference path or an indirect path via adaptive observers. Current research on adaptive observers for nonlinear systems [14--16] indicates that the indirect path may become promising for adaptive nonlinear control. However, the major stumbling block along this path continues to be its linear-like proof of stability which imposes restrictive conic conditions on the nonlinearities [12,13]. Under such linear growth constraints the actual nonlinear problem is, in fact, not addressed. In this paper we formulate and solve a truly nonlinear output-feedback problem by following the direct model-reference path of Feuer and Morse [17]. In contrast to other more popular adaptive linear control methods [18,19], the method of Feuer and Morse offers a possibility to prove stability without any growth constraints. In a companion paper [20] we have exploited this possibility to solve a full-state-feedback adaptive nonlinear control problem. In this paper we present an adaptive output-feedback result without nonlinearity growth constraints. * The work of the first two authors was supported in part by the National Science Foundation under Grant ECS-87-15811 and in part by the Air Force Office of Scientific Research under Grant AFOSR 90-0011. The work of the third author was supported by the National Science Foundation under Grants ECS-88-05611 and ECS-90-12551.
496
Kanellakopoulos, Kokotovi~, and Morse
The results of this paper apply to nonlinear input-output models consisting of a linear transfer function and output-dependent nonlinearities. The coefficients of the transfer function and the parameters multiplying the nonlinearities are unknown. For the linear part, the assumptions of minimum phase and known sign of the high-frequency gain are the same as in the adaptive linear control theory, which now appears as a special case of the nonlinear theory presented in this paper. For easier understanding, the new adaptive scheme is first designed for a particular system of sufficient complexity to be illustrative of both the design procedure and the stability properties of the resulting closed-loop adaptive system. In Section 2 we design the adaptive scheme for this system and then prove the stability and tracking properties of the resulting adaptive system in full detaft. The design procedure for the general case is presented in Section 3, and the proof of stability and tracking is given in Section 4. Nonlinear input-output models are intimately tied to state-space equations which originate from nonlinear physical laws expressed in specific state coordinates. In Section 5 we give a state-space form of the class of nonlinear plants which have the desired input-output representation, and characterize this class of plants via a set of geometric conditions. 2 Adaptive
Scheme
Design:
An Example
The purpose of this section is to make both the proposed adaptive scheme and the main features of the Feuer-Morse method more easily accessible to the reader with the usual background in control theory and limited familiarity with adaptive linear control. 2.1. N o n l i n e a r s y s t e m p r o p e r t i e s . The nonlinear system is assumed to be minimum-phase [21, Chap.4] and its nonlinearities depend only on the output variable. This implies that the nonlinear system is linearizable by output injection [22]. The input-output description of a typical nonlinear system of this kind is given by
DSy -- (D 2 + 2D + 1)u + 8 [D2p~(y) + Dpl(y) + P0(y)],
(2.1)
d where u and y are the scalar control and output, respectively, D = ~ , and 0 is an unknown constant parameter. To address a truly nonlinear problem, we choose the nonlinearities which do not satisfy linear growth constraints:
pl(y)=y
+2y 3,
3.
(2.2)
It is important to notice that these nonlinearities are not in the span of u, and, hence, the system (2.1) is not full-state linearizable by static output feedback, or even by static full-state feedback, as shown in Sect. 5. However, it is input-output linearizable by full-state feedback [21, Chap. 4].
Adaptive Output-Feedback Control for Nonlinear Systems
497
The above structural and growth properties of (2.1) and its relative degree [21, Chap. 4] show that (2.1) is a nonlinear system of considerable complexity. However, this system also satisfies a structural constraint under which the results of this paper are applicable: the nonlinearities do not enter the system before the control input u. 2.2. A u g m e n t i n g t h e C E control. As in most adaptive designs, our first step is to find a dynamic output-feedback control that guarantees the specified stability and tracking properties when the parameter 0 is known. Most adaptive schemes then replace the unknown 0 with its estimate 0 and implement the so formed "certainty-equivalence" control. Such certainty-equivalence designs have been satisfactory in adaptive linear control, but have failed to produce truly nonlinear results because of their inherent linear growth constraints [12,13]. To avoid this difficulty we must go beyond the certainty-equivalence approach. Following Feuer and Morse [17], we will augment the certainty-equivalence control by an additive term fi which will counteract the effects of rapidly growing nonlinearities. It will also provide us with additional flexibility in the proof of stability. The certainty-equivalence part of our control will be designed to match a reference model of the same relative degree as that of the nonlinear plant (2.1). As this plant is input-output linearizable by full-state feedback, we will choose the simplest linear reference model of relative degree three: (D + 1)ayr = r .
(2.3)
The first step in matching this reference model is to filter the plant equation (2.1) by the strictly proper stable filter F/E2, where F is a monic polynomial of degree 2, and E2 is a monic Hurwitz polynomial of degree 4. This results in
FA
y=
F1~,'B u q ' o
[--gTp~(y) +
~
F ]
p,(y) + ~ p 0 ( y )
,
(2.4)
where A = D s, B = D 2 -t- 2D-I- 1 as in (2.1). It is now straightforward to verify that the desired matching is achieved by the control
u* provided that
G -~y
+ r- 0~(y)-
FD 2
~(y) = --~-p~Cy) +
~
F B - E2 u* , E~ F
pl(y) + ~po(y),
(2.5)
(2.0
and that G, a polynomial of degree 4, and F satisfy the polynomial equation
FD 5 + G = (D + 1)aE2.
(2.7)
Note that the polynomial F B - E2 in (2.5) is of degree 3, since F B and E2 are both monic polynomials of degree 4. As an illustration, the choice E2 = (D + 2) 4 yields the following solution of (2.7): F = D 2 + 11D + 51
(2.8)
G = 129D 4 + 192D a + 168D 2 + 80D + 16.
(2.9)
498
Kanellakopoulos, Kokotovi~, and Morse
When the control (2.5) is applied to the system (2.1) and the initial conditions of the filters used in (2.4), (2.5) and (2.6) are exactly matched with those of the system (2.1), then (2.5) achieves the exact tracking y ( t ) :- yr(t) for all t >_ 0. However, the initial conditions of (2.1) are unknown and the tracking can be achieved only asymptotically, that is, y(t) = yr(t) +
.-, y,(t)
as t
-.,
(2.1o)
where e(t) is the exponentially decaying tracking error caused by the mismatch of the initial conditions. When the parameter 9 is unknown, we replace it in (2.5) by its estimate ~, to be obtained from a parameter update law. To this "certainty-equivalence" part of our control we add a term fi which will be a handy tool later. So, our adaptive control will be of the following form: G
u = -~-~2 y + r - ~v(y) -
F B - E2 u + f~
E2
(2.11)
When applied to the nonlinear plant (2.1), this control yields the following inputoutput description of the resulting feedback system: 1 where, as in the case when 0 was known, e(t) contains all the exponentially decaying terms caused by the mismatch of the initial conditions. It should be observed that with an exact estimate 0 = 0 the linearization of (2.12) is achieved. Introducing the error variables e=y-yr,
0=0-~,
(2.13)
and taking the difference between (2.12) and the reference model (2.3), we obtain the tracking error equation: 1 e --- (D -t- 1) 3
[0v(y) + "~] "1"e(t).
(2.14)
2.3. E r r o r a u g m e n t a t i o n a n d s w a p p i n g . Following the standard practice in adaptive control, we now set out to construct an error equation in which the parameter error is filtered only by a strictly positive real (SPR) transfer function. As a first step, we rewrite (2.14) in the form e -- (D ÷ 1-----~ 0~-~ -i- 1) 2 1 -t- (D -'t-1) 3 [0v(y)]
(D - ' ' ' '+' ~ 1) fi
~
(D 1Jr 1) [ ( D ÷
1
1) ' v ( y ) ] ÷ e(t).
(2.15)
The first term in (2.15) is in the desired S P R form, while the second term is due to the additional control term ft. As for the third and fourth terms, these are the
Adaptive Output-Feedback Control for Nonlinear Systems
499
familiar swapping terms, whose presence is caused by the time-varying nature of t~: if 0 were constant, these two terms would cancel out. Let us therefore define the augmented error ff as = e + ~/o,
(2.16)
where the term Oo represents all the undesirable terms in (2.14):
T/o=-
1 _ 1 ( D + I ~ u + ( D + I ) s [0v(y)]
[8(DT
1 (D+I)
1 1)2v(y)] ]
(2.17)
The signal multiplying 0 in the first brackets is of particular importance and is denoted by 1
hi - (D + 1) 2 v(y).
(2.18)
Considering v(y) as the input and hi as the output, we represent (2.18) in the state-space form
[ ttl ] = ]~ = A°h + bv(y)
(2.19)
where
It can now be verified that ~/0 is the output of the third order system //0 = -~/0 + rh
(2.21) =
= il = Aorl- b ~ - hO.
(2.22)
The variables hi and }71 from (2.19) and (2.22) allow us to express the tracking error as
1 ~ e- D +'-----
[0hi - 71] "4"~(t).
(2.23)
The analogous expression for the augmented error is e-
D-41 1 [0hl] + e(t) ,
(2.24)
and it has the desired SPK form: the parameter error {~multiplied by the "regressor" hi is the input into the SPR filter 1/(D + 1).
500
Kandl~kopoulos, Kokotovi6, and Morse
2.4. U p d a t e law. From this point on, the route prescribed by most of the adaptive linear control literature is to choose a normalized gradient update law and to set ~ -- 0 (thus returning to a pure certainty equivalence control). In the case of adaptive linear systems, boundedness of the closed-loop signals can then be established using the Gronwall lemma or some type of small-gain argument. Attempts to apply this type of stability proof to nonlinear systems have so far been successful only when conic constraints are imposed on the nonlinearities. Without such linear growth constraints, the term ( 0 - 0) v(y) can cause some /
signals to escape to infinity in finite time if the parameter error 0-/t is not rapidly decreased. The difficulty with normalizations of update laws is that they don't allow a rapid enough decrease of the parameter error when this error is most harmful. A simulation example of instability of a full-state-feedback scheme with normalization [10] is given in our companion paper [20], where it is also shown that an unnormalized update law preserves global stability. We are, therefore, motivated to look for an unnormalized update law. The SPR form of (2.24 / suggests the unnormalized gradient update law 0 = ht~.
(2.25)
Given the complexity of the nonlinear system (2.1), it is likely that such a simple update law will shift the difficulties in the adaptive design to the proof of stability. Indeed, this is the case. A simple Lyapunov-like function involving ~2 and ~2 has a nonpositive derivative, but fails to prove boundedness of y. It clearly shows, however, that r/0 must be taken into account. Our next attempt is with the function 1 ~2 + ~2(.r)d.r 1 T (2.26)
(
/,=
)
where P is the positive definite solution of the Lyapunov equation o.= Using (2.22), (2.24) and (2.25), we compute l~, = - ~1 (~_ e(t))2 _ ~ 1[ ~ 2 +
2r}Tphhle+ 2~/Wpb~] ,
(2.28)
and try to render it nonpositive. The tool we have prepared for this task is ft. However, it turns out to be impossible to counteract the effects of the twodimensional vector hhl$ by fi alone. Hence, we need an additional degree of freedom involving hi. This prompts us to replace ~1in (2.26) by the new vector (2.29)
¢1 "-- T}I
where the nonlinear function ~1(hl) is at our disposal. With the new variables ¢, the nonnegative function to be used in our proof is V=~
,( ,,/= e2+
+
e2(r)dl.
), +
;Tp¢.
(2.30)
Adaptive Output-Feedback Control for Nonlinear Systems
501
To evaluate ~Ywe need ~, which is obtained by differentiating (2.29) and using (2.19)-(2.22): tl = & - hi0 - e(hl)¢~ ~2 = - ~ 1 - 2C2 + 2~1(hx)~1 - fi - h ~
(2.31)
Introducing the notation [ -h~ ] w = [-hlh2 h~l J '
(2.32)
-
we compute the time derivative of V as:
[
~
0~lh~]}. (2.33)
2.5. Design equation. W e now have two tools to make V nonpositive: the function ~1(hi) and the control term ft. With these tools we will attempt to represent the quantity enclosed in braces in (2.33) as the s u m of two squares. It turns out that this is possible to achieve by decomposing P as P = P1 + P2 such that the following design equation holds:
[
~t~x
0~1 h2(~1] . (2.34)
The substitution of (2.34) into (2.33) yields the desired form for lY: <_ o
Our task now is to find Pa, £2, ~l(hl) and fi which satisfy the design equation (2.34). For the example considered here, the systematic procedure of Sect. 3 gives the following solution for P1 and P2: P~=
O0 '
P2=
11 "
Substitution of (2.36) into (2.34) results in [(hlh~ + h~ + (ah~) 2 - 2h~] (x + (hlh~. + h~ +(xh~)2(2
=
-2~ + ~ -
0,~1 ~. '~ ~,,~) ~i -
~¢~ + ~
= '
(2.37)
502
Kanellakopoulos, Kokotovid, and Morse
which directly yields the following solutions for ~1 and fi : 6 = 2h~ u - [8h~h 2 -{- 2h 4 - 4 h ? + (hlh 2 + h~ nt- 2h6) 21 41 + [2h 4 + (h,h 2 + h~ + 2h6) 2] 42.
(2.38) (2.39)
Using (2.29) and the notation ~al -- 8h~h2 + 2h~ + (1 + 2h4)(hlh2 + h~ + 2h16)2 ~2 = 2h~ + (hlh2 + h~ + 2h~) 2 ,
(2.40) (2.41)
can be defined in terms of available signals as = ~t~h + ~T/2.
(2.42)
To summarize, the complete closed-loop adaptive system is Plant: DSy = (D 2 + 2D + 1)u + 0 [D~p2(y) + Dpl(y) + Po(Y)] Control: G F B - E2
E2 Update law:
(2.43) 2
O=hle,
e=e+~lo=y--yr+qO
Filters: h = Aoh + bt,(y) il = Ao~? - b(~ph + ~2~12) - hhl ~ ~ = -1/0 + T h , where y~ is the output of the reference model (2.3) and v(y) is defined in (2.6). 2.6. S t a b i l i t y a n d t r a c k i n g . The stability and tracking properties of (2.43) are now established using the nonnegative function V from (2.30), whose derivative, given in (2.35), is nonpositive. Because of the piecewise continuity of r(t) and the smoothness of the nonlinear functions p0, pl, p2, the solution of (2.43) l~as a maximum interval of definition, which we denote by [0, tf). We will now show that tf = oo. From (2.30) and (2.35) we conclude that if, 4, and/~ are bounded on [0, ~f). Solving (2.29) for r/and using (2.38) we obtain ~t = (t ~2 = 4 2 - 2h4(t.
(2.44)
Thus, the boundedness of ( implies that Yt is bounded, which, in turn, implies that ~0 is bounded: /10 = - y 0 + yl. Since ~ and ~0 are bounded, e is bounded: e = i f - ~?0- By the boundedness of Yr, this implies that y is bounded: y = e + yr.
Adaptive Output-Feedback Control for Nonlinear Systems
503
Hence, v ( y ) is bounded, which means that h is bounded: h = Aoh + bu(y). The boundedness of h and ~ implies that 17 and fi are bounded (cf. (2.44) and (2.39)). This does not yet prove that u is bounded. From (2.11), to prove the boundedness of u we only need to show that is of relative degree 1, we can express FB-E~
E~
u=
(D+I)3(D+A1) E2
[
F B - E~
u is bounded. Since
E2 F B - E2
E2
F B - E2
E2
u in the form
Ez 1 E4 ] (D + 1)3 u + D + AI (D + 1)3 u + e(t),
(~.45) where AI is a positive constant and E3, E4 are polynomials of degree 2. Now Di
(2.45) clearly shows that u is bounded if
u is bounded for i = 0, 1,2. (m + 1) 3 Since y is bounded and the plant (2.1) is minimum phase and of relative degree 1 3, it follows that (D + 1) au is bounded. Differentiating (2.14) and substituting - S v ( y ) + ~ from (2.11), we obtain for i = 1, 2 :
Di e(') = ( D ~ 1 ) 3
Di
Using hi -
[0~(U) +
[
fi]
+
e(t)
FB
G
- ( D ~ 1)3 0~(y) + - ~ - .
+ ~y
] - ~ + ~(0"
(2.46)
Di-i (D +
1,2.v(y),) i = 1, 2, from (2.19) and rearranging terms in (2.46)
we get
o' (D + 1) 3 ['E'~'z J -
,]
D
D + 1 hi8 + e(t).
( D " ~ 1) 3 E'~'2y -
(2.47)
From (2.16), (2.21), (2.22), (2.24) and (2.25) we have
= $ - #0 = - e + ~hx + ~0 - ~1 + ~(t)
(2.48)
L
~, = - ~ +
Ohl + ~hl + ~o - 01 + ~.(t)
= ~ - 8hl - h~e + 8h2 - T/0 + ,/1 - T/2 + h ~ + e(t).
(2.49)
Since ~, h, y, 0 are bounded, (2.48) and (2.49) imply that ~, ~ are bounded. B u ] J is bounded for i = 1, 2. The boundedness of Hence, by (2.47), (D Di + 1) 3 [[ FE2 Di 1 (D + 1) 3u for i = 1,2 then follows from the boundedness of (D + 1) 3 u and the
recursive expression Di
Di
FB
(D + 1-ffflu = (D + 1) 3 E2
u
D(FB-E~)
E2
D ~-1
(D + 1) 3
u + ~(t).
(2.50)
504
Kanellakopoulos, Kokotovid, and Morse
Next, we prove that the state of the plant is bounded. From (2.12) it follows that D i y , 0 < i < 3 are bounded. Combining this with the fact that the plant is minimum phase, we conclude that the state of any minimal realization of (2.1) is bounded on [0, tf). Thus, we have shown that the state of the closed-loop adaptive system (2.43) is bounded on its maximum interval of existence [0, tf). Therefore, tr = oo. Finally, we prove convergence of the tracking error to zero. From tr = co and (2.35) we conclude that V is bounded and integrable on [0, oo). Furthermore, the boundedness of ~ (c.f. (2.24)), ~ (cf. (2.31)), and h (cf. (2.19)), implies that is bounded. Hence, V ~ 0 as t --+ oo, which implies (cf. (2.35)) that ~ --+ 0, --* 0 and, from (2.44), that r / ~ 0 (since h is bounded). This, in turn, implies that ~/0 ~ 0 as t ~ co by (2.21). We conclude that lim [y(t) - y~(t)] = tlirn [~(t) -
oo(t)l
(2.51)
= O.
[]
3 The
Systematic
Design
Procedure
Even though the expressions in the general case become more complicated than in the preceding section, the main steps of the design procedure remain essentially the same. 3.1. N o n l i n e a r s y s t e m p r o p e r t i e s . We consider the class of n-dimensional nonlinear systems which have an input-output description expressed globally by the n-th order scalar differential equation
A(D)y = B(D) [q(y)u] + ~ D i [pi0(y) +pT(y)Ol]
,
(3.1)
i=O
where - the coefficients a 0 , . . . , a n - 1 of the denominator polynomial A ( D ) = D '~ + a n - l D '~-I + " " + ao are unknown, - the coefficients b o , . . . , b,n(m < n - 1) of the numerator polynomial B ( D ) = braD m + " " + bo are unknown, but B ( D ) is known to be Hurwitz, and the sign of bm is known, - 01 is an t-dimensional vector of unknown parameters, - q(y), pij(y), 0 < i < m, 0 < j < £, are smooth nonlinearities with q(y) ¢ O V y e lR, p q ( O ) = O , O < i < m , O <_j <_£. Systems in this class are linearizable by output injection, and input-output linearizable by full-state feedback, but not necessarily full-state linearizable, even by full-state feedback, as will be shown in Sect. 5.
Adaptive Output-Feedback Control for Nonlinear Systems
505
3.2. A u g m e n t i n g t h e C E c o n t r o l . The design objective of the certaintyequivalence part of our control is to match a reference model of the same or higher relative degree than that of the nonlinear plant (3.1). As this plant is input-output linearizable by output feedback, we choo6e the linear reference model: EI(D)E2(D)yr = R r , E2(D) "- E~1(D)E22(D), (3.2) where E l ( D ) , E2(D), E21(D), E22(D) are monic Hurwitz polynomials of degree n - m, n - 1, n - m - 1, and m, respectively, and R(D) is a polynomial of degree h < n - 1. Filtering (3.1) by the strictly proper stable filter F/E~, where F is a monic polynomial of degree n - rn - 1, we obtain
FA FB ~ FDi [P,0(y) 4- pW(y)01] E2 y = ~ [q(y)u] 4~ i=0
(3.3)
It is now straightforward to verify that in the case when the coefficients of A(D) and B(D) and the parameters 01 are known, the desired matching is achieved by the control
q(~)
-~Y
+ ~ -
~ j=0
----
T~
~
-•( y ) i=0 E22p''
-
[q(y),']
-~2
(3.4)
'
provided that G, a polynomial of degree n - 1, F , M, ~/) and L0 satisfy the polynomial equations
FA + brag = EIE2
(3.5)
M = 1R b,~ = ~#uF,
no = ~LFB.~
(3.6) 0 < j _< £, 010
----
1
(3.7)
(3.8)
- Z2.
1 Note that L0 is a polynomial of degree n - 2, since both ~_FB and E2 are monic polynomials of degree n - 1. When the control (3.4) is applied to the system (3.1), asymptotic tracking is achieved:
y(~) = yr(~) + ,(~) --* w ( 0
as ~ -
¢~,
(3.9)
where e(t) is the exponentially decaying tracking error caused by the mismatch of the initial conditions. We now rewrite the control (3.4) as
.*__
q_~_~ 1 [¢T(y,.* , ~)0],
(3.10)
506
Kanellakopoulos,Kokotovid, and Morse
where the n0-dimensional vectors 0 and ¢ (with no = 2n + (£ + 1)(n - m)) are defined as 0 7 "- [go . . . . , g n - 1 , - - m 0 , t00 . . . . .
t0n-m--l,~10,--.,tln-m-1,
• .., Q o , . . . , t t , - , ~ - l , l o o , . . . , £o,~-2]
CT(y,u*,r)
,[-D"-ly, R = L-~
-~.-m-l X-.__ Di
(3.11)
(
-~._.,_l ~_~ Di
. _ , E , i=o 22 pio,y,, -~2 r' - - ~E21
-E21 -
~.~___p,~(~) , -~,,-~
]
-Dn-rn-1 m Si ...,
~21
i=O z':"22
t \
i=0 22
[q(y)u.]j,
(3.12)
with ~ t defined as the (k + 1)-dimensionM row operator ~ k = [1,D,... ,Dr]. The form (3.10) is particularly useful in the case where the coefficients of A(D) and B ( D ) and the components of the vector 01 are unknown. Since in that case the parameter vector 0 defined in (3.11) and used in (3.10) cannot be computed, it is replaced by an estimate t~. The so formed "certainty-equivalence" control is then augmented by an additive term ~ which is yet to be designed. Hence, the adaptive control will be of the form:
=
1
+ ~]
u,r)
(3.13)
4
F Filtering the system equation (3.1) by the strictly proper stable filter - E1E2 ' and using (3.5)-(3.8) and (3.11)-(3.12), we obtain: FA -EI - YE2
FB ~-~ F D i [P,0(Y) + pT(y)0,] + e(t) -- E1E2 [q(y)u] + ~.= E1E2
y--
+ e(t) 1:-'2
y= ~
q(y)u + cT(y,., r)0 + ~ r
j=O
=
+ e(¢),
(3.14)
where, as in the case when 0 is known, e(t) denotes a linear combination of exponentially decaying terms caused by the mismatch of the initial conditions. Substitution of (3.13) into (3.14) yields the following description of the resulting feedback system:
Y=~
~
+¢T(~ 'u'r)
+
+
.
Introducing the error variables
= ~ - y~, ~ = o - ~,
(3.16)
Adaptive Output-Feedback Control for Nonlinear Systems
507
and taking the difference between (3.15) and the reference model (3.2), we obtain the ~raeking error e~uation:
bm [¢T(y,u,r)~ + fi] +e(t).
(3.17)
e= ~t
In the special ease of relative degree one ( n - rn = 1), the design is extremely simple. Since the transfer function ]b,~l/El is SPR, the parameter update law 0 = sgn(b~)rO(y, u, r)e,
(3.18)
where F = /~T :> 0 is the adaptive gain, guarantees boundedness of all the closed-loop signals and convergence of the tracking error e to zero [18, Chap. 5], and the control augmentation is not needed: fi ~ 0. 3.3. E r r o r a u g m e n t a t i o n a n d swapping. For relative degree higher than one (n - m > 1), the design becomes considerably more complicated, since IbmI/E1 is no longer SPR.. We first rewrite (3.17) in the form 1 [ e= D+Ao
~1
+ D ~" ,~o "~o
]
-
bmfi
¢ +
'
where
Eo(D)(D + ~o) = E1 (D).
(3.20)
In contrast to the example of Sect. 2, where the high-frequency gain bm was known, here it is unknown. Therefore, using an estimate b,n and denoting b,n = bm - bin, we rewrite (3.19) in the form
1
e= D+A0
¼
"~b~r ~b+bm
(~ o a - 0 w¼~b+~--~0
-
E
Since the first summand in (3.21) is in the desired SPR form, we define the augmented error e = e + 70,
(3.22)
where the term 70 represents all the undesirable terms in (3.21): 70 =
D + :~0
~_.~0fi_ ~r
¢ + ~00 ~W¢
.
(3.23)
The vector multiplying 0 in the second term is denoted by 1 ¢ = 70 ¢(y, u, r).
(3.24)
508
Kanellakopoulos, Kokotovi~, and Morse
Considering ¢(y, u, r) as the input and ¢ as the output, we represent (3.24) in the matrix state-space form: [-t
AoH
=
+
b~bT(y,u, r)
(3.25) cT -- cTH, where (c, A0, b) is a minimal realization of 1/Eo : c T ( s I _ A0)_lb _
(3.26)
1 Eo(s) "
Now ~/0 is the output of the (n - m)-dimensional system ~0 : --)lOgO "4- bmcT~
(3.27)
il = Ao~l - b~ - H O .
(3.28)
The variables ¢ and cTt/from (3.25) and (3.28) allow us to express the tracking error as (cf. (3.19)) 1 [bmST¢- bracts] + e(t) e -- A-----~ D+
(3.29)
The analogous expression for the augmented error is 1 [bm~~ - A-------~ D +
b~T.]
+e(t)
and it has the desired SPR form: the parameter errors by the SPR filter 1 / ( D + Ao).
(3.30)
~ and bm are filtered only
3.4. U p d a t e law. As in Sect. 2, we choose the unnormalized gradient update laws suggested by the SPK form of (3.30): k 0 = sgn(bm)FCe
bm=
(3.31)
--Tcw~ff,
(3.32)
where F = F T > 0 and 7 > 0 are the adaptive gains. From Sect. 2 we know that in the proof of stability there will be a need to balance the interaction between y and H. Therefore, we introduce the new variables ~ : = S-lrl, S ~ C~Iscn, (3.33) where h -
n - m - 1, Ci=
[c,A 0 •c,...,
A
i-lc
1" ,
i=l,...,fi
(3.34)
Adaptive Output-Feedba~:k Control for Nonlinear Systems
509
[A°] ELi----[Ij×j, 0]j×i
,
O<j_
(3.36)
and Iixi is the i x i identity matrix. The components of the/-dimensional row vectors ~i are nonlinear functions of the elements of H which represent the aforementioned additional degrees of design freedom. In order to show that the matrix S defined in (3.33) is invertible, we note that, because of the structure of the matrices E{,I defined by (3.36), the matrix ~ in (3.35) is lower triangular with ones on its diagonal. From this it follows that E - t always exists. Furthermore, C~ 1 exists because (cT, A0) is assumed to be an observable pair. The nonnegative function to be used in our stability proof is 1(
1
l ~°°e'(r)dr)
+ ~--_°n,TP,
(3.37)
The form of (3.37) is the same as that of (2.30), where P is the positive definite solution of the Lyapunov equation PAo + AT p ---- - Q o .
(3.38)
To evaluate I)" we need ~, which is obtained by differentiating (3.33) and using (3.28), (3.31) and (3.34)-(3.36): = S -1 (AoT/- bfi - sgn(b,~)HF¢~) +
d
- S - I A o S ~ - S-l(bf~ + s g n ( b m ) g F ¢ ~ ) - S - 1 S ~ = Ao~ - s g n ( b m ) S - 1 H F C e + S - ' ( A o S - SAo - ~3)~ - S-Xbf~. (3.39) Introducing the notation w = -sgn(b,,)s-~gr¢ W = S-' (SAo-
(3.40)
A0S Or o6), ,1
(3.41)
we rewrite (3.39) as = Ao¢+wg-
W~-
S-lb,,
(3.42)
and compute the time derivative of V as 1
--2'~0 {4e2-~Tp(we- W~-'-'q-lbu) }
(3.43)
510
Kanellakopoulos, Kokotovid, and Morse
3.5. Design equation. The tools we have at our disposal to make V nonpositive are the functions ~i(H) and the control term ~. With these tools we will attempt to represent the quantity enclosed in braces in (3.43) as the sum of h squares. It turns out that this is possible to achieve by decomposing P as fl
P =E
Pi such that the following design equation holds:
i=1 f~
~_, P~w~orP~¢ = P(W¢ + S-lbr,).
(3.44)
i=1
The substitution of (3.44) into (3.43) yields the desired form for 1?: 1( 1 )' ~ 2)~o~(2_(Tp, ~r=__2 v ~ 0 e - - ' ~ 0 e -- _ (TQ0(-- n " i = 1
w ) Z < O. (3.45)
Our task now is to find Pi, ~ ( H ) and fi which satisfy the design equation (3.44). Following the development in [17], we define (3.46)
I~ = CiWMiCi ,
where M1 = ( C I p - 1 c 1 ) -1 Mi = (CiP-1C~r~ ) -1
(
Iixi-
CiP-ICTMjEi,i
/
,
i = 2,...,ft.
(3.47)
]
~-----1
In [17, i e m m a 1] it is proved that (3.46)-(3.47) result in fi
E ,=e
(3.48)
i----1
C j P - 1 C T MiCi = O ,
l < j < i <_ ~.
(3.49)
This proof is now given for completeness. From (3.47) we have i-1
Iixi = C i P - 1 C T M, T E C ' p - 1 C T MyEj, i
j=l i
= C, P - 1 E C T M ~ E k , , .
(3.50)
k----1
Postmultiplying both sides of (3.50) by 6'~ and using the identity
Ck=Ek,iG, k
(3.51)
i
Ci = CiP -1 E k=l
cW Mkck "
(3.52)
Adaptive Output-Feedback Control for Nonlinear Systems
511
Evaluating (3.52) at i = fi and using the nonsingularity of Ca we obtain (3.48). Furthermore, premultiplying (3.52) by Ej,i, where j < i, and using (3.51) again, we obtain i
Cj - - C j p - i ~
CTMkCk,
i > j.
(3.53)
k=l
But from (3.52) we have J k=l
which, combined with (3.53), results in (3.49). Having established (3.48) and (3.49), we now set out to find ~i(H) and fi which, along with Pi defined by (3.46)-(3.47), satisfy the design equation (3.44). Substituting (3.46) into (3.44) we obtain a
We + S-Xbfi = p-1 ~-~ CT MiCiwwTCT MTCi .
(3.54)
Using (3.41) we rewrite the design equation (3.54) as a
(3.55) i=l
Premultiplying both sides of (3.55) by cTAio-1 for { ----1 , . . . , fi, we obtain
+S] ~ + cWA~-iba,
1 < i < ~.
(3.56)
From (3.33), (3.35) and (3.51) we have
~101 (2C2
CaS = Y.Cn = Ca +
(3.57)
(.-iC.-i which gives
cTAio-IS = cTAio-1 + ~ i - l C / - 1 ,
1 < / < fi,
(3.58)
where we have defined for notational convenience ~0=0,
C0=0.
(3.59)
512
Kanellakopoulos, Kokotovid, and Morse
Furthermore, 0 cTAio-lS = cTAio-lC~l~--"Cr1 _ eTAio-Ic~I
:
~r~-lCr,-I =~i-tCi-1,
1 < i < fi,
(3.60)
where we have used the definition of Cn to obtain the last equality. Finally, from the definition (3.26) of the triple (c, A0, b) we have: l<{
cWAio-lb=O,
(3.61) (3.62)
cWA~-lb : 1.
Substituting (3.58)-(3.62) into (3.56) results in ~Ci = ~-1C~-i + ~-lCi-lAo
- (cVA~o- I +
~-1C~-1)
fl
1< i <
x P-1ECTMjCjwwTCTMTcI,
'h-
1
(3.63)
j=l
-"
[cTA~S-~n-,Cn-,-
c T A o. - ~ n - l C n - l A o
+ (cTA~ -1 + ~ n - I C a - 1 )
f~
x P-l c?M c ww cTMTc
¢
(3.64)
j=l
At this point, we have almost achieved our goal of finding ~i and fi which satisfy the design equation and thus render l/nonpositive. Still, (3.63)-(3.64) are in a rather complicated form and, moreover, they involve the time derivatives of the functions ~i. Therefore, we now set out to simplify (3.63)-(3.64) and to express ~i-x as explicit functions of available signals. Motivated by the appearance of the terms C j w in (3.63)-(3.64), we introduce the i-dimensional column vectors w l , . . . , w,~ which are defined as wi-Ciw,
l
(3.65)
Combining (3.65) with (3.51) we see that these vectors satisfy the recursive expressions
l
(3.66) (3.67)
< i < fi-
(3.68)
W?t ~ C f t w
wi = Ei,i+lWi+l ,
Using (3.30) we can rewrite (3.67) as 1
1.
We now set out to obtain explicit expressions for W l , . . . , wn in terms of ~i. Substituting (3.40) into (3.66) results in ~-wn = -sgn(b,,~)CnHF¢
(3.69)
Adaptive Output-Feedback Control for Nonll,eax Systems
513
We then use (3.25), (3.35) and (3.68) to rewrite (3.69) as
°1)
1f1,.
+
•
cT cTA0
[w._x] = -sgn(bm)
HFHTc. (3.70)
Lw . , . j
cTA~- 1
~n-lEn-l,n J
By (3.36), (3.70) is equivalent to
~IEl,n-1 l(n-1)×(n-l) nu
:
W . - 1 --
(3.71)
-sgn(bm)Cn-1HFHTc
L~n-2Ea-2,,-1
wn,. + ~n-,wn-x = -sgn(b.,)cT A'~-l H PHT c. (3.72) Starting from (3.71)-(3.72), one can repeat the above procedure to show that (3.70) is equivalent to Wl w2,2 + ~i w l
cT
cTAo
HFHTc.
= -sgn(bm)
(3.73)
cTA~- 1
Lwn,n + ~n-lWn-1
Hence, the explicit expressions for the vectors w , , . . . , wn are w T = -sgn(bm) (cT H F H Tc)
(3.74)
wT = [wwl,--sgn(bm)(cWAio-lHFHTc)--~,-tw,-1] , i = 2 , . . . , f t .
(3.75)
Combining (3.49) and (3.65) we obtain n
i
C,p-I~CTMjCjwwTCTMTCj = CiP-1 ZCjT MjCjwwT CjT MjT Cj j=l
j=l
= Ci P - ' L j=l = CiNiCi,
MjwiwT MT Ei,, Ci (3.76)
where i
Ni = p - 1 Z C J T MjwjwjT MjT Ej,i, i = 1 , . . . , f t . j=l
(3.77)
514
Kanellakopoulos, Kokotovid, and Morse
Thus, the last term in (3.63) and (3.64) can be rewritten as
j=l f~
= ( [ 0 . . . 0 1] + ~,_IE,_:,,)C,P-:~'~CiMiCiwwTCTMTCj j=l
= ([0 ... 0 1] + ~i-lEi-l,i)CiNiC, = (cTAi0- I + ~i-lCi-1)Ni(7,.
(3.78)
Introducing the i x (i + 1) matrix R~+I --- [0, I~×i] and substituting (3.78) into (3.63)-(3.64) we obtain for i - 1 , . . . , fi - 1: fiei
"-
~i-lCi-1 + ~,_,Ci_lAo - (cWA~-1 + ~,_lC,_:)g, ci
(3.79)
=
fi
{cTA~S - cTA~ - [ ~ , - 1 E , - : , , + e , _ : R , -
(c TA~-: + f , _ : C , _ : ) Nn] C , } ( .
(3.80)
This form makes it apparent that the design equation (3.54) is satisfied by the recursive expressions
(3.81)
~1 = --cT N1
~i = ~ i - l E , - : , , + ~,-1R~ - (cWA~-1 + ~,-1Ci-1) Y,,
i = 2 , . . . , 5,
fi = [cWA~ -- (cWA~ + eaCa) S-:] 71.
(3.82) (3.83)
To finally solve the design equation, we need to express ~i-1 in (3.82) as an explicit function of available signals. We first show via an induction argument that (3.74), (3.75), (3.77), (3.81) and (3.82) imply that the elements of wi, Ni and ~i are polynomial functions of the elements of CiH: -
For i = 1, this fact is obvious from the definitions:
wl = -sgn(bm )C1H F( C: H) w g l = p - 1 C T M t w : w T MT E1 = --cT N1 • -
For i -- k _< fi - 1, suppose the elements of wk, Nk and ~k are polynomial functions of the elements of Ck H. Then, the derivative of ~ can be expressed as
no
~k = ~--~fi~kWj 0 ~
(3.84)
i=1 Ohk,j ' where hkj is the j-th column of CkH and O{~/Oh~,i is the k × j matrix of partial derivatives of{k(CkH) with respect to the elements of hkj. But from (3.25), (3.61), (3.62), and the definition of R~+I, we obtain
Cj-I=C~AoH=Rk+:Ck+IH,
l
(3.85)
Adaptive Output-Feedback Control for Nonlinear Systems
515
Combining (3.84) and (3.85) we can express ~k as I%# ~k = j=l E hk+ld T T :~k~,i Rk+l
- Fori=k+l,
(3.86)
wehave
wW+l = [WT, --sgn(bm)[0... 0 llC~,+IHF(CIH) w - ~kwk] Nk.I. 1 ---- NkEk,k+ 1 "Jffp-1cT+IMk.I.111)k.I.lwTI.I MT+I n, T T O~k E ~ ~+1 = E h~+lJR~+l O-~k,/ .,.+l + ~kRk+l -- (cTAko + ~kCk)N~+l • j=l
Hence, the elements of wk+l, Nk+l and ~k+l are polynomial functions of the elements of Ck+lH. Thus, the term ~ - x in (3.82) can be calculated explicitly from (3.86). Furthermore, the lower triangular form of 3 (cf. (3.35)) and the polynomial dependence of ~i on the elements of CiH imply that the elements of both S and ~ - I are polynomial functions of the elements of H. The design procedure is now complete. The expressions for ~i and fi, which guarantee that the nonnegative function V in (3.37) has the nonpositive derivative (3.45), are ~x = - c r N 1 no
~i -- E
W
W
O~i-1
hi,j Ri O ~
Ei- l,i q- ~i- l Ri
j=l - (cWA~-1 + ~,-1C,-1) N~, i = 2 , . . . , fi
(3.87)
~w = cWA~ _ (cWA~ + ~nCn) (cffX s e n ) -x • The designed closed-loop adaptive system is:
Plant:
t]%
= a z x + bzq(y)u + E e . _ ~ [Pio(Y) +pT(y)O1] i=0 y =- cTz
Control: u = q-~
(3.88)
Update law: 0 = sgn(bm)F¢~ ~rn : --TcTT/~
516
KaneUakopoulos, Kokotovi6, and Morse ~ = e + rlo = y - Yr + ~7o Filters: B = A o H + bfW(y, u, r)
!bw _~ cTH //0 : --)10~/0 "4- brneWT/ (7 = Aorl - b~Wtl -- H s g n ( b m ) [ ' ¢ ~ ,
where Yr is the output of the reference model (3.2), ¢(y, u, 7") is defined in (3.12), and (cE, AE, bE) is a minimal state representation of the plant equation (3.1):
-a,~-1 . I] A E --
, bE =
-ao
hie
, cE =
,
(3.89)
0 ... 0 . b0
with en-~ the (n - i)-th coordinate vector in IRn. The stability and tracking properties of (3.88) are established in the next section. 4 Stability
and Tracking
We are now ready to state and prove our main result: T h e o r e m 4.1 For any uniformly bounded and piecewise continuous reference input r, all the signals in the closed-loop adaptive s y s t e m (3.88) are well-defined and uniformly bounded on [0, c~), and, in addition, lim e(t) = O, ,~na ,l(t) ~ O, ,lim oo(t) = 0.
(4.1)
f -* O0
Proof. Due to the piecewise continuity of r(t) and the smoothness of the non-
linear functions appearing in the definitions of various terms in the closed-loop system (3.88), the solution of (3.88) has a maximum interval of existence [0, if). On this interval, the time derivative of the nonnegative function V defined in
(3•37) l / t ° ° e2(r)d1" ) q- A--fi°¢TP¢'n V = 71 ( ez+ I b,n [ 0 T F - 1 0 + 1 b~ + ~oo computed along the solutions of (3.88), is given by (3.45):
f'=-3
e-
oo,
-
,=1
-
_<0.
Adaptive Output-Feedback Control for Nonlinear Systems
517
We conclude that V, ~, 0, bm and ~ are bounded on [0, tf) by constants depending only on the initial conditions of (3.88). This implies that 0, b,n are bounded on [0, tf). The boundedness of ~ together with (3.33) and (3.58) evaluated at i = 1 imply that eT7 is bounded; from the definition of 70 in (3.88) and the boundedness of bm and cT7 it follows that 7o is bounded. But since e = ~ - 770, and ~, ~70 are bounded, we have that e is bounded. Now from the boundedness of r we have that Yr is bounded, and, hence, y is bounded, since y = e + Yr. The boundedness of y implies that all the nonlinearities appearing are bounded, and, furthermore, that q(y) is bounded away from zero. Filtering the system equation (3.1) with the strictly proper stable filter 1 / B E 1 and rearranging terms, we obtain 1
E--:
=
A
~-,
+'-"
Di
+ vT(y)01],
(4.2)
i=D
1
--~.q(y)u is bounded. The boundedthe row vectors cWAioH, 0 < i < fi-- 1
which, by the boundedness of y, implies that
hess of H is now established by proving that are bounded. This, in turn, is proved as follows: First, from [17, Lemma 5] we have t h a t the first fi derivatives of e can be expressed as
e(O--pi([,O,l,,n,7o,Cig, ci(,~i),
i= 1,...,fi,
(4.3)
where the pi's are continuous functions of their arguments and the e~'s are exponentially decaying terms. This is straightforward to show (cf. (2.48)-(2.49) in the example of Sect. 2), starting from e -- ~ - 70 and using (3.88) and the facts that the derivative of ( is given by f%
--Ao(+we-e-lZCWMiCiwwWCWMWC,~,
(4.4)
i=l
and t h a t the elements of wi, Ni and ~i are polynomial functions of the elements of CiH. Second, from (3.25) we have
cTA~H
Di = ~-~-0eW(y, u, r) + ~i(t),
0 < i <: fi -- 1,
(4.5)
where ~i(t) are n0-dimensional row vectors of exponentially decaying terms. Then, we use (3.12) to express ¢ as ~bT(y,u,r) --
[
P(y,r), ~
]
[q(y)u] ,
(4.6)
with
"~(y, r) =
Dn-iy,
L-- C
R --r, E2
-D.-m-1 -~ D i D , - , , - 1 x-~ D i , ,] Z " f f - p i 0 ( y ) , • • •, -~_~Pit(Y)| E21 i=o ~22 Z21 ,=0/~22 J
(4z)
518
Kanellakopoulos, Kokotovid, and Morse
being bounded, since y and r are bounded. Combining (3.13), (3.17) and (4.5) we obtain the following expressions for the first fi derivatives of the tracking error e : e(i)
= bmDiE1
[¢T(y,u, r)~+ fi] + e(t)
= bmD~ [q(y)u + CW(y, u, r)0] + ¢(t) E1
bm Di bmEoD D i-1 El [q(y)u] + Z'----~- ~ 0 ~bw(y' u, r)/~ + e(t) b,~D i bmE°DcWAio-Zg 0 + e(t), 1 < i < fi - 1. (4.8) E---~[q(y)u] + El It is important to note that bmEoD/El is stable and proper. The boundedness of cWAioH is now established for i = 0 , . . . , fi - 1 by the following induction --
-
argument: -
For i = 0, (4.5)-(4.6) give
I
cTH =
D.-2 [q(Y)"] ]
= 1
Since ~(y, r), ~
-
,
EoE-----TE[q(y)u].
(4.9)
[q(y)uI are bounded and E1D._2. EoE---~ is a row of stable proper
filters, (4.9) implies that cTH is bounded. Furthermore, we have already shown that e is bounded. For 1 < i < f i - 1 , assume that cWAkoH and e (k) are bounded for 0 _< k _< i - 1 . Hence, CIH is bounded, and, by (4.3), e (0 is bounded. Then, rewriting (4.8)
as
D_~. i
E1 [q(y)u]
=
l"~'e(O E°DcT A~-I H O + e(t')
(4.10)
D~
we conclude that ~'1 [q(y)u] is bounded. Finally, using (4.5)-(4.7), we obtain
cT AioH ---=
P(y, r) , EoE2 [q(y)u] [ ~ 0ip ( y '
, EID'-,-2 D' ] EoE2 E1 [q(y)u] + ~(t) .
(4.11)
Hence, cWAioH is bounded. This proves that H is bounded, which, by (4.3), means that e(0, 1 < i < fi, are bounded. Next, we prove the boundedness of u. From (3.13), the boundedness of 0 and the fact that q(y) is bounded away from zero, it follows that u is bounded if
Adaptive Output-Feedback Control for Nonlinear Systems
519
and ¢(y, u, r) are bounded. The boundedness of H implies the boundedness of .- w-1 and ~o. Since ~ = S ( = ( C n ) - t ~ C ~ , and ~ and ( are bounded, ~ is bounded as well. Hence, • is bounded. From (4.6)-(4.7), to prove boundedness
Dn-2 [q(y)u]. We first rewite (4.8) as
of ¢(y, u, r) we only need to show that ~
Di 1 i E'-~ [q(y)u] = -~-~e( ) Di
which implies that ~
1
with the fact that ~
EoD T i 1 ~ c A"o- H 0 + e(t), 1 < i < fi - 1,
(4.12)
[q(y)u] is bounded for i = 1 , . . . , fi - 1. Combining this Di
[q(y)u] is bounded, it follows that ~ [q(y)u] is bounded
for / = 1 , . . . , n - 3. Differentiating (4.12) with i = n - 1 and using (3.25) and (3.62) we obtain
OnE1[q(y)u] = ~,,,le(n) _ ZODEt [cTA~H + ¢T(y, u, r)] 0 + ~(t).
(4.13)
Substituting ¢W(y, u,r) from i4.6) and rewriting (3.11) as 8 w = [~W, £0,-2], we express (4.13) as
~-~[q(y)u]
DnE2 + ~.On_2Dn-IEo E1E2 [q(y)u] _ l_e(n)_ EoD cTA~ HO+ -- bm
L
which implies that ~ Di
(y,r),
[q(y)u]
+e(t), (4.14)
Et
[q(y)u] is bounded. Since L is of degree n - 2 and
[q(y)u] is bounded for i = 1 , . . . , n -
D n-2 3, it follows that ~ [q(y)u] is
D,-2 bounded. Hence, ~ [q(y)u] is bounded, which proves that u is bounded. In order to show the boundedness of the state of the plant, we note that the boundedness of u, y and r together with (3.14) imply that Diy, 0 < i < n - rn, are bounded. From this and the fact that B(D) is Hurwitz, we conclude that the state ~ in (3.88) is bounded. We have thus proved that the state of the closed-loop adaptive system (3.88) is bounded on [0, re). Hence, tf -- oo. To prove the convergence of the tracking error e to zero, we first note that (3.37) and (3.45) imply that V is bounded and integrable on [0, co). Furthermore, the boundedness of ~ (.cf. (3.30)), ( (cf. (4.4)) and H (cf. (3.88)) implies that is bounded. Hence, V --. 0 as t ---} ~ , which, in view of (3.45), proves that ~ O, ~ ~ 0 as t ~ oo. Since 7 / = S~ and S is bounded, ~/ ~ 0 as t ~ oo. Combined with (3.88) and the boundedness of bin, this also proves that r/0 ---* 0 as t --+ oo. Thus,
520
Kanellakopoulos, Kokotovld, and Morse
lim [y(t) -- y~(t)] = ,l~In [~(t) -- ~7o(t)] = O.
(4.15)
t -"~ O 0
Q
5 The Class of Nonlinear
Systems
Most models of nonlinear systems are expressed in specific state coordinates. From t h a t state-space form it m a y not always be obvious whether or not the nonlinear system at hand has the input-output description assumed in Sect. 3. Therefore, we now give coordinate-free geometric conditions which are necessary and sufficient for a single-input single-output nonlinear system of the form
= f ( z ; o0 + g(z; o~)u
(5.1)
y = h(~; ~) to have an input-output description of the form (3.1), which is repeated here for convenience: Er~
A( D)y = B ( D) [q(y)u] + E
D' [Pi0(Y) + pT(y)#I] •
(5.2)
i----0
In (5.1) z E IRa is the state, u E IR is the input, y E IR is the output, a = [al ... az] w E IRI is a vector of unknown constant parameters, and f , g, h, axe s m o o t h vector fields with f ( 0 ; a ) = 0, h(0;cx) = 0, for all a • IRt, g(z) # 0 for all z • IR'~. In (5.2) - the coefficients a0 . . . . , a,,-1 of the denominator polynomial A ( D ) = D" + a n - l D n-1 + • .. + ao are unknown, - the coefficients b o , . . . , bm(m <_ n - 1) of the n u m e r a t o r polynomial B ( D ) -b,~D m + ... + bo are unknown, - 01 is an g-dimensional vector of unknown parameters, resulting from a possible overparameterization in which products and powers of the original unknown parameters ai axe treated as new p a r a m e t e r s (so t h a t g >_ I), - q(y), pij(y), 0 < i < m, 0 <_ j g ~ are smooth nonlinearities with q(y)
0, Vy •IR, pij(0) =0, 0 < i < m, 0 < j _ < t . We first note that. a minimal state representation of (5.2) is given by tTl
~.= A 2 , x + b 2 q ( y ) u + E e , _ ,
[p,o(y)+pW(y)01]
(5.3)
i--0 y = cTx,
with A £ , b£, c2, e,~-i as defined in (3.89). Hence, the following s t a t e m e n t becomes obvious:
Adaptive Output-Feedback Control for Nonlinear Systems
521
F a c t 5.1. The nonlinear system (5.1) has an input-output description of the form (5.2) if and only if there exists a global in z~ possibly parameter-dependent, diffeomorphism transforming (5.1) into (5.3). [] Using this fact, we now state the following result: P r o p o s i t i o n 5.2 The system (5.1)has an input-output description of the form (5.2) if and only if the following conditions are satisfied for all z E IRn and for the trne value of the parameter vector ~: (C1) the one-forms dh, dL f h, . . . , dL~-l h are linearly independent, r
~
eve) [ad~,ad~] = O, i,S = 0 , . . , . -
1, w~e~ ~ is uniquely defi.ea by
[0, i = O,...,n-2 1, i n - l ,
L~L~Ih
(cs) tt--1
adT.~ = ~
fr;
dj (a)ad~.~ + ~
j=0
[Pio'(Y)+ P)T(y)Ox] ad~.~
j=0
g = q(~) ~ c~(~)ad~o~, j=o
with di(,~),cj(o0 polynomial functions of a, 01 the new unknown parameter vector, and e~(y) =
Z
v ~( v ) d , , i = 0 , . . . , e, and
(C~) the vector fields f and 77 are complete. Proof. Using Proposition 3 of [22], it is straightforward to show that conditions (C1)-(C3) are necessary and sufficient for the existence of a local diffeomorphism such that in the new coordinates the system (5.1 is expressed as
,]
(-1)d,_l(a) : (-1)~d0(~)
0 0
(_i)~cm q(u)u
0...0 Co
0
+
0 (_l),~-~[p,~o(y)+pW(y)Oi]
(-1) n [P00(Y) + p~(y)O1] y=Xl,
(5.4)
522
Kanellakopoulos, Kokotovi~, and Morse
which is exactly in the form (5.3), where the coefficients a0 . . . . , a,,_l, b 0 , . . . , b,~ depend on the physical parameters 4. From [23], condition (C4) is necessary and sufficient for the above diffeomorphism to be global. [] R e m a r k 5.3. The above proposition gives a set of geometric conditions characterizing the class of nonlinear systems to which our adaptive scheme can be applied. Whenever the conditions (C1)-(C4) can be verified a priori, the inputoutput description (5.2) of the nonlinear system at hand is determined directly from (C3), without the need to compute the diffeomorphism of Proposition 5.2. Unfortunately, the verification of (C1)-(C4) m a y require some a priori information about the unknown parameter vector o~. [] R e m a r k 5.4. The conditions of Proposition 5.2 are satisfied by nonlinear systems that are linearizable by output injection and input-output linearizable by full-state feedback. However, they need not be full-state feedback linearizable. [] We illustrate these two remarks with an example. E x a m p l e 5.6. It may not be obvious that (2.1) is the input-output description of the nonlinear system i:i - z2 q- a (3ye ~ + 2y 2) £'2 -- ;:3 - a (2ye y --}-y~)
(5.5)
~3 = ~4 + aye~
£'4 = z5 + a y 2 oty3
z5 = u +
y = zz + 2z2 + z3. However, this can be established by checking the conditions (C1)-(C4). Straightforward calculations show that for (5.5) we have 9 =
5 --~--0 - 4 69 + 3 £ - 2 ~ z 4 + Ozl Oz2
0
Oz5
a d ~ = ~ [-3y:~ + (2y + 6y2)ads~ - (ye~ + e~ + 4y + 3y~)ad~] g = 9 - 2ad/9 + a d ~ .
(5.6)
(5.7) (5.8)
Hence, the conditions (C1)-(C4) are satisfied for M1 a and the input-output description of (5.5) is D S y = ( D 2 q- 2 9 q- l)u + 0 [D2(ye ~ q- 2y ~ + y3) + O ( y 2 + 2y3) + y3],
(5.9)
where 8 -- o~. It is important to note that to determine this input-output description no explicit change of coordinates was required. In this simple example,
Adaptive Output-Feedback Control for Nonlinear Systems
523
however, one can find the corresponding change of coordinates by inspection: zl = Zl + 2z2 + z s x~ = z2 + 2z3 + z4 zs = z3 + 2z4 + zs
(5.10)
z4 = z4 + 2z5 X5 ~
Z5 •
In these coordinates, (5.5) becomes ~I = X2 ~2
=
x3
+
+ 2y. +ys)
(5aI)
x4 = z5 + 2 u + ~ (y2 + 2y3) x5 = u + a y 3 y=xl.
We immediately see that (5.11) has the input-output description (5.9). However, it should also be pointed out that (5.5) is not full-state feedback linearizable, since the distribution G3 - span is not involutive[21],
{g, adlg,ad~g, ad~g }
(5.12) r-I
6 Conclusions This paper has extended the theory of adaptive control for linear systems to a class of systems which are essentially nonlinear in the sense that their nonlinearities are not restricted by any growth constraints. In spite of this absence of growth constraints, all the stability and tracking results are global. The assumptions on the linear part of the system are the same as in the standard adaptive theory for linear systems. However, to guarantee the aforementioned global properties, the systematic design procedure has departed from the two main ingredients of most adaptive schemes for linear systems: the certaintyequivalence control and the normalization of the update law. In addition to the certainty-equivalence part, the control contains a term which counteracts the effects of rapidly growing nonlinearities. Thanks to the presence of this term, the normalization of the update law is avoided, which allows the rapid decrease of the parameter error. This proved to be crucial in preventing finite escape times common in systems with rapidly growing nonlinearities. The class of nonlinear systems has been restricted by coordinate-free geometric conditions which are equivalent to the structural requirements that the nonlinearities depend only on the output and do not enter the system before
524
Kanellakopoulos, Kokotovit, and Morse
the control does, and that the zero dynamics are linear and exponentially stable. Relaxing these restrictions, and thus enlarging the class of systems t h a t can be adaptively controlled using only output measurement, is a topic of further research.
Acknowledgement T h e authors would like to t h a n k Professor Pdccardo Marino for m a n y helpful discussions which led to the development of Section 5.
References 1. G. Campion and G. Ba.stin, "Indirect adaptive state feedback control of linearly parametrized nonlinear systems," Int. J. Adapt. Control Sit?. Proc., vol. 4, pp. 345-358, Sept. 1990. 2. I. KaneUakopoulos, P. V. Kokotovi6, and R. Marino, "Robustness of adaptive nonlinear control under an extended matching condition," Prepr. IFAC Syrup. Non. linear Control Syst. Design, pp. 192-197, Capri, Italy, June 1989. 3. I. Kanellakopoulos, P. V. Kokotovit, and R. Marino, "An extended direct scheme for robust adaptive nonlinear control," Automatica, to appear, March 1991. 4. R. Marino, I. Kanellakopoulos, and P. V. Kokotovi~, "Adaptive tracking for feedback linearizable SISO systems," Proc. ~8th IEEE Conf. Dec. Control, pp. 10021007, Tampa, FL, Dec. 1989. 5. J.-J. E. Slotine and J. A. Coetsee, "Adaptive sliding controller synthesis for nonlinear systems," Int. J. Control, vol. 43, pp. 1631-1651, June 1986. 6. D. G. Taylor, P. V. Kokotovi~, R. Marino, and I. Kanellakopoulos, "Adaptive regulation of nonlinear systems with unmodeled dynamics," IEEE Trans. Aut. Control, vol. 34, pp. 405-412, April 1989. 7. K. Nam and A. Arapostathis, "A model-reference adaptive control scheme for pure-feedback nonlinear systems," IEEE Trans. Ant. Control, vol. 33, pp. 803811, Sept. 1988. 8. J.-B. Pomet and L. Praly, Adaptive nonlinear control: an estimation-based algorithm, in New Trends in Nonlinear Control Theory, 2. Descusse, M. Fliess, A. Isidori and D. Leborgne Eds., Springer-Verlag, Berlin, 1989. 9. J.-B. Pomet and L. Praly, "Adaptive nonlinear regulation: equation error from the Lyapunov equation," Proc. ~8th IEEE Conf. Dec. Control, pp. 1008-1013, Tampa, FL, Dec. 1989. 10. S.S. Sastry and A. Isidori, "Adaptive control of linearlzable systems," IEEE Trans. Aut. Control, vol. 34, pp. 1123-1131, Nov. 1989. 11. A. Teel, R. Kadiyala, P. V. Kokotovit, and S. S. Sastry, "Indirect techniques for adaptive input output linearization of nonlinear systems," Int. J. Control, vol. 53, pp. 193-222, Jan. 1991. 12. I. Kanellakopoulos, P. V. Kokotovi~, and R. H. Middleton, "Observer-based adaptive control of nonlinear systems under matching conditions," Proc. 1990 Amer. Control Conf., pp. 549-555, San Diego, CA, May 1990. 13. I. Kanellakopoulos, P. V. Kokotovi~, and R. H. Middleton, "Indirect adaptive output-feedback control of a class of nonlinear systems," Proc. ~9th IEEE Conf. Dec. Control, pp. 2714-2719, Honolulu, HI, Dec. 1990.
Adaptive Output-Feedback Control for Nonlinear Systems
525
14. G. Ba.stin and M. R. Gevers, "Stable adaptive observers for nonlinear time-varying systems," IEEE Trans. Aut. Control, vol. 33, pp. 650-658, July 1988. 15. R. Marina, "Adaptive observers for single output nonlinear systems," IEEE Trans. Aut. Control, vol. 35, pp. 1054-1058, Sept. 1990. 16. R. Marina and P. Tomei, "Global adaptive observers for nonlinear systems via filtered transformations, ~ Submitted to IEEE Trans. Aut. Control, 1990. 17. A. Feuer and A. S. Morse, "Adaptive control of single-input single-output linear systems," IEEE Trans. Aut. Control, vol. AC-23, pp. 557-569, Aug. 1978. 18. K. S. Narendra and A. M. Annaswamy, Stable Adaptive Systems, Prentice-Hall, Inc., Englewood Cliffs, N J, 1989. 19. S. S. Sastry and M. Bodson, Adaptive Control: Stability, Convergence and Robustness, Prentice-Hall, Inc., Englewood Cliffs, N J, 1989. 20. P. V. Kokotovid, I. Kanellakopoulos, and A. S. Morse, "Adaptive feedback lineaxization of nonlinear systems," this volume, pp. 309-344. 21. A. Isidori, Nonlinear Control Systems, 2nd ed., Sprlnger-Verlag, Berlin, 1989. 22. A. J. Krener and A. Isidori, "Linearization by output injection and nonlinear observers," Syst. Control Left., vol. 3, pp. 47-52, June 1983. 23. W. Respondek, Global aspects of linearization, equivalence to polynomial forms and decomposition of nonlinear control systems, in Algebraic and Geometric Methods in Nonlinear Control Theory, M. Fliess and M. Hazewlnkel, eds., pp. 257-284, D. Reidel Publishing Co., Dordrecht, 1986.